Skip to content

hdbn/osus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

--------------------------------------------------
osus by Hideo Bannai and Kazuya Tsuruta
Copyright (c) 2013
See file LICENSE for copying
--------------------------------------------------
For a given string s of length N, an interval [i..j] is unique if s[i..j] does not occur
anywhere else in s. A unique interval [i..j] is a Shortest Unique Substring (SUS)
at position p, if i \leq p \leq j and there is no shorter unique interval [i'..j']
where i' \leq p \leq j'.

SUSs were first considered in [2]. However, 
- their algorithm was O(N^2) time, and 
- they only considered the left-most SUS (lmSUS) of each position.

This program implements the algorithm proposed in the paper [1], which
computes the lmSUS for all positions of the input string, in O(N) time.
(For strange reasons, we have changed the name of the implementation 
 from TSUS to OSUS ;-P)

The algorithm can also compute all SUSs for any given position p in O(k)
time, where k is the number of SUSs at position p, if we are allowed O(N) 
time pre-processing of the string. 

If you use this code for experiments in a research paper, please cite the
paper [1] and publish the URL from which you downloaded the code.

--

The libdivsufsort-lite library by Yuta Mori is used for suffix array construction.
Scons (http://www.scons.org/) is used for the building process.
Just type "scons" if you have it installed.

-- Usage:

Usage  : ./osus [options]
Options: 
  -f iFile : file to process
  -n int   : truncate input string to size
  -q int   : query position (output all if negative (default))
  -l       : only output leftmost SUS
  -c       : (DEBUG) check results against semi-naive algorithm
  -h       : usage

------------------------------
Ex:
$ ./osus -f test.txt 
input string length = 9
0: [0,1];
1: [0,1];
2: [0,2];
3: [0,3];
4: [0,4];[2,6];[4,8];
5: [5,8];
6: [6,8];
7: [7,8];
8: [8,8];
------------------------------

--------------------------------------------------
References:

  [1] Shortest Unique Substrings Queries in Optimal Time,
      Kazuya Tsuruta, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda,
      SOFSEM '14 Proceedings of the 40th International Conference on
      Current Trends in Theory and Practice of Computer Science.

  [2] On Shortest Unique Substring Queries
      Jian Pei, Wush Chi-Hsuan Wu, and Mi-Yen Yeh
      ICDE '13 Proceedings of the 2013 IEEE International Conference on 
      Data Engineering (ICDE 2013) Pages 937-948 
      http://dx.doi.org/10.1109/ICDE.2013.6544887
      http://www.cs.sfu.ca/~jpei/publications/Shortest%20Unique%20Substring%20Queries%20ICDE13.pdf

About

Automatically exported from code.google.com/p/osus

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors