2017
Conference article  Open Access

An encoding for order-preserving matching

Gagie T., Manzini G., Venturini R.

FOS: Computer and information sciences  Computer Science - Data Structures and Algorithms  Compact data structures  encodings  Computer Science  general works  order-preserving matching  000 Computer science  knowledge  Data Structures and Algorithms (cs.DS) 

Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an order-preserving match if the relative order of their characters is the same: E.g., 4, 1, 3, 2 and 10, 3, 7, 5 are an order preserving match. We show how, given a string S[1..n] over an arbitrary alphabet of size ? and a constant c >= 1, we can build an O(n log log n)-bit encoding such that later, given a pattern P[1..m] with m <= logcn, we can return the number of order-preserving occurrences of P in S in O(m) time. Within the same time bound we can also return the starting position of some order preserving match for P in S (if such a match exists). We prove that our space bound is within a constant factor of optimal if log?=? (log log n); our query time is optimal if log?= (log n). Our space bound contrasts with the ? (n log n) bits needed in the worst case to store S itself, an index for order-preserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to O(logc n) neighbouring characters.

Source: 25th European Symposium on Algorithms, ESA 2017, pp. 38:1–38:15, Vienna, Austria, 4-6 September, 2016


1. Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In STOC, pages 71-80, 1993.
2. Brenda S. Baker. Parameterized pattern matching: Algorithms and applications. J. Comput. Syst. Sci., 52(1):28- 42, 1996.
3. J´er´emy Barbay, Meng He, J. Ian Munro, and Srinivasa Rao Satti. Succinct indexes for strings, binary relations and multilabeled trees. ACM Trans. Algorithms, 7(4):52, 2011.
4. Djamal Belazzougui, Adeline Pierrot, Mathieu Raffinot, and St´ephane Vialette. Single and multiple consecutive permutation motif search. In ISAAC, volume 8283 of Lecture Notes in Computer Science, pages 66-77. Springer, 2013.
5. Domenico Cantone, Simone Faro, and M. Oguzhan Ku¨lekci. An efficient skip-search approach to the orderpreserving pattern matching problem. In Stringology, pages 22-35. Department of Theoretical Computer Science, Czech Technical University in Prague, 2015.
6. Tamanna Chhabra, Simone Faro, M. Oguzhan Ku¨lekci, and Jorma Tarhio. Engineering order-preserving pattern matching with SIMD parallelism. Software: Practice and Experience, 2016.
7. Tamanna Chhabra, Emanuele Giaquinta, and Jorma Tarhio. Filtration algorithms for approximate orderpreserving matching. In SPIRE, volume 9309 of Lecture Notes in Computer Science, pages 177-187. Springer, 2015.
8. Tamanna Chhabra, M. Oguzhan Ku¨lekci, and Jorma Tarhio. Alternative algorithms for order-preserving matching. In Stringology, pages 36-46. Department of Theoretical Computer Science, Czech Technical University in Prague, 2015.
9. Tamanna Chhabra and Jorma Tarhio. Order-preserving matching with filtration. In SEA, volume 8504 of Lecture Notes in Computer Science, pages 307-314. Springer, 2014.
10. Tamanna Chhabra and Jorma Tarhio. A filtration method for order-preserving matching. Inf. Process. Lett., 116(2):71-74, 2016.
11. Sukhyeun Cho, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. A fast algorithm for order-preserving pattern matching. Inf. Process. Lett., 115(2):397-402, 2015.
12. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Order-preserving indexing. Theor. Comput. Sci., 638:122-135, 2016.
13. Gianni Decaroli and Giovanni Manzini. Compressing and indexing stock market data. CoRR, abs/1606.05724, 2016.
14. Simone Faro and M. Oguzhan Ku¨lekci. Efficient algorithms for the order preserving pattern matching problem. In Algorithmic Aspects in Information and Management, volume 9778, 2016.
15. Paolo Ferragina and Giovanni Manzini. An experimental study of a compressed index. Inf. Sci., 135(1-2):13-28, 2001.
16. Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Parameterized pattern matching - succinctly. CoRR, abs/1603.07457, 2016. An updated version has been accepted to SODA 2017.
17. Pawel Gawrychowski and Przemyslaw Uznanski. Order-preserving pattern matching with k mismatches. In CPM, volume 8486 of Lecture Notes in Computer Science, pages 130-139. Springer, 2014.
18. Tommi Hirvola and Jorma Tarhio. Approximate online matching of circular strings. In SEA, volume 8504 of Lecture Notes in Computer Science, pages 315-325. Springer, 2014.
19. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theor. Comput. Sci., 525:68-79, 2014.
20. Marcin Kubica, Tomasz Kulczynski, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett., 113(12):430-433, 2013.
21. Rahul Shah. Personal communication, 2016.

Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:424365,
	title = {An encoding for order-preserving matching},
	author = {Gagie T. and Manzini G. and Venturini R.},
	doi = {10.4230/lipics.esa.2017.38 and 10.48550/arxiv.1610.02865},
	booktitle = {25th European Symposium on Algorithms, ESA 2017, pp. 38:1–38:15, Vienna, Austria, 4-6 September, 2016},
	year = {2017}
}