Approximate String Matching with Lempel-Ziv Compressed Indexes
Author
dc.contributor.author
Russo, Luís M. S.
Author
dc.contributor.author
Navarro, Gonzalo
es_CL
Author
dc.contributor.author
Oliveira, Arlindo L.
es_CL
Admission date
dc.date.accessioned
2014-01-08T20:17:44Z
Available date
dc.date.available
2014-01-08T20:17:44Z
Publication date
dc.date.issued
2007
Cita de ítem
dc.identifier.citation
En: String Processing and Information Retrieval. 14th International Symposium, SPIRE 2007 Santiago, Chile, October 29-31, 2007 Proceedings. pp 264-275
en_US
Identifier
dc.identifier.isbn
978-3-540-75530-2 (online)
Identifier
dc.identifier.other
DOI: 10.1007/978-3-540-75530-2_24
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/120327
Abstract
dc.description.abstract
A compressed full-text self-index for a text T is a data structure
requiring reduced space and able of searching for patterns P in T.
Furthermore, the structure can reproduce any substring of T, thus it
actually replaces T. Despite the explosion of interest on self-indexes in
recent years, there has not been much progress on search functionalities
beyond the basic exact search. In this paper we focus on indexed
approximate string matching (ASM), which is of great interest, say, in
computational biology applications. We present an ASM algorithm that
works on top of a Lempel-Ziv self-index.We consider the so-called hybrid
indexes, which are the best in practice for this problem. We show that a
Lempel-Ziv index can be seen as an extension of the classical q-samples
index. We give new insights on this type of index, which can be of independent
interest, and then apply them to the Lempel-Ziv index. We
show experimentally that our algorithm has a competitive performance
and provides a useful space-time tradeoff compared to classical indexes.