Show simple item record

Authordc.contributor.authorArroyuelo, Diego 
Authordc.contributor.authorNavarro, Gonzalo es_CL
Authordc.contributor.authorSadakane, Kunihiko es_CL
Admission datedc.date.accessioned2012-05-22T21:16:34Z
Available datedc.date.available2012-05-22T21:16:34Z
Publication datedc.date.issued2012
Cita de ítemdc.identifier.citationAlgorithmica (2012) 62:54–101es_CL
Identifierdc.identifier.otherDOI 10.1007/s00453-010-9443-8
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/125597
Abstractdc.description.abstractGiven a text T [1..u] over an alphabet of size σ, the full-text search problem consists in finding the occ occurrences of a given pattern P[1..m] in T . In indexed text searching we build an index on T to improve the search time, yet increasing the space requirement. The current trend in indexed text searching is that of compressed full-text self-indices, which replace the text with a more space-efficient representation of it, at the same time providing indexed access to the text. Thus, we can provide efficient access within compressed space. The Lempel-Ziv index (LZ-index) of Navarro is a compressed full-text self-index able to represent T using 4uHk(T ) + o(ulogσ) bits of space, where Hk(T ) denotes the k-th order empirical entropy of T , for any k = o(logσ u). This space is about four times the compressed text size. The index can locate all the occ occurrences of a pattern P in T in O(m3 log σ + (m + occ) log u) worst-case time. Although this index has proven very competitive in practice, the O(m3 logσ) term can be excessive for long patterns. Also, the factor 4 in its space complexity makes it larger than other state-of-the-art alternatives. In this paper we present stronger Lempel-Ziv based indices (LZ-indices), improving the overall performance of the original LZ-index. We achieve indices requiring (2 + )uHk(T ) + o(ulogσ) bits of space, for any constant > 0, which makes them the smallest existing LZ-indices. We simultaneously improve the search time to O(m2 + (m + occ) log u), which makes our indices very competitive with stateof- the-art alternatives. Our indices support displaying any text substring of length in optimal O( / logσ u) time. In addition, we show how the space can be squeezed to (1+ )uHk(T )+o(ulogσ) to obtain a structure with O(m2) average search time for m 2 logσ u. Alternatively, the search time of LZ-indices can be improved to O((m + occ) log u) with (3 + )uHk(T ) + o(ulogσ) bits of space, which is much less than the space needed by other Lempel-Ziv-based indices achieving the same search time. Overall our indices stand out as a very attractive alternative for spaceefficient indexed text searching.es_CL
Patrocinadordc.description.sponsorshipCONICYT PhD Fellowship Program, Fondecyt Grant 1-080019, Grant-in-Aid of the Ministry of Education, Science, Sports and Culture of Japanes_CL
Lenguagedc.language.isoenes_CL
Publisherdc.publisherSpringeres_CL
Keywordsdc.subjectText compressiones_CL
Títulodc.titleStronger Lempel-Ziv Based Compressed Text Indexinges_CL
Document typedc.typeArtículo de revista


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record