Position-Restricted Substring Searching

Makinen, Veli; Navarro, Gonzalo

Author	dc.contributor.author	Makinen, Veli
Author	dc.contributor.author	Navarro, Gonzalo	es_CL
Admission date	dc.date.accessioned	2009-05-11T11:58:05Z
Available date	dc.date.available	2009-05-11T11:58:05Z
Publication date	dc.date.issued	2006
Cita de ítem	dc.identifier.citation	Lecture Notes in Computer Science, Volume 3887/2006	en
Identifier	dc.identifier.issn	0302-9743 (Print) 1611-3349 (Online)
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/124918
Abstract	dc.description.abstract	A full-text index is a data structure built over a text string T[1, n]. The most basic functionality provided is (a) counting how many times a pattern string P[1,m] appears in T and (b) locating all those occ positions. There exist several indexes that solve (a) in O(m) time and (b) in O(occ) time. In this paper we propose two new queries, (c) counting how many times P[1,m] appears in T[l, r] and (d) locating all those occl,r positions. These can be solved using (a) and (b) but this requires O(occ) time. We present two solutions to (c) and (d) in this paper. The first is an index that requires O(n log n) bits of space and answers (c) in O(m + log n) time and (d) in O(log n) time per occurrence (that is, O(occl,r log n) time overall). A variant of the first solution answers (c) in O(m+log log n) time and (d) in constant time per occurrence, but requires O(n log1+Ç« n) bits of space for any constant Ç« > 0. The second solution requires O(nmlog Ïƒ) bits of space, solving (c) in O(mâŒˆlog Ïƒ/ log log nâŒ‰) time and (d) in O(mâŒˆlog Ïƒ/ log log nâŒ‰) time per occurrence, where Ïƒ is the alphabet size. This second structure takes less space when the text is compressible. Our solutions can be seen as a generalization of rank and select dictionaries, which allow computing how many times a given character c appears in a prefix T[1, i] and also locate the i-th occurrence of c in T. Our solution to (c) extends character rank queries to substring rank queries, and our solution to (d) extends character select to substring select queries. As a byproduct, we show how rank queries can be used to implement fractional cascading in little space, so as to obtain an alternative implementation of a well-known two-dimensional range search data structure by Chazelle. We also show how Grossi et al.â€™s wavelet trees are suitable for two-dimensional range searching, and their connection with Chazelleâ€™s data structure.	en
Lenguage	dc.language.iso	en	en
Publisher	dc.publisher	Springer Berlin / Heidelberg	en
Título	dc.title	Position-Restricted Substring Searching	en
Document type	dc.type	Artículo de revista

Files in this item

Name:: Makinen_Veli_Position.pdf
Size:: 167.9Kb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record