Show simple item record

Authordc.contributor.authorGagie, Travis 
Authordc.contributor.authorKärkkäinen, Juha es_CL
Authordc.contributor.authorNavarro, Gonzalo es_CL
Authordc.contributor.authorPuglisi, Simon J. es_CL
Admission datedc.date.accessioned2014-01-24T13:40:47Z
Available datedc.date.available2014-01-24T13:40:47Z
Publication datedc.date.issued2013
Cita de ítemdc.identifier.citationTheoretical Computer Science 483 (2013) 36–50en_US
Identifierdc.identifier.issn0304-3975
Identifierdc.identifier.otherdoi:10.1016/j.tcs.2012.08.004
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/126273
General notedc.descriptionArtículo de publicación ISIen_US
Abstractdc.description.abstractColored range queries are a well-studied topic in computational geometry and database research that, in the past decade, have found exciting applications in information retrieval. In this paper, we give improved time and space bounds for three important onedimensional colored range queries — colored range listing, colored range top-k queries and colored range counting — and, as a consequence, new bounds for various document retrieval problems on general collections of sequences. Colored range listing is the problem of preprocessing a sequence S[1, n] of colors so that, later, given an interval [i, i+ℓ−1], we list the different colors in S[i, i+ℓ−1]. Colored range top-k queries ask instead for k most frequent colors in the interval. Colored range counting asks for the number of different colors in the interval. We first describe a framework including almost all recent results on colored range listing and document listing, which suggests new combinations of data structures for these problems. For example, we give the first compressed data structure (using nHk(S) + o(n log σ) bits, for any k = o(logσ n), where Hk(S) is the k-th order empirical entropy of S and σ the number of different colors in S) that answers colored range listing queries in constant time per returned result. We also give an efficient data structure for document listing whose size is bounded in terms of the k-th order entropy of the library of documents. We then show how (approximate) colored top-k queries can be reduced to (approximate) range-mode queries on subsequences, yielding the first efficient data structure for this problem. Finally, we show how modified wavelet trees can support colored range counting using nH0(S) + O(n) + o(nH0(S)) bits, and answer queries in O(log ℓ) time. As far as we know, this is the first data structure in which the query time depends only on ℓ and not on n. We also show how our data structure can be made dynamic.en_US
Lenguagedc.language.isoen_USen_US
Type of licensedc.rightsAttribution-NonCommercial-NoDerivs 3.0 Chile*
Link to Licensedc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/*
Keywordsdc.subjectData structuresen_US
Títulodc.titleColored range queries and document retrievalen_US
Document typedc.typeArtículo de revista


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Chile
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile