The wavelet matrix: An efficient wavelet tree for large alphabets
Author
dc.contributor.author
Claude, Francisco
Author
dc.contributor.author
Navarro, Gonzalo
Author
dc.contributor.author
Ordóñez, Alberto
Admission date
dc.date.accessioned
2015-09-15T19:09:42Z
Available date
dc.date.available
2015-09-15T19:09:42Z
Publication date
dc.date.issued
2015
Cita de ítem
dc.identifier.citation
Information Systems 47 (2015) 15–32
en_US
Identifier
dc.identifier.other
DOI: 10.1016/j.is.2014.06.002
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/133661
General note
dc.description
Artículo de pubicación ISI
en_US
Abstract
dc.description.abstract
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size sigma, within compressed space and supporting a wide range of operations on S. When sigma is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zero-order entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map.
en_US
Patrocinador
dc.description.sponsorship
Fondecyt 11130104; Millennium Nucleus Information and Coordination in Networks ICM/FIC, Chile P10-024F; CDTI CDTI EXP 000645663/ITC-20133062; Ministerio de Economia y Competitividad-MEC- CDTI EXP 000645663/ITC-20133062; Axencia Galega de Innovacion -AGI- CDTI EXP 000645663/ITC-20133062; Xunta de Galicia - (FEDER) GRC2013/053; MICINN (PGE); MICINN (FEDER) TIN2009-14560-C03-02 , TIN2010-21246-C02-01; FPU Program AP2010-6038