A grammar compression algorithm based on induced suffix sorting
Author
dc.contributor.author
Nogueira Nunes, Daniel
Author
dc.contributor.author
Louza, Felipe
Author
dc.contributor.author
Gog, Simon
Author
dc.contributor.author
Ayala-Rincon, Mauricio
Author
dc.contributor.author
Navarro, Gonzalo
Admission date
dc.date.accessioned
2019-05-31T15:20:02Z
Available date
dc.date.available
2019-05-31T15:20:02Z
Publication date
dc.date.issued
2018
Cita de ítem
dc.identifier.citation
Data Compression Conference Proceedings, 2018
Identifier
dc.identifier.issn
10680314
Identifier
dc.identifier.other
10.1109/DCC.2018.00012
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/169427
Abstract
dc.description.abstract
We introduce GCIS, a grammar compression algorithm based on the induced suffix sorting algorithm SAIS, presented by Nong et al. in 2009. Our solution builds on the factorization performed by SAIS during suffix sorting. We construct a context-free grammar on the input string which can be further reduced into a shorter string by substituting each substring by its corresponding factor. The resulting grammar is encoded by exploring some redundancies, such as common prefixes between suffix rules, which are sorted according to SAIS framework. When compared to well-known compression tools such as Re-Pair and 7-zip under repetitive sequences, our algorithm is faster at compressing and achieves compression ratio close to that of Re-Pair, at the cost of being the slowest at decompressing.
Lenguage
dc.language.iso
en
Publisher
dc.publisher
Institute of Electrical and Electronics Engineers Inc.