Binary RDF representation for publication and exchange (HDT)
Author
dc.contributor.author
Fernández, Javier D.
Author
dc.contributor.author
Martínez Prieto, Miguel A.
es_CL
Author
dc.contributor.author
Gutiérrez Gallardo, Claudio
es_CL
Author
dc.contributor.author
Polleres, Axel
es_CL
Author
dc.contributor.author
Arias, Mario
es_CL
Admission date
dc.date.accessioned
2014-02-05T13:54:05Z
Available date
dc.date.available
2014-02-05T13:54:05Z
Publication date
dc.date.issued
2013
Cita de ítem
dc.identifier.citation
Web Semantics: Science, Services and Agents on the World Wide Web 19 (2013) 22–41
en_US
Identifier
dc.identifier.other
doi:10.1016/j.websem.2013.01.002
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/126366
General note
dc.description
Artículo de publicación ISI
en_US
Abstract
dc.description.abstract
The current Web of Data is producing increasingly large RDF datasets. Massive publication efforts of RDF
data driven by initiatives like the Linked Open Data movement, and the need to exchange large datasets
has unveiled the drawbacks of traditional RDF representations, inspired and designed by a documentcentric
and human-readable Web. Among the main problems are high levels of verbosity/redundancy
and weak machine-processable capabilities in the description of these datasets. This scenario calls for
efficient formats for publication and exchange.
This article presents a binary RDF representation addressing these issues. Based on a set of metrics
that characterizes the skewed structure of real-world RDF data, we develop a proposal of an RDF
representation that modularly partitions and efficiently represents three components of RDF datasets:
Header information, a Dictionary, and the actual Triples structure (thus called HDT). Our experimental
evaluation shows that datasets in HDT format can be compacted by more than fifteen times as compared
to current naive representations, improving both parsing and processing while keeping a consistent
publication scheme. Specific compression techniques over HDT further improve these compression rates
and prove to outperform existing compression solutions for efficient RDF exchange.