Binary RDF representation for publication and exchange (HDT)
Artículo
Open/ Download
Publication date
2013Metadata
Show full item record
Cómo citar
Fernández, Javier D.
Cómo citar
Binary RDF representation for publication and exchange (HDT)
Author
Abstract
The current Web of Data is producing increasingly large RDF datasets. Massive publication efforts of RDF
data driven by initiatives like the Linked Open Data movement, and the need to exchange large datasets
has unveiled the drawbacks of traditional RDF representations, inspired and designed by a documentcentric
and human-readable Web. Among the main problems are high levels of verbosity/redundancy
and weak machine-processable capabilities in the description of these datasets. This scenario calls for
efficient formats for publication and exchange.
This article presents a binary RDF representation addressing these issues. Based on a set of metrics
that characterizes the skewed structure of real-world RDF data, we develop a proposal of an RDF
representation that modularly partitions and efficiently represents three components of RDF datasets:
Header information, a Dictionary, and the actual Triples structure (thus called HDT). Our experimental
evaluation shows that datasets in HDT format can be compacted by more than fifteen times as compared
to current naive representations, improving both parsing and processing while keeping a consistent
publication scheme. Specific compression techniques over HDT further improve these compression rates
and prove to outperform existing compression solutions for efficient RDF exchange.
General note
Artículo de publicación ISI
Identifier
URI: https://repositorio.uchile.cl/handle/2250/126366
DOI: doi:10.1016/j.websem.2013.01.002
Quote Item
Web Semantics: Science, Services and Agents on the World Wide Web 19 (2013) 22–41
Collections