Compressed representations for web and social graphs
Author
dc.contributor.author
Hernández, Cecilia
Author
dc.contributor.author
Navarro, Gonzalo
es_CL
Admission date
dc.date.accessioned
2014-12-11T14:33:03Z
Available date
dc.date.available
2014-12-11T14:33:03Z
Publication date
dc.date.issued
2013
Cita de ítem
dc.identifier.citation
Knowl Inf Syst (2014) 40:279–313
en_US
Identifier
dc.identifier.other
DOI 10.1007/s10115-013-0648-4
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/126521
General note
dc.description
Articulo de publicacion SCOPUS
en_US
Abstract
dc.description.abstract
Compressed representations have become effective to store and access largeWeb
and social graphs, in order to support various graph querying and mining tasks. The existing
representations exploit various typical patterns in those networks and provide basic navigation
support. In this paper, we obtain unprecedented results by finding “dense subgraph” patterns
and combining them with techniques such as node orderings and compact data structures.
On those representations, we support out-neighbor and out/in-neighbor queries, as well as
mining queries based on the dense subgraphs. First, we propose a compression scheme for
Web graphs that reduces edges by representing dense subgraphs with “virtual nodes”; over
this scheme, we apply node orderings and other compression techniques.With this approach,
we match the best current compression ratios that support out-neighbor queries (i.e., nodes
pointed from a given node), using 1.0–1.8 bits per edge (bpe) on large Web graphs, and
retrieving each neighbor of a node in 0.6–1.0microseconds (μs). When supporting both outand
in-neighbor queries, instead, our technique generally offers the best timewhen using little
space. If the reduced graph, instead, is represented with a compact data structure that supports
bidirectional navigation,weobtain the most compactWeb graph representations (0.9–1.5 bpe)
that support out/in-neighbor navigation; yet, the time per neighbor extracted raises to around
5–20μs. We also propose a compact data structure that represents dense subgraphs without
using virtual nodes. It allows us to recover out/in-neighbors and answer other more complex
queries on the dense subgraphs identified. This structure is not competitive on Web graphs,
but on social networks, it achieves 4–13 bpe and 8–12μs per out/in-neighbor retrieved, which
improves upon all existing representations.
en_US
Patrocinador
dc.description.sponsorship
Millennium Nucleus Information and Coordination in Networks
FONDECYT