Show simple item record

Authordc.contributor.authorUmbrich, Juergen 
Authordc.contributor.authorHogan, Aidan 
Authordc.contributor.authorPolleres, Axel 
Authordc.contributor.authorDecker, Stefan 
Admission datedc.date.accessioned2015-12-29T20:11:24Z
Available datedc.date.available2015-12-29T20:11:24Z
Publication datedc.date.issued2015
Cita de ítemdc.identifier.citationSemantic Web Volumen: 6 Número: 6 Páginas: 585-624 (2015)en_US
Identifierdc.identifier.otherDOI: 10.3233/SW-140164
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/136048
General notedc.descriptionArtículo de publicación ISIen_US
Abstractdc.description.abstractTraditional approaches for querying the Web of Data often involve centralised warehouses that replicate remote data. Conversely, Linked Data principles allow for answering queries live over the Web by dereferencing URIs to traverse remote data sources at runtime. A number of authors have looked at answering SPARQL queries in such a manner; these link-traversal based query execution (LTBQE) approaches for Linked Data offer up-to-date results and decentralised (i.e., client-side) execution, but must operate over incomplete dereferenceable knowledge available in remote documents, thus affecting response times and "recall" for query answers. In this paper, we study the recall and effectiveness of LTBQE, in practice, for the Web of Data. Furthermore, to integrate data from diverse sources, we propose lightweight reasoning extensions to help find additional answers. From the state-of-the-art which (1) considers only dereferenceable information and (2) follows rdfs: seeAlso links, we propose extensions to consider (3) owl: sameAs links and reasoning, and (4) lightweight RDFS reasoning. We then estimate the recall of link-traversal query techniques in practice: we analyse a large crawl of the Web of Data (the BTC'11 dataset), looking at the ratio of raw data contained in dereferenceable documents vs. the corpus as a whole and determining how much more raw data our extensions make available for query answering. We then stress-test LTBQE (and our extensions) in real-world settings using the FedBench and DBpedia SPARQL Benchmark frameworks, and propose a novel benchmark called QWalk based on random walks through diverse data. We show that link-traversal query approaches often work well in uncontrolled environments for simple queries, but need to retrieve an unfeasible number of sources for more complex queries. We also show that our reasoning extensions increase recall at the cost of slower execution, often increasing the rate at which results return; conversely, we show that reasoning aggravates performance issues for complex queries.en_US
Patrocinadordc.description.sponsorshipScience Foundation Ireland (SFI), Vienna Science and Technology Fund (WWTF), Millennium Nucleus Center for Semantic Web Researchen_US
Lenguagedc.language.isoenen_US
Publisherdc.publisherIOS Pressen_US
Type of licensedc.rightsAtribución-NoComercial-SinDerivadas 3.0 Chile*
Link to Licensedc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/*
Keywordsdc.subjectReasoningen_US
Keywordsdc.subjectLive queryingen_US
Keywordsdc.subjectWeb of Dataen_US
Keywordsdc.subjectRDFen_US
Keywordsdc.subjectSemantic Weben_US
Keywordsdc.subjectOWLen_US
Keywordsdc.subjectRDFSen_US
Keywordsdc.subjectSPARQLen_US
Keywordsdc.subjectLinked dataen_US
Títulodc.titleLink traversal querying for a diverse Web of Dataen_US
Document typedc.typeArtículo de revista


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Atribución-NoComercial-SinDerivadas 3.0 Chile
Except where otherwise noted, this item's license is described as Atribución-NoComercial-SinDerivadas 3.0 Chile