We define a procedure for canonicalising SPARQL 1.1 queries. Specifically, given two input queries that return the
same solutions modulo variable names over any RDF graph (which we call congruent queries), the canonicalisation procedure
aims to rewrite both input queries to a syntactically canonical query that likewise returns the same results modulo variable renaming.
The use-cases for such canonicalisation include caching, optimisation, redundancy elimination, question answering, and
more besides. To begin, we formally define the semantics of the SPARQL 1.1 language, including features often overlooked in
the literature.We then propose a canonicalisation procedure based on mapping a SPARQL query to an RDF graph, applying algebraic
rewritings, removing redundancy, and then using canonical labelling techniques to produce a canonical form. Unfortunately
a full canonicalisation procedure for SPARQL 1.1 queries would be undecidable. We rather propose a procedure that we prove
to be sound and complete for a decidable fragment of monotone queries under both set and bag semantics, and that is sound but
incomplete in the case of the full SPARQL 1.1 query language. Although the worst case of the procedure is super-exponential,
our experiments show that it is efficient for real-world queries, and that such difficult cases are rare.
es_ES
Patrocinador
dc.description.sponsorship
Comision Nacional de Investigacion Cientifica y Tecnologica (CONICYT)
CONICYT FONDECYT 1181896
ANID -Millennium Science Initiative Program ICN17_002
es_ES
Lenguage
dc.language.iso
en
es_ES
Publisher
dc.publisher
IOS PRESS
es_ES
Type of license
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 United States