SPARQL Query generation

This is a SHACL-based SPARQL generator. It generates SPARQL queries used to extract a subset of data from a knowledge graph, based on the SHACL specification of the target dataset structure. This tool takes into account a subset of SHACL constraints such as sh:hasValue , sh:in, sh:languageIn, sh:node or sh:inversePath. If generates CONSTRUCT queries to return an RDF graph as an output.
Detailed documentation is available below.

  Shapes

Remove Select file Change
You can select multiple files. Supported extensions : .rdf, .ttl, .n3, .trig. Other extensions will be treated as RDF/XML
URL of an RDF file. Same extensions as file upload are supported.
Supported syntaxes : Turtle, RDF/XML, JSON-LD, TriG, TriX, N-Quads. We recommend Turtle.

  Target override (optionnal)

Remove Select file Change
You can select multiple files. Supported extension : .ttl
URL of an RDF file. Same extensions as file upload are supported.
Supported syntaxe : Turtle. We recommend Turtle.

  Options

Generates a single query per initial target defined in the SHACL, using UNION clauses. By default, the process generates one query file per possible path in the shapes file.

Documentation

Sample file

To test, and to better understand how the SPARQL query generation works you can download this turtle example of an application profile specified in SHACL , or the corresponding Excel file This Excel file can be converted in SHACL using the SKOS Play xls2rdf conversion tool. All the details about the conversion rules are documented in the converter page. This documentation only explains the query generation algorithms.

SHACL file structure

The SPARQL query generation requires that there is at least one NodeShape with SPARQL-based target, that is having a sh:target that itself has a sh:select giving the SPARQL query that defines the target of this shape. The SPARQL query in the sh:select is the starting point and is inserted as a subquery.



Properties Shapes

On property shapes, the following SHACL predicates and constraints are considered :

sh:path (required)

The property indicated in sh:path is inserted in the CONSTRUCT clause as well as in the WHERE clause of the generated SPARQL query.
The only supported property paths in sh:path for the SPARQL query generation are inverse property paths.

Optional filtering criterias

Within the property shape, 3 possible conditions are considered.

  • sh:hasValue : Value nodes must be equal to the given RDF term. This generates a VALUES ?x {...} condition in the SPARQL query.
  • sh:in : Value nodes must be a member of the provided list of values. This generates a VALUES ?x {...} condition in the SPARQL query.
  • sh:languageIn : The language tags for each value node must be inside the given list of language tags. This generates a FILTER (lang(?x) IN(...)) condition in the SPARQL query.
sh:node (optional)

When a sh:path property has a sh:node constraint, the SPARQL query generation "follows" the sh:node to generate either another SPARQL query or another UNION clause (see below).

Multiple "target" sh:node are supported through the use of an sh:or constraint :
sh:or([sh:node ex:nodeShape1][sh:node ex:nodeShape2])
For each NodeShape indicated in the sh:or, a new SPARQL query or another UNION clause will be generated (see below).

sh:inversePath (optional)

sh:inversePath used in sh:path contains a blank node that is the subject of exactly one triple.
Using this insert inverse property paths in the generated SPARQL query : ?x ^foaf:knows ?y




Targets override

If you provide a target override model, then the targets (sh:target) of the shapes will be read from this model instead of the original SHACL shapes graph. This allows to use the same base model with different target specifications.

SPARQL query generation algorithm (multiple queries)

  1. Start by looking at each NodeShape having a sh:target with a sh:select...
  2. Generate one query for the "starting point" NodeShape. In this query, for each PropertyShape :
  3. If the property shape has a sh:hasValue or sh:in, a VALUES clause is inserted, if it has sh:languageIn a FILTER clause will be inserted.

    Example:


  4. Then, for each NodeShape referred to in an sh:node (or sh:or containing multiple sh:node), follow to the target NodeShape and apply the same algorithm recursively. The recursion stops when it encounters a NodeShape that was already processed.

    Example: the ex:Country property in the Person NodeShape, refers to the ex:City NodeShape, through the ex:country property. This generates this second query:



SPARQL query generation algorithm (single query)

Instead of generating multiple queries, one for each "path" in the SHACL specification, it is possible to produce one single SPARQL query for each "starting point" NodeShape having a sh:target with a sh:select. The query uses a serie of UNION clauses. In this case, no filtering using sh:hasValue , sh:in or sh:languageIn happens. The query simply takes into account the structure of the triples in the graph, but cannot filter on their value