SPARQLUpdateStore in Python RDFLib 5.0.0
Recently I had to use the SPARQLUpdateStore
in Python's RDFLib 5.0.0. Out of the box, my SPARQL Update queries constructed by RDFLib were failing when sending to a local instance of Virtuoso Single Server Edition version 07.20.3215 and GraphDB Free version 9.0. Here is how I got the SPARQLUpdateStore
to work.
I ran both Virtuoso and GraphDB using Docker.
Virtuoso docker-compose.yml
:
db:
image: tenforce/virtuoso:1.3.1-virtuoso7.2.2
environment:
SPARQL_UPDATE: "false"
DEFAULT_GRAPH: "http://www.example.com/my-graph"
volumes:
- ./data/virtuoso:/data
ports:
- "8890:8890"
GraphDB docker-compose.yml
:
version: '3.0'
services:
graphdb:
image: ternau/graphdb:9.0.0
ports:
- '7200:7200'
environment:
GDB_HEAP_SIZE: '4G'
GDB_JAVA_OPTS: '-Dgraphdb.workbench.cors.enable=true'
volumes:
- ./graphdb-data:/graphdb-free-8.5.0/data
- ./graphdb-work:/graphdb-free-8.5.0/work
- ./graphdb-logs:/graphdb-free-8.5.0/logs
The requirements.txt
file:
rdflib==5.0.0
requests==2.24.0
The config.py
file:
# SPARQL_ENDPOINT = 'http://localhost:8890/sparql' # Virtuoso
SPARQL_ENDPOINT = 'http://localhost:7200/repositories/test-repo' # GraphDB
# SPARQL_UPDATE_ENDPOINT = 'http://localhost:8890/sparql-auth' # Virtuoso
SPARQL_UPDATE_ENDPOINT = 'http://localhost:7200/repositories/test-repo/statements' # GraphDB
AUTH_USER = 'dba' # Default Virtuoso username
AUTH_PASS = 'dba' # Default Virtuoso password
DEFAULT_GRAPH_URI = 'http://test.com/' # Named graph context
My solution - app.py
:
from rdflib import URIRef, Graph
from rdflib.plugins.stores.sparqlstore import SPARQLUpdateStore, Store
from rdflib.namespace import RDF, OWL
from requests.auth import HTTPDigestAuth
import config
def set_store_header_update(store: Store):
"""Call this function before any `Graph.add()` calls to set the appropriate request headers."""
if 'headers' not in store.kwargs:
store.kwargs.update({'headers': {}})
store.kwargs['headers'].update({'content-type': 'application/sparql-update'})
def set_store_header_read(store: Store):
"""Call this function before any `Graph.triples()` calls to set the appropriate request headers."""
if 'headers' not in store.kwargs:
store.kwargs.update({'headers': {}})
store.kwargs['headers'].pop('content-type', None)
if __name__ == '__main__':
# The below code is working on RDFLib 5.0.0. It was a lot of effort to get the SPARQLUpdateStore to work.
# I think the API of the SPARQLUpdateStore is not working as intended. I am suspecting future versions of RDFLib
# may change its API or fix the internal issues that we've come across.
# For sending SPARQL 1.1 Update queries, we need to set the content type of the header
# to 'application/sparql-update'. The bad thing about this is it fails for read-only SPARQL 1.1 queries.
# For read-only SPARQL 1.1 queries, we need to remove the content type from the header.
# See `set_store_header_read()` and `set_store_header_update`.
# postAsEncoded doesn't do anything! But we are passing it False just so it will work with future versions of
# RDFLib when it gets fixed. SPARQL 1.1 Update expects content type of request to be either
# 'application/sparql-update' or 'application/x-www-form-urlencoded'.
#
# We are using Digest auth. Remember, additional kwargs are passed on to the Requests library.
store = SPARQLUpdateStore(queryEndpoint=config.SPARQL_ENDPOINT, update_endpoint=config.SPARQL_UPDATE_ENDPOINT,
auth=HTTPDigestAuth(config.AUTH_USER, config.AUTH_PASS), context_aware=True,
postAsEncoded=False)
# Default is 'GET'. We want to send 'POST' requests in this instance.
store.method = 'POST'
# Initially I tried to use the Dataset class but it does not work with the SPARQLUpdateStore. A couple of things
# also get overridden when creating it. E.g. the g.store.query_endpoint gets set to None and g.open(None) needs
# to be called manually. Instead, create an instance of `Graph` and pass it the context via the `identifier` param.
g = Graph(store=store, identifier=config.DEFAULT_GRAPH_URI)
# Insert some triples.
set_store_header_update(store)
g.add((URIRef('http://example.com/test888'), RDF.type, OWL.Ontology))
# Read some triples.
set_store_header_read(store)
for triple in g.triples((None, None, None)):
print(triple)