A declarative query language from the W3C for querying
structured propositional data (in the form of
3-tuple [triples] or 4-tuple [quads] records)
stored in a deductive database (colloquially referred
to as triple or quad stores in Semantic Web and Linked Data parlance).
SPARQL is inherently platform independent. Like SQL, the query language and the backend
database engine are distinct. Database clients capture SPARQL
queries which are then passed on to compliant backend
databases.
Why is it important?
Like SQL for relational databases, it provides a powerful
mechanism for accessing and joining data across one or more data
partitions (named graphs identified by IRIs). The aforementioned
capability also enables the construction of sophisticated Views,
Reports (HTML or those produced in native form by desktop
productivity tools), and data streams for other services.
Unlike SQL, SPARQL includes result serialization formats and an
HTTP based wire protocol. Thus, the ubiquity and sophistication of
HTTP is integral to SPARQL i.e., client side applications (user
agents) only need to be able to perform an HTTP GET against a
URL en route to exploiting the power of
SPARQL.
How do I use it, generally?
- Locate a SPARQL endpoint (DBpedia, LOD Cloud
Cache, Data.Gov, URIBurner, others), or;
- Install a SPARQL compliant database server (quad or triple
store) on your desktop, workgroup server, data center, or cloud
(e.g., Amazon EC2 AMI)
- Start the database server
- Execute SPARQL Queries via the SPARQL
endpoint.
How do I use SPARQL with Virtuoso?
What follows is a very simple guide for using SPARQL against
your own instance of Virtuoso:
- Software Download and Installation
- Data Loading from Data Sources exposed at Network Addresses
(e.g. HTTP URLs) using very simple methods
- Actual SPARQL query execution via SPARQL endpoint.
Installation Steps
- Download Virtuoso Open Source or Virtuoso Commercial Editions
- Run installer (if using Commercial edition of Windows Open
Source Edition, otherwise follow build guide)
- Follow post-installation guide and verify installation by
typing in the command: virtuoso -? (if this fails check you've
followed installation and setup steps, then verify environment
variables have been set)
- Start the Virtuoso server using the command:
virtuoso-start.sh
- Verify you have a connection to the Virtuoso Server via the
command: isql localhost (assuming you're using default DB settings)
or the command: isql localhost:1112 (assuming demo database) or
goto your browser and type in:
http://<virtuoso-server-host-name>:[port]/conductor (e.g.
http://localhost:8889/conductor for default DB or
http://localhost:8890/conductor if using Demo DB)
- Go to SPARQL endpoint which is typically --
http://<virtuoso-server-host-name>:[port]/sparql
- Run a quick sample query (since the database always has system
data in place): select distinct * where {?s ?p ?o} limit 50 .
Troubleshooting
- Ensure environment settings are set and functional -- if using
Mac OS X or Windows, so you don't have to worry about this, just
start and stop your Virtuoso server using native OS services
applets
- If using the Open Source Edition, follow the getting started guide -- it covers PATH
and startup directory location re. starting and stopping Virtuoso
servers.
- Sponging (HTTP GETs against external Data Sources) within
SPARQL queries is disabled by default. You can enable this feature
by assigning "SPARQL_SPONGE" privileges to user
"SPARQL". Note, more sophisticated security exists via WebID based ACLs.
Data Loading Steps
- Identify an RDF based structured data source of interest -- a
file that contains 3-tuple / triples available at an address on a
public or private HTTP based network
- Determine the Address (URL) of the RDF data source
- Go to your Virtuoso SPARQL endpoint and type in the following
SPARQL query: DEFINE GET:SOFT "replace" SELECT DISTINCT * FROM
<RDFDataSourceURL> WHERE {?s ?p ?o}
- All the triples in the RDF resource (data source accessed via
URL) will be loaded into the Virtuoso Quad Store (using RDF Data
Source URL as the internal quad store Named Graph IRI) as part of
the SPARQL query processing pipeline.
Note: the data source URL doesn't even have to be RDF based --
which is where the Virtuoso Sponger Middleware comes into play
(download and install the VAD installer package first) since it
delivers the following features to Virtuoso's SPARQL engine:
- Transformation of data from non RDF data sources (file content,
hypermedia resources, web services
output etc..) into RDF based 3-tuples (triples)
- Cache Invalidation Scheme Construction -- thus, subsequent
queries (without the define get:soft "replace" pragma will not be
required bar when you forcefully want to override cache).
- If you have very large data sources like DBpedia etc. from
CKAN, simply use our bulk loader .
SPARQL Endpoint Discovery
Public SPARQL endpoints are emerging at an ever increasing rate.
Thus, we've setup up a DNS lookup service that provides access to a
large number of SPARQL endpoints. Of course, this doesn't cover all
existing endpoints, so if our endpoint is missing please ping
me.
Here are a collection of commands for using DNS-SD to discover
SPARQL endpoints:
- dns-sd -B _sparql._tcp sparql.openlinksw.com -- browse for
services instances
- dns-sd -Z _sparql._tcp sparql.openlinksw.com -- output results
in Zone File format
Related
-
Using HTTP from Ruby -- you can just
make SPARQL Protocol URLs re. SPARQL
-
Using SPARQL Endpoints via Ruby -- Ruby
example using DBpedia endpoint
-
Interactive SPARQL Query By Example (QBE)
tool -- provides a graphical user interface (as is common in
SQL realm re. query building against RDBMS engines) that works with any
SPARQL endpoint
-
Other methods of loading RDF data into
Virtuoso
-
Virtuoso Sponger -- architecture and how
it turns a wide variety of non RDF data sources into SPARQL
accessible data
-
Using OpenLink Data Explorer (ODE) to
populate Virtuoso -- locate a resource of interest; click on a
bookmarklet or use context menus (if using ODE extensions for
Firefox, Safari, or Chrome); and you'll have SPARQL accessible data
automatically inserted into your Virtuoso instance.
-
W3C's SPARQLing Data Access Ingenuity --
an older generic SPARQL introduction post
-
Collection of SPARQL Query Examples --
GoodRelations (Product Offers), FOAF (Profiles), SIOC
(Data Spaces -- Blogs, Wikis, Bookmarks, Feed Collections, Photo Galleries, Briefcase/DropBox, AddressBook, Calendars, Discussion Forums)
-
Collection of Live SPARQL Queries against LOD
Cloud Cache -- simple and advanced queries.