Details

Kingsley Uyi Idehen
Lexington, United States

Subscribe

Post Categories

Subscribe

E-Mail:

Recent Articles

Display Settings

articles per page.
order.
Showing posts in all categories RefreshRefresh
New Preconfigured Virtuoso AMI for Amazon EC2 Cloud comprised of Linked Data from BBC & DBpedia

What?

Introducing a new preloaded and preconfigured Virtuoso (Cluster Edition) AMI for the Amazon EC2 Cloud that hosts combined Linked Datasets from:

Why?

Predictably instantiate a powerful database with high quality data and cross links within minutes, for personal or service specific use.

How?

Simply follow the instructions in our Amazon EC2 guide for the BBC + DBpedia 3.6 Linked Dataset guide.

Your installation steps are as follows:

  1. Instantiate a Virtuoso EC2 AMI
  2. Mount the Amazon Elastic Block Storage (EBS) snapshot that hosts the preloaded Virtuoso Database.

Related

# PermaLink Comments [0]
02/18/2011 20:20 GMT-0500 Modified: 03/29/2011 09:52 GMT-0500
New Preconfigured Virtuoso AMI for Amazon EC2 Cloud comprised of Linked Data from BBC & DBpedia

What?

Introducing a new preloaded and preconfigured Virtuoso (Cluster Edition) AMI for the Amazon EC2 Cloud that hosts combined Linked Datasets from:

Why?

Predictably instantiate a powerful database with high quality data and cross links within minutes, for personal or service specific use.

How?

Simply follow the instructions in our Amazon EC2 guide for the BBC + DBpedia 3.6 Linked Dataset guide.

Your installation steps are as follows:

  1. Instantiate a Virtuoso EC2 AMI
  2. Mount the Amazon Elastic Block Storage (EBS) snapshot that hosts the preloaded Virtuoso Database.

Related

# PermaLink Comments [0]
02/18/2011 20:20 GMT-0500 Modified: 03/29/2011 09:52 GMT-0500
DBpedia + BBC (combined) Linked Data Space Installation Guide

What?

The DBpedia + BBC Combo Linked Dataset is a preconfigured Virtuoso Cluster (4 Virtuoso Cluster Nodes, each comprised of one Virtuoso Instance; initial deployment is to a single Cluster Host, but license may be converted for physically distributed deployment), available via the Amazon EC2 Cloud, preloaded with the following datasets:

Why?

The BBC has been publishing Linked Data from its Web Data Space for a number of years. In line with best practices for injecting Linked Data into the World Wide Web (Web), the BBC datasets are interlinked with other datasets such as DBpedia and MusicBrainz.

Typical follow-your-nose exploration using a Web Browser (or even via sophisticated SPARQL query crawls) isn't always practical once you get past the initial euphoria that comes from comprehending the Linked Data concept. As your queries get more complex, the overhead of remote sub-queries increases its impact, until query results take so long to return that you simply give up.

Thus, maximizing the effects of the BBC's efforts requires Linked Data that shares locality in a Web-accessible Data Space — i.e., where all Linked Data sets have been loaded into the same data store or warehouse. This holds true even when leveraging SPARQL-FED style virtualization — there's always a need to localize data as part of any marginally-decent locality-aware cost-optimization algorithm.

This DBpedia + BBC dataset, exposed via a preloaded and preconfigured Virtuoso Cluster, delivers a practical point of presence on the Web for immediate and cost-effective exploitation of Linked Data at the individual and/or service specific levels.

How?

To work through this guide, you'll need to start with 90 GB of free disk space. (Only 41 GB will be consumed after you delete the installer archives, but starting with 90+ GB ensures enough work space for the installation.)

Install Virtuoso

  1. Download Virtuoso installer archive(s). You must deploy the Personal or Enterprise Edition; the Open Source Edition does not support Shared-Nothing Cluster Deployment.

  2. Obtain a Virtuoso Cluster license.

  3. Install Virtuoso.

  4. Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell and install directory):

    . /opt/virtuoso/virtuoso-enterprise.sh
  5. Optional: To keep the default single-server configuration file and demo database intact, set the VIRTUOSO_HOME environment variable to a different directory, e.g.,

    export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/

    Note: You will have to adjust this setting every time you shift between this cluster setup and your single-server setup. Either may be made your environment's default through the virtuoso-enterprise.sh and related scripts.

  6. Set up your cluster by running the mkcluster.sh script. Note that initial deployment of the DBpedia + BBC Combo requires a 4 node cluster, which is the default for this script.

  7. Start the Virtuoso Cluster with this command:

    virtuoso-start.sh
  8. Stop the Virtuoso Cluster with this command:

    virtuoso-stop.sh

Using the DBpedia + BBC Combo dataset

  1. Navigate to your installation directory.

  2. Download the combo dataset installer script — bbc-dbpedia-install.sh.

  3. For best results, set the downloaded script to fully executable using this command:

    chmod 755 bbc-dbpedia-install.sh
  4. Shut down any Virtuoso instances that may be currently running.

  5. Optional: As above, if you have decided to keep the default single-server configuration file and demo database intact, set the VIRTUOSO_HOME environment variable appropriately, e.g.,

    export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/
  6. Run the combo dataset installer script with this command:

    sh bbc-dbpedia-install.sh

Verify installation

The combo dataset typically deploys to EC2 virtual machines in under 90 minutes; your time will vary depending on your network connection speed, machine speed, and other variables.

Once the script completes, perform the following steps:

  1. Verify that the Virtuoso Conductor (HTTP-based Admin UI) is in place via:

    http://localhost:[port]/conductor
  2. Verify that the Virtuoso SPARQL endpoint is in place via:

    http://localhost:[port]/sparql
  3. Verify that the Precision Search & Find UI is in place via:

    http://localhost:[port]/fct
  4. Verify that the Virtuoso hosted PivotViewer is in place via:

    http://localhost:[port]/PivotViewer

Related

# PermaLink Comments [0]
02/17/2011 17:15 GMT-0500 Modified: 03/29/2011 10:09 GMT-0500
SPARQL Guide for the Perl Developer

What?

A simple guide usable by any Perl developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#
# Demonstrating use of a single query to populate a 
# Virtuoso Quad Store via Perl. 
#

# 
# HTTP URL is constructed accordingly with CSV query results format as the default via mime type.
#

use CGI qw/:standard/;
use LWP::UserAgent;
use Data::Dumper;
use Text::CSV_XS;

sub sparqlQuery(@args) {
  my $query=shift;
  my $baseURL=shift;
  my $format=shift;
	
	%params=(
		"default-graph" => "", "should-sponge" => "soft", "query" => $query,
		"debug" => "on", "timeout" => "", "format" => $format,
		"save" => "display", "fname" => ""
	);
	
	@fragments=();
	foreach $k (keys %params) {
		$fragment="$k=".CGI::escape($params{$k});
		push(@fragments,$fragment);
	}
	$query=join("&", @fragments);
	
	$sparqlURL="${baseURL}?$query";
	
	my $ua = LWP::UserAgent->new;
	$ua->agent("MyApp/0.1 ");
	my $req = HTTP::Request->new(GET => $sparqlURL);
	my $res = $ua->request($req);
	$str=$res->content;
	
	$csv = Text::CSV_XS->new();
	
	foreach $line ( split(/^/, $str) ) {
		$csv->parse($line);
		@bits=$csv->fields();
	  push(@rows, [ @bits ] );
	}
	return \@rows;
}


# Setting Data Source Name (DSN)

$dsn="http://dbpedia.org/resource/DBpedia";

# Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET using the IRI in
# FROM clause as Data Source URL en route to DBMS
# record Inserts.

$query="DEFINE get:soft \"replace\"\n

# Generic (non Virtuoso specific SPARQL
# Note: this will not add records to the 
# DBMS 

SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 

$data=sparqlQuery($query, "http://localhost:8890/sparql/", "text/csv");

print "Retrieved data:\n";
print Dumper($data);

Output

Retrieved data:
$VAR1 = [
          [
            's',
            'p',
            'o'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://www.w3.org/2002/07/owl#Thing'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://dbpedia.org/ontology/Work'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://dbpedia.org/class/yago/Software106566077'
          ],
...

Conclusion

CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Perl developer that already knows how to use Perl for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [3]
01/25/2011 11:05 GMT-0500 Modified: 01/26/2011 18:11 GMT-0500
Virtuoso + DBpedia 3.6 Installation Guide (Update 1)

What is DBpedia?

DBpedia is a community effort to provide a contemporary deductive database derived from Wikipedia content. Project contributions can be partitioned as follows:

  1. Ontology Construction and Maintenance
  2. Dataset Generation via Wikipedia Content Extraction & Transformation
  3. Live Database Maintenance & Administration -- includes actual Linked Data loading and publishing, provision of SPARQL endpoint, and traditional DBA activity
  4. Internationalization.

Why is DBpedia important?

Comprising the nucleus of the Linked Open Data effort, DBpedia also serves as a fulcrum for the burgeoning Web of Linked Data by delivering a dense and highly-interlinked lookup database. In its most basic form, DBpedia is a great source of strong and resolvable identifiers for People, Places, Organizations, Subject Matter, and many other data items of interest. Naturally, it provides a fantastic starting point for comprehending the fundamental concepts underlying TimBL's initial Linked Data meme.

How do I use DBpedia?

Depending on your particular requirements, whether personal or service-specific, DBpedia offers the following:

  • Datasets that can be loaded on your deductive database (also known as triple or quad stores) platform of choice
  • Live browsable HTML+RDFa based entity description pages
  • A wide variety of data formats for importing entity description data into a broad range of existing applications and services
  • A SPARQL endpoint allowing ad-hoc querying over HTTP using the SPARQL query language, and delivering results serialized in a variety of formats
  • A broad variety of tools covering query by example, faceted browsing, full text search, entity name lookups, etc.

What is the DBpedia 3.6 + Virtuoso Cluster Edition Combo?

OpenLink Software has preloaded the DBpedia 3.6 datasets into a preconfigured Virtuoso Cluster Edition database, and made the package available for easy installation.

Why is the DBpedia+Virtuoso package important?

The DBpedia+Virtuoso package provides a cost-effective option for personal or service-specific incarnations of DBpedia.

For instance, you may have a service that isn't best-served by competing with the rest of the world for ad-hoc query time and resources on the live instance, which itself operates under various restrictions which enable this ad-hoc query service to be provided at Web Scale.

Now you can easily commission your own instance and quickly exploit DBpedia and Virtuoso's database feature set to the max, powered by your own hardware and network infrastructure.

How do I use the DBpedia+Virtuoso package?

Pre-requisites are simply:

  1. Functional Virtuoso Cluster Edition installation.
  2. Virtuoso Cluster Edition License.
  3. 90 GB of free disk space -- you ultimately only need 43 gigs, but this our recommended free disk space size pre installation completion.

To install the Virtuoso Cluster Edition simply perform the following steps:

  1. Download Software.
  2. Run installer
  3. Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell):

    . /opt/virtuoso/virtuoso-enterprise.sh
  4. Run the mkcluster.sh script which defaults to a 4 node cluster
  5. Set VIRTUOSO_HOME environment variable -- if you want to start cluster databases distinct from single server databases via distinct root directory for database files (one that isn't adjacent to single-server database directories)
  6. Start Virtuoso Cluster Edition instances using command:
    virtuoso-start.sh
  7. Stop Virtuoso Cluster Edition instances using command:
    virtuoso-stop.sh

To install your personal or service specific edition of DBpedia simply perform the following steps:

  1. Navigate to your installation directory
  2. Download Installer script (dbpedia-install.sh)
  3. Set execution mode on script using command:
    chmod 755 dbpedia-install.sh
  4. Shutdown any Virtuoso instances that may be currently running
  5. Set your VIRTUOSO_HOME environment variable, e.g., to the current directory, via command (this may vary depending on your shell):
    export VIRTUOSO_HOME=`pwd`
  6. Run script using command:
    sh dbpedia-install.sh

Once the installation completes (approximately 1 hour and 30 minutes from start time), perform the following steps:

  1. Verify that the Virtuoso Conductor (HTML based Admin UI) is in place via:
    http://localhost:[port]/conductor
  2. Verify that the Precision Search & Find UI is in place via:
    http://localhost:[port]/fct
  3. Verify that DBpedia's Green Entity Description Pages are in place via:
    http://localhost:[port]/resource/DBpedia

Related

# PermaLink Comments [0]
01/24/2011 20:08 GMT-0500 Modified: 01/25/2011 14:46 GMT-0500
Virtuoso + DBpedia 3.6 Installation Guide (Update 1)

What is DBpedia?

DBpedia is a community effort to provide a contemporary deductive database derived from Wikipedia content. Project contributions can be partitioned as follows:

  1. Ontology Construction and Maintenance
  2. Dataset Generation via Wikipedia Content Extraction & Transformation
  3. Live Database Maintenance & Administration -- includes actual Linked Data loading and publishing, provision of SPARQL endpoint, and traditional DBA activity
  4. Internationalization.

Why is DBpedia important?

Comprising the nucleus of the Linked Open Data effort, DBpedia also serves as a fulcrum for the burgeoning Web of Linked Data by delivering a dense and highly-interlinked lookup database. In its most basic form, DBpedia is a great source of strong and resolvable identifiers for People, Places, Organizations, Subject Matter, and many other data items of interest. Naturally, it provides a fantastic starting point for comprehending the fundamental concepts underlying TimBL's initial Linked Data meme.

How do I use DBpedia?

Depending on your particular requirements, whether personal or service-specific, DBpedia offers the following:

  • Datasets that can be loaded on your deductive database (also known as triple or quad stores) platform of choice
  • Live browsable HTML+RDFa based entity description pages
  • A wide variety of data formats for importing entity description data into a broad range of existing applications and services
  • A SPARQL endpoint allowing ad-hoc querying over HTTP using the SPARQL query language, and delivering results serialized in a variety of formats
  • A broad variety of tools covering query by example, faceted browsing, full text search, entity name lookups, etc.

What is the DBpedia 3.6 + Virtuoso Cluster Edition Combo?

OpenLink Software has preloaded the DBpedia 3.6 datasets into a preconfigured Virtuoso Cluster Edition database, and made the package available for easy installation.

Why is the DBpedia+Virtuoso package important?

The DBpedia+Virtuoso package provides a cost-effective option for personal or service-specific incarnations of DBpedia.

For instance, you may have a service that isn't best-served by competing with the rest of the world for ad-hoc query time and resources on the live instance, which itself operates under various restrictions which enable this ad-hoc query service to be provided at Web Scale.

Now you can easily commission your own instance and quickly exploit DBpedia and Virtuoso's database feature set to the max, powered by your own hardware and network infrastructure.

How do I use the DBpedia+Virtuoso package?

Pre-requisites are simply:

  1. Functional Virtuoso Cluster Edition installation.
  2. Virtuoso Cluster Edition License.
  3. 90 GB of free disk space -- you ultimately only need 43 gigs, but this our recommended free disk space size pre installation completion.

To install the Virtuoso Cluster Edition simply perform the following steps:

  1. Download Software.
  2. Run installer
  3. Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell):

    . /opt/virtuoso/virtuoso-enterprise.sh
  4. Run the mkcluster.sh script which defaults to a 4 node cluster
  5. Set VIRTUOSO_HOME environment variable -- if you want to start cluster databases distinct from single server databases via distinct root directory for database files (one that isn't adjacent to single-server database directories)
  6. Start Virtuoso Cluster Edition instances using command:
    virtuoso-start.sh
  7. Stop Virtuoso Cluster Edition instances using command:
    virtuoso-stop.sh

To install your personal or service specific edition of DBpedia simply perform the following steps:

  1. Navigate to your installation directory
  2. Download Installer script (dbpedia-install.sh)
  3. Set execution mode on script using command:
    chmod 755 dbpedia-install.sh
  4. Shutdown any Virtuoso instances that may be currently running
  5. Set your VIRTUOSO_HOME environment variable, e.g., to the current directory, via command (this may vary depending on your shell):
    export VIRTUOSO_HOME=`pwd`
  6. Run script using command:
    sh dbpedia-install.sh

Once the installation completes (approximately 1 hour and 30 minutes from start time), perform the following steps:

  1. Verify that the Virtuoso Conductor (HTML based Admin UI) is in place via:
    http://localhost:[port]/conductor
  2. Verify that the Precision Search & Find UI is in place via:
    http://localhost:[port]/fct
  3. Verify that DBpedia's Green Entity Description Pages are in place via:
    http://localhost:[port]/resource/DBpedia

Related

# PermaLink Comments [0]
01/24/2011 20:08 GMT-0500 Modified: 01/25/2011 14:46 GMT-0500
SPARQL Guide for the Javascript Developer

What?

A simple guide usable by any Javascript developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

/*
Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. 
*/

/* 
HTTP URL is constructed accordingly with JSON query results format as the default via mime type.
*/

function sparqlQuery(query, baseURL, format) {
	if(!format)
		format="application/json";
	var params={
		"default-graph": "", "should-sponge": "soft", "query": query,
		"debug": "on", "timeout": "", "format": format,
		"save": "display", "fname": ""
	};
	
	var querypart="";
	for(var k in params) {
		querypart+=k+"="+encodeURIComponent(params[k])+"&";
	}
	var queryURL=baseURL + '?' + querypart;
	if (window.XMLHttpRequest) {
  	xmlhttp=new XMLHttpRequest();
  }
  else {
  	xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
  }
  xmlhttp.open("GET",queryURL,false);
  xmlhttp.send();
  return JSON.parse(xmlhttp.responseText);
}

/*
setting Data Source Name (DSN)
*/

var dsn="http://dbpedia.org/resource/DBpedia";

/*
Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to 
DBMS record inserts
*/

var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; 
var data=sparqlQuery(query, "/sparql/");

Output

Place the snippet above into the <script/> section of an HTML document to see the query result.

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/21/2011 14:59 GMT-0500 Modified: 01/26/2011 18:10 GMT-0500
SPARQL Guide for the Javascript Developer

What?

A simple guide usable by any Javascript developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

/*
Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. 
*/

/* 
HTTP URL is constructed accordingly with JSON query results format as the default via mime type.
*/

function sparqlQuery(query, baseURL, format) {
	if(!format)
		format="application/json";
	var params={
		"default-graph": "", "should-sponge": "soft", "query": query,
		"debug": "on", "timeout": "", "format": format,
		"save": "display", "fname": ""
	};
	
	var querypart="";
	for(var k in params) {
		querypart+=k+"="+encodeURIComponent(params[k])+"&";
	}
	var queryURL=baseURL + '?' + querypart;
	if (window.XMLHttpRequest) {
  	xmlhttp=new XMLHttpRequest();
  }
  else {
  	xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
  }
  xmlhttp.open("GET",queryURL,false);
  xmlhttp.send();
  return JSON.parse(xmlhttp.responseText);
}

/*
setting Data Source Name (DSN)
*/

var dsn="http://dbpedia.org/resource/DBpedia";

/*
Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to 
DBMS record inserts
*/

var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; 
var data=sparqlQuery(query, "/sparql/");

Output

Place the snippet above into the <script/> section of an HTML document to see the query result.

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/21/2011 14:59 GMT-0500 Modified: 01/26/2011 18:10 GMT-0500
SPARQL Guide for the PHP Developer

What?

A simple guide usable by any PHP developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. PHP.

Steps:

  1. From your command line execute: aptitude search '^PHP26', to verify PHP is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env php
<?php
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store via PHP. 
#

# HTTP URL is constructed accordingly with JSON query results format in mind.

function sparqlQuery($query, $baseURL, $format="application/json")

  {
	$params=array(
		"default-graph" =>  "",
		"should-sponge" =>  "soft",
		"query" =>  $query,
		"debug" =>  "on",
		"timeout" =>  "",
		"format" =>  $format,
		"save" =>  "display",
		"fname" =>  ""
	);

	$querypart="?";	
	foreach($params as $name => $value) 
  {
		$querypart=$querypart . $name . '=' . urlencode($value) . "&";
	}
	
	$sparqlURL=$baseURL . $querypart;
	
	return json_decode(file_get_contents($sparqlURL));
};



# Setting Data Source Name (DSN)
$dsn="http://dbpedia.org/resource/DBpedia";

#Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
#using the IRI in FROM clause as Data Source URL

$query="DEFINE get:soft \"replace\"
SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 

$data=sparqlQuery($query, "http://localhost:8890/sparql/");

print "Retrieved data:\n" . json_encode($data);

?>

Output

Retrieved data:
  {"head":
  {"link":[],"vars":["s","p","o"]},
  "results":
		{"distinct":false,"ordered":true,
		"bindings":[
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/www.w3.org\/2002\/07\/owl#Thing"}},
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/dbpedia.org\/ontology\/Work"}},
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/dbpedia.org\/class\/yago\/Software106566077"}},
...

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a PHP developer that already knows how to use PHP for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/20/2011 16:25 GMT-0500 Modified: 01/25/2011 10:36 GMT-0500
SPARQL Guide for Python Developer

What?

A simple guide usable by any Python developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Python.

Steps:

  1. From your command line execute: aptitude search '^python26', to verify Python is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env python
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store via Python. 
#

import urllib, json

# HTTP URL is constructed accordingly with JSON query results format in mind.

def sparqlQuery(query, baseURL, format="application/json"):
	params={
		"default-graph": "",
		"should-sponge": "soft",
		"query": query,
		"debug": "on",
		"timeout": "",
		"format": format,
		"save": "display",
		"fname": ""
	}
	querypart=urllib.urlencode(params)
	response = urllib.urlopen(baseURL,querypart).read()
	return json.loads(response)

# Setting Data Source Name (DSN)
dsn="http://dbpedia.org/resource/DBpedia"

# Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
# using the IRI in FROM clause as Data Source URL

query="""DEFINE get:soft "replace"
SELECT DISTINCT * FROM <%s> WHERE {?s ?p ?o}""" % dsn 

data=sparqlQuery(query, "http://localhost:8890/sparql/")

print "Retrieved data:\n" + json.dumps(data, sort_keys=True, indent=4)

#
# End

Output

Retrieved data:
{
    "head": {
        "link": [], 
        "vars": [
            "s", 
            "p", 
            "o"
        ]
    }, 
    "results": {
        "bindings": [
            {
                "o": {
                    "type": "uri", 
                    "value": "http://www.w3.org/2002/07/owl#Thing"
                }, 
                "p": {
                    "type": "uri", 
                    "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
                }, 
                "s": {
                    "type": "uri", 
                    "value": "http://dbpedia.org/resource/DBpedia"
                }
            }, 
...

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Python developer that already knows how to use Python for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/19/2011 12:13 GMT-0500 Modified: 01/25/2011 10:35 GMT-0500
 <<     | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform
The posts on this weblog are my personal views, and not those of OpenLink Software.