Details

Kingsley Uyi Idehen
Lexington, United States

Subscribe

Post Categories

Subscribe

E-Mail:

Recent Articles

Display Settings

articles per page.
order.
Showing posts in all categories RefreshRefresh
SPARQL Guide for the Perl Developer

What?

A simple guide usable by any Perl developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#
# Demonstrating use of a single query to populate a 
# Virtuoso Quad Store via Perl. 
#

# 
# HTTP URL is constructed accordingly with CSV query results format as the default via mime type.
#

use CGI qw/:standard/;
use LWP::UserAgent;
use Data::Dumper;
use Text::CSV_XS;

sub sparqlQuery(@args) {
  my $query=shift;
  my $baseURL=shift;
  my $format=shift;
	
	%params=(
		"default-graph" => "", "should-sponge" => "soft", "query" => $query,
		"debug" => "on", "timeout" => "", "format" => $format,
		"save" => "display", "fname" => ""
	);
	
	@fragments=();
	foreach $k (keys %params) {
		$fragment="$k=".CGI::escape($params{$k});
		push(@fragments,$fragment);
	}
	$query=join("&", @fragments);
	
	$sparqlURL="${baseURL}?$query";
	
	my $ua = LWP::UserAgent->new;
	$ua->agent("MyApp/0.1 ");
	my $req = HTTP::Request->new(GET => $sparqlURL);
	my $res = $ua->request($req);
	$str=$res->content;
	
	$csv = Text::CSV_XS->new();
	
	foreach $line ( split(/^/, $str) ) {
		$csv->parse($line);
		@bits=$csv->fields();
	  push(@rows, [ @bits ] );
	}
	return \@rows;
}


# Setting Data Source Name (DSN)

$dsn="http://dbpedia.org/resource/DBpedia";

# Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET using the IRI in
# FROM clause as Data Source URL en route to DBMS
# record Inserts.

$query="DEFINE get:soft \"replace\"\n

# Generic (non Virtuoso specific SPARQL
# Note: this will not add records to the 
# DBMS 

SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 

$data=sparqlQuery($query, "http://localhost:8890/sparql/", "text/csv");

print "Retrieved data:\n";
print Dumper($data);

Output

Retrieved data:
$VAR1 = [
          [
            's',
            'p',
            'o'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://www.w3.org/2002/07/owl#Thing'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://dbpedia.org/ontology/Work'
          ],
          [
            'http://dbpedia.org/resource/DBpedia',
            'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
            'http://dbpedia.org/class/yago/Software106566077'
          ],
...

Conclusion

CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Perl developer that already knows how to use Perl for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [3]
01/25/2011 11:05 GMT-0500 Modified: 01/26/2011 18:11 GMT-0500
SPARQL Guide for the Javascript Developer

What?

A simple guide usable by any Javascript developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

/*
Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. 
*/

/* 
HTTP URL is constructed accordingly with JSON query results format as the default via mime type.
*/

function sparqlQuery(query, baseURL, format) {
	if(!format)
		format="application/json";
	var params={
		"default-graph": "", "should-sponge": "soft", "query": query,
		"debug": "on", "timeout": "", "format": format,
		"save": "display", "fname": ""
	};
	
	var querypart="";
	for(var k in params) {
		querypart+=k+"="+encodeURIComponent(params[k])+"&";
	}
	var queryURL=baseURL + '?' + querypart;
	if (window.XMLHttpRequest) {
  	xmlhttp=new XMLHttpRequest();
  }
  else {
  	xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
  }
  xmlhttp.open("GET",queryURL,false);
  xmlhttp.send();
  return JSON.parse(xmlhttp.responseText);
}

/*
setting Data Source Name (DSN)
*/

var dsn="http://dbpedia.org/resource/DBpedia";

/*
Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to 
DBMS record inserts
*/

var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; 
var data=sparqlQuery(query, "/sparql/");

Output

Place the snippet above into the <script/> section of an HTML document to see the query result.

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/21/2011 14:59 GMT-0500 Modified: 01/26/2011 18:10 GMT-0500
SPARQL Guide for the Javascript Developer

What?

A simple guide usable by any Javascript developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

Steps:

  1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

/*
Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. 
*/

/* 
HTTP URL is constructed accordingly with JSON query results format as the default via mime type.
*/

function sparqlQuery(query, baseURL, format) {
	if(!format)
		format="application/json";
	var params={
		"default-graph": "", "should-sponge": "soft", "query": query,
		"debug": "on", "timeout": "", "format": format,
		"save": "display", "fname": ""
	};
	
	var querypart="";
	for(var k in params) {
		querypart+=k+"="+encodeURIComponent(params[k])+"&";
	}
	var queryURL=baseURL + '?' + querypart;
	if (window.XMLHttpRequest) {
  	xmlhttp=new XMLHttpRequest();
  }
  else {
  	xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
  }
  xmlhttp.open("GET",queryURL,false);
  xmlhttp.send();
  return JSON.parse(xmlhttp.responseText);
}

/*
setting Data Source Name (DSN)
*/

var dsn="http://dbpedia.org/resource/DBpedia";

/*
Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to 
DBMS record inserts
*/

var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; 
var data=sparqlQuery(query, "/sparql/");

Output

Place the snippet above into the <script/> section of an HTML document to see the query result.

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/21/2011 14:59 GMT-0500 Modified: 01/26/2011 18:10 GMT-0500
SPARQL Guide for the PHP Developer

What?

A simple guide usable by any PHP developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. PHP.

Steps:

  1. From your command line execute: aptitude search '^PHP26', to verify PHP is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env php
<?php
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store via PHP. 
#

# HTTP URL is constructed accordingly with JSON query results format in mind.

function sparqlQuery($query, $baseURL, $format="application/json")

  {
	$params=array(
		"default-graph" =>  "",
		"should-sponge" =>  "soft",
		"query" =>  $query,
		"debug" =>  "on",
		"timeout" =>  "",
		"format" =>  $format,
		"save" =>  "display",
		"fname" =>  ""
	);

	$querypart="?";	
	foreach($params as $name => $value) 
  {
		$querypart=$querypart . $name . '=' . urlencode($value) . "&";
	}
	
	$sparqlURL=$baseURL . $querypart;
	
	return json_decode(file_get_contents($sparqlURL));
};



# Setting Data Source Name (DSN)
$dsn="http://dbpedia.org/resource/DBpedia";

#Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
#using the IRI in FROM clause as Data Source URL

$query="DEFINE get:soft \"replace\"
SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 

$data=sparqlQuery($query, "http://localhost:8890/sparql/");

print "Retrieved data:\n" . json_encode($data);

?>

Output

Retrieved data:
  {"head":
  {"link":[],"vars":["s","p","o"]},
  "results":
		{"distinct":false,"ordered":true,
		"bindings":[
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/www.w3.org\/2002\/07\/owl#Thing"}},
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/dbpedia.org\/ontology\/Work"}},
			{"s":
			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
			{"type":"uri","value":"http:\/\/dbpedia.org\/class\/yago\/Software106566077"}},
...

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a PHP developer that already knows how to use PHP for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/20/2011 16:25 GMT-0500 Modified: 01/25/2011 10:36 GMT-0500
SPARQL Guide for Python Developer

What?

A simple guide usable by any Python developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Python.

Steps:

  1. From your command line execute: aptitude search '^python26', to verify Python is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env python
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store via Python. 
#

import urllib, json

# HTTP URL is constructed accordingly with JSON query results format in mind.

def sparqlQuery(query, baseURL, format="application/json"):
	params={
		"default-graph": "",
		"should-sponge": "soft",
		"query": query,
		"debug": "on",
		"timeout": "",
		"format": format,
		"save": "display",
		"fname": ""
	}
	querypart=urllib.urlencode(params)
	response = urllib.urlopen(baseURL,querypart).read()
	return json.loads(response)

# Setting Data Source Name (DSN)
dsn="http://dbpedia.org/resource/DBpedia"

# Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
# using the IRI in FROM clause as Data Source URL

query="""DEFINE get:soft "replace"
SELECT DISTINCT * FROM <%s> WHERE {?s ?p ?o}""" % dsn 

data=sparqlQuery(query, "http://localhost:8890/sparql/")

print "Retrieved data:\n" + json.dumps(data, sort_keys=True, indent=4)

#
# End

Output

Retrieved data:
{
    "head": {
        "link": [], 
        "vars": [
            "s", 
            "p", 
            "o"
        ]
    }, 
    "results": {
        "bindings": [
            {
                "o": {
                    "type": "uri", 
                    "value": "http://www.w3.org/2002/07/owl#Thing"
                }, 
                "p": {
                    "type": "uri", 
                    "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
                }, 
                "s": {
                    "type": "uri", 
                    "value": "http://dbpedia.org/resource/DBpedia"
                }
            }, 
...

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Python developer that already knows how to use Python for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/19/2011 12:13 GMT-0500 Modified: 01/25/2011 10:35 GMT-0500
SPARQL Guide for Python Developer

What?

A simple guide usable by any Python developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Python.

Steps:

  1. From your command line execute: aptitude search '^python26', to verify Python is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env python
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store via Python. 
#

import urllib, json

# HTTP URL is constructed accordingly with JSON query results format in mind.

def sparqlQuery(query, baseURL, format="application/json"):
	params={
		"default-graph": "",
		"should-sponge": "soft",
		"query": query,
		"debug": "on",
		"timeout": "",
		"format": format,
		"save": "display",
		"fname": ""
	}
	querypart=urllib.urlencode(params)
	response = urllib.urlopen(baseURL,querypart).read()
	return json.loads(response)

# Setting Data Source Name (DSN)
dsn="http://dbpedia.org/resource/DBpedia"

# Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
# using the IRI in FROM clause as Data Source URL

query="""DEFINE get:soft "replace"
SELECT DISTINCT * FROM <%s> WHERE {?s ?p ?o}""" % dsn 

data=sparqlQuery(query, "http://localhost:8890/sparql/")

print "Retrieved data:\n" + json.dumps(data, sort_keys=True, indent=4)

#
# End

Output

Retrieved data:
{
    "head": {
        "link": [], 
        "vars": [
            "s", 
            "p", 
            "o"
        ]
    }, 
    "results": {
        "bindings": [
            {
                "o": {
                    "type": "uri", 
                    "value": "http://www.w3.org/2002/07/owl#Thing"
                }, 
                "p": {
                    "type": "uri", 
                    "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
                }, 
                "s": {
                    "type": "uri", 
                    "value": "http://dbpedia.org/resource/DBpedia"
                }
            }, 
...

Conclusion

JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Python developer that already knows how to use Python for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [0]
01/19/2011 12:13 GMT-0500 Modified: 01/25/2011 10:35 GMT-0500
SPARQL for the Ruby Developer

What?

A simple guide usable by any Ruby developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Ruby.

Steps:

  1. From your command line execute: aptitude search '^ruby', to verify Ruby is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env ruby
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store. 
#

require 'net/http'
require 'cgi'
require 'csv'

#
# We opt for CSV based output since handling this format is straightforward in Ruby, by default.
# HTTP URL is constructed accordingly with CSV as query results format in mind.

def sparqlQuery(query, baseURL, format="text/csv")
	params={
		"default-graph" => "",
		"should-sponge" => "soft",
		"query" => query,
		"debug" => "on",
		"timeout" => "",
		"format" => format,
		"save" => "display",
		"fname" => ""
	}
	querypart=""
	params.each { |k,v|
		querypart+="#{k}=#{CGI.escape(v)}&"
	}
  
	sparqlURL=baseURL+"?#{querypart}"
	
	response = Net::HTTP.get_response(URI.parse(sparqlURL))

	return CSV::parse(response.body)
	
end

# Setting Data Source Name (DSN)

dsn="http://dbpedia.org/resource/DBpedia"

#Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
#using the IRI in FROM clause as Data Source URL

query="DEFINE get:soft \"replace\"
SELECT DISTINCT * FROM <#{dsn}> WHERE {?s ?p ?o} "

#Assume use of local installation of Virtuoso 
#otherwise you can change URL to that of a public endpoint
#for example DBpedia: http://dbpedia.org/sparql

data=sparqlQuery(query, "http://localhost:8890/sparql/")

puts "Got data:"
p data

#
# End

Output

Got data:
[["s", "p", "o"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://www.w3.org/2002/07/owl#Thing"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://dbpedia.org/ontology/Work"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://dbpedia.org/class/yago/Software106566077"],
...

Conclusion

CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Ruby developer that already knows how to use Ruby for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [1]
01/18/2011 14:48 GMT-0500 Modified: 01/25/2011 10:17 GMT-0500
SPARQL for the Ruby Developer

What?

A simple guide usable by any Ruby developer seeking to exploit SPARQL without hassles.

Why?

SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

How?

SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Ruby.

Steps:

  1. From your command line execute: aptitude search '^ruby', to verify Ruby is in place
  2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
  3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

Script:

#!/usr/bin/env ruby
#
# Demonstrating use of a single query to populate a # Virtuoso Quad Store. 
#

require 'net/http'
require 'cgi'
require 'csv'

#
# We opt for CSV based output since handling this format is straightforward in Ruby, by default.
# HTTP URL is constructed accordingly with CSV as query results format in mind.

def sparqlQuery(query, baseURL, format="text/csv")
	params={
		"default-graph" => "",
		"should-sponge" => "soft",
		"query" => query,
		"debug" => "on",
		"timeout" => "",
		"format" => format,
		"save" => "display",
		"fname" => ""
	}
	querypart=""
	params.each { |k,v|
		querypart+="#{k}=#{CGI.escape(v)}&"
	}
  
	sparqlURL=baseURL+"?#{querypart}"
	
	response = Net::HTTP.get_response(URI.parse(sparqlURL))

	return CSV::parse(response.body)
	
end

# Setting Data Source Name (DSN)

dsn="http://dbpedia.org/resource/DBpedia"

#Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
#using the IRI in FROM clause as Data Source URL

query="DEFINE get:soft \"replace\"
SELECT DISTINCT * FROM <#{dsn}> WHERE {?s ?p ?o} "

#Assume use of local installation of Virtuoso 
#otherwise you can change URL to that of a public endpoint
#for example DBpedia: http://dbpedia.org/sparql

data=sparqlQuery(query, "http://localhost:8890/sparql/")

puts "Got data:"
p data

#
# End

Output

Got data:
[["s", "p", "o"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://www.w3.org/2002/07/owl#Thing"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://dbpedia.org/ontology/Work"], 
  ["http://dbpedia.org/resource/DBpedia", 
   "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
   "http://dbpedia.org/class/yago/Software106566077"],
...

Conclusion

CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Ruby developer that already knows how to use Ruby for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

Related

# PermaLink Comments [1]
01/18/2011 14:48 GMT-0500 Modified: 01/25/2011 10:17 GMT-0500
Virtuoso Linked Data Deployment In 3 Simple Steps

Injecting Linked Data into the Web has been a major pain point for those who seek personal, service, or organization-specific variants of DBpedia. Basically, the sequence goes something like this:

  1. You encounter DBpedia or the LOD Cloud Pictorial.
  2. You look around (typically following your nose from link to link).
  3. You attempt to publish your own stuff.
  4. You get stuck.

The problems typically take the following form:

  1. Functionality confusion about the complementary Name and Address functionality of a single URI abstraction
  2. Terminology confusion due to conflation and over-loading of terms such as Resource, URL, Representation, Document, etc.
  3. Inability to find robust tools with which to generate Linked Data from existing data sources such as relational databases, CSV files, XML, Web Services, etc.

To start addressing these problems, here is a simple guide for generating and publishing Linked Data using Virtuoso.

Step 1 - RDF Data Generation

Existing RDF data can be added to the Virtuoso RDF Quad Store via a variety of built-in data loader utilities.

Many options allow you to easily and quickly generate RDF data from other data sources:

  • Install the Sponger Bookmarklet for the URIBurner service. Bind this to your own SPARQL-compliant backend RDF database (in this scenario, your local Virtuoso instance), and then Sponge some HTTP-accessible resources.
  • Convert relational DBMS data to RDF using the Virtuoso RDF Views Wizard.
  • Starting with CSV files, you can
    • Place them at an HTTP-accessible location, and use the Virtuoso Sponger to convert them to RDF or;
    • Use the CVS import feature to import their content into Virtuoso's relational data engine; then use the built-in RDF Views Wizard as with other RDBMS data.
  • Starting from XML files, you can
    • Use Virtuoso's inbuilt XSLT-Processor for manual XML to RDF/XML transformation or;
    • Leverage the Sponger Cartridge for GRDDL, if there is a transformation service associated with your XML data source, or;
    • Let the Sponger analyze the XML data source and make a best-effort transformation to RDF.

Step 2 - Linked Data Deployment

Install the Faceted Browser VAD package (fct_dav.vad) which delivers the following:

  1. Faceted Browser Engine UI
  2. Dynamic Hypermedia Resource Generator
    • delivers descriptor resources for every entity (data object) in the Native or Virtual Quad Stores
    • supports a broad array of output formats, including HTML+RDFa, RDF/XML, N3/Turtle, NTriples, RDF-JSON, OData+Atom, and OData+JSON.

Step 3 - Linked Data Consumption & Exploitation

Three simple steps allow you, your enterprise, and your customers to consume and exploit your newly deployed Linked Data --

  1. Load a page like this in your browser: http://<cname>[:<port>]/describe/?uri=<entity-uri>
    • <cname>[:<port>] gets replaced by the host and port of your Virtuoso instance
    • <entity-uri> gets replaced by the URI you want to see described -- for instance, the URI of one of the resources you let the Sponger handle.
  2. Follow the links presented in the descriptor page.
  3. If you ever see a blank page with a hyperlink subject name in the About: section at the top of the page, simply add the parameter "&sp=1" to the URL in the browser's Address box, and hit [ENTER]. This will result in an "on the fly" resource retrieval, transformation, and descriptor page generation.
  4. Use the navigator controls to page up and down the data associated with the "in scope" resource descriptor.

Related

# PermaLink Comments [0]
10/29/2010 18:54 GMT-0500 Modified: 11/02/2010 11:55 GMT-0500
Linked Data & Socially Enhanced Collaboration (Enterprise or Individual) -- Update 1

Socially enhanced enterprise and invididual collaboration is becoming a focal point for a variety of solutions that offer erswhile distinct content managment features across the realms of Blogging, Wikis, Shared Bookmarks, Discussion Forums etc.. as part of an integrated platform suite. Recently, Socialtext has caught my attention courtesy of its nice features and benefits page . In addition, I've also found the Mike 2.0 portal immensely interesting and valuable, for those with an enterprise collaboration bent.

Anyway, Socialtext and Mike 2.0 (they aren't identical and juxtaposition isn't seeking to imply this) provide nice demonstrations of socially enhanced collaboration for individuals and/or enterprises is all about:

  1. Identifying Yourself
  2. Identifying Others (key contributors, peers, collaborators)
  3. Serendipitous Discovery of key contributors, peers, and collaborators
  4. Serendipitous Discovery by key contributors, peers, and collaborators
  5. Develop and sustain relationships via socially enhanced professional network hybrid
  6. Utilize your new "trusted network" (which you've personally indexed) when seeking help or propagating a meme.

As is typically the case in this emerging realm, the critical issue of discrete "identifiers" (record keys in sense) for data items, data containers, and data creators (individuals and groups) is overlooked albeit unintentionally.

How HTTP based Linked Data Addresses the Identifier Issue

Rather than using platform constrained identifiers such as:

  • email address (a "mailto" scheme identifier),
  • a dbms user account,
  • application specific account, or
  • OpenID.

It enables you to leverage the platform independence of HTTP scheme Identifiers (Generic URIs) such that Identifiers for:

  1. You,
  2. Your Peers,
  3. Your Groups, and
  4. Your Activity Generated Data,

simply become conduits into a mesh of HTTP -- referencable and accessible -- Linked Data Objects endowed with High SDQ (Serendipitious Discovery Quotient). For example my Personal WebID is all anyone needs to know if they want to explore:

  1. My Profile (which includes references to data objects associated with my interests, social-network, calendar, bookmarks etc.)
  2. Data generated by my activities across various data spaces (via data objects associated with my online accounts e.g. Del.icio.us, Twitter, Last.FM)
  3. Linked Data Meshups via URIBurner (or any other Virtuoso instance) that provide an extend view of my profile

How FOAF+SSL adds Socially aware Security

Even when you reach a point of equilibrium where: your daily activities trigger orchestratestration of CRUD (Create, Read, Update, Delete) operations against Linked Data Objects within your socially enhanced collaboration network, you still have to deal with the thorny issues of security, that includes the following:

  1. Single Sign On,
  2. Authentication, and
  3. Data Access Policies.

FOAF+SSL, an application of HTTP based Linked Data, enables you to enhance your Personal HTTP scheme based Identifer (or WebID) via the following steps (peformed by a FOAF+SSL compliant platform):

  1. Imprint WebID within a self-signed x.509 based public key (certificate) associated with your private key (generated by FOAF+SSL platform or manually via OpenSSL)
  2. Store public key components (modulous and exponent) into your FOAF based profile document which references your Personal HTTP Identifier as its primary topic
  3. Leverage HTTP URL component of WebID for making public key components (modulous and exponent) available for x.509 certificate based authentication challenges posed by systems secured by FOAF+SSL (directly) or OpenID (indirectly via FOAF+SSL to OpenID proxy services).

Contrary to conventional experiences with all things PKI (Public Key Infrastructure) related, FOAF+SSL compliant platforms typically handle the PKI issues as part of the protocol implementation; thereby protecting you from any administrative tedium without compromising security.

Conclusions

Understanding how new technology innovations address long standing problems, or understanding how new solutions inadvertently fail to address old problems, provides time tested mechanisms for product selection and value proposition comprehension that ultimately save scarce resources such as time and money.

If you want to understand real world problem solution #1 with regards to HTTP based Linked Data look no further than the issues of secure, socially aware, and platform independent identifiers for data objects, that build bridges across erstwhile data silos.

If you want to cost-effectively experience what I've outlined in this post, take a look at OpenLink Data Spaces (ODS) which is a distributed collaboration engine (enterprise of individual) built around the Virtuoso database engines. It simply enhances existing collaboration tools via the following capabilities:

Addition of Social Dimensions via HTTP based Data Object Identifiers for all Data Items (if missing)

  1. Ability to integrate across a myriad of Data Source Types rather than a select few across RDBM Engines, LDAP, Web Services, and various HTTP accessible Resources (Hypermedia or Non Hypermedia content types)
  2. Addition of FOAF+SSL based authentication
  3. Addition of FOAF+SSL based Access Control Lists (ACLs) for policy based data access.

Related:

# PermaLink Comments [0]
03/02/2010 15:47 GMT-0500 Modified: 03/03/2010 19:50 GMT-0500
 <<     | 1 | 2 | 3 | 4 | 5 | 6 | 7 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform
The posts on this weblog are my personal views, and not those of OpenLink Software.