Last week we officially released Virtuoso 5.0.1 (in Commercial and Open Source Editions). The press release provided us with an official mechanism and timestamp for the current Virtuoso feature set.
A vital component of the new Virtuoso release is the finalization of our SQL to RDF mapping functionality -- enabling the declarative mapping of SQL Data to RDF. Additional technical insight covering other new features (delivered and pending) is provided by Orri Erling, as part of a series of post-Banff posts.
Why is SQL to RDF Mapping a Big Deal?
A majority of the world's data (especially in the enterprise realm) resides in SQL Databases. In addition, Open Access to the data residing in said databases remains the biggest challenge to enterprises for the following reasons:
-
SQL Data Sources are inherently heterogeneous because they are acquired with business applications that are in many cases inextricably bound to a particular DBMS engine
-
Data is predictably dirty
-
DBMS vendors ultimately hold the data captive and have traditionally resisted data access standards such as ODBC (*trust me they have, just look at the unprecedented bad press associated with ODBC the only truly platform independent data access API. Then look at how this bad press arose..*)
Enterprises have known from the beginning of modern corporate times that data access, discovery, and manipulation capabilities are inextricably linked to the "Real-time Enterprise" nirvana (hence my use of 0.0 before this becomes 3.0).
In my experience, as someone whose operated in the data access and data integration realms since the late '80s, I've painfully observed enterprises pursue, but unsuccessfully attain, full control over enterprise data (the prized asset of any organization) such that data-, information-, knowledge-workers are just a click away from commencing coherent platform and database independent data drill-downs and/or discovery that transcend intranet, internet, and extranet boundaries -- serendipitous interaction with relevant data, without compromise!
Okay, situation analysis done, we move on..
At our most recent (12th June) monthly Semantic Web Gathering, I unveiled to TimBL and a host of other attendees a simple, but powerful, demonstration of how Linked Data, as an aspect of the Semantic Data Web, can be applied to enterprise data integration challenges.
Actual SQL to RDF Mapping Demo / Experiment
Hypothesis
A SQL Schema can be effectively mapped declaratively to RDF such that SQL Rows morph into RDF Instance Data (Entity Sets) based on the Concepts & Properties defined in a Concrete Conceptual Data Model oriented Data Dictionary (
RDF Schema and/or
OWL Ontology). In addition, the solution must demonstrate how "Linked Data in the Web" is completely different from "Data on the Web" or "Linked Data on the Web" (btw -
Tom Heath eloquently unleashed this point in his recent
podcast interview with Talis).
Apparatus
An Ontology - in this case we simply derived the
Northwind Ontology from the XML Schema based CSDL (
Conceptual Schema Definition Language) used by Microsoft's public
Astoria demo (specifically the
Northwind Data Services demo).
SQL Database Schema -
Northwind (comes bundled with ACCESS, SQL Server, and Virtuoso) comprised of tables such as:
Customer,
Employee,
Product,
Category,
Supplier,
Shipper etc.
OpenLink Virtuoso - SQL DBMS Engine (although this could have been any
ODBC or
JDBC accessible Database),
SQL-RDF Metaschema Language, HTTP URL-rewriter, WebDAV Engine, and DBMS hosted XSLT processor
Client Tools -
iSPARQL Query Builder,
RDF Browser (which could also have been
Tabulator or
DISCO or a standard Web Browser)
Experiment / Demo
-
Declaratively map the Northwind SQL Schema to RDF using the Virtuoso Meta Schema Language (see: Virtuoso PL based Northwind_SQL_RDF script)
-
Start browsing the data by clicking on the URIs that represent the RDF Data Model Entities resulting from the SQL to RDF Mapping
Observations
-
Via a single Data Link click I was able to obtain specific information about the Customer represented by the URI "ALFKI" (act of URI Dereferencing as you would an Object ID in an Object or Object-Relational Database)
-
Via a
Dynamic Data Page I was able to explore all the entity relationships or specific entity data (i.e Exploratory or Entity specific dereferencing) in the Northwind Data Space
-
I was able to perform similar exploration (as per item 2) using our
OpenLink Browser.
Conclusions
The vision of data, information, or knowledge at your fingertips is nigh! Thanks to the infrastructure provided by the Semantic Data Web (URIs, RDF Data Model, variety of RDF Serialization Formats[1][2][3], and Shared Data Dictionaries / Schemas / Ontologies [1][2][3][4][5]) it's now possible to Virtualize enterprise data from the Physical Storage Level, through the Logical Data Management Levels (Relational), up to a Concrete Conceptual Model (Graph) without operating system, development environment or framework, or database engine lock-in.
Next Steps
We produce a shared ontology for the CRM and Business Reporting Domains. I hope this experiment clarifies how this is quite achievable by converting XML Schemas to RDF Data Dictionaries (RDF Schemas or Ontologies). Stay tuned :-)
Also watch TimBL amplify and articulate Linked Data value in a recent interview.
Other Related Matters
To deliver a mechanism that facilitates the crystallization of this reality is a contribution of boundless magnitude (as we shall all see in due course). Thus, it is easy to understand why even "her majesty", the queen of England, simply had to get in on the act and appoint TimBL to the "British Order of Merit" :-)
Note: All of the demos above now work with IE & Safari (a "remember what Virtuoso is epiphany") by simply putting Virtuoso's DBMS hosted XSLT engine to use :-) This also applies to my earlier collection of demos from the Hello Data Web and other Data Web & Linked Data related demo style posts.