Last week we officially released
Virtuoso 5.0.1 (in Commercial and Open Source Editions). The
press release provided us with an official mechanism and timestamp
for the current Virtuoso feature set.
A vital component of the new Virtuoso release is the
finalization of our SQL to RDF mapping functionality -- enabling
the declarative mapping of SQL Data to RDF. Additional technical
insight covering other new features (delivered and pending) is
provided by Orri Erling, as
part of a series of post-Banff posts.
Why is SQL to RDF Mapping a Big Deal?
A majority of the world's data (especially in the enterprise
realm) resides in SQL Databases. In addition, Open Access to the
data residing in said databases remains the biggest challenge to
enterprises for the following reasons:
- SQL Data Sources are inherently heterogeneous because they are
acquired with business applications that are in many cases
inextricably bound to a particular DBMS engine
- Data is predictably dirty
- DBMS vendors ultimately hold the data captive and have
traditionally resisted data access standards such as ODBC (*trust
me they have, just look at the unprecedented bad press associated
with ODBC the only truly platform independent data access API. Then
look at how this bad press arose..*)
Enterprises have known from the beginning of modern corporate
times that data access, discovery, and manipulation capabilities
are inextricably linked to the "Real-time Enterprise" nirvana
(hence my use of 0.0 before this becomes 3.0).
In my experience, as someone whose operated in the data access
and data integration realms since the late '80s, I've painfully
observed enterprises pursue, but unsuccessfully attain, full
control over enterprise data (the prized asset of any organization)
such that data-, information-, knowledge-workers are just a click
away from commencing coherent platform and database independent
data drill-downs and/or discovery that transcend intranet,
internet, and extranet boundaries -- serendipitous interaction with
relevant data, without compromise!
Okay, situation analysis done, we move on..
At our most recent (12th
June) monthly Semantic
Web Gathering, I unveiled to TimBL and a host
of other attendees a simple, but powerful, demonstration of how
Linked Data,
as an aspect of the
Semantic Data Web, can be applied to enterprise data
integration challenges.
Actual SQL to RDF Mapping Demo / Experiment
Hypothesis
A SQL Schema can be effectively mapped declaratively to RDF such
that SQL Rows morph into RDF Instance Data (Entity Sets) based on
the Concepts & Properties defined in a Concrete Conceptual Data
Model oriented Data Dictionary (
RDF Schema and/or
OWL
Ontology). In addition, the solution must demonstrate how
"Linked Data in the Web" is completely different from "Data on the
Web" or "Linked Data on the Web" (btw -
Tom Heath eloquently
unleashed this point in his recent
podcast interview with Talis).
Apparatus
An Ontology - in this case we simply derived the
Northwind Ontology from the XML Schema based CSDL (
Conceptual
Schema Definition Language) used by Microsoft's public
Astoria demo
(specifically the
Northwind Data Services demo). SQL Database Schema -
Northwind (comes bundled with ACCESS, SQL Server, and Virtuoso)
comprised of tables such as:
Customer,
Employee,
Product,
Category,
Supplier,
Shipper
etc.
OpenLink
Virtuoso - SQL DBMS Engine (although this could have been any
ODBC
or
JDBC
accessible Database),
SQL-RDF Metaschema Language, HTTP URL-rewriter, WebDAV Engine,
and DBMS hosted XSLT processor Client Tools -
iSPARQL Query Builder,
RDF
Browser (which could also have been
Tabulator or
DISCO
or a standard Web Browser)
Experiment / Demo
- Declaratively map the Northwind SQL Schema to RDF using the
Virtuoso Meta Schema Language (see:
Virtuoso PL based Northwind_SQL_RDF script)
- Start browsing the data by clicking on the URIs that represent
the RDF Data Model Entities resulting from the SQL to RDF
Mapping
Observations
- Via a single Data Link click I was able to obtain specific
information about the Customer represented by the URI "ALFKI"
(act of URI Dereferencing as you would an Object ID in an Object or
Object-Relational Database)
- Via a
Dynamic Data Page I was able to explore all the entity
relationships or specific entity data (i.e Exploratory or Entity
specific dereferencing) in the Northwind Data Space
- I was able to perform similar exploration (as per item 2) using
our
OpenLink Browser.
Conclusions
The vision of data, information, or knowledge at your fingertips
is nigh! Thanks to the infrastructure provided by the Semantic Data
Web (URIs, RDF
Data Model, variety of RDF Serialization Formats[1][2][3],
and Shared Data Dictionaries / Schemas / Ontologies [1][2][3][4][5])
it's now possible to Virtualize enterprise data from the Physical
Storage Level, through the Logical Data Management Levels
(Relational), up to a Concrete Conceptual Model (Graph) without
operating system, development environment or framework, or database
engine lock-in.
Next Steps
We produce a shared ontology for the CRM and Business Reporting
Domains. I hope this experiment clarifies how this is quite
achievable by converting XML Schemas to RDF Data Dictionaries (RDF
Schemas or Ontologies). Stay tuned :-)
Also watch TimBL amplify and
articulate Linked Data value in a recent interview.
Other Related Matters
To deliver a mechanism that facilitates the crystallization of
this reality is a contribution of boundless magnitude (as we shall
all see in due course). Thus, it is easy to understand why even
"her majesty", the queen of England, simply had to get in on the
act and appoint TimBL to the
"British Order of Merit" :-)
Note: All of the demos above now work with IE & Safari (a
"remember what Virtuoso is epiphany") by simply putting Virtuoso's
DBMS hosted XSLT engine to use :-) This also applies to my earlier
collection of demos from the
Hello Data Web and other
Data Web & Linked Data related demo style posts.