The motivation behind this post is a response to the Read/WriteWeb post titled:
Semantic Web: Difficulties with the Classic Approach.
First off, I am going to focus on the Semantic Data Web aspect
of the overall Semantic Web vision (a continuum) as this is what we
have now. I am also writing this post as a deliberate contribution
to the discourse swirling around the real topic: Semantic Web Value
Proposition.
Situation Analysis
We are in the early stages of the long anticipated Knowledge
Economy. That being the case, it would be safe to assume that
information access, processing, and dissemination are of utmost
importance to individuals and organizations alike. You don't
produce knowledge in a vacum! Likewise, you can produce Information
in a vacum, you need Data.
The Semantic Data Web's value to Individuals
Problem:
Increasingly, Blogs, Wikis, Shared
Bookmarks, Photo Galleries, Discussion Forums, Shared Calendars
and the like, have become invaluable tools for individual and
organizational participation in Web enabled global discourse (where
a lot of knowledge is discovered). These tools, are typically
associated with Web
2.0, implying Read-Write access via Web Services,
centralized application hosting, and data lock-in (silos).
The reality expressed above is a recipe for "Information
Overload" and complete annihilation of ones effective pursuit
and exploitation of knowledge due "Time Scarcity" (note:
disconnecting is not an option). Information abundance is inversely
related to available processing time (for humans in particular). In
my case for instance, I was actively subscribed to over 500+ RSS
feeds in 2003. As of today, I've simply stopped counting, and
that's just my Weblog Data Space. Then add to that, all of the
Discussions I track across Blogs, wikis, message boards, mailing
lists, traditional usnet discussion forumns, and the like, and I
think you get the picture.
Beyond information overload, Web 2.0 data is "Semi-Structured"
by way of it's dominant data containers ((X)HTML, RSS, Atom
documents and data streams etc.) lacking semantics that formally
expose individual data items as distinct entities, endowed with
unambiguous naming / identification, descriptive attributes (a type
of property/predicate), and relationships (a type of
property/predicate).
Solution:
Devise a standard for Structured Data Semantics that is
compatible with the
Web Information BUS.
Produce structured
data (entities, entity types, entity relationships) from Web
1.0 and Web 2.0 resources that already exists on the Web such that
individual entities, their attributes, and relationships are
accessible and discernible to software agents (machines).
Once the entities are individually exposed, the next requirement
is a mechanism for selective access to these entities i.e. a query
language.
Semantic Data Web Technologies that facilitate the solution
described above include:
Structured Data Standards:
RDF- Data Model
for structured data
RDF/XML - A serialization format for RDF based structured
data
N3/
Turtle- more
human friendly serialization formats for RDF based structured
data
Entity Exposure & Generation:
GRDDL- enables
association between XHTML pages and XSLT stylesheets that
facilitates loosely coupled "on the fly" extraction of RDF from non
RDF documents
RDFa- enables
document publishers or viewers (i.e those repurposing or
annotating) to embed structured data into existing XHTML
documents
eRDF-
another option for embedding structured RDF data within (X)HTML
documents
RDF
Middleware- typically incorporating GRDDL, RDFa, eRDF, and
custom extraction and mapping as part of a structured data
production pipeline
.
Entity Naming & Identification:
Use of URIs or IRIs for uniquely identifying physical (HTML
Documents, Image Files, Multimedia Files etc..) and abstract
(People, Places, Music, and other abstract things).
Entity Access & Querying:
SPARQL Query
Language - the SQL
analog of the Semantic Data Web that enables query constructs that
target named entities, entity attributes, and entity
relationships
The Semantic Data Web's value to Organizations
Problem:
Organizations are rife with a plethora of business systems that
are built atop a myriad of database engines, sourced from a variety
of DBMS vendors. A typical organization would have a different
database engine, from a specific DBMS vendor, underlying critical
business applications such as: Human Resource Management (HR),
Customer Relationship Management (CRM), Accounting, Supply Chain
Management etc. In a nutshell, you have DBMS Engines, and DBMS
Schema heterogeneity permeating the IT infrastructure of
organizations on a global scale, making Data & Information
Integration the biggest headache across all IT driven
organizations.
Solution:
Alleviation of the pain (costs) associated with Data &
Information Integration.
Semantic Data Web offerings:
A dexterous data model (RDF) that enables the construction of
conceptual views of disparate data sources across an organization
based on existing web architecture components such as HTTP and
URIs.
Existing middleware solutions that facilitate the exposure of
SQL DBMS data as RDF based Structured Data include:
BTW - There is an upcoming W3C Workshop covering the
integration of SQL and RDF data.
Conclusion
The Semantic Data Web is here, it's value delivery vehicle is
the URI. The URI is a conduit to Interlinked Structured Data (RDF
based Linked Data) derived from existing data sources on the World
Wide Web alongside data continuously injected into the Web by
organizations world wide. Ironically, the Semantic Data Web only
platform that crystallizes the: Information at Your Fingertips
vision, without development environment, operating system,
application, or database lock-in. You simply click on a Linked Data URI and
the serendipitous exploration and discovery of data commences.
The unobtrusive emergence of the Semantic Data Web is a
reflection of the soundness of the underlying Semantic Web
vision.
If you are excited about Mash-ups
then your are a Semantic Web enthusiast and benefactor in the
making, because you only "Mash" (brute force data extraction and
interlinking) because you can't "Mesh" (natural data extraction and
interlinking). Likewise, if you are a social-networking, open
social-graph, or portable social-network enthusiast, then you are
also a Semantic Data Web benefactor and enthusiasts, because your
"values" (yes, the values associated with the properties that
define you e.g your interests etc) are the fundamental basis for
portable, open, social-networking, which is what the Semantic Data
Web hands to you on a platter without compromise (i.e. data lock-in
or loss of data ownership).
Some practical examples of Semantic Data Web prowess:
DBpedia(*note: I deliberately
use DBpedia URIs in my posts where I would otherwise have used a
Wikipedia article URI*)