Note: An updated version of a previously unpublished blog
post:
Continuing from our recent
Podcast conversation, Jon Udell sheds further insight into the
essence of our conversation via a “Strategic Developer” column
article titled:
Accessing the web of databases.
Below, I present an initial dump of a DataSpace FAQ below that
hopefully sheds light on the DataSpace vision espoused during my
podcast conversation with Jon.
What is a DataSpace?
A moniker for Web-accessible atomic containers that manage and
expose Data, Information, Services, Processes, and Knowledge.
What would you typically find in a Data Space? Examples
include:
- Raw Data - SQL, HTML, XML (raw), XHTML, RDF etc.
- Information (Data In Context) - XHTML (various microformats),
Blog Posts (in RSS, Atom, RSS-RDF formats), Subscription Lists
(OPML, OCS, etc), Social Networks (FOAF, XFN etc.), and many other
forms of applied XML.
- Web Services (Application/Service Logic) - REST or SOAP based
invocation of application logic for context sensitive and
controlled data access and manipulation.
- Persisted Knowledge - Information in actionable context that is
also available in transient or persistent forms expressed using a
Graph Data Model. A modern knowledgebase would more than likely
have RDF as its Data Language, RDFS as its Schema Language, and OWL
as its Domain Definition (Ontology) Language. Actual Domain,
Schema, and Instance Data would be serialized using formats such as
RDF-XML, N3, Turtle etc).
How do Data Spaces and Databases differ?
Data Spaces are fundamentally problem-domain-specific database
applications. They offer functionality that you would instinctively
expect of a database (e.g. AICD data management) with the additonal
benefit of being data model and query language agnostic. Data
Spaces are for the most part DBMS Engine and Data Access Middleware
hybrids in the sense that ownership and control of data is
inherently loosely-coupled.
How do Data Spaces and Content Management Systems differ?
Data Spaces are inherently more flexible, they support multiple
data models and data representation formats. Content management
systems do not possess the same degree of data model and data
representation dexterity.
How do Data Spaces and Knowledgebases differ?
A Data Space cannot dictate the perception of its content. For
instance, what I may consider as knowledge relative to my Data
Space may not be the case to a remote client that interacts with it
from a distance, Thus, defining my Data Space as Knowledgebase,
purely, introduces constraints that reduce its broader
effectiveness to third party clients (applications, services, users
etc..). A Knowledgebase is based on a Graph Data Model resulting in
significant impedance for clients that are built around alternative
models. To reiterate, Data Spaces support multiple data models.
What Architectural Components make up a Data Space?
- ORDBMS Engine - for Data Modeling agility (via complex purpose
specific data types and data access methods), Data Atomicity, Data
Concurrency, Transaction Isolation, and Durability (aka
ACID).
- Virtual Database Engine - for creating a single view of, and
access point to, heterogeneous SQL, XML, Free Text, and other data.
This is all about Virtualization at the Data Access Level.
- Web Services Platform - enabling controlled access and
manipulation (via application, service, or protocol logic) of
Virtualized or Disparate Data. This layer handles the decoupling of
functionality from monolithic wholes for function specific
invocation via Web Services using either the SOAP or REST
approach.
Where do Data Spaces fit into the Web's rapid evolution?
They are an essential part of the burgeoning Data Web / Semantic
Web. In short, they will take us from data “Mash-ups” (combining
web accessible data that exists without integration and repurposing
in mind) to “Mesh-ups” (combining web accessible data that exists
with integration and repurposing in mind).
Where can I see a DataSpace along the lines described, in
action?
Just look at my blog, and take the journey as follows:
What about other Data Spaces?
There are several and I will attempt to categorize along the
lines of query method available:
Type 1 (Free Text Search over HTTP):
Google, MSN, Yahoo!, Amazon, eBay, and most Web 2.0 plays .
Type 2 (Free Text Search and XQuery/XPath over HTTP)
A few blogs and Wikis (Jon Udell's and a few others)
Type 3 (RDF Data Sets and SPARQL Queryable):
Type 4 (Generic Free Text Search, OpenSearch, GData, XQuery/XPath,
and SPARQL):
Points of Semantic Web presence such as the Data Spaces at:
What About Data Space aware tools?
- OpenLink Ajax
Toolkit - provides Javascript Control level binding to Query
Services such as XMLA for SQL, GData for Free Text, OpenSearch for
Free Text, SPARQL for RDF, in addition to service specific Web
Services (Web 2.0 hosted solutions that expose service specific
APIs)
- Semantic
Radar - a Firefox Extension
- PingTheSemantic - the Semantic
Webs equivalent of Web 2.0's weblogs.com
- PiggyBank - a Firefox
Extension