Details
Kingsley Uyi Idehen
Lexington, United States
Subscribe
Post Categories
Subscribe
Recent Articles
Display Settings
|
Web 2.0's Open Data Access Conundrum
Open Data Access and Web 2.0 have a very strange relationship
that continues to blur the lines of demarcation between where Web
2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of
Databases etc.) starts. But before I proceed, let me attempt to
define Web 2.0 one more time:
A phase in the evolution web
usage patterns that emphasizes Web Services based interaction
between “Web Users” and “Points of Web Presence” over traditional
“Web Users” and “Web Sites” based interaction. Basically, a
transition from visual site interaction to presence based
interaction.
BTW - Dare Obasanjo also commented about Web usage patterns in
his post titled:
The Two Webs. Where he concluded that we had a dichotomy along
the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0).
Which Jon Udell
evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during
our recent podcast
conversation.
With definitions in place, I will resume my quest to unveil the
aforementioned Web 2.0 Data Access Conundrum:
- Emphasis on XML's prowess in the realms of Data and Protocol
Modeling alongside Data Representation. Especially as SOAP or REST
styles of Web Services and various XML formats (RSS
0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web
2.0 infrastructure landscape
- Where a modicum of Data Access appreciation and comprehension
does exist it is inherently compromised by business models that
mandate some form of “Walled Gardens” and “Data Silos”
- Mash-ups are a response to said “Walled Gardens” and “Data
Silos” . Mash-ups by definition imply combining things that were
not built for recombination.
As you can see from the above, Open Data access isn't genuinely
compatible with Web 2.0.
We can also look at the same issue by way of the popular M-V-C
(Model View Controller) pattern. Web 2.0 is all about the “V” and
“C” with a modicum of “M” at best (data access, open data access,
and flexible open data access are completely separate things). The
“C” items represent application logic exposed by SOAP or REST style
web services etc. I'll return to this later in this post.
What about Social Networking you must be thinking? Isn't this a
Web 2.0 manifestation? Not at all (IMHO). The Web was developed /
invented by Tim Berners-Lee to leverage the “Network Effects”
potential of the Internet for connecting People and Data.
Social Networking on the other hand, is simply one of several ways
by which construct network connections. I am sure we all accept the
fact that connections are built for many other reasons beyond
social interaction. That said, we also know that through social
interactions we actually develop some of our most valuable
relationships (we are social creatures after-all).
The Web 2.0 Open Data Access impedance reality is ultimately
going to be the greatest piece of tutorial and usecase material for
the Semantic Web. I take this position because it is human nature
to seek Freedom (in unadulterated form) which implies the
following:
- Access Data from a myriad of data sources (irrespective of
structural differences at the database level)
- Mesh (not Mash) data in new and interesting ways
- Share the meshed data with as many relevant people as possible
for social, professional, political, religious, and other
reasons
- Construct valuable networks based on data oriented
connections
Web 2.0 by definition and use case scenarios is inherently
incompatible with the above due to the lack of Flexible and Open
Data Access.
If we take the definition of Web 2.0 (above) and rework it with
an appreciation Flexible and Open Data Access you would arrive at
something like this:
A phase in the evolution of the
web that emphasizes interaction between “Web Users” and “Web Data”
facilitated by Web Services based APIs and an Open & Flexible
Data Access Model “.
In more succinct form:
A pervasive network of people
connected by data or data connected by people.
Returning to M-V-C and looking at the definition
above, you now have a complete of ”M“ which is enigmatic in Web 2.0
and the essence of the Semantic Web (Data and Context).
To make all of this possible a palatable Data Model is required.
The model of choice is the Graph based RDF Data Model - not to be
mistaken for the RDF/XML serialization which is just that, a data
serialization that conforms to the aforementioned RDF data
model.
The Enterprise Challenge
Web 2.0 cannot and will not make valuable inroads into the the
enterprise because enterprises live and die by their ability to
exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other
Web 2.0 distributed collaborative applications profiles are only
valuable if the data is available to the enterprise for meshing
(not mashing).
A good example of how enterprises will exploit data by
leveraging networks of people and data (social networks in this
case) is shown in this nice presentation by Accenture's Institute
for High Performance Business titled: Visualizing
Organizational Change.
Web 2.0 commentators (for the most part) continue to ponder the
use of Web 2.0 within the enterprise while forgetting the
congruency between enterprise agility and exploitation of people
& data networks (The very issue emphasized in this original
Web vision
document by Tim Berners-Lee). Even worse, they remain
challenged or spooked by the Semantic Web vision because they do
not understand that Web 2.0 is fundamentally a Semantic Web
precursor due to Open Data Access challenges. Web 2.0 is one of the
greatest demonstrations of why we need the Semantic Web at the
current time.
Finally, juxtapose the items below and you may even get a
clearer view of what I am an attempting to convey about the virtues
of Open Data Access and the inflective role it plays as we move
beyond Web 2.0:
Information
Management Proposal - Tim Berners-Lee
Visualizing
Organizational Change -
Accenture Institute of High Performance Business
09/02/2006 16:47 GMT-0500 |
Modified: 11/16/2006 15:51
GMT-0500 |
Data Spaces and Web of Databases
Note: An updated version of a previously unpublished blog
post:
Continuing from our recent
Podcast conversation, Jon Udell sheds further insight into the
essence of our conversation via a “Strategic Developer” column
article titled:
Accessing the web of databases.
Below, I present an initial dump of a DataSpace FAQ below that
hopefully sheds light on the DataSpace vision espoused during my
podcast conversation with Jon.
What is a DataSpace?
A moniker for Web-accessible atomic containers that manage and
expose Data, Information, Services, Processes, and Knowledge.
What would you typically find in a Data Space? Examples
include:
- Raw Data - SQL, HTML, XML (raw), XHTML, RDF etc.
- Information (Data In Context) - XHTML (various microformats),
Blog Posts (in RSS, Atom, RSS-RDF formats), Subscription Lists
(OPML, OCS, etc), Social Networks (FOAF, XFN etc.), and many other
forms of applied XML.
- Web Services (Application/Service Logic) - REST or SOAP based
invocation of application logic for context sensitive and
controlled data access and manipulation.
- Persisted Knowledge - Information in actionable context that is
also available in transient or persistent forms expressed using a
Graph Data Model. A modern knowledgebase would more than likely
have RDF as its Data Language, RDFS as its Schema Language, and OWL
as its Domain Definition (Ontology) Language. Actual Domain,
Schema, and Instance Data would be serialized using formats such as
RDF-XML, N3, Turtle etc).
How do Data Spaces and Databases differ?
Data Spaces are fundamentally problem-domain-specific database
applications. They offer functionality that you would instinctively
expect of a database (e.g. AICD data management) with the additonal
benefit of being data model and query language agnostic. Data
Spaces are for the most part DBMS Engine and Data Access Middleware
hybrids in the sense that ownership and control of data is
inherently loosely-coupled.
How do Data Spaces and Content Management Systems differ?
Data Spaces are inherently more flexible, they support multiple
data models and data representation formats. Content management
systems do not possess the same degree of data model and data
representation dexterity.
How do Data Spaces and Knowledgebases differ?
A Data Space cannot dictate the perception of its content. For
instance, what I may consider as knowledge relative to my Data
Space may not be the case to a remote client that interacts with it
from a distance, Thus, defining my Data Space as Knowledgebase,
purely, introduces constraints that reduce its broader
effectiveness to third party clients (applications, services, users
etc..). A Knowledgebase is based on a Graph Data Model resulting in
significant impedance for clients that are built around alternative
models. To reiterate, Data Spaces support multiple data models.
What Architectural Components make up a Data Space?
- ORDBMS Engine - for Data Modeling agility (via complex purpose
specific data types and data access methods), Data Atomicity, Data
Concurrency, Transaction Isolation, and Durability (aka
ACID).
- Virtual Database Engine - for creating a single view of, and
access point to, heterogeneous SQL, XML, Free Text, and other data.
This is all about Virtualization at the Data Access Level.
- Web Services Platform - enabling controlled access and
manipulation (via application, service, or protocol logic) of
Virtualized or Disparate Data. This layer handles the decoupling of
functionality from monolithic wholes for function specific
invocation via Web Services using either the SOAP or REST
approach.
Where do Data Spaces fit into the Web's rapid evolution?
They are an essential part of the burgeoning Data Web / Semantic
Web. In short, they will take us from data “Mash-ups” (combining
web accessible data that exists without integration and repurposing
in mind) to “Mesh-ups” (combining web accessible data that exists
with integration and repurposing in mind).
Where can I see a DataSpace along the lines described, in
action?
Just look at my blog, and take the journey as follows:
What about other Data Spaces?
There are several and I will attempt to categorize along the
lines of query method available:
Type 1 (Free Text Search over HTTP):
Google, MSN, Yahoo!, Amazon, eBay, and most Web 2.0 plays .
Type 2 (Free Text Search and XQuery/XPath over HTTP)
A few blogs and Wikis (Jon Udell's and a few others)
Type 3 (RDF Data Sets and SPARQL Queryable):
Type 4 (Generic Free Text Search, OpenSearch, GData, XQuery/XPath,
and SPARQL):
Points of Semantic Web presence such as the Data Spaces at:
What About Data Space aware tools?
- OpenLink Ajax
Toolkit - provides Javascript Control level binding to Query
Services such as XMLA for SQL, GData for Free Text, OpenSearch for
Free Text, SPARQL for RDF, in addition to service specific Web
Services (Web 2.0 hosted solutions that expose service specific
APIs)
- Semantic
Radar - a Firefox Extension
- PingTheSemantic - the Semantic
Webs equivalent of Web 2.0's weblogs.com
- PiggyBank - a Firefox
Extension
08/28/2006 19:38 GMT-0500 |
Modified: 09/04/2006 18:58
GMT-0500 |
The WWW Proposal and RDF: Then and Now (circa 1999)
I've just re-read an article penned by Dan Brickley in 1999
titled: The WWW
Proposal and RDF: Then and Now, that retains its prescience to
this very day. Ironically I stumbled across this timeless piece
while revisiting the
RSS name imbroglio that gave us a simple syndication format
(RSS 2.0) that will ultimately implode (IMHO) since "Simple" is
ultimately short lived when dealing with attention challenged
end-users that are always assumed to be dumb when in fact they are
simply ambivalent.
I was compelled to go back to the RSS 2.0 imbroglio when I came
across Dave Winer's
comments re. "the SEC attempting to reinvent RSS 2.0..." response
to Jon
Udell's recent XBRL article.
Although I don't believe in complex entry points into complex
technology realms, I do subscribe to the approach where developers
deal with the complexity associated with a problem domain while
hiding said complexity from ambivalent end-users via coherent
interfaces -- which does not always imply User Interface.
XBRL is a
great piece of work that addresses the complex problem domain of
Financial Reporting. The only thing it's missing right now is an
Ontology that facilitates RDF Data Model based XBRL
Schema and Instance Data which ultimately makes XBRL data available
to RDF query languages such as SPARQL. This line of
thought implies, for instance, an XML Schema to OWL Ontology Mapping for
Schema Data (as explained in a
white paper by the VSIS Group at the university of Hamburg)
leaving the Instance Data to be generated in a myriad of ways that
includes XML to RDF and/or XML->SQL->RDF.
As I stated in an earlier post:
we should not mistake ambivalence to lack of intelligence.
Assuming "Simple" is always right at all times is another way of
subscribing to this profound misconception. You know, assuming the
world was flat (as opposed to geoid) was quite palatable at some
point in the history of mankind, I wonder what would have happened
if we held on to this point of view to this day because of its
"Simplicity"?
08/28/2006 06:20 GMT-0500 |
Modified: 09/30/2006 16:27
GMT-0500 |
OpenLink Ajax Toolkit (OAT) 1.0 Released
We have finally released the 1.0 edition of OAT.
OAT offers a broad Javascript-based, browser-independent widget
set
for building data source independent rich internet applications
that are usable across a broad range of Ajax-capable web
browsers.
OAT's support binding to the following data sources via its Ajax
Database Connectivity Layer:
SQL Data via XML for Analysis (XMLA)
Web Data via SPARQL, GData, and OpenSearch Query Services
Web Services specific Data via service specific binding to SOAP and
REST style web services
The toolkit includes a collection of powerful rich internet
application prototypes include: SQL Query By Example, Visual
Database Modeling, and Data bound Web Form Designer.
Project homepage on sourceforge.net:
http://sourceforge.net/projects/oat
Source Code:
http://sourceforge.net/projects/oat/files
Live demonstration:
http://www.openlinksw.com/oat/
08/08/2006 22:11 GMT-0500 |
Modified: 08/09/2006 05:12
GMT-0500 |
Semantic Knight vs Web Hacker
Semantic
Knight: "
SEMANTIC KNIGHT:
None shall pass without formally defining the ontological
meta-semantic thingies of their domain something-or-others!
HACKER:
What?
SEMANTIC KNIGHT:
None shall pass without using all sorts of semantic
meta-meta-meta-stuff that we will invent Real Soon Now!
HACKER:
I have no quarrel with you, good Sir Knight, but I must get my work
done on the Web. Stand aside!
More from: Semantic
Knight vs. Web Hacker Duel. Nice antidote to lots of
self-rightous talk in the aftermath of the
TBL-Norvig encounter. Thanks York.
"
(Via Valentin
Zacharias.)
07/23/2006 19:37 GMT-0500 |
Modified: 07/24/2006 15:09
GMT-0500 |
GeoRSS & Geonames for Philanthropy re. Kiva Microfinance
(Via Geospatial
Semantic Web Blog.)
GeoRSS & Geonames for Philanthropy: "
I heard about Kiva.ORG in a BusinessWeek podcast. After
visiting its website, I think there are few places where GeoRSS (in
the RDF/A syntax) and Geonames can be used to enhance the site’s
functionality.
Kiva.ORG Background
It’s a microfinance website for people in the developing
countries. Its business model is in the intersection between
peer-to-peer financing and philanthropy. The goal is to help
developing country businesses to borrow small loans from a large
group of Web users, so that they can avoid paying high interests to
the banks.
For example, a person in Uganda can
request a $500 loan and use it for buying and selling more
poultry. One or more lenders (anyone on the Web) may decide to
grant loans to that person in increments as tiny as $25. After few
years, that person will pay back the loans to the lenders.
How GeoRSS and Geonames Can Help
I went to the website and discovered the site has a relative
weak search and browsing interface. In particular, there is no way
to group loan requests based on geographical locations (e.g.,
countries, cities and regions).
Took a look at individual loan pages. Each page actually has
standard ways to describe location information — e.g.,
Location: Mbale, Uganda.
It should be relative easy to add GeoRSS points (in
the RDF/A syntax) to describe these location information (an
alternative maybe using
Microformat Geo or W3C Geo). Once the location
information is annotated, one can imagine building a map mashup to
display loan requests in a geospatial perspective. One can also
build search engines to support spatial queries such as ‘find me
all loans with from Mbale’.
Since Kiva.ORG webmasters may not be GIS experts, it will be
nice if we can find ways to automatically geocode location
information and describe that using GeoRSS. This automatic
geocoding procedure can be developed using Geonames’s
webservices. Take a string ‘Mbale’ or ‘Uganda’, and send to
Geonames’s search service. The procedure will get back JSON
or XML
description of the location, which include latitude and longitude.
This will then be used to annotate the location information in a
Kiva loan page.
Can you think of other ways to help Kiva.ORG to become more
‘geospatially intelligent’?
You can learn more about Kiva.ORG at its website and listen to
this podcast.
"
07/15/2006 14:11 GMT-0500 |
Modified: 07/15/2006 10:48
GMT-0500 |
Standards as social contracts
Standards as social contracts: "Looking at Dave Winer's efforts
in evangelizing OPML, I try to draw some rough lines into what
makes a de-facto standard. De Facto standards are made and seldom
happen on their own. In this entry, I look back at the history of
HTML, RSS, the open source movement and try to draw some lines as
to what makes a standard.
"
(Via Tristan Louis.)
I posted a comment to the Tristan Louis' post along the
following lines:
Analysis is spot on re. the link between de facto
standardization and bootstrapping. Likewise, the clear linkage
between boostrapping and connected communities (a variation of the
social networking paradigm).
Dave built a community around a XML content syndication and
subscription usecase demo that we know today as the blogosphere.
Superficially, one may conclude that Semantic Web vision has
suffered to date from a lack a similar bootstrap effort. Whereas in
reality, we are dealing with "time and context" issues that are
critical to the base understanding upon which a "Dave Winer" style
bootstrap for the Semantic Web would occur.
Personally, I see the emergence of Web 2.0 (esp. the mashups
phenomenon) as the "time and context" seeds from which the Semantic
Web bootstrap will sprout. I see shared ontologies such as FOAF and SIOC leading the way (they are the RSS
2.0's of the Semantic Web IMHO).
07/04/2006 17:25 GMT-0500 |
Modified: 07/04/2006 14:53
GMT-0500 |
Structured Data vs. Unstructured Data
There is an interesting article at regdeveloper.com titled:
Structured data is boring and useless.. This article provides insight into a serious point of confusion about what exactly is structured vs. unstructured data. Here is a key excerpt:
"We all know that structured data is boring and
useless; while unstructured data is sexy and chock full of value.
Well, only up to a point, Lord Copper. Genuinely unstructured data
can be a real nuisance - imagine extracting the return address from
an unstructured letter, without letterhead and any of the
formatting usually applied to letters. A letter may be thought of
as unstructured data, but most business letters are, in fact,
highly-structured." ....
Duncan Pauly, founder and chief technology officer of Coppereye add's eloquent insight to the conversation:
"The labels "structured data" and "unstructured
data" are often used ambiguously by different interest groups; and
often used lazily to cover multiple distinct aspects of the issue.
In reality, there are at least three orthogonal aspects to
structure:
* The structure of the data
itself.
* The structure of the container that
hosts the data.
* The structure of the access method
used to access the data.
These three dimensions are largely independent and one does not
need to imply another. For example, it is absolutely feasible and
reasonable to store unstructured data in a structured database
container and access it by unstructured search
mechanisms."
Data understanding and appreciation is dwindling at a time when
the reverse should be happening. We are supposed to be in the
throws of the "Information Age", but for some reason this appears
to have no correlation with data and "data access" in the minds of
many -- as reflected in the broad contradictory positions taken re.
unstructured data vs structured data, structured is boring and
useless while unstructured is useful and sexy....
The difference between "Structured Containers" and "Structured
Data" are clearly misunderstood by most (an unfortunate fact).
For instance all DBMS products are "Structured Containers"
aligned to one or more data models (typically one). These products
have been limited by proprietary data access APIs and underlying
data model specificity when used in the "Open-world" model that is
at the core of the World Wide Web. This confusion also carries over
to the misconception that Web 2.0 and the Semantic/Data Web are
mutually exclusive.
But things are changing fast, and the concept of multi-model
DBMS products is beginning to crystalize. On our part, we have
finally released the long promised "OpenLink Data Spaces"
application layer that has been developed using our Virtuoso Universal
Server. We have structured unified storage containment exposed
to the data web cloud via endpoints for querying or accessing data
using a variety of mechanisms that include; GData, OpenSearch,
SPARQL, XQuery/XPath, SQL etc..
To be continued....
06/23/2006 18:35 GMT-0500 |
Modified: 06/27/2006 01:39
GMT-0500 |
Contd: Ajax Database Connectivity Demos
Last week I put out a series of screencast style demos that
sought to demonstrate the core elements of our soon to be released
Javascript Toolkit called OAT (OpenLink Ajax Toolkit) and its
Ajax Database Connectivity layer.
The screencasts covered the following functionality realms:
-
SQL Query By Example (basic)
-
SQL Query By Example (advanced - pivot table construction)
-
Web Form Design (basic database driven map based mashup)
-
Web Form Design (advanced database driven map based
mashup)
To bring additional clarity to the screencasts demos and OAT in
general, I have saved a number of documents that are the by
products of activities in the screenvcasts:
-
Live XML Document produced using SQL Query By Example (basic)
(you can use drag and drop columns across the grid to reorder and
sort presentation)
-
Live XML Document produced using QBE and Pivot Functionality
(you can drag and drop the aggregate columns and rows to create
your own views etc..)
-
Basic database driven map based mashup (works with FireFox,
Webkit, Camino; click on pins to see national flag)
-
Advanced database driven map based mashup (works with FireFox,
Webkit, Camino; records, 36, 87, and 257 will unveil pivots via
lookup pin)
Notes:
- “Advanced”, as used above, simply means that I am embedding
images (employee photos and national flags) and a database driven
pivot into the map pins that serve as details lookups in classic
SQL master/details type scenarios.
- The “Ajax Call In Progress..” dialog is there to show live
interaction with a remote database (in this case Virtuoso but this could be any
ODBC, JDBC, OLEDB, ADO.NET, or XMLA accessible data source)
- The data access magic source (if you want to call it that) is
XMLA - a standard that has been in place for years but completely
misunderstood and as a result under utilized
You can see a full collection of saved documents at the
following locations:
06/01/2006 22:48 GMT-0500 |
Modified: 06/22/2006 08:56
GMT-0500 |
Screencast: Ajax Database Connectivity and SQL Query By Example
AJAX Database Connectivity is the Data Access Component of OAT ( OpenLink AJAX
Toolkit). It's basically an XML
for Analysis(XMLA) client that enables the development and deployment of database independent Rich Internet Applications (RIAs). Thus, you can now develop database centric AJAX applications without lock-in at the Operating System, Database Connectivity mechanism (ODBC, JDBC, OLEDB, ADO.NET), or back-end Database levels.
XMLA has been around for a long time. Its fundamental goal was to provide Web Applications with Tabular and Multi-dimensional data access before it fell off the radar (a story too long to tell in this post).
AJAX Database connectivity only requires your target DBMS to be XMLA (direct), ODBC, JDBC, OLEDB, or ADO.NET accessible.
I have attached a Query By Example (QBE) screencast movie enclosure to this post (should you be reading this post Web 1.0 style). The demo shows how Paradox-, Quattro Pro-, Access-, and MS Query-like user friendly querying is achieved using AJAX DatabaseConnect Connectivity
05/26/2006 17:59 GMT-0500 |
Modified: 06/22/2006 08:56
GMT-0500 |
|
|