Details
Kingsley Uyi Idehen
Lexington, United States
Subscribe
Post Categories
Subscribe
Recent Articles
Display Settings
|
Data Spaces, Internet Reinvention, and Semantic Web
In the last week I've dispatch some thoughts about a number of issues (Data Spaces and Web 2.0's Open Data Access Paradox) that basically equate to the identification of the Web 2.0 to Semantic Web (Data Web, Web of Databases, Web.next etc..) inflection. One of the great things about the moderate “open data access” that we have today (courtesy of the blogosphere) is the fact that you can observe the crystallization of new thinking, and/or new appreciation of emerging ideas, in near real-time. Of course, when we really hit the tracks with the Semantic Web this will be in “conditional real-time” (i.e. you choose and control your scope and sensitivity to data changes etc..). For instance, by way of feed subscriptions, I stumbled upon a series of posts by Jason Kolb that basically articulate what I (and others who believe in the Semantic Web vision) have been attempting to convey in a myriad of ways via posts and commentary etc.. Here are the links to the 4 part series by Jason: - Reinventing the Internet part 1 (appreciating “Presence” over traditional “Web Sites”)
- Reinventing the Internet part 2
- Reinventing the Internet part 3 (appreciating and comprehending URIs)
- Reinventing the Internet part 4 (nice visualization of what “Data Spaces”)
-
Reinventing the Internet part 5 (everyone will have a Data Space in due course becuase the Internet is really a Federation of Data Spaces)
09/04/2006 17:06 GMT-0500
|
Modified:
01/25/2007 16:50 GMT-0500
|
Web 2.0's Open Data Access Conundrum
Open Data Access and Web 2.0 have a very strange relationship that continues to blur the lines of demarcation between where Web 2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of Databases etc.) starts. But before I proceed, let me attempt to define Web 2.0 one more time: A phase in the evolution web usage patterns that emphasizes Web Services based interaction between “Web Users” and “Points of Web Presence” over traditional “Web Users” and “Web Sites” based interaction. Basically, a transition from visual site interaction to presence based interaction. BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation. With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum: - Emphasis on XML's prowess in the realms of Data and Protocol Modeling alongside Data Representation. Especially as SOAP or REST styles of Web Services and various XML formats (RSS 0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web 2.0 infrastructure landscape
- Where a modicum of Data Access appreciation and comprehension does exist it is inherently compromised by business models that mandate some form of “Walled Gardens” and “Data Silos”
- Mash-ups are a response to said “Walled Gardens” and “Data Silos” . Mash-ups by definition imply combining things that were not built for recombination.
As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0. We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the “V” and “C” with a modicum of “M” at best (data access, open data access, and flexible open data access are completely separate things). The “C” items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post. What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the “Network Effects” potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all). The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following: - Access Data from a myriad of data sources (irrespective of structural differences at the database level)
- Mesh (not Mash) data in new and interesting ways
- Share the meshed data with as many relevant people as possible for social, professional, political, religious, and other reasons
- Construct valuable networks based on data oriented connections
Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access. If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this: A phase in the evolution of the web that emphasizes interaction between “Web Users” and “Web Data” facilitated by Web Services based APIs and an Open & Flexible Data Access Model “. In more succinct form: A pervasive network of people connected by data or data connected by people. Returning to M-V-C and looking at the definition above, you now have a complete of ”M“ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context). To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model. The Enterprise Challenge Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing). A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change. Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time. Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0: Information Management Proposal - Tim Berners-Lee Visualizing Organizational Change - Accenture Institute of High Performance Business
09/02/2006 16:47 GMT-0500
|
Modified:
11/16/2006 15:51 GMT-0500
|
Data Spaces and Web of Databases
Note: An updated version of a previously unpublished blog post:
Continuing from our recent Podcast conversation, Jon Udell sheds further insight into the essence of our conversation via a “Strategic Developer” column article titled: Accessing the web of databases. Below, I present an initial dump of a DataSpace FAQ below that hopefully sheds light on the DataSpace vision espoused during my podcast conversation with Jon. What is a DataSpace? A moniker for Web-accessible atomic containers that manage and expose Data, Information, Services, Processes, and Knowledge. What would you typically find in a Data Space? Examples include: - Raw Data - SQL, HTML, XML (raw), XHTML, RDF etc.
- Information (Data In Context) - XHTML (various microformats), Blog Posts (in RSS, Atom, RSS-RDF formats), Subscription Lists (OPML, OCS, etc), Social Networks (FOAF, XFN etc.), and many other forms of applied XML.
- Web Services (Application/Service Logic) - REST or SOAP based invocation of application logic for context sensitive and controlled data access and manipulation.
- Persisted Knowledge - Information in actionable context that is also available in transient or persistent forms expressed using a Graph Data Model. A modern knowledgebase would more than likely have RDF as its Data Language, RDFS as its Schema Language, and OWL as its Domain Definition (Ontology) Language. Actual Domain, Schema, and Instance Data would be serialized using formats such as RDF-XML, N3, Turtle etc).
How do Data Spaces and Databases differ? Data Spaces are fundamentally problem-domain-specific database applications. They offer functionality that you would instinctively expect of a database (e.g. AICD data management) with the additonal benefit of being data model and query language agnostic. Data Spaces are for the most part DBMS Engine and Data Access Middleware hybrids in the sense that ownership and control of data is inherently loosely-coupled. How do Data Spaces and Content Management Systems differ? Data Spaces are inherently more flexible, they support multiple data models and data representation formats. Content management systems do not possess the same degree of data model and data representation dexterity. How do Data Spaces and Knowledgebases differ? A Data Space cannot dictate the perception of its content. For instance, what I may consider as knowledge relative to my Data Space may not be the case to a remote client that interacts with it from a distance, Thus, defining my Data Space as Knowledgebase, purely, introduces constraints that reduce its broader effectiveness to third party clients (applications, services, users etc..). A Knowledgebase is based on a Graph Data Model resulting in significant impedance for clients that are built around alternative models. To reiterate, Data Spaces support multiple data models. What Architectural Components make up a Data Space? - ORDBMS Engine - for Data Modeling agility (via complex purpose specific data types and data access methods), Data Atomicity, Data Concurrency, Transaction Isolation, and Durability (aka ACID).
- Virtual Database Engine - for creating a single view of, and access point to, heterogeneous SQL, XML, Free Text, and other data. This is all about Virtualization at the Data Access Level.
- Web Services Platform - enabling controlled access and manipulation (via application, service, or protocol logic) of Virtualized or Disparate Data. This layer handles the decoupling of functionality from monolithic wholes for function specific invocation via Web Services using either the SOAP or REST approach.
Where do Data Spaces fit into the Web's rapid evolution? They are an essential part of the burgeoning Data Web / Semantic Web. In short, they will take us from data “Mash-ups” (combining web accessible data that exists without integration and repurposing in mind) to “Mesh-ups” (combining web accessible data that exists with integration and repurposing in mind). Where can I see a DataSpace along the lines described, in action? Just look at my blog, and take the journey as follows: What about other Data Spaces? There are several and I will attempt to categorize along the lines of query method available: Type 1 (Free Text Search over HTTP): Google, MSN, Yahoo!, Amazon, eBay, and most Web 2.0 plays . Type 2 (Free Text Search and XQuery/XPath over HTTP) A few blogs and Wikis (Jon Udell's and a few others) Type 3 (RDF Data Sets and SPARQL Queryable): Type 4 (Generic Free Text Search, OpenSearch, GData, XQuery/XPath, and SPARQL): Points of Semantic Web presence such as the Data Spaces at: What About Data Space aware tools? - OpenLink Ajax Toolkit - provides Javascript Control level binding to Query Services such as XMLA for SQL, GData for Free Text, OpenSearch for Free Text, SPARQL for RDF, in addition to service specific Web Services (Web 2.0 hosted solutions that expose service specific APIs)
- Semantic Radar - a Firefox Extension
- PingTheSemantic - the Semantic Webs equivalent of Web 2.0's weblogs.com
- PiggyBank - a Firefox Extension
08/28/2006 19:38 GMT-0500
|
Modified:
09/04/2006 18:58 GMT-0500
|
Paul Graham was Surprised by Google Calendar?
Dare's insightful take below, sheds light on the problems associated with building Web 2.0 business offerings around a single Collaborative Application feature as opposed to a coherently integrated platform.
BTW - I am just as perplexed as Dare about Paul Graham being blind-sided by the integration of Calendaring and Email by Google.
Paul Graham was Surprised by Google Calendar?: "
I was just reading Paul Graham's post entitled The
Kiko Affair which talks about the
recent failure of Kiko, an AJAX web-calendaring application. I was quite surprised
to see the following sentence in Paul Graham's post
The killer, unforseen by the Kikos and by us, was Google Calendar's
integration with Gmail. The Kikos can't very well write their own Gmail to compete.
Integrating a calendaring application with an email application seems pretty obvious
to me especially since the most popular usage of calendaring applications is using
Outlook/Exchange to schedule meetings in corporate environments. What's surprising
to me is how surprised people are that an
idea that failed in 1990s will turn out any differently now because you sprinkle
the AJAX magic pixie dust on it.
Kiko was a feature, not a full-fledged online destination
let alone a viable business. There'll be a lot more entrants into the TechCrunch
deadpool that are features masquerading as companies before the 'Web 2.0' hype
cycle runs its course.
"
(Via Dare Obasanjo aka Carnage4Life.)
08/19/2006 20:17 GMT-0500
|
Modified:
08/19/2006 23:39 GMT-0500
|
OpenLink Ajax Toolkit (OAT) 1.0 Released
We have finally released the 1.0 edition of OAT.
OAT offers a broad Javascript-based, browser-independent widget set
for building data source independent rich internet applications that are usable across a broad range of Ajax-capable web browsers.
OAT's support binding to the following data sources via its Ajax Database Connectivity Layer:
SQL Data via XML for Analysis (XMLA)
Web Data via SPARQL, GData, and OpenSearch Query Services
Web Services specific Data via service specific binding to SOAP and REST style web services
The toolkit includes a collection of powerful rich internet application prototypes include: SQL Query By Example, Visual Database Modeling, and Data bound Web Form Designer.
Project homepage on sourceforge.net:
http://sourceforge.net/projects/oat
Source Code:
http://sourceforge.net/projects/oat/files
Live demonstration:
http://www.openlinksw.com/oat/
08/08/2006 22:11 GMT-0500
|
Modified:
08/09/2006 05:12 GMT-0500
|
Google vs Semantic Web
Goggle vs Semantic Web: "Google exec challenges Berners-Lee 'At the end of the keynote, however, things took a different turn. Google Director of Search and AAAI Fellow Peter Norvig was the first to the microphone during the Q&A session, and he took the opportunity to raise a few points.
'What I get a lot is: 'Why are you against the Semantic Web?' I am not against the Semantic Web. But from Google's point of view, there are a few things you need to overcome, incompetence being the first,' Norvig said. Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user.'
Related: Google Base -- summing up."
(Via More News.)
When will we drop the ill conceived notion that end-users are incompetent?
Has it every occurred to software developers and technology vendors that incompetent, dumb, and other contemptuous end-user adjectives simply reflect the inability of most technology products to surmount end-user "Interest Activation Thresholds"?
Interest Activation Threshold (IAT)? What's That?
I have a fundamental personal belief that all human beings are intelligent. Our ability to demonstrate intelligence, or be perceived as intelligent, is directly proportional to our interest level in a given context. In short, we have "Ambivalence Quotients" (AQs) just as we have "Intelligence Quotients" (IQs).
An interested human being is an inherently intelligent entity. The abstract nature of human intelligence also makes locating the IQ and AQ on/off buttons a mercurial quest at the best of times.
Technology end-users exhibit high AQs, most of the time due to the inability of most technology products to truly engage, and ultimately stimulate genuine interest, by surmounting IAT and reducing AQ.
Ironically, when a technology vendor is lagging behind its competitors in the "features arms race" it is common place to use the familiar excuse: "our end-users aren't asking for this feature".
Note To Google:
Ambivalence isn't incompetence. If end-users were genuinely incompetent, how is that they run rings around your page rank algorithms by producing google-friendly content at the expense of valuable context? What about the deteriorating value of Adsense due to click fraud? Likewise, the continued erosion of the value of your once exemplary "keyword based search" service? As we all know, necessity is the mother of invention, so when users develop high AQs because there is nothing better, we end up with a forced breech of "IAT"; which is why the issues that I mention remain long term challenges for you. Ironically, the so called "incompetents" are already outsmarting you, and you don't seem to comprehend this reality or its inevitable consequences.
Finally, how you are going to improve value without integrating the Semantic Web vision into your R&D roadmap? I can tell you categorically that you have little or no wiggle room re. this matter, especially if you want to remain true to your: "don't be evil" mantra. My guess is that you will incorporate Semantic Web technologies sooner rather than later (Google Co-op is a big clue). I would even go as far as predicting a Google hosted SPARQL Query Endpoint alongside your GData endpints during the next 6-12 months (if even that long). I believe that your GData protocol (like the rest of Web 2.0) will ultimately accelerate your appreciation of the data model dexterity that RDF brings to loosely coupled knowledge networks espoused by the Semantic Web vision.
Google & Semantic Web Paradox
The Semantic Web vision has the RDF graph data model at its core (and for good reason), but even more confusing for me, as I process Google sentiments about the Semantic Web, is the fact that RDF's actual creator (Ramanathan Guha aka. Guha) currently works at Google. There's a strange disconnect here IMHO.
If I recall correctly, Google wants to organize the worlds data and information, leaving the knowledge organization to someone else which is absolutely fine. What is increasingly irksome, is the current tendency to use corporate stature to generate Fear, Uncertainty, and Doubt when the subject matter is the "Semantic Web".
BTW - I've just read Frederick Giasson's perspective on the Google Semantic Web paradox which ultimately leads to the same conclusions regarding Google's FUD stance when dealing with matters relating to the Semantic Web.
I wonder if anyone is tracking the google hits for "fud google semantic web"?
07/20/2006 19:19 GMT-0500
|
Modified:
07/29/2006 19:55 GMT-0500
|
Web 2.0 Self-Experiment aids Web 3.0 comprehension
Web 2.0 Self-Experiment: "
I shopped for everything except food on eBay. When working with foreign-language documents, I used translations from Babel Fish. (This worked only so well. After a Babel Fish round-trip through Italian, the preceding sentence reads, 'That one has only worked therefore well.') Why use up space storing files on my own hard drive when, thanks to certain free utilities, I can store them on Gmail's servers? I saved, sorted, and browsed photos I uploaded to Flickr. I used Skype for my phone calls, decided on books using Amazon's recommendations rather than 'expert' reviews, killed time with videos at YouTube, and listened to music through customizable sites like Pandora and Musicmatch. I kept my schedule on Google Calendar, my to-do list on Voo2do, and my outlines on iOutliner. I voyeured my neighborhood's home values via Zillow. I even used an online service for each stage of the production of this article, culminating in my typing right now in Writely rather than Word. (Being only so confident that Writely wouldn't somehow lose my work -- or as Babel Fish might put it, 'only confident therefore' -- I backed it up into Gmail files.
Interesting article, Tim O'Reilly's response is here"
(Via Valentin Zacharias (Student).)
Tim O'Reilly's response provides the following hierarchy for Web 2.0 based on The what he calls: "Web 2.0-ness":
level 3: The application could ONLY exist on the net, and draws its essential power from the network and the connections it makes possible between people or applications. These are applications that harness network effects to get better the more people use them. EBay, craigslist, Wikipedia, del.icio.us, Skype, (and yes, Dodgeball) meet this test. They are fundamentally driven by shared online activity. The web itself has this character, which Google and other search engines have then leveraged. (You can search on the desktop, but without link activity, many of the techniques that make web search work so well are not available to you.) Web crawling is one of the fundamental Web 2.0 activities, and search applications like Adsense for Content also clearly have Web 2.0 at their heart. I had a conversation with Eric Schmidt, the CEO of Google, the other day, and he summed up his philosophy and strategy as "Don't fight the internet." In the hierarchy of web 2.0 applications, the highest level is to embrace the network, to understand what creates network effects, and then to harness them in everything you do.
Level 2: The application could exist offline, but it is uniquely advantaged by being online. Flickr is a great example. You can have a local photo management application (like iPhoto) but the application gains remarkable power by leveraging an online community. In fact, the shared photo database, the online community, and the artifacts it creates (like the tag database) is central to what distinguishes Flickr from its offline counterparts. And its fuller embrace of the internet (for example, that the default state of uploaded photos is "public") is what distinguishes it from its online predecessors.
Level 1: The application can and does exist successfully offline, but it gains additional features by being online. Writely is a great example. If you want to do collaborative editing, its online component is terrific, but if you want to write alone, as Fallows did, it gives you little benefit (other than availability from computers other than your own.)
Level 0: The application has primarily taken hold online, but it would work just as well offline if you had all the data in a local cache. MapQuest, Yahoo! Local, and Google Maps are all in this category (but mashups like housingmaps.com are at Level 3.) To the extent that online mapping applications harness user contributions, they jump to Level 2.
So, in a sense we have near conclusive confirmation that Web 2.0 is simply about APIs (typically service specific Data Silos or Walled-gardens) with little concern, understanding, or interest in truly open data access across the burgeoning "Web of Databases". Or the Web of "Databases and Programs" that I prefer to describe as "Data Spaces"
Thus, we can truly begin to conclude that Web 3.0 (Data Web) is the addition of Flexible and Open Data Access to Web 2.0; where the Open Data Access is achieved by leveraging Semantic Web deliverables such as the RDF Data Model and the SPARQL Query Language :-)
07/17/2006 21:46 GMT-0500
|
Modified:
07/18/2006 01:17 GMT-0500
|
Standards as social contracts
Standards as social contracts: "Looking at Dave Winer's efforts in evangelizing OPML, I try to draw some rough lines into what makes a de-facto standard. De Facto standards are made and seldom happen on their own. In this entry, I look back at the history of HTML, RSS, the open source movement and try to draw some lines as to what makes a standard.
"
(Via Tristan Louis.)
I posted a comment to the Tristan Louis' post along the following lines:
Analysis is spot on re. the link between de facto standardization and bootstrapping. Likewise, the clear linkage between boostrapping and connected communities (a variation of the social networking paradigm).
Dave built a community around a XML content syndication and subscription usecase demo that we know today as the blogosphere. Superficially, one may conclude that Semantic Web vision has suffered to date from a lack a similar bootstrap effort. Whereas in reality, we are dealing with "time and context" issues that are critical to the base understanding upon which a "Dave Winer" style bootstrap for the Semantic Web would occur.
Personally, I see the emergence of Web 2.0 (esp. the mashups phenomenon) as the "time and context" seeds from which the Semantic Web bootstrap will sprout. I see shared ontologies such as FOAF and SIOC leading the way (they are the RSS 2.0's of the Semantic Web IMHO).
07/04/2006 17:25 GMT-0500
|
Modified:
07/04/2006 14:53 GMT-0500
|
Syndication Format Family Tree
Important bookmark reference to note as the Web 2.0->[Data Web|Semantic Web] fusion's inflection takes shape: Syndication Format Family Tree.
This particular inflection and, ultimately, transistion is going to occur at Warp Speed!
06/28/2006 16:29 GMT-0500
|
Modified:
06/28/2006 13:02 GMT-0500
|
Structured Data vs. Unstructured Data
There is an interesting article at regdeveloper.com titled: Structured data is boring and useless.. This article provides insight into a serious point of confusion about what exactly is structured vs. unstructured data. Here is a key excerpt: "We all know that structured data is boring and useless; while unstructured data is sexy and chock full of value. Well, only up to a point, Lord Copper. Genuinely unstructured data can be a real nuisance - imagine extracting the return address from an unstructured letter, without letterhead and any of the formatting usually applied to letters. A letter may be thought of as unstructured data, but most business letters are, in fact, highly-structured." .... Duncan Pauly, founder and chief technology officer of Coppereye add's eloquent insight to the conversation: "The labels "structured data" and "unstructured data" are often used ambiguously by different interest groups; and often used lazily to cover multiple distinct aspects of the issue. In reality, there are at least three orthogonal aspects to structure:
* The structure of the data itself. * The structure of the container that hosts the data. * The structure of the access method used to access the data. These three dimensions are largely independent and one does not need to imply another. For example, it is absolutely feasible and reasonable to store unstructured data in a structured database container and access it by unstructured search mechanisms." Data understanding and appreciation is dwindling at a time when the reverse should be happening. We are supposed to be in the throws of the "Information Age", but for some reason this appears to have no correlation with data and "data access" in the minds of many -- as reflected in the broad contradictory positions taken re. unstructured data vs structured data, structured is boring and useless while unstructured is useful and sexy.... The difference between "Structured Containers" and "Structured Data" are clearly misunderstood by most (an unfortunate fact). For instance all DBMS products are "Structured Containers" aligned to one or more data models (typically one). These products have been limited by proprietary data access APIs and underlying data model specificity when used in the "Open-world" model that is at the core of the World Wide Web. This confusion also carries over to the misconception that Web 2.0 and the Semantic/Data Web are mutually exclusive. But things are changing fast, and the concept of multi-model DBMS products is beginning to crystalize. On our part, we have finally released the long promised "OpenLink Data Spaces" application layer that has been developed using our Virtuoso Universal Server. We have structured unified storage containment exposed to the data web cloud via endpoints for querying or accessing data using a variety of mechanisms that include; GData, OpenSearch, SPARQL, XQuery/XPath, SQL etc.. To be continued....
06/23/2006 18:35 GMT-0500
|
Modified:
06/27/2006 01:39 GMT-0500
|
|
|