Kingsley Idehen's Blog Data Spacehttp://www.openlinksw.com/weblog/public/search.vspx?blogid=127&q=data%20web%0D%0A&type=text&output=htmlThu, 28 Mar 2024 14:53:39 GMTKingsley Uyi Idehen<kidehen@openlinksw.com>About data web 295 1 10 My podcast interview with Paul Miller of Talis is out. As I listened to the podcast (naturally awkward affair) I got a first hand sense of Paul's mastery of the art of interviewing, even when dealing with a fast talking data blitzers like me. Personally, I think I still talk a little too fast (the Nigerian in me), especially when the subject matter hones right into the epicenter of my professional passions: Open Data Access and Heterogeneous Data Integration (aka. Virtual Database Technology) -- so you may need to rewind every now and then during the interview :-)

During this particular podcast interview, I deliberately wanted to have an conversation about the practical value of Linked Data, rather than the technical innards. The fundamental utility of Linked Data remains somewhat mercurial, and I am certainly hoping to do my bit at the upcoming Linked Data Planet conference re. demonstrating and articulating linked data value across the blurring realms of "the individual" and "the enterprise".

Note to my old schoolmates on Facebook: when you listen to this podcast you will at least reconcile "Uyi Idehen" with "Kingsley Idehen". Unfortunately, Facebook refuses to let me Identify myself in the manner I choose. Ideally, I would like to have the name: "Kingsley (Uyi) Idehen" associated with my Facebook ID since this is the Identifier known to my personal network of friends, family, and old schoolmates. This Identity predicament is a long running Identity case study in the making.

]]>
My Talis Podcast re. Semantic Web, Linked Data, and OpenLink Softwarehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1361Fri, 16 May 2008 16:53:49 GMT12008-05-16T12:53:49.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
If your Web presence doesn't extend beyond (X)HTML web pages, you are only participating in Web usage Dimension 1.0.

If your Web presence goes beyond (X)HTML pages, via the addition of REST or SOAP based Web Services, then you re participating in Web usage dimension 2.0.

If you Web presence includes all of the above, with the addition of structured data interlinked with structured data across other points of presence on the Web, then you are participating in Web usage dimension 3.0 i.e., "Linked Data Web" or "Web of Data" or "Data Web".

BTW - If you've already done all of the above, and you have started building intelligent agents that exploit the aforementioned structured interlinked data substrate, then you are already in Web usage dimension 4.0.

Related

]]>
Web 1.0, 2.0, and 3.0 (Yet Again)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1439Mon, 15 Sep 2008 17:48:15 GMT12008-09-15T13:48:15-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I now have the first cut of a Facebook application called: Dynamic Linked Data Pages.

What is a Dynamic Linked Data Page (DLD)?

A dynamically generated Web Page comprised of Semantic Data Web style data links (formally typed links) and traditional Document Web links (generic links lacking type specificity).

Linked Data Pages will ultimately enable Facebook users to inject their public data into the Semantic Data Web as RDF based Linked Data. For instance, my Facebook Profile & Photo albums data is now available as RDF, without paying a cent of RDF handcrafting tax, thanks to the Virtuoso Sponger (middleware for producing RDF from non RDF data sources) which is now equipped with a new RDFizer Cartridger for the Facebook Query Language (FQL) and RESTful Web Service.

Demo Notes:

When you click on a link in DLD pages, you will be presented with a lookup that exposes the different interaction options associated with a given URI. Examples include:

  1. Explore - find attributes and relationships that apply to the clicked URI
  2. Dereference (get the attributes of the clicked URI)
  3. Bookmark - store the URI for subsequent use e.g meshing with other URIs from across the Web
  4. (X)HTML Page Open - traditional Document Web link (i.e. just opens another Web document as per usual)

Remember, the facebook URLs (links to web pages) are being converted, on the fly, into RDF based Structured Data ( graph model database) i.e Entity Sets that possess formally defined characteristics (attributes) and associations (relationships).

Dynamic Linked Data Pages

  1. My facebook Profile
  2. My facebook Photo Album

Saved RDF Browser Sessions

  1. My facebook Profile
  2. My facebook Photo Album

Saved SPARQL Query Definitions

  1. My facebook Profile Query
  2. My facebook Photo Album Query
]]>
Injecting Facebook Data into the Semantic Data Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1237Wed, 11 Feb 2009 12:40:11 GMT22009-02-11T07:40:11-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
As the saying goes, "A picture speaks a thousand words..". In this post I simply provide a Data Web view of Mike Bergman's post titled: More Structure, More Terminology and (hopefully) More Clarity. I am hoping the OpenLink RDF Browser view of Mike's post aids in the understanding of the following terms:

  1. Structured Data
  2. Structured Data Resources
  3. Information Resources

Note: I make no reference to "non information" resource, since a non-information resource is a data resource that may or may not contain 100% structured data. Also note that even when structured, the format may not be RDF.

]]>
A Structured Web of Data Picture....http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1233Sun, 22 Jul 2007 23:18:25 GMT22007-07-22T19:18:25-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Chris Bizer, Richard Cyganiak, and Tom Heath have just published a Linked Data Publishing Tutorial that provides a guide to the mechanics of Linked Data injection into the Semantic Data Web.

On different, but related, thread, Mike Bergman recently penned a post titled: What is the Structured Web?. Both of these public contributions shed light on the "Information BUS" essence of the World Wide Web by describing the evolving nature of the payload shuttled by the BUS.

What is an Information BUS?

Middleware infrastructure for shuttling "Information" between endpoints using a messaging protocol.

The Web is the dominant Information BUS within the Network Computer we know as the "Internet". It uses HTTP to shuttle information payloads between "Data Sources" and "Information Consumers" - what happens when we interact with Web via User Agents / Clients (e.g Browsers).

What are Web Information Payloads?

HTTP transported streams of contextualized data. Hence the terms: "Information Resource" and "Non Information" when reading material related to http-range-14 and Web Architecture. For example, an (X)HTML document is a specific data context (representation) that enables us to perceive, or comprehend, a data stream originating from a Web Server as a Web Page. On the other hand, if the payload lacks contextualized data, a fundamental Web requirement, then the resource is referred to as a "Non Information" resource. Of course, there is really no such thing as a "Non Information" resource, but with regards to Web Architecture, it's the short way of saying: "the Web Transmits Information only". That said, I prefer to refer to these "Non Information" resources as "Data Sources", are term well understood in the world of Data Access Middleware (ODBC, JDBC, OLEDB, ADO.NET etc.) and Database Management Systems (Relational, Objec-Relational, Object etc).

Examples of Information Resource and Data Source URIs:

Explanation: The Information Resource is a conduit to the Entity identified by Data Source (an entity in my RDF Data Space that is the Subject or Object of one of more Triple based Statements. The triples in question can that can be represented as an RDF resource when transmitted over the Web via an Information Resource that takes the form of a SPARQL REST Service URL or a Physical RDF based Information Resource URL).

What about Structured Data?

Prior to the emergence of the Semantic Data Web, the payloads shuttled across the Web Information BUS comprised primarily of the following:

  1. HTML - Web Resource with presentation focused structure (Web 1.0 dominant payload form)
  2. XML - Web Resource with structure that separates presentation and data (Web 2.0's dominant payload form).

The Semantic Data Web simply adds RDF to the payload formats that shuttle the Web Information BUS. RDF addresses formal data structure which XML doesn't cover since it is semi-structured (distinct data entities aren't formally discernible). In a nutshell, an RDF payload is basically a conceptual model database packaged as an Information Resource. It's comprised of granular data items called "Entities", that expose fine grained properties values, individual and/or group characteristics (attributes), and relationships (associations) with other Entities.

Where is this all headed?

The Web is in the final stages of the 3rd phase of it's evolution. A phase characterized by the shuttling of structured data payloads (RDF) alongside less data oriented payloads (HTML, XHTML, XML etc.). As you can see, Linked Data and Structured Data are both terms used to describe the addition of more data centric payloads to the Web. Thus, you could view the process of creating a Structured Web of Linked Data as follows:

  1. Identify or Create Structured Data Sources
  2. Name these Data Sources using Data Source URIs
  3. Expose Structured Data Sources to the Web as Linked Data using Information Resource (conduit) URIs

Conclusions

The Semantic Data Web is an evolution of the current Web (an Information Space) that adds structured data payloads (RDF) to current, less data oriented, structured payloads (HTML, XHTML, XML, and others).

The Semantic Data Web is increasingly seen as an inevitability because it's rapidly reaching the point of critical mass (i.e. network effect kick-in). As a result, Data Web emphasis is moving away from: "What is the Semantic Data Web?" To: "How will Semantic Data Web make our globally interconnected village an even better place?", relative to the contributions accrued from the Web thus far. Remember, the initial "Document Web" (Web 1.0) bootstrapped because of the benefits it delivered to blurb-style content publishing (remember the term electronic brochure-ware?). Likewise, in the case of the "Services Web" (Web 2.0), the bootstrap occurred because it delivered platform independence to Web Application Developers - enabling them to expose application logic behind Web Services. It is my expectation that the Data Integration prowess of the Data Web will create a value exchange realm for data architects and other practitioners from the database and data access realms.

Related Items

  1. Mike Bergman's post about Semi-Structured Data
  2. My Posts covering Structured and Un-Structured Containers
]]>
Linked Data & The Web Information BUShttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1231Wed, 08 Aug 2007 22:26:55 GMT52007-08-08T18:26:55-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Frederick Giasson provides compelling data that supports the view that the Semantic Web bootstrap is a global Data Integration & Data Generation effort that inevitably involves a variety of Data Sources such as: social networks, blogs, wikis etc.

The Data in Fred's post is based on FOAF Ontology instance data generated from a myriad of Data Sources.

]]>
Semantic Web Data Generation Activity: FOAF Crawlinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1123Mon, 22 Jan 2007 19:25:48 GMT22007-01-22T14:25:48-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
There is increasing coalescence around the idea that HTTP-based Linked Data adds a tangible dimension to the World Wide Web (Web). This Data Dimension grants end-users, power-users, integrators, and developers the ability to experience the Web not solely as a Information Space or Document Space, but now also as a Data Space.

Here is a simple What and Why guide covering the essence of Data Spaces.

What is a Data Space?

A Data Space is a point of presence on a network, where every Data Object (item or entity) is given a Name (e.g., a URI) by which it may be Referenced or Identified.

In a Data Space, every Representation of those Data Objects (i.e., every Object Representation) has an Address (e.g., a URL) from which it may be Retrieved (or "gotten").

In a Data Space, every Object Representation is a time variant (that is, it changes over time), streamable, and format-agnostic Resource.

An Object Representation is simply a Description of that Object. It takes the form of a graph, pictorially constructed from sets of 3 elements which are themselves named Subject, Predicate, and Object (or SPO); or Entity, Attribute, and Value (or EAV). Each Entity+Attribute+Value or Subject+Predicate+Object set (or triple), is one datum, one piece of data, one persisted observation about a given Subject or Entity.

The underlying Schema that defines and constrains the construction of Object Representations is based on Logic, specifically First-Order Logic. Each Object Representation is a collection of persisted observations (Data) about a given Subject, which aid observers in materializing their perception (Information), and ultimately comprehension (Knowledge), of that Subject.

Why are Data Spaces important?

In the real-world -- which is networked by nature -- data is heterogeneously (or "differently") shaped, and disparately located.

Data has been increasing at an alarming rate since the advent of computing; the interWeb simply provides context that makes this reality more palpable and more exploitable, and in the process virtuously ups the ante through increasingly exponential growth rates.

We can't stop data heterogeneity; it is endemic to the nature of its producers -- humans and/or human-directed machines. What we can do, though, is create a powerful Conceptual-level "bus" or "interface" for data integration, based on Data Description oriented Logic rather than Data Representation oriented Formats. Basically, it's possible for us to use a Common Logic as the basis for expressing and blending SPO- or EAV-based Object Representations in a variety of Formats (or "dialects").

The roadmap boils down to:

  1. Assigning unambiguous Object Names to:

    • Every record (or, in table terms, every row);

    • Every record attribute (or, in table terms, every field or column);

    • Every record relationship (that is, every relationship between one record and another);

    • Every record container (e.g., every table or view in a relational database, every named graph, every spreadsheet, every text file, etc.);

  2. Making each Object Name resolve to an Address through which Create, Read, Update, and Delete ("CRUD") operations can be performed against (can access) the associated Object Representation graph.

]]>
Data Spaceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1662Tue, 01 Mar 2011 22:26:15 GMT12011-03-01T17:26:15-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Open Data Access and Web 2.0 have a very strange relationship that continues to blur the lines of demarcation between where Web 2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of Databases etc.) starts. But before I proceed, let me attempt to define Web 2.0 one more time:

A phase in the evolution web usage patterns that emphasizes Web Services based interaction between “Web Users” and “Points of Web Presence” over traditional “Web Users” and “Web Sites” based interaction. Basically, a transition from visual site interaction to presence based interaction.

BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.

With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:

  • Emphasis on XML's prowess in the realms of Data and Protocol Modeling alongside Data Representation. Especially as SOAP or REST styles of Web Services and various XML formats (RSS 0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web 2.0 infrastructure landscape
  • Where a modicum of Data Access appreciation and comprehension does exist it is inherently compromised by business models that mandate some form of “Walled Gardens” and “Data Silos”
  • Mash-ups are a response to said “Walled Gardens” and “Data Silos” . Mash-ups by definition imply combining things that were not built for recombination.

As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.

We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the “V” and “C” with a modicum of “M” at best (data access, open data access, and flexible open data access are completely separate things). The “C” items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.

What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the “Network Effects” potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).

The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:

  • Access Data from a myriad of data sources (irrespective of structural differences at the database level)
  • Mesh (not Mash) data in new and interesting ways
  • Share the meshed data with as many relevant people as possible for social, professional, political, religious, and other reasons
  • Construct valuable networks based on data oriented connections

Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.

If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:

A phase in the evolution of the web that emphasizes interaction between “Web Users” and “Web Data” facilitated by Web Services based APIs and an Open & Flexible Data Access Model “.


In more succinct form:

A pervasive network of people connected by data or data connected by people.


Returning to M-V-C and looking at the definition above, you now have a complete of ”M“ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).

To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.

The Enterprise Challenge

Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).

A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.

Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.

Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:

Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business

]]>
Web 2.0's Open Data Access Conundrum (Update)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1034Thu, 16 Nov 2006 21:11:45 GMT42006-11-16T16:11:45-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Open Data Access and Web 2.0 have a very strange relationship that continues to blur the lines of demarcation between where Web 2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of Databases etc.) starts. But before I proceed, let me attempt to define Web 2.0 one more time:

A phase in the evolution web usage patterns that emphasizes Web Services based interaction between “Web Users” and “Points of Web Presence” over traditional “Web Users” and “Web Sites” based interaction. Basically, a transition from visual site interaction to presence based interaction.

BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.

With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:

  • Emphasis on XML's prowess in the realms of Data and Protocol Modeling alongside Data Representation. Especially as SOAP or REST styles of Web Services and various XML formats (RSS 0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web 2.0 infrastructure landscape
  • Where a modicum of Data Access appreciation and comprehension does exist it is inherently compromised by business models that mandate some form of “Walled Gardens” and “Data Silos”
  • Mash-ups are a response to said “Walled Gardens” and “Data Silos” . Mash-ups by definition imply combining things that were not built for recombination.

As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.

We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the “V” and “C” with a modicum of “M” at best (data access, open data access, and flexible open data access are completely separate things). The “C” items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.

What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the “Network Effects” potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).

The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:

  • Access Data from a myriad of data sources (irrespective of structural differences at the database level)
  • Mesh (not Mash) data in new and interesting ways
  • Share the meshed data with as many relevant people as possible for social, professional, political, religious, and other reasons
  • Construct valuable networks based on data oriented connections

Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.

If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:

A phase in the evolution of the web that emphasizes interaction between “Web Users” and “Web Data” facilitated by Web Services based APIs and an Open & Flexible Data Access Model “.


In more succinct form:

A pervasive network of people connected by data or data connected by people.


Returning to M-V-C and looking at the definition above, you now have a complete of ”M“ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).

To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.

The Enterprise Challenge

Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).

A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.

Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.

Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:

Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business

]]>
Web 2.0's Open Data Access Conundrumhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1032Thu, 16 Nov 2006 20:51:43 GMT42006-11-16T15:51:43-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
We have finally released the 1.0 edition of OAT.

OAT offers a broad Javascript-based, browser-independent widget set
for building data source independent rich internet applications that are usable across a broad range of Ajax-capable web browsers.

OAT's support binding to the following data sources via its Ajax Database Connectivity Layer:

SQL Data via XML for Analysis (XMLA)
Web Data via SPARQL, GData, and OpenSearch Query Services
Web Services specific Data via service specific binding to SOAP and REST style web services

The toolkit includes a collection of powerful rich internet application prototypes include: SQL Query By Example, Visual Database Modeling, and Data bound Web Form Designer.

Project homepage on sourceforge.net:

http://sourceforge.net/projects/oat

Source Code:

http://sourceforge.net/projects/oat/files

Live demonstration:

http://www.openlinksw.com/oat/

]]>
OpenLink Ajax Toolkit (OAT) 1.0 Releasedhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1023Wed, 09 Aug 2006 09:12:48 GMT22006-08-09T05:12:48-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Important bookmark reference to note as the Web 2.0->[Data Web|Semantic Web] fusion's inflection takes shape: Syndication Format Family Tree.

This particular inflection and, ultimately, transistion is going to occur at Warp Speed!

]]>
Syndication Format Family Treehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/992Wed, 28 Jun 2006 17:02:39 GMT12006-06-28T13:02:39.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The net effect of Web Services and Web Data (soon to be Semantic Content) is the ability obtain and analyze this kind of data .

Answers.com was launched a month ago, and its stock is practically on fire! Does this graph tell you anything about subject searches vs keyword searches?

The burgeoning Semantic Web will disrupt the search market in a big way (and for the better IMHO).

 

]]>
Traffic Analysis: Google vs Answers.com vs Ask.comhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/677Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I had been anticipating the release of Web Matrix 2.0, but was pretty disappointed with the blatant attempts to lock users into SQL Server and ACCESS (of course I know that manual imports are possible re. my .net provider for non Microsoft databases, but that's beside the point). From the feature list:

Easy Data UI Generation.  Web Matrix makes it easy to create data bound pages without writing code. Drop SQL/MSDE or Access tables on your page to create data-bound grids, or start with Data Page templates for reports or Master/Detail pages. Code builders help you generate code to select, insert, update and delete SQL/MSDE or Access data. 
 [via WebLogs @ ASP.NET]

It only makes it easy for two databases which are both Microsoft owned? What really baffles me is why they don't use ADO.NET, by the way this is their own data abstraction technology. The same approach has also been applied to InfoPath and this is certainly a disturbing trend for unsuspecting end-users, developers, systems architects, and decision makers. Before you know it you lose your database choices.
 
Could this be an oversight on the part of Microsoft? I don't think so somehow, we are taking a very interesting journey here from database independence to database specificity ( ODBC->OLEDB-ADO.NET-[SQL Server|Acces] ), all in a quest to covertly reduce choices (I think I've seen this movie before! And I might have to rewrite the script).
 
]]>
What's new in Web Matrix ?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/282Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
In response to ReadWriteWeb's post titled: Who will own your Data in Web 3.0 World?. My simple answer: You!

You will control your data in the Web 3.0 realm. If somehow this remains somewhat incomprehensible and nebulous (as is typical in this emerging realm) then simply think about this as: The Magic of You!

Remember, "You" was the Times person of the year as an acknowledgement of the Web 2.0 phenomenon, and maybe this time next year it would simply be the "Magic of Being You" that's the person of the year :-)

Web 3.0 brings databasing to the Web (as a feature). The single most important action item at this stage is the act of creating a record for yourself, in this new distributed database held together by an HTTP based Network (e.g., the World Wide Web).

Related:

  1. Get yourself a Web Database ID in 5 minutes or less
  2. 2006 Callout from TimBL: Get Yourself a URI
  3. Just watch the Numerati Video
]]>
The Numerati & The Magic of You!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1458Mon, 01 Feb 2010 13:55:22 GMT22010-02-01T08:55:22.000017-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Linked Data meme continues on it's quest to unravel the mysteries of the Semantic Web vision, it's quite gratifying to see that data virtualization comprehension: creating "Conceptual Views" into logically organized "Disparate & Heterogeneous Data Sources" via "Context Lenses" is taking shape, as illustrated in the "note-to-self" post by David Provost.




Virtualization of heterogeneous data sources is only achievable if you have a dexterous data model based "Bus" into which the data sources are plugged. RDF has offered such a model for a long time.







When heterogeneous data sources are plugged into an RDF based integration bus e.g., customer records sourced from a variety of tables, across a plethora of databases, you can only end up with true value if the emergent entities from such an effort are coherently linked and (de)referencable; which is what Linked Data's fundamental preoccupation with dereferencable URIs is all about. Of course, Even when you have all of the above in place, you also need to be able to construct "Context Lenses" i.e., context driven views of the Linked Data Mesh (or Linked Data Spaces).


Additional Diagrams:


1. Clients of the RDF Bus
2. RDF Bus Server plugins: Scripts that emit RDF
3. RDF Bus Servers: RDF Data Managers (Triple or Quad Stores)
4. RDF Bus Servers: Relational to RDF Mappers (RDF Views, Semantic Covers etc.)
5. RDF Bus Server plugins: XML to RDF Mappers
6. RDF Bus Server plugins: GRDDL based XSLT stylesheets that emit RDF
7. RDF Bus Server plugins: Intelligent RDF Middleware






]]>
Time for Context Lenses (Update)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1405Mon, 04 Aug 2008 15:24:50 GMT32008-08-04T11:24:50.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
These days I increasingly qualify myself and my Semantic Web advocacy as falling under the realm Linked Data. Thus, I tend to use the following introduction: I am Kingsley Idehen, of the Tribe Linked Data.

The aforementioned qualification is increasingly necessary for the following reasons:

  1. The Semantic Web vision is broad and comprised of many layers
  2. A new era of confusion is taking shape just as we thought we had quelled the prior AI dominated realm of confusion
  3. None of the Semantic Web vision layers are comprehensible in practical ways without a basic foundation
  4. Open Data Access is the foundation of the Semantic Web (in prior post I used the term: Semantic Web Layer 1)
  5. URIs units of Open Data Access in Semantic Web parlance i.e.. each datum on the Web must have an ID (minted by the host Data Space).

The terms GGG, Linked Data, Data Web, Web of Data, and Web 3.0 (when I use this term) all imply URI driven Open Data Access for the Web Database (maybe call this ODBC for the Web) -- ability to point to records across data spaces without any adverse effect to the remote data spaces. It's really important to note that none of the aforementioned terms have nothing to do with the "Linguistic Meaning of blurb". Building a smarter document exposed via a URL without exposing descriptive data links doesn't provide open access to information data sources.

As human beings we are all endowed with reasoning capability. But we can't reason without access to data. Dearth of openly accessible structured data is the source of many ills in cyberspace and across society in general. Today we still have Subjectivity reigning over Objectivity due to the prohibitive costs of open data access.

We can't cost-effectively pursue objectivity without cost-effective infrastructure for creating alternative views of the data behind information sources (e.g. Web Pages). More Objectivity and less Subjectivity is what the next Web Frontier is about. At OpenLink we simply use the moniker: Analysis for All! Everyone becomes a data analyst in some form, and even better, the analysis are easily accessible to anyone connected to the Web. Of course, you will be able to share special analysis with your private network of friends and family, or if you so choose, not at all :-)

Recap, it's important to note that Linked Data is the foundation layer of the Semantic Web vision. It's not only facilitates open data access, it also enables data integration (Meshing as opposed to Mashing) across disparate data schemas

As demonstrated by DBpedia and the Linked Data Solar system emerging around it, if you URI everything, then everything is Cool.

Linked Data and Information Silos are mutually exclusive concepts. Thus, you cannot produce a web accessible Information Silo and then refer to it as "Semantic Web" technology. Of course, it might be very Semantic, but it's fundamentally devoid of critical "Semantic Web" essence (DNA).

My acid test for any Semantic Web solution is simply this (using a Web User Agent or Client):

  1. go to the profile page of the service
  2. ask for an RDF representation of my profile (by this I mean "get me the raw data in structured form")
  3. attempt to traverse the structured data graph (RDF) that the service provides via live de-referncable URIs.

Here is the Acid test against my Data Space:

  1. My Profile Page (HTML representation dispatched via an instance of OpenLink Data Spaces)
  2. Click on the "Linked Data Tab" (HTML representation endowed with Data Links the link to information resources containing other structured descriptions of things).
]]>
Semantic Web Advocate of Tribe Linked Data! (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1324Thu, 20 Mar 2008 20:29:47 GMT32008-03-20T16:29:47-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I've been a little busier than usual, of late. So busy, that even minimal blog based discourse participation has been a challenge. Anyway, during this quiet period, a number of interesting data streams have come my way that relate to OpenLink Data Spaces (ODS). Thus, in typical fashion, I'll use this post (via URIs) to contribute a few nodes to the Giant Global Graph that is the Web of Structured Linked Data, also known as the Data Web, Semantic Data Web, or Web of Data (also see prior Data Web posts).

Here goes:

  1. Alan Wilensky recalls his early encounters with OpenLink Data Spaces (circa. 2004)
  2. Daniel Lewis shares his "state of the Semantic Data Web" findings
  3. Daniel Lewis experiences OpenLink Data Space first hand en route to creating Data Spaces in the Clouds (the Fourth Platform).

In addition, in one week, courtesy of the Web, UK Semnantic Web Gatherings in Bristol and Oxford, I discover, interview, and employ Daniel :-) Imagine how long this would have taken to pull off via the Document Web, assuming I would even discover Daniel.

As with all things these days, the Web and Internet change everything, which includes talent discovery and recruitment.

A Global Social graph that is a mesh of Linked Data enables the process of recruitment, marketing, and other elements of busines management to be condensed down to a sending powerful beams across the aforementioned Graph :-) The only variable pieces are the traversal paths exposed to your beam via the beam's entry point URI. In my case, I have a single URI that exposes a Graph of critical paths for the Blogosphere (i.e data spaces of RSS Atom Feeds). Thus, I can discover if your profile matches the requirements associated with an opening at OpenLink Software (most of the time) before you do :-)

BTW - I just noticed that John Breslin described ODS as social-graph++ in his recent post, titled: Tales from the SIOC-o-sphere, part 6. In a funny way, this reminds of a post from the early blogosphere days about platforms and Weblog APIs (circa. 2003) about ODS (then exposed via the Blog Platform realm of Virtuoso).

]]>
Discussion: OpenLink Data Spaces http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1280Sat, 01 Dec 2007 20:26:12 GMT42007-12-01T15:26:12-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Introducing the XBRL Ontology Project.

The XBRL Ontology Project seeks to address the obvious need to bring structured financial data into the emerging Semantic Data Web as articulated in this excerpt from the inaugural mailing list post:

The parallel evolution of XBRL and the Semantic Web is one of the more puzzling current day technology misnomers:

The Semantic Web expresses a vision about a Web of Data connected by formal meaning (Context). Congruently, XBRL espouses a vision whereby by formally defined Financial Data is accessible via the Web (and other networks). Sadly, we have an abundance of XBRL Taxonomies, pretty wide adoption of the XBRL standard globally, but not a single RDFS Schema or OWL Ontology, derived from said taxonomies, in sight!
Read on..."

(Via XBRL Ontology Specification Group Google Group.)

]]>
XBRL Ontology Projecthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1189Tue, 05 Feb 2008 04:20:04 GMT12008-02-04T23:20:04.000014-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
While I continue to wrestle with screencast production etc.. Here is are some screenshots that guide you through the process of providing Data Web URIs to the SPARQL Query Builder (first cut of an MS Query or MS ACCESS type tool for the Data Web).

  1. Step 1 - Enter a Data Source URI
  2. Step 2 - Click on the Run Control (">" video control icon)
  3. Step 3 - Interact with Custom Grid hosted results (comprised of Resource Identifiers (S), Properties (P), and Property Values (O).

Once you grasp the concept of entering values into the "Default Data Source URI field", take a look at: http://programmableweb.com and other URIs (hint: scroll through the results grid to the QEDWiki demo item)

]]>
Hello Data Web (Take 2 - with Screenshots)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1137Sun, 18 Feb 2007 15:23:42 GMT32007-02-18T10:23:42-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Nova Spivack provides poignant insights into the recent Web 2.0 vs Web 3.0 brouhaha which I've excerpted below:

Web Me2.0 -- Exploding the Myth of Web 2.0:

"Many people have told me this week that they think 'Web 2.0' has not been very impressive so far and that they really hope for a next-generation of the Web with some more significant innovation under the hood -- regardless of what it's called. A lot of people found the Web 2.0 conference in San Francisco to be underwhelming -- there was a lot of self-congratulation by the top few brands and the companies they have recently bought, but not much else happening. Where was all the innovation? Where was the focus on what's next? It seemed to be a conference mainly about what happened in the last year, not about what will happen in the coming year. But what happened last year is already so 'last year.' And frankly Web 2.0 still leaves a lot to be desired. The reason Tim Berners-Lee proposed the Semantic Web in the first place is that it will finally deliver on the real potential and vision of the Web. Not that today's Web 2.0 sucks completely -- it only sort of sucks. It's definitely useful and there are some nice bells and whistles we didn't have before. But it could still suck so much less!"

Web 2.0 is a (not was) a piece of the overall Web puzzle. The Data Web (so called Web 3.0) is another critical piece of this puzzle, especially as it provides the foundation layer (Layer 1) of the Semantic Web.

Web 2.0 was never about "Open Data Access", "Flexible Data Models", or "Open World" meshing of disparate data sources built atop disparate data schemas (see: Web 2.0's Open Data Access Conundrum). It was simply about "Execution and APIs". I already written about "Web Interaction Dimensions", but you call also look at the relationship of the currently perceived dimensions through the M-V-C programming pattern:

  1. Viewer (V) - Web 1.0 (Interaction, Dimension 1 - Interactive-Web)
  2. Controller (C) - Web 2.0 (Services, Dimension 2 - Services-Web which is about Execution & Application Logic; SOA outside/in-front-of the Firewall for Enterprise 2.0 crowd)
  3. Model (M) - Web 3.0 (Data, Dimension 3 - Data-Web which is about data model dexterity and open data access)

Another point to note, Social Networking is hot, but nearly every social network that I know (and I know and use most of them) suffers from an impedance mismatch between the service(s) they provide (social networks) and their underlying data models (in many cases Relational as opposed to Graph). Networks are about Relationships (N-ary) and your cannot effectively exploit the deep potential of: "Network Effects" (Wisdom of Crowds, Viral Marketing etc..) without a complimentary data model, you simply can't.

Finally, the Data Web is already here, I promised a long time ago (Internet Time) that the manifestation of the Semantic Web would occur unobtrusively, meaning, we will wake up one day and realize we are using critical portions of the Semantic Web (i.e. Data-Web) without even knowing it. Guess what? It's already happening. Simple case in point, you may have started to notice the emergence of SIOC gems in the same way you may have observed those RSS 2.0 gems at the dawn of Web 2.0. What I am implying here is that the real question we should be asking is: Where is the Semantic Web Data? And how easy or difficult will it be to generate? And where are the tools? My answers are presented below:

  1. Pingthesemanticweb.com - Semantic Web Data Source Lookup & Tracking Service
  2. Swoogle - Semantic Web Ontology Location Service
  3. Semantic Web Solutions for Generating RDF Data from SQL Data
  4. Semantic Web Solutions Directory
  5. SIOC Project - Semantically-Interlinked Online Communities Ontology, a grassroots effort that provides a critical bridge between Web 2.0 and the Data-Web. For instance, existing Web 2.0 application profiles such as; Blogs, Wikis, Feed Aggregators, Content Managers, Discussion Forums etc.. are much closer to the Data-Web than you may think :-)
  6. Virtuoso - our Universal Server for the Data-Web
  7. OpenLink Data Spaces (ODS) - our SIOC based platform for transparent incorporation of the Data-Web into Web 1.0 and Web 2.0

Next stop, less writing, more demos, these are long overdue! At least from my side of the fence :-) I need to produce a little step-by-guide oriented screencasts that demonstrates how Web 2.0 meshes nicely with the Data-Web.

Here are some (not so end-user friendly) examples of how you can use SPARQL (Data-Web's Query Language) to query Web 2.0 Instance Data projected through the SIOC Ontology:

  1. Weblog Data Query
  2. Wiki Data Query
  3. Aggregated Feeds Data Query - (RSS 1.0, RSS 2.0, Atom etc)
  4. Shared Bookmarks Data Space
  5. Web Filesystem Data Query - (Briefcase - Virtual Spotlight of sorts)
  6. Photo Gallery Data Query (this could be data from Flickr etc..)
  7. Discussion Data Query (e.g. Blog posts comments)
  8. Data Queries across different Data Spaces - combining data from Wikis, Blogs, Feeds, Photos, Bookmarks, Discussions etc..

Note: You can use the online SPARQL Query Interface at: http://demo.openlinksw.com/isparql.

Other Data-Web Technology usage demos include:

  1. TimBL's Tabulator - A Data-Web Browser
  2. Semantic Web Client Library - RDF Data Drill Down Demos using SPARQL
  3. Semantic Radar - A Firefox plug-in for auto-discovering SIOC Instance Data
  4. Talk Digger - SIOC based Web Conversation Tracker
]]>
Web Me2.0 -- Exploding the Myth of Web 2.0http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1081Thu, 16 Nov 2006 21:11:46 GMT52006-11-16T16:11:46-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
A new technical white paper about our declarative language for SQL Schema to RDF Ontology Mapping has just been published.

What is this?

A declarative language adapted from SPARQL's graph pattern language (N3/Turtle) for mapping SQL Data to RDF Ontologies. We currently refer to this as a Graph Pattern based RDF VIEW Definition Language.

Why is it important?

It provides an effective mechanism for exposing existing SQL Data as virtual RDF Data Sets (Graphs) negating the data duplication associated with generating physical RDF Graphs from SQL Data en route to persistence in a dedicated Triple Store.

Enterprise applications (traditional and web based) and most Web Applications (Web 1.0 and Web 2.0) sit atop relational databases, implying that SQL/RDF model and data integration is an essential element of the burgeoning "Data Web" (Semantic Web - Layer 1) comprehension and adoption process.

In a nutshell, this is a quick route for non disruptive exposure of existing SQL Data to SPARQL supporting RDF Tools and Development Environments.

How does it work?

RDF Side

  1. locate one or more Ontologies (e.g FOAF, SIOC, AtomOWL, SKOS etc.) that effectively defines the Concepts (Classes) and Terms (Predicates) to be exposed via your RDF Graph
  2. Using the Virtuoso's RDF View Definition Language declare a International Resource Identifier (or URI) for your Graph. Example:
    CREATE GRAPH IRI("http://myopenlink.net/dataspace")
  3. Then create Classes (Concepts), Class Properties/Predicates (Memb), and Class Instances (Inst) for the new Graph. Example:
    CREATE IRI CLASS odsWeblog:feed_iri  "http://myopenlink.net/dataspace/kidehen/weblog/MyFeeds" (
      in memb varchar not null, in inst varchar not null)

SQL Side

  1. If Virtuoso isn't your SQL Data Store, Identify the ODBC or JDBC SQL data source(s) containing the SQL data to be mapped to RDF and then link the relevant tables into Virtuoso's Virtual DBMS Layer
  2. Then use the RDF View Definition Language's graph pattern feature to generate SQL to RDF Mapping Template for your Graph. As shown in this ODS Weblog -> AtomOWL Mapping example.
]]>
Virtuoso's SQL Schema to RDF Ontology Mapping Language (1.0)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1064Fri, 17 Nov 2006 23:24:25 GMT122006-11-17T18:24:25-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I have yanked out a key segment from the TECH TALK: The Future of Search: Perspectives post that I find really poignant regarding the changing shape and form of the Web:

It is clear that in comparison to the Web of the last century, the nature of data on the Web later in this decade will be very different in the following aspects:

  • Volume of data is growing by orders of magnitudes every year
    Multimedia and sensor data are becoming more and more common.

  • Spatio-temporal attributes of data are important.

  • Different data sources provide information to form the holistic picture.

  • Users are not concerned with the location of data source, as long as its quality and credibility is assured. They want to know the result of the data assimilation (the big picture of the event).

  • Real-time data processing is the only way to extract meaningful information
    Exploration, not querying, is the predominant mode of interaction, which makes context and state critical.

  • The user is interested in experience and information, independent of the medium and the source.

Effectively, the nature of the knowledge on the Web is changing very fast. It used to be mostly static text documents; now it will be a combination of live and static multimedia, including text, data and documents with spatio-temporal attributes. Considering these changes, can the search engines developed for static text documents be able to deal with the needs of the Web? [via E M E R G I C . o r g]

No, but this doesn't render them useless since we wouldn't be at this point without the likes of Google, Yahoo! et al. But building upon the data substrate that web data oriented search engines provide is where the next batch of Information access and Knowledge discovery solutions will carve out their space. The symbiotic relationship between Google (data) and Gurunet's Answers.com (Information and Knowledge) is one interesting example.

The Web is a distributed collection of databases that implement variety of data storage models but are commonly accessible via protocols that rely on HTTP for transport (in-bound and out-bound messages) services. These databases increasingly using well-formed XML for query result (data contextualization) persistence and URIs for permenant reference. 'What Database?" you might ask, "What you once called your Web Site, Blog, Wiki, etc.." my time-less reply.

When you have the database that I describe above, and a collection of entry points from which discrete or composite Web Services can be invoked available from one or more internet domains, you end up with what I prefer to call "Web 2.0" presence, or what Richard McManus describes as: "The Web as a Platform".

Here is a collection of posts I have made in the past relating to Web 2.0, note that this list is dynamic since this blog is Virtuoso based (predictably):

Free Text Search with XHTML results page (with Virtuoso generated URIs for RSS, Atom, and RDF): http://www.openlinksw.com/blog/search.vspx?blogid=127&q=web+2.0&type=text&output=html 

It's also no secret that I believe that Virtuoso is a bleeding edge Web 2.0 technology platform (and more..). The URIs that I am exposing provide the foundation layer for other complimentary Web initiatives such as the Semantic Web (Web 2.0 provides infrastructure for the Semantic Web as time will show). They are also completely usable outside the realm of this blog.

BTW - Jon Udell is writing, experimenting with, and demonstrating similar concepts across feeds within his Web 2.0 domain.

These are indeed fun times!

]]>
The Future of Search: Perspectiveshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/710Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Microsoft Reinvents FrontPage, Tapping Into the Power of XMLTo Build Live Data-Driven Web Sites Microsoft Corp. today announced that Microsoft Office FrontPage 2003, part of the Microsoft Office System, has been reinvented to support a wide range of capabilities for building dynamic, Extensible Markup Language (XML)-based, data-driven Web sites, while retaining the ease of use that has helped make it one of the most popular Web site design tools on the market today. FrontPage 2003 will be the first commercially available, fully WYSIWYG Extensible Stylesheet Language Transformation (XSLT) editor in which users can work with live data to create interactive and dynamic Web sites, streamlining the process of sharing information on the Web. [via Loosely Coupled news releases live feed]

This also includes Weblog Editing and Posting I believe.

]]>
Microsoft Reinvents FrontPage, Tapping Into the Power of XMLTo Build Live Data-Driven Web Siteshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/271Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Based on the prevalence of confusion re. the Linked Data meme, here are a few important points to remember about the World Wide Web.

  1. Its an HTTP based Network Cluster within the Internet (remember: Networks are about meshes of Nodes connected by Links)
  2. Its underlying data model is that of a Network (we've had Network Data models for eons. EAV/CR is an example)
  3. Links are facilitated via URIs
  4. Until recently the granularity of Networking on the Web was scoped to Data Containers (documents) (due to prevalence of URL style links
  5. The Linked Data meme adds Data Item (Datum) level granularity to World Wide Web networking via HTTP URIs
  6. Data Items become Web Reference-able when you Identify/Name them using HTTP based URIs
  7. An HTTP URI implicitly binds a Web Reference-able Data Item (Entity, Datum, Data Object, Resource) to its Web Accessible Metadata
  8. Web Accessible Metadata resides within Data Containers (documents or information resources)
  9. The representation of a Web Accessible Metadata container is negotiable
  10. I am able to write and dispatch this blog post courtesy of the Web features listed above
  11. You are able to explore the many dimensions to data exposed by this blog should you decide to explore the Linked Data mesh exposed by this post's HTTP URI (via its permalink permalink)

The HTTP URI is the secret sauce of the Web that is powerfully and unobtrusively reintroduced via the Linked Data meme (classic back to the future act). This powerful sauce possess a unique power courtesy of its inherent duality i.e., how it uniquely combines Data Item Identity (think keys in traditional DBMS parlance) with Data Access (e.g. access to negotiable representations of associated metadata).

As you can see, I've made no mention of RDF or SPARQL, and I can still articulate the inherent value of the "Linked Data" dimension that the "Linked Data" meme adds to the World Wide Web.

As per usual this post is a live demonstration of Linked Data (dog-food style) :-)

Related

]]>
Important Things to Note about the World Wide Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1564Thu, 23 Jul 2009 14:33:58 GMT12009-07-23T10:33:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Another post done in response to lost comments. This time, the comments relate to Robin Bloor's article titled: What is Web 3.0 and Why Should I Care?

Robin:

Web 3.0 is fundamentally about the World Wid Web becoming a structured database equipped with a formal data model (RDF which is a moniker for Entity-Attribute-Value with Classes & Relationships based Graph Model), query language, and a protocol for handling divrerse data representational requirements via negotiation

.

Web 3.0 is about a Web that facilitates serendipitous discovery of relevant things; thereby making serendipitous discovery quotient (SDQ), rather than search engine optimization (SEO), the critical success factor that drives how resources get published on the Web.

Personally, I believe we are on the cusp of a major industry inflection re. how we interact with data hosted in computing spaces. In a nutshell, the conceptual model interaction based on real-world entities such as people, places, and other things (including abstract subject matter) will usurp traditional logical model interaction based on rows and columns of typed and/or untyped literal values exemplified by relational data access and management systems.

Labels such as "Web 3.0", "Linked Data", and "Semantic Web", are simply about the aforementioned model transition playing out on the World Wide Web and across private Linked Data Webs such as Intranets & Extranets, as exemplified emergence of the "Master Data Management" label/buzzword.

What's the critical infrastructure supporting Web 3.0?

As was the case with Web Services re. Web 2.0, there is a critical piece of infrastructure driving the evolution in question, and in this case it comes down to the evolution of Hyperlinking.

We now have a new and complimentary variant of Hyperlinking commonly referred to as "Hyperdata" that now sits alongside "Hypertext". Hyperdata when used in conjunction with HTTP based URIs as Data Source Names (or Identifiers), delivers a potent and granular data access mechanism scoped down to the datum (object or record) level; which is much different from the document (record or entity container) level linkage that Hypertext accords.

In addition, the incorporation of HTTP into this new and enhanced granular Data Source Naming mechanism also addresses past challenges relating to separation of data, data representation, and data transmission protocols -- remember XDR woes familiar to all sockets level programmers -- courtesy of in-built content negotiation. Hence, via a simple HTTP GET --against a Data Source Name exposed by a Hyperdata link -- I can negotiate (from client or server sides) the exact representation of the description (entity-attribute-value graph) of an Entity / Data Object / Resource, dispatched by a data server.

For example, this is how a description of entity "Me" ends up being available in (X)HTML or RDF document representations (as you will observe when you click on that link to my Personal URI).

The foundation of what I describe above comes from:

  1. Entity-Attribute-Value & Class Relationship Data Model (originating from LISP era with detours via the Object Database era. into the Triples approach in RDF)
  2. Use of HTTP based Identifiers in the Entity ID construction process
  3. SPARQL query language for the Data Model.

Some live examples from DBpedia:

  • http://dbpedia.org/resource/Linked_Data
  • http://dbpedia.org/resource/Hyperdata
  • http://dbpedia.org/resource/Entity-attribute-value_model
  • http://dbpedia.org/resource/Benjamin_Franklin

Related

]]>
Response to: What is Web 3.0 and Why Should I Care?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1524Thu, 29 Jan 2009 18:45:11 GMT22009-01-29T13:45:11-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The sweet spot of Web 3.0 (or any other Web.vNext moniker) is all about providing Web Users with a structured and interlinked data substrate that facilitates serendipitous discovery of relevant "Things" i.e., a Linked Data Web -- a Web of Linkable Entities that goes beyond documents and other information resource (data containers) types.

Understanding potential Linked Data Web business models, relative to other Web based market segments, is best pursued via a BCG Matrix diagram, such as the one I've constructed below:



Notes:

Link Density

  • Web 1.0's collection of "Web Sites" have relatively low link density relative to Web 2.0's user-activity driven generation of semi-structured linked data spaces (e.g., Blogs, Wikis, Shared Bookmarks, RSS/Atom Feeds, Photo Galleries, Discussion Forums etc..)
  • Semantic Technologies (i.e. "Semantics Inside style solutions") which are primarily about "Semantic Meaning" culled from Web 1.0 Pages also have limited linked density relative to Web 2.0
  • The Linked Data Web, courtesy of the open-ended linking capacity of URIs, matches and ultimately exceeds Web 2.0 link density.

Relevance

  • Web 1.0 and 2.0 are low relevance realms driven by hyperlinks to information resources ((X)HTML, RSS, Atom, OPML, XML, Images, Audio files etc.) associated with Literal Labels and Tagging schemes devoid of explicit property based resource description thereby making the pursuit of relevance mercurial at best
  • Semantic Technologies offer more relevance than Web 1.0 and 2.0 based on the increased context that semantic analysis of Web pages accords
  • The Linked Data Web, courtesy of URIs that expose self-describing data entities, match the relevance levels attained by Semantic Technologies.

Serendipity Quotient (SDQ)

  • Web 1.0 has next to no serendipity, the closest thing is Google's "I'm Feeling Lucky" button
  • Web 2.0 possess higher potential for serendipitous discovery than Web 1.0, but such potential is neutralized by inherent subjectivity due to its human-interaction-focused literal foundation (e.g., tags, voting schemes, wiki editors etc.)
  • Semantic Technologies produce islands-of-relevance with little scope for serendipitous discovery due to URI invisibility, since the prime focus is delivering more context to Web search relative to traditional Web 1.0 search engines.
  • The Linked Data Web's use of URIs as the naming and resolution mechanism for exposing structured and interlinked resources provides the highest potential for serendipitous discovery of relevant "Things"

To conclude, the Linked Data Web's market opportunities are all about the evolution of the Web into a powerful substrate that offers a unique intersection of "Link Density" and "Relevance", exploitable across horizontal and vertical market segments to solutions providers. Put differently, SDQ is how you take "The Ad" out of "Advertising" when matching Web users to relevant things :-)

]]>
The Linked Data Market via a BCG Matrix (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1442Fri, 26 Sep 2008 16:36:56 GMT32008-09-26T12:36:56-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Unfortunately our fixation with "Labels" and the artificial link that exist between "Labels" and so-called "first mover advantage" continue to impede our progress to clarity about matters such as a fully functional Web of interlinked data.

A while back I watched Kevin Kelly's 5,000 days presentation at TED. During the presentation, I kept on scratching my head, wondering why phrases like "Linked Data", "Semantic Web", "Web of Data", "Data Web" where so unnaturally disconnected from his session narrative.

Yesterday I watched IMINDI's TechCrunch 50 presentation, and once again I saw the aforementioned pattern repeat itself. This time around, the poor founders of this "Linked Data Web" oriented company (which is what they are in reality) took a totally undeserved pasting from a bunch of panelist incapable of seeing beyond today (Web 2.0) and yesterday (initial Web bootstrap).

Anyway, thanks to the Web, this post will make a small contribution towards re-connecting the missing phrases to these "Linked Data Web" presentations.

]]>
The Trouble with Labelshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1438Tue, 16 Sep 2008 14:07:49 GMT12008-09-16T10:07:49.000015-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Even with the marginal degrees of serendipitous discovery that the current document oriented Web offers, it's still possible to stumble across poignant gems such as this statement from InspireUX :



The statement above resonates with a lot of my fundamental views about the essence of Web. It also drives right at the core of what we are trying to address with the OpenLink Data Explorer (ODE) which simply isn't about Linked Data visualization, but the combination of visualization, user interaction, and unobtrusive exposure and exploitation of Linked Data Entities culled from the existing Web of Linked Documents. ODE consumes and processes URIs or URLs. Thus, as long as the (X)HTML container / host document keeps URIs or URLs in "agent view", ODE will give you the option to interact with the-data-behind Web information resources (e.g., Web Pages, Images, Audio etc..)

Do remember, "mission-critical" is no longer a corporate / enterprise theme. The lines of demarcation between the individual and enterprise are blurring at warp speed.

]]>
Nice Quote about Information Architecture & World Wide Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1421Wed, 27 Aug 2008 15:03:39 GMT12008-08-27T11:03:39-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Unfortunately a number of Linking Open Data (LOD) community / Linked Data tribe members (myself included) aren't at the Semantic Web Technologies conference in San Jose (we are in a busy period for Semantic Web Technology related Conferences). But all isn't lost as Ivan Herman (W3C Semantic Web Activity Lead) , LOD member, and SWEO colleague has carried the banner with aplomb.

Ivan's presentation titled: State of the Semantic Web, is a must view for those who need a quick update on where things are re. the Semantic Web in general.

I also liked the fact that in proper "Lead by example" manner, his presentation isn't PDF or PPT based, it's a Web Document :-)

Hint: as per usual, this post contains a Linked Data demo nugget. This time around, it's in the form of a shared calendar covering a large number of Semantic Web Technology events. All I had to do was subscribe to a number of WebDAV accessible iCal files from my Calendar Data Space and the platform did the rest i.e. produce Linked Data Objects for events associated with a plethora of conferences.

If you assimilate Ivan's presentation properly, you will note I've just generated, and shared, a large number of URIs covering a range of conference events. Thus, you can extend my contributions (thereby enriching the GGG) by simply associating additional data from your Linked Data Space with mine. All you have to do is use my calendar data objects URIs in your statements.

]]>
State of the Semantic Web Presentationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1365Fri, 23 May 2008 10:53:08 GMT22008-05-23T06:53:08-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
At the forth coming World Wide Web 2008 Conference there will be an entire workshop dedicated to the emerging Linked Data Web (aka Linked Data). The Linked Data Workshop will include: Presentations, Demonstrations, Tutorials, and Research Papers from a variety on organizations and individuals associated with this very exciting aspect of the Web.

The deadline for submitting papers, presentations, demo, and tutorial proposals is the 28th of January, 2008.

]]>
Linked Data Workshop -- WWW2008http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1291Thu, 10 Jan 2008 18:03:29 GMT22008-01-10T13:03:29.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I've just read the extensive post by Nova Spivack titled: The Semantic Web, Collective Intelligence and Hyperdata, courtesy of a post by Danny Ayres titled: Confused about the Semantic Web , in response to a post by Tim O'Reilly titled: Economist Confused About the Semantic Web? .

My Comments:

Hyperdata is short for HyperLinked Data :-) The same applies to Linked Data. Thus, we have two literal labels for the same core Concept. HTTP is the enabling protocol for "Hyper-linking" Documents and associated Structured Data via the World Wide Web (Web for short). Data Links associated with Structured Data contained in, or hosted by, Documents on the Web.

RDFa, eRDF, GRDDL, SPARQL Query Language, SPARQL Protocol (SOAP or REST service), SPARQL Results Serializations (XML or JSON) collectively provide a myriad of unobtrusive routes to structured data embedded within, or associated with, existing Web Documents.

As Danny already states, ontologies are not prerequisites for producing structured data using the RDF Data Model. They simply aid the ability to express one's self clearly (i.e. no repetition or ambiguity) across a broad audience of machines (directly) and their human masters (indirectly).

Using the crux of this post as the anecdote: The Semantic Data Web would simplify the process of claiming and/or proving that Linked Data and Hyperdata describe the same concept. It achieves this by using Triples (Subject, Predicate, Object) expressed in various forms (N3, Turtle, RDF/XML etc.) to formalize claims in a form palatable to electronic agents (machines) operating on behalf of Humans. In a nutshell, this increases human productive by completely obliterates the erstwhile exponential costs of discovering data, information, and knowledge.

BTW - for full effect, view this post (i.e. cut and paste the Permalink URI of this post, below) into an RDF Browser such as:

]]>
Web of Linked Data & Hyperdatahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1252Tue, 05 Feb 2008 01:43:55 GMT22008-02-04T20:43:55.000003-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Dare Obasanjo's post about the issue of Open Data (or Open Data Access), indicates that the "Open Data" issue is gradually beginning to resonate across a broader audience.

From my perspective on things I prefer to align my articulation of the changes that are occurring across our industry (courtesy of the Internet Inflection) to the MVC pattern.

Re. the Web Versions (or Dimensions of Interaction):

    Web 1.0 - (V)iewer (Interactive Web experienced via Browser)
    Web 2.0 - (C)ontroller Web (via Web Services API)
    Web 3.0 - (M)odel (via the RDF Data Model as the basis for an Open and Standards based Concrete Conceptual Data Model)

The same applies to evolution of Openness:

    Early work by Sun and other early UNIX Vendors - (V)iewer (Interaction with the same OS across different hardware platforms)
    Open Source Movement - (C)ontroller (Open Access to Application Source Code )
    Open Data - (M)odel (*where we are now* Freeing the Date from the Applications and Services while moving the application development focus to a Concrete Conceptual Data Model focus. The Data Web is a classic example.)

In the (C)ontroller realm where the focal point is Application Logic, data access issues aren't obvious (*I recall my battles with Richard Stallman re. the appropriate Open Source License variant for iODBC during the embryonic years of database and data access technology on Linux*). Data is an enigma in this realm, unfortunately. This implies that "Data Lock-in" occurs deliberately, but in most cases, inadvertently when we make Application Logic the focal point of everything. Another example is Web 2.0 in which the norm (unfortunately) is to suck in your data, and then refuse to give you complete ownership over how it is used (including the fact that you may want to share it elsewhere).

Open Data is a really big deal which is why the SWEO supported Linking Open Data Project is a very big deal. The good news is that this movement is gathering moment at an exponential rate :-)]]>
Open Source and Open Data Movementshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1175Sun, 01 Apr 2007 21:55:55 GMT12007-04-01T17:55:55.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Frederick Giasson penned an interesting post earlier today that highlighted the RDF Middleware services offered by Triplr and the Virtuoso Sponger

Some Definitions (as per usual):

RDF Middleware (as defined in this context) is about producing RDF from non RDF Data Sources. This implies that you can use non RDF Data Sources (e.g. (X)HTML Web Pages, (X)HTML Web Pages hosting Microformats, and even Web Services such as those from Google, Del.icio.us, Flickr etc..) as Semantic Web Data Source URIs (pointers to RDF Data).

In this post I would like to provide a similar perspective on this ability to treat non RDF as RDF from RDF Browser perspective.

First off, what's an RDF Browser?

An RDF Browser is a piece of technology that enables you to Browse RDF Data Sources by way of Data Link Traversal. The key difference between this approach and traditional browsing is that Data Links are typed (they possess inherent meaning and context) whereas traditional links are untyped (although universally we have been trained to type them as links to Blurb in the form of (X)HTML pages or what is popularly called "Web Content".).

There are a number of RDF Browsers that I am aware off (note: pop me a message directly of by way of a comment to this post if you have a browser that I am unaware of), and they include (in order of creation and availability):

  1. Tabulator
  2. DISCO - Hyperdata Browser
  3. OpenLink Ajax Toolkit's RDF Browser (a component of the OAT Javascript Toolkit)

Each of the browsers above can consume the services of Triplr or the Virtuoso Sponger en route to unveiling a RDF Data that is traversable via URI dereferencing (HTTP GETing the data exposed by the Data Pointer). Thus you can cut&paste the following into each of the aforementioned RDF Browsers:

  1. Triplr's RDF Data (Triples) extractions from Dan Connolly's Home Page
  2. The Virtuoso Sponger's RDF Data (Triples) extractions from Dan Connolly's Home Page

Since we are all time challenged (naturally!) you can also just click on these permalinks for the OAT RDF Browser demos:

  1. Permalink for Triplr's RDF Data (Triples) extractions from Dan Connolly's Home Page
  2. Permalink for the Virtuoso Sponger's RDF Data (Triples) extractions from Dan Connolly's Home Page
]]>
RDF Browsers & RDF Data Middlewarehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1172Sun, 29 Apr 2007 18:59:05 GMT42007-04-29T14:59:05-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
A defining characteristic of the Data Web (Context Oriented Web 3.0) is that it facilitates Meshups rather than Mashups.

Quick Definitions:

    Mashups - Brute force joining of disparate Web Data
    Meshups - Natural joining of disparate Web Data

Reasons for the distinction:

    Mashups are Data Model oblivious.
    Meshups are Data Model driven.

Examples:

    Mashups are based on RSS 2.0 most of the time (RSS 2.0 is at best a Tree Structure that contains untyped or meaning challenged links.
    Meshups are RDF based and the data is self describing since the links are typed (posses inherent meaning thereby providing context).

So what? You may be thinking.

For starters, I can quite easily Mesh data from Googlebase (which emits RSS 2.0 or Atom) and other data sources with the Mapping Services from Yahoo!

I can achieve this in minutes without writing a single line of code. I can do it because of the Data Model prowess of RDF (self-describing instance-data), the data interchange and transformation power of XML and XSLT respectively, the inherent power of XML based Web Services (REST or SOAP), and of course, having a Hybrid Server product like Virtuoso at my disposal that delivers a cross platform solution for exploiting all of these standards coherently.

I can share the self-describing describing data source that serves my Meshup. Try reusing the data presented by a Mashup via the same URL that you used to locate Mashup to get my drift.

Demo Links:

  1. Googlebase Query URL as an RDF Data Source
  2. Perform a simple Data Mesh by adding (via link copy and paste) this Upcoming.org Query Services URL for Ajax Events to the RDF Browsers list of Data Sources (paste into the Data Source URI input field).

What does this all mean?

"Context" is the catalyst of the burgeoning Data Web (Semantic Web Layer - 1). It's the emerging appreciation of "Context" that is driving the growing desire to increment Web versions from 2.0 to 3.0. It also the the very same "Context" that has been a preoccupation of Semantic Web vision since its inception.

The journey towards a more Semantic Web is all inclusive (all "ANDs" and no "ORs" re. participation).

The Semantic Web is self-annotating. Web 2.0 has provided a huge contribution to the self annotation effort: on the Web we now have Data Spaces for Bookmarks (e.g del.icio.us), Image Galleries ( e.g Flickr), Discussion Forums (remember those comments associated with blog posts? ditto the pingbacks and trackbacks?), People Profiles (FOAF, XFN, del.icio.us, and those crumbling walled-gardens around many Social Networks), and more..

A Web without granular access to Data is simply not a Web worth having (think about the menace of click-fraud and spam).

]]>
Data Web, Googlebase, and Yahoo!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1165Thu, 22 Mar 2007 23:14:55 GMT22007-03-22T19:14:55-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I just overheard the following dialog between my six year old son and his play date:

Play Date: What is that thing on the Wall?
My Son: Security Alarm
Play Date: How does it work
My Son: If you click on that top button and then open the door, I will have to enter a code when we come back in or the alarm will go off
Play Date: What is the code?
My Son: I can't tell you that!
Play Date: Why not?
My Son: You might come and steal something from our house!
Play Date: No I won't!
My Son: Well, you might tell someone that might come and steal something from our house! or that person could tell someone who could tell someone that would steal from our house

LOL!! of course! At the same time wondering, how come a majority of adults don't quite see the need for granular access to Web Data in a manner that enables computers and humans to collectively arrive at similar decisions?

Putting Data in context en route to producing actionable knowledge is a transient endeavor that engages a myriad of human senses. We demonstrate comprehension of this fact in our daily existence as social creatures (at a very early age as depicted above). That said, we seem to forget this fact when engaging the Web: If we can't see it then it can't be valuable.

BTW - I just received a ping about the "Sensory Web" (which is just another way of describing a Data Driven Web experience from my vantage point.)

In the popular M-V-C pattern you don't see the "M", but the "M" will kill you if you get it wrong (it is the FORCE)! Coming to think about it, the pattern could have been coined: V-C-M or C-M-V, but isn't for obvious reasons :-)

RDF is the vehicle that enables us tap into the Data aspect of the Web. We started off with pages of blurb linked via hypertext (Web 1.0) and then looked to "Keywords" for some kind of data access; we then isolated some "Verbs" and discovered another dimension of Web Interaction (Web 2.0) but looked to these "Verbs" for data access which left us with Mashups; and now we are starting to extract "Nouns" and "Adjectives" from sentences (Subject, Predicate, Object - Triples) associated with resources on the Web (Data Web / Web 3.0 / Semantic Web Layer 1) which provides a natural data access substrate for Meshups (natural joining of disparate data from a plethora of data sources) while providing the foundation layer for the Semantic Web.

For those who need use-cases that demonstrate tangible value re. the Semantic Web, here are some projects to note courtesy of the Semantic Web Education and Outreach (SWEO) interest group:

  1. FOAF based White-lists - Attacking SPAM
  2. Open Data Access and Linking for the Data Web - Data Integration and Generation effort that creates a cluster of RDF instance data from a myriad of data sources relating to every day things such as: People, Places, Events, Projects, Discussions, Music, Books, and other things
  3. Content Labeling - Protecting our kids on the Web amongst other matters relating to knowledge about data sources
  4. Others..
Related posts:
  1. Data Web and Global Data Integration & Generation Effort
  2. Previous Data Web posts.
]]>
Our Basic Human Instinctshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1143Sat, 24 Feb 2007 00:55:49 GMT12007-02-23T19:55:49-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The simple demo use our Ajax based Visual Query Builder for the SPARQL Query Language (this isn't Grandma's Data Web UI, but not to worry, that is on it's way also). Here goes:

  1. go to http://demo.openlinksw.com/isparql
  2. Enter any of the following values into the "Default Data URI"; field:
    • - http://www.mkbergman.com/?p=336
      - http://radar.oreilly.com/archives/2007/02/pipes_and_filte.html
      - http://jeremy.zawodny.com/blog/archives/008513.html
      - Other URIs

What I am demonstrating is how existing Web Content hooks transperently into the "Data Web". Zero RDF Tax :-) Everything is good!

Note: Please look to the bottom of the screen for the "Run Query" Button. Remember, it not quite Grandma's UI but should do for Infonauts etc.. A screencast will follow.

]]>
Hello Data Web!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1134Tue, 05 Feb 2008 04:22:04 GMT112008-02-04T23:22:04.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I just spotted a nice Semantic Desktop animation Courtesy of John Breslin.

This is fundamentally an animation demonstrating Semantic Web exploitation in the classic: picture speaks a thousand words manner. It also illustrates (yet again) the important Data Space(s) aspect of creating Semantic Web presence.

Finally, the Web 2.0 usage pattern tries to espouse what's demonstrated in this animation via data-context-challenged interactions (due to its "Walled Garden" and "Data Silo" approach to Data Access etc..). The Semantic Web (as per numerous posts on the subject) on the other hand achieves this via data-context-aware interactions (as will be exemplified via meshups).

]]>
Data Spaces and Semantic Web Animationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1035Tue, 05 Sep 2006 20:00:17 GMT22006-09-05T16:00:17.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Mary Meeker's Web 2.0 Presentation.

Key data points:

  • Market cap of big 5: $2B (2000 pre-IPO), $178B (2000 peak), $32B (2002 trough) $261B (2005)
  • 27% of US Internet users read blogs
  • 54MM registered Skype users (9/05) - fastest product ramp ever?
  • China - More Internet users < age of 30 than anywhere
  • S. Korea Broadband penetration of 70%+ - No. 1 in world
  • Mobile is most important direction now

Conclusion: first ten years (1995-2005) of commercial Internet were a warm up act for what is about to happen

"

(Via Silkworm Blog.)

]]>
Web 2.0 Conference Notes: Mary Meekerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/873Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

By Martin LaMonica, CNET News.com

The World Wide Web consortium, the standards body in charge of developing XML, said Tuesday that it has issued three recommendations designed to make handling XML-formatted data more efficient. The specifications have the backing of large industry software providers, including IBM, Microsoft and BEA Systems, which provide the software infrastructure to build and run XML data and Web services applications.

The W3C and vendors are looking at a variety of methods of speeding up the performance of XML, which can be slow for certain applications.

http://news.com.com/2110-1013_3-5551788.html

See also the news story: http://xml.coverpages.org/ni2005-01-25-a.html

]]>
W3C Recommends Quicker XML Transmissionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/671Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

* IBM has introduced new portal software for accessing and integrating disparate applications, business processes, and data while collaborating with colleagues via a single Web-based environment and sign-on.

http://www.bijonline.com/News.asp?NewsID=980

]]>
IBM Announces New Integration Portalhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/235Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
As a compliment to the most recent Linked Data Design Issues note by TimBL, I would like to add this subtle tweak to the enumerated rules:

  1. Identify or Name things using HTTP URIs
  2. Describe things using the RDF metadata model
  3. Increase link data mesh density on the Web by linking (referring) to things in other data spaces using their HTTP URIs.

If you perform the steps above, on any HTTP network (e.g. World Wide Web), you implicitly bind the Names/Identifiers of things to negotiable representations of their metadata (description) bearing documents.

Also note, you can create and deploy the resulting RDF metadata using any of the following approaches:

  1. RDFa within (X)HTML documents
  2. N3, Turtle, TriX, RDF/XML etc. based documents
  3. Programmatically generated variants of 1&2.

Related

]]>
Linked Data Rules Simplifiedhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1561Sat, 27 Jun 2009 03:18:24 GMT22009-06-26T23:18:24.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
After reading Bengee's interview with CrunchBase, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse.

CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is?
Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes & Relationships).




CrunchBase: And what inspired you to create it?
Me: Bengee built a new space with your data, and we've built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles).
Bengee posted a notice to the Linking Open Data Community's public mailing list announcing his effort. Bearing in mind the fact that we've been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee's RDFization efforts, and ours.
Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we've taken the "RDFization on the fly" approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we've also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data.

CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?
Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to "Mesh" (rather than "Mash") CrunchBase data with other Linked Data sources on the Web without writing a single line of code.

CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?
Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap.

CrunchBase: Can you put into layman’s terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?
Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).
SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That's it in a nutshell.

CrunchBase: On your website you wrote that “RDF and SPARQL as productivity boosters in everyday web development”. Can you elaborate on why you believe that to be true?
Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn't confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the "Information at Your Fingertips" vision of yore. Even though it's taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision's intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn't any knowledge without accessible Information, and there isn't any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).
Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet.

CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?
Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of "Smart Data Access" to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs.

Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge:

  1. Amazon.com
  2. Microsoft
  3. Google
  4. Apple
]]>
Crunchbase & Semantic Web Interview (Remix - Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1424Thu, 28 Aug 2008 00:35:15 GMT32008-08-27T20:35:15-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Stumbled across a nice post titled: What do people have against URLs?. My answer: Everything, if they don't understand the inherent power of URLs when incorporated into the "Data Source Naming" mechanism of the Web called: URIs :-)

URIs are simple to use i.e you simply click on them via a user agents UI. However, URLs when incorporated into Data Source Naming en route to constructing HTTP based Identifiers, that deliver HTTP based pointers to the location / address of a Resource Descriptions, another matter.

I touched on this issue in my Linked Data Planet keynote last week, and I must say, it did set off a light.

I believe, we can only get the broader Web community to comprehend the utility of URIs (Web Data Source Names) by exposing said utility via the Web's Universal Client (Web Browser). For instance, how do URN based Identity / Naming schemes help in a world dominated by Web Browsers that only grok "http://"? From my vantage point, the practical solution is for data providers who already have "doi", "lsid" and other Handle based Identifiers in place, to embark upon http-to-native-naming-scheme-proxying.

In my usual "dog-fooding" and "practice what you preach" fashion, this is exactly what we do in the new Linked Data Web extension that we've decided to reveal to the public (albeit late beta). Thus, when you use an existing browser to view pages with "lsid" or "doi" URNs, you still enjoy the utility of getting at the "Raw Linked Data Sources" that these names expose.

]]>
What do people have against URLs or URIs? (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1388Mon, 23 Jun 2008 13:37:57 GMT22008-06-23T09:37:57.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
ODBC delivers open data access (by reference) to a broad range of enterprise databases via a 'C' based API. Thanks to the iODBC and unixODBC projects, ODBC is available across broad range of platforms beyond Windows.

ODBC identifies data sources using Data Source Names (DSNs).

WODBC (Web Open Database Connectivity) delivers open data access to Web Databases / Data Spaces. The Data Source Naming scheme: URI or IRI, is HTTP based thereby enabling data access by reference via the Web.

ODBC DSNs bind ODBC client applications to Tables, Views, Stored Procedures.

WODBC DSNs bind you to a Data Space (e.g. my FOAF based Profile Page where you can use the "Explore Data Tab" to look around if you are a human visitor) or a specific Entity within a Data Space (i.e Person Entity Me).

ODBC Drivers are built using APIs (DBMS Call Level Interfaces) provided by DBMS vendors. Thus, a DBMS vendor can chose not to release an API, or do so selectivity, for competitive advantage or market disruption purposes (it's happened!).

WODBC Drivers are also built using APIs (Web Services associated with a Web Data Space). These drivers are also referred to as RDF Middleware or RDFizers. The "Web" component of WODBC ensures openness, you publish Data with URIs from your Linked Data Server and that's it; your data space or specific data entities are live and accessible (by reference) over the Web!

So we have come full circle (or cycle), the Web is becoming more of a structured database everyday! What's new is old, and what's old is new!

Data Access is everything, without "Data" there is no information or knowledge. Without "Data" there's not notion of vitality, purpose, or value.

URIs make or break everything in the Linked Data Web just as ODBC DSNs do within the enterprise.

I've deliberately left JDBC, ADO.NET, and OLE-DB out of this piece due to their respective programming languages and frameworks specificity. None of these mechanisms match the platform availability breadth of ODBC.

The Web as a true M-V-C pattern is now crystalizing. The "M" (Model) component of M-V-C is finally rising to the realm of broad attention courtesy of the "Linked Data" meme and "Semantic Web" vision.

By the way, M-V-C lines up nicely with Web 1.0 (Web Forms / Pages), Web 2.0 (Web Services based APIs), and Web 3.0 (Data Web, Web of Data, or Linked Data Web) :-)

]]>
ODBC & WODBC Comparisonhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1364Tue, 20 May 2008 19:46:11 GMT12008-05-20T15:46:11-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
In 2006, I stumbled across Jason Kolb (online) via a 4-part series of posts titled: Reinventing the Internet. At the time, I realized that Jason was postulating about what is popularly known today as "Data Portability", so I made contact with him (blogosphere style) via a post of my own titled: Data Spaces, Internet Reinvention, and the Semantic Web. Naturally, I tried to unveil to Jason the connection between his vision and the essence of the Semantic Web. Of course, he was skeptical :-)

Jason recently moved to Massachusetts which lead to me pinging him about our earlier blogosphere encounter and the emergence of a Data Portability Community. I also informed him about the fact that TimBL, myself, and a number of other Semantic Web technology enthusiasts, frequently meet on the 2nd Tuesday of each month at the MIT hosted Cambridge Semantic Web Gatherings, to discuss, demonstrate, debate all aspects of the Semantic Web. Luckily (for both of us), Jason attended the last event, and we got to meet each other in person.

Following our face to face meeting in Cambridge, a number of follow-on conversations ensued covering, Linked Data and practical applications of the Semantic Web vision. Jason writes about our exchanges a recent post titled: The Semantic Web. His passion for Data Portability enabled me to use OpenID and FOAF integration to connect the Semantic Web and Data Portability via the Linked Data concept.

During our conversations, Jason also eluded to the fact that he had already encountered OpenLink Software while working with our ODBC Drivers (part of or UDA product family) for IBM Informix (Single-Tier or Multi-Tier Editions) a few years ago (interesting random connection).

As I've stated in the past, I've always felt that the Semantic Web vision will materialize by way of a global epiphany. The count down to this inevitable event started at the birth of the blogosphere, ironically. And accelerated more recently, through the emergence of Web 2.0 and Social Networking, even more ironically :-)

The blogosphere started the process of Data Space coalescence via RSS/Atom based semi-strucutured data enclaves, Web 2.0 RDFpropagated Web Service usage en route to creating service provider controlled, data and information silosRDF, Social NetworkingRDF brought attention to the fact that User Generated Data wasn't actually owned or controlled by the Data Creators etc.

The emergence of "Data Portability" has created a palatable moniker for a clearly defined, and slightly easier to understand, problem: the meshing of Data and Identity in cyberspace i.e. individual points of presence in cyberspace, in the form of "Personal Data Spaces in the Clouds" (think: doing really powerful stuff with .name domains). In a sense, this is the critical inflection point between the document centric "Web of Linked Documents" and the data centric "Web or Linked Data". There is absolutely no other way solve this problem in a manner that alleviates the imminent challenges presented by information overload -- resulting from the exponential growth of user generated data across the Internet and enterprise Intranets.

]]>
Semantic Data Web Epiphanies: One Node at a Timehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1300Fri, 18 Jan 2008 07:27:27 GMT12008-01-18T02:27:27.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
In response to the ReadWriteWeb piece titled: Semantic Web: What is the Killer App. by Alex Iskold:

Information overload and Data Portability are two of the most pressing and imminent challenges affecting every individual connected to the global village exposed by the Internet and World Wide Web. I wrote an earlier post titled: Why We Need Linked Data that shed light on frequently overlooked realities about the Document Web.

The real Killer application of the Semantic Web (imho) is Linked Data (or Hyperdata), just as the killer application of the Document Web was Linked Documents (Hyperlinks). Linked Data enables human users (indirectly) and software agents (directly in response to human instruction) to traverse Web Data Spaces (Linked Data enclaves within the Giant Global Graph).

Semantic Web applications (conduits between humans and agents) that take advantage of Linked Data include:

DBpedia - General Knowledge sourced from Wikipedia and a host of other Linked Data Spaces.

Various Linked Data Browsers: Zitgist Data Viewer, OpenLink RDF Browser, DISCO Browser, and TimBL's Tabulator.

zLknks - Linked Data Lookup technology for Web Content Publishing systems (note: more to come on this in a future post).

OpenLink Data Spaces - a solution for Data Portability via a Linked Data Junction Box for Web 1.0 ((X)HTML Document Webs), 2.0 (XML Web Services based Content Publishing, Content Syndication, and Aggregation), and 3.0 (Linked Data) Data Spaces. Thus, via my URI (when viewed through a Linked Data Browser/Viewer) you can traverse my Data Space (i.e my Linked Data Graph) generated by the following activities:

    Blog Posts publishing
    My RSS & Atom Content Subscriptions (what used to be called a "Blogroll")
    My Bookmarks (from my Desktop and Del.icio.us)
    and other things I choose to share with the public via the Web

Virtuoso - a Universal Server Platform that includes RDF Data Management, RDFization Middleware, SQL-RDF Mapping, RDF Linked Data Deployment, alongside a hybrid/multi-model, virtual/federated data service in a single product offering.

BTW - There is a Linked Data Workshop at this years World Wide Web conference. Also note the Healthcare & Life Science Workshop which is a related Linked Data technology and Semantic Web best practices realm. ]]>
Semantic Web Killer Application?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1293Tue, 05 Feb 2008 01:32:42 GMT92008-02-04T20:32:42.000003-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The motivation behind this post is a response to the Read/WriteWeb post titled: Semantic Web: Difficulties with the Classic Approach.

First off, I am going to focus on the Semantic Data Web aspect of the overall Semantic Web vision (a continuum) as this is what we have now. I am also writing this post as a deliberate contribution to the discourse swirling around the real topic: Semantic Web Value Proposition.

Situation Analysis

We are in the early stages of the long anticipated Knowledge Economy. That being the case, it would be safe to assume that information access, processing, and dissemination are of utmost importance to individuals and organizations alike. You don't produce knowledge in a vacum! Likewise, you can produce Information in a vacum, you need Data.

The Semantic Data Web's value to Individuals

Problem:

Increasingly, Blogs, Wikis, Shared Bookmarks, Photo Galleries, Discussion Forums, Shared Calendars and the like, have become invaluable tools for individual and organizational participation in Web enabled global discourse (where a lot of knowledge is discovered). These tools, are typically associated with Web 2.0, implying Read-Write access via Web Services, centralized application hosting, and data lock-in (silos).

The reality expressed above is a recipe for "Information Overload" and complete annihilation of ones effective pursuit and exploitation of knowledge due "Time Scarcity" (note: disconnecting is not an option). Information abundance is inversely related to available processing time (for humans in particular). In my case for instance, I was actively subscribed to over 500+ RSS feeds in 2003. As of today, I've simply stopped counting, and that's just my Weblog Data Space. Then add to that, all of the Discussions I track across Blogs, wikis, message boards, mailing lists, traditional usnet discussion forumns, and the like, and I think you get the picture.

Beyond information overload, Web 2.0 data is "Semi-Structured" by way of it's dominant data containers ((X)HTML, RSS, Atom documents and data streams etc.) lacking semantics that formally expose individual data items as distinct entities, endowed with unambiguous naming / identification, descriptive attributes (a type of property/predicate), and relationships (a type of property/predicate).

Solution:

Devise a standard for Structured Data Semantics that is compatible with the Web Information BUS.

Produce structured data (entities, entity types, entity relationships) from Web 1.0 and Web 2.0 resources that already exists on the Web such that individual entities, their attributes, and relationships are accessible and discernible to software agents (machines).

Once the entities are individually exposed, the next requirement is a mechanism for selective access to these entities i.e. a query language.

Semantic Data Web Technologies that facilitate the solution described above include:

Structured Data Standards:
    RDF - Data Model for structured data
    RDF/XML - A serialization format for RDF based structured data
    N3 / Turtle - more human friendly serialization formats for RDF based structured data
Entity Exposure & Generation:
    GRDDL - enables association between XHTML pages and XSLT stylesheets that facilitates loosely coupled "on the fly" extraction of RDF from non RDF documents
    RDFa - enables document publishers or viewers (i.e those repurposing or annotating) to embed structured data into existing XHTML documents
    eRDF - another option for embedding structured RDF data within (X)HTML documents
    RDF Middleware - typically incorporating GRDDL, RDFa, eRDF, and custom extraction and mapping as part of a structured data production pipeline
. Entity Naming & Identification:

Use of URIs or IRIs for uniquely identifying physical (HTML Documents, Image Files, Multimedia Files etc..) and abstract (People, Places, Music, and other abstract things).

Entity Access & Querying:

    SPARQL Query Language - the SQL analog of the Semantic Data Web that enables query constructs that target named entities, entity attributes, and entity relationships

The Semantic Data Web's value to Organizations

Problem:

Organizations are rife with a plethora of business systems that are built atop a myriad of database engines, sourced from a variety of DBMS vendors. A typical organization would have a different database engine, from a specific DBMS vendor, underlying critical business applications such as: Human Resource Management (HR), Customer Relationship Management (CRM), Accounting, Supply Chain Management etc. In a nutshell, you have DBMS Engines, and DBMS Schema heterogeneity permeating the IT infrastructure of organizations on a global scale, making Data & Information Integration the biggest headache across all IT driven organizations.

Solution:

Alleviation of the pain (costs) associated with Data & Information Integration.

Semantic Data Web offerings:

A dexterous data model (RDF) that enables the construction of conceptual views of disparate data sources across an organization based on existing web architecture components such as HTTP and URIs.

Existing middleware solutions that facilitate the exposure of SQL DBMS data as RDF based Structured Data include:

BTW - There is an upcoming W3C Workshop covering the integration of SQL and RDF data.

Conclusion

The Semantic Data Web is here, it's value delivery vehicle is the URI. The URI is a conduit to Interlinked Structured Data (RDF based Linked Data) derived from existing data sources on the World Wide Web alongside data continuously injected into the Web by organizations world wide. Ironically, the Semantic Data Web only platform that crystallizes the: Information at Your Fingertips vision, without development environment, operating system, application, or database lock-in. You simply click on a Linked Data URI and the serendipitous exploration and discovery of data commences.

The unobtrusive emergence of the Semantic Data Web is a reflection of the soundness of the underlying Semantic Web vision.

If you are excited about Mash-ups then your are a Semantic Web enthusiast and benefactor in the making, because you only "Mash" (brute force data extraction and interlinking) because you can't "Mesh" (natural data extraction and interlinking). Likewise, if you are a social-networking, open social-graph, or portable social-network enthusiast, then you are also a Semantic Data Web benefactor and enthusiasts, because your "values" (yes, the values associated with the properties that define you e.g your interests etc) are the fundamental basis for portable, open, social-networking, which is what the Semantic Data Web hands to you on a platter without compromise (i.e. data lock-in or loss of data ownership).

Some practical examples of Semantic Data Web prowess:
    DBpedia (*note: I deliberately use DBpedia URIs in my posts where I would otherwise have used a Wikipedia article URI*)
]]>
Semantic Web Value Propositionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1254Fri, 21 Sep 2007 12:05:07 GMT32007-09-21T08:05:07.000009-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The recently released Semantic Web FAQ (authored by Ivan Herman) has some neat Rich Internet and Semantic Data Web embellishments contributed by Ivan and Lee Feigenbaum. As a result, we not only have a great Semantic Web FAQ document, we also inherit a coherent piece of "demo fodder" that aids the general (S)emantic (W)eb (E)ducation and (O)reach (SWEO) that is clearly in full swing.

Of course, this also enables me to provide yet another Semantic Data Web demo in the form of additional viewing perspectives for the aforementioned FAQ (just click to see):

  1. Semantic Web FAQ via Dynamic Data Page
  2. Semantic Web FAQ via OpenLink Browser

Lee also embarked on a similar embellishment effort re. the SPARQL Query Language FAQ thereby enabling me to also offer alternative viewing perspectives along similar lines:

  1. SPARQL FAQ via Dynamic Data Page
  2. SPARQL FAQ via OpenLink Browser
]]>
Exploring The Semantic Web & SPARQL FAQs, Linked Data Style!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1205Thu, 31 May 2007 21:43:47 GMT12007-05-31T17:43:47.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Web Data Spaces

Now that broader understanding of the Semantic Data Web is emerging, I would like to revisit the issue of "Data Spaces".

A Data Space is a place where Data Resides. It isn't inherently bound to a specific Data Model (Concept Oriented, Relational, Hierarchical etc..). Neither is it implicitly an access point to Data, Information, or Knowledge (the perception is purely determined through the experiences of the user agents interacting with the Data Space.

A Web Data Space is a Web accessible Data Space.

Real world example:

Today we increasing perform one of more of the following tasks as part of our professional and personal interactions on the Web:

  1. Blog via many service providers or personally managed weblog platforms
  2. Create Event Calendars via Upcoming.com and Eventful
  3. Maintain and participate in Social Networks (e.g. Facebook, Orkut, MySpace)
  4. Create and Participate in Discussions (note: when you comment on blogs or wikis for instance, you are participating in, or creating, a conversation)
  5. Track news by subscribing to RSS 1.0, RSS 2.0, or Atom Feeds
  6. Share Bookmarks & Tags via Del.icio.us and other Services
  7. Share Photos via Flickr
  8. Buy, Review, or Search for books via Amazon
  9. Participates in auctions via eBay
  10. Search for data via Google (of course!)

John Breslin has nice a animation depicting the creation of Web Data Spaces that drives home the point.

Web Data Space Silos

Unfortunately, what isn't as obvious to many netizens, is the fact that each of the activities above results in the creation of data that is put into some context by you the user. Even worse, you eventually realize that the service providers aren't particularly willing, or capable of, giving you unfettered access to your own data. Of course, this isn't always by design as the infrastructure behind the service can make this a nightmare from security and/or load balancing perspectives. Irrespective of cause, we end up creating our own "Data Spaces" all over the Web without a coherent mechanism for accessing and meshing these "Data Spaces".

What are Semantic Web Data Spaces?

Data Spaces on the Web that provide granular access to RDF Data.

What's OpenLink Data Spaces (ODS) About?

Short History

In anticipation of this the "Web Data Silo" challenge (an issue that we tackled within internal enterprise networks for years) we commenced the development (circa. 2001) of a distributed collaborative application suite called OpenLink Data Spaces (ODS). The project was never released to the public since the problems associated with the deliberate or inadvertent creation of Web Data silos hadn't really materialized (silos only emerged in concreted form after the emergence of the Blogosphere and Web 2.0). In addition, there wasn't a clear standard Query Language for the RDF based Web Data Model (i.e. the SPARQL Query Language didn't exist).

Today, ODS is delivered as a packaged solution (in Open Source and Commercial flavors) that alleviates the pain associated with Data Space Silos that exist on the Web and/or behind corporate firewalls. In either scenario, ODS simply allows you to create Open and Secure Data Spaces (via it's suite of applications) that expose data via SQL, RDF, XML oriented data access and data management technologies. Of course it also enables you to integrates transparently with existing 3rd party data space generators (Blogs, Wikis, Shared Bookmrks, Discussion etc. services) by supporting industry standards that cover:

  1. Content Publishing - Atom, Moveable Type, MetaWeblog, Blogger protocols
  2. Content Syndication Formats - RSS 1.0, RSS 2.0, Atom, OPML etc.
  3. Data Management - SQL, RDF, XML, Free Text
  4. Data Access - SQL, SPARQL, GData, Web Services (SOAP or REST styles), WebDAV/HTTP
  5. Semantic Data Web Middleware - GRDDL, XSLT, SPARQL, XPath/XQuery, HTTP (Content Negotiation) for producing RDF from non RDF Data ((X)HTML, Microformats, XML, Web Services Response Data etc).

Thus, by installing ODS on your Desktop, Workgroup, Enterprise, or public Web Server, you end up with a very powerful solution for creating Open Data access oriented presence on the "Semantic Data Web" without incurring any of the typically assumed "RDF Tax".

Naturally, ODS is built atop Virtuoso and of course it exploits Virtuoso's feature-set to the max. It's also beginning to exploit functionality offered by the OpenLink Ajax Toolkit (OAT).

]]>
Semantic Web Data Spaceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1185Fri, 13 Apr 2007 22:19:29 GMT12007-04-13T18:19:29.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Stefano Mazzocchi, via his blog: Stefano's Linotype, delivers insightful contribution to the ongoing effort to recapture the essence of the original Semantic Web vision.

The Semantic Web is about granular exposure of the underlying web-of-data that fuels the World Wide Web. It models "Web Data" using a Directed Graph Data Model (back-to-the-future: Network Model Database) called RDF.

In line with contemporary database technology thinking, the Semantic Web also seeks to expose Web Data to architects, developers, and users via a concrete Conceptual Layer that is defined using RDF Schema.

The abstract nature of Conceptual Models implies that actual instance data (Entities, Attributes, and Relationships/Associations) occurs by way of "Logical to Conceptual" schema mapping and data generation that can involve a myriad of logical data sources (SQL, XML, Object databases, traditional web content, RSS/Atom feeds etc.). Thus, by implication, it is safe assume that the Semantic Web's construction is basically a Data Integration and exposure effort. The point that Stefano alludes to in the blog post excerpts that follow:

The semantic web is really just data integration at a global scale. Some of this data might end up being consistent, detailed and small enough to perform symbolic reasoning on, but even if this is the case, that would be such a small, expensive and fragile island of knowledge that it would have the same impact on the world as calculus had on deciding to invade Iraq.

The biggest problem we face right now is a way to 'link' information that comes from different sources that can scale to hundreds of millions of statements (and hundreds of thousands of equivalences). Equivalences and subclasses are the only things that we have ever needed of OWL and RDFS, we want to 'connect' dots that otherwise would be unconnected. We want to suggest people to use whatever ontology pleases them and then think of just mapping it against existing ones later. This is easier to bootstrap than to force them to agree on a conceptualization before they even know how to start!

Additional insightful material from Stefano:

  1. A No-Nonsense Guide to Semantic Web Specs for XML People [Part I]
  2. A No-nonsense Guide to Semantic Web Specs for XML People [Part II]

Benjamin Nowack also chimes into this conversation via his simple guide to understanding Data, Information, and Knowledge in relation so the Semantic Web.

]]>
Semantic Web & Data Integrationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1122Thu, 18 Jan 2007 14:25:51 GMT42007-01-18T09:25:51.000006-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
In the last week I've dispatch some thoughts about a number of issues (Data Spaces and Web 2.0's Open Data Access Paradox) that basically equate to the identification of the Web 2.0 to Semantic Web (Data Web, Web of Databases, Web.next etc..) inflection.

One of the great things about the moderate “open data access” that we have today (courtesy of the blogosphere) is the fact that you can observe the crystallization of new thinking, and/or new appreciation of emerging ideas, in near real-time. Of course, when we really hit the tracks with the Semantic Web this will be in “conditional real-time” (i.e. you choose and control your scope and sensitivity to data changes etc..).

For instance, by way of feed subscriptions, I stumbled upon a series of posts by Jason Kolb that basically articulate what I (and others who believe in the Semantic Web vision) have been attempting to convey in a myriad of ways via posts and commentary etc..

Here are the links to the 4 part series by Jason:

  1. Reinventing the Internet part 1 (appreciating “Presence” over traditional “Web Sites”)
  2. Reinventing the Internet part 2
  3. Reinventing the Internet part 3 (appreciating and comprehending URIs)
  4. Reinventing the Internet part 4 (nice visualization of what “Data Spaces”)
  5. Reinventing the Internet part 5 (everyone will have a Data Space in due course becuase the Internet is really a Federation of Data Spaces)
]]>
Data Spaces, Internet Reinvention, and Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1033Thu, 25 Jan 2007 21:50:40 GMT42007-01-25T16:50:40.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The act of using URIs to "refer to" (reference) Web addressable data objects. It's also the act of using the same URI to de-reference the description of a referenced data object; in this case, the representation of the description is negotiated by a Web client and/or Web server. Thus, you can access the description of a data object via data representation formats such as: JSON, XML, (X)HTML, RDF/XML, N3, Turtle, TriX etc.

Note: In proper Web parlance, a data object is referred to as a resource.

Simple example (using DBpedia)

In the Linked Data realm, If you want to make a reference to the Linked Data meme in a blog post, you are better off using the resource URI: http://dbpedia.org/resource/Linked_Data, instead of the Web page URL: http://dbpedia.org/page/Linked_Data, which is the address of a physical document (an information conveying artifact) that at best visually presents the negotiated representation of a resource description.

Why is this valuable?

In the simplest sense, you only have one focal point for referencing (referring to) and de-referencing (retrieving data about) a given Web resource. It protects you from the impact of Web document location changes (amongst many other things).

Remember, a single URI is a conduit into a realm where the identity, access, representation, presentation, and storage of a resource (data object) are completely distinct. It's the mechanism for conducting data across network, machine, operating system, dbms engine, application, and service (API) boundaries. Thus, without "linked data meme" prescribed URI referencing and de-referencing, we are simply back to "business as usual" re. the industry at large, where networks, operating systems, dbms engines, applications, and services (APIs) become the basis for "data lock-in" and silo construction.

Going forward

Take a second to think about the profound virtues of the ubiquitous Web of Linked Document URLs that we have today, and then apply that thinking to the burgeoning Web of Linked Data URIs, that has just turned corner and heading in everyone's direction at full blast.

Note to "Social Media" players: Who you know isn't the canonical object of sociality. What you are i.e., your description and the data objects it exposes, are real objects of your sociality :-)

Related

]]>
What is the Linked Data Meme about?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1546Wed, 29 Apr 2009 20:31:10 GMT62009-04-29T16:31:10-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
If you are still grappling with the "Semantic Web Project" and one of its more distinguished deliverables: Linked Data Web, then please make time to watch and digest the imminence of this 1990 documentary about Hypermedia titled: Hyperland, by the late Douglas Adams.

Related

]]>
Important Movie and Ultimate Linked Data Documentary (Update 3)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1530Sun, 15 Mar 2009 14:35:49 GMT62009-03-15T10:35:49.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
As indicated in posts from Fred Giasson and Mike Bergman, the Zitgist incubation effort that contributed to the delivery of vital Linked Data Web infrastructure components such as TalkDigger (discourse discovery and participation), PingTheSemanticWeb (ground-zero data source for most Semantic Web search engines), UMBEL (binding layer for Upper and Lower Ontologies amongst other things), Music Ontology (enabling meaningful description of Music), and Bibliographic Ontology (enabling meaningful description of Bibliographic content), is now ready to continue its business development and technology growth as a going concern known as Structured Dynamics.

With great joy and pride, I wish Structured Dynamics all the success they deserve. Naturally, the collaborations and close relationship between OpenLink Software and its latest technology partner will continue -- especially as we collectively work towards a more comprehendible and pragmatic Web of Linked Data for developers (across Web 1.0, 2.0, 3.0, and beyond), end-users (information- and knowledge-workers), and entrepreneurs (driven by quality and tangible value contribution).

Related

]]>
Linked Data Web Collaborators: Introducing Structured Dynamicshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1513Sat, 03 Jan 2009 04:27:26 GMT12009-01-02T23:27:26-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
It is getting clearer by the second that Master Data Management and RDF based Linked data are two realms separated by a common desire to provide "Entity Oriented Data Access" to heterogeneous data sources (within the enterprise and/or across the World Wide Web).

Here is how I see Linked Data providing tangible value to MDM tools vendors and users:

  1. Open access to Entities across MDM instances served up by different MDM solutions acting as Linked Data publishers (i.e., expose MDM Entities as RDF resources endowed with de-referencable URIs thereby enabling Hyperdata-style linking)
  2. Use of RDF-ization middleware to hook disparate data sources (SQL, XML, and other data sources) into existing MDM packages (i.e., the MDM solutions become consumers of RDF Linked Data).

Of course Virtuoso was designed and developed to deliver the above from day one (circa. 1998 re. the core and 2005 re. the use of RDF for the final mile) as depicted below:

Related

]]>
Master Data Management (MDM) & RDF based Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1482Wed, 05 Nov 2008 23:19:02 GMT12008-11-05T18:19:02-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The original design document (by TimBL) that lead to the WWW (*an important read*) was very clear about the need to create an "information space" that connects heterogeneous data sources. Unfortunately, in trying to create a moniker to distinguish one aspect of the Web (the Linked Document Web) from the part that was overlooked (the Linked Data Web), we ended up with a project code name that's fundamentally a misnomer in the form of: "The Semantic Web".

If we could just take "The Semantic Web" moniker for what it was -- a code name for an aspect of the Web -- and move on, things will get much clearer, fast!

Basically, what is/was the "Semantic Web" should really have been code named: ("You" Oriented Data Access) as a play on: Yoda's appreciation of the FORCE (Fact ORiented Connected Entities) -- the power of inter galactic, interlinked, structured data, fashioned by the World Wide Web courtesy of the HTTP protocol.

As stated in a earlier post, the next phase of the Web is all about the magic of entity "You". The single most important item of reference to every Web user would be the Person Entity ID (URI). Just by remembering your Entity ID, you will have intelligent pathways across, and into, the FORCE that the Linked Data Web delivers. The quality of the pathways and increased density of the FORCE are the keys to high SDQ (tomorrows SEO). Thus, the SDQ of URIs will ultimately be the unit determinant of value to Web Users, along the following personal lines, hence the critical platform questions:

  • Does your platform give me Identity (a URI) with high SDQ?
  • Do the Data Source Names (URIs) in your Data Spaces deliver high SDQ?

While most industry commentators continue to ponder and pontificate about what "The Semantic Web" is (unfortunately), the real thing (the "FORCE") is already here, and self-enhancing rapidly.

Assuming we now accept the FORCE is simply an RDF based Linked Data moniker, and that RDF Linked Data is all about the Web as a structured database, we should start to move our attention over to practical exploitation of this burgeoning global database, and in doing so we should not discard knowledge from the past such as the many great examples available gratis from the Relational Database realm. For instance, we should start paying attention to the discovery, development, and deployment of high level tools such as query builders, report writers, and intelligence oriented analytic tools, none of which should -- at first point of interaction -- expose raw RDF or the SPARQL query language. Along similar lines of thinking, we also need development environments and frameworks that are counterparts to Visual Studio, ACCESS, File Maker, and the like.

Related

]]>
YODA & the Data FORCEhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1474Tue, 20 Jul 2010 17:53:06 GMT62010-07-20T13:53:06-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
The evolution of the Web into a federated database, information space, and knowledge-base hybrid continues at frenetic pace.

As more Linked Data is injected into the Web from the Linking Open Data community and other initiatives, it's important to note that "Linked Data" is available in a variety of forms such as:

  • Data Model Definition oriented Linked Data (aka. Data Dictionary)
  • Data Model Instance Data (aka. Instance Data)
  • Linked Data oriented solutions that leverage the smart data substrate that Models and Instance Data meshes deliver.

Note: The common glue across the different types of Linked Data remains the commitment to data object (entity) identification and access via de-referencable URIs (aka. record / entity level data source names).

As stated in my recent post titled: Semantic Web: Travails to Harmony Illustrated. Harmonious intersections of instance data, data dictionaries (schemas, ontologies, rules etc.) provide a powerful substrate (smart data) for the development and deployment of "People" and/or "Machine" oriented solutions. Of course, others have commented on these matters and expressed similar views (see related section below).

The clickable venn diagram below, provides a simple exploration path that exposes the linkage that already exists, across the different Linked Data types, within the burgeoning Linked Data Web.

Related

]]>
State of the Linked Data Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1455Sun, 28 Mar 2010 22:25:19 GMT62010-03-28T18:25:19-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
This post is in response to Glenn McDonald's post titled: Whole Data, where he highlights a number of issues relating to "Semantic Web" marketing communications and overall messaging, from his perspective.

By coincidence, Glenn and I presented at this month's Cambridge Semantic Web Gathering.

I've provided a dump of Glenn's issues and my responses below:

Issue - RDF

  • Ingenious data decomposition idea, but:
  • too low-level; the assembly language of data, where we need Java or Ruby
  • "resource" is not the issue; there's no such thing as "metadata", it's all data; "meta" is a perspective
  • lists need to be effortless, not painful and obscure
  • nodes need to be represented, not just implied; they need types and literals in a more pervasive, integrated way.

Response:

RDF is a Graph based Data Model it stands for Resource Description Framework. The Metadata data angle comes from it's Meta Content Framework (MCF) origins. You can express and serialize data based on the RDF Data Model using: Turtle, N3, TriX, N-Triples, and RDF/XML.

Issue - SPARQL (and Freebase's MQL)

These are just appeasement:
- old query paradigm: fishing in dark water with superstitiously tied lures; only works well in carefully stocked lakes
- we don't ask questions by defining answer shapes and then hoping they're dredged up whole.

Response:

SPARQL, MQL, and Entity-SQL are Graph Model oriented Query Languages. Query Languages always accompany Database Engines. SQL is the Relational Model equivalent.

Issue - Linked Data

Noble attempt to ground the abstract, but:
- URI dereferencing/namespace/open-world issues focus too much technical attention on cross-source cases where the human issues dwarf the technical ones anyway
- FOAF query over the people in this room? forget it.
- link asymmetry doesn't scale
- identity doesn't scale
- generating RDF from non-graph sources: more appeasement, right where the win from actually converting could be biggest!

Response:

Innovative use of HTTP to deliver "Data Access by Reference" to the Linked Data Web.

When you have a Data Model, Database Engine, and Query Language, the next thing you need is a Data Access mechanism that provides "Data Access by Reference". ODBC and JDBC (amongst others) provide "Data Access by Reference" via Data Source Names. Linked Data is about the same thing (URIs are Data Source Names) with the following differences:

  • Naming is scoped to the entity level rather than container level
  • HTTP's use within the data source naming scheme expands the referencability of the Named Entity Descriptions beyond traditional confines such as applications, operating systems, and database engines.

Issue - Giant Global Graph

Hugely motivating and powerful idea, worthy of a superhero (Graphius!), but:
- giant and global parts are too hard, and starting global makes every problem harder
- local projects become unmanageable in global context (Cyc, Freebase data-modeling lists...). And my thus my plea, again. Forget "semantic" and "web", let's fix the database tech first:
- node/arc data-model, path-based exploratory query-model
- data-graph applications built easily on top of this common model; building them has to be easy, because if it's hard, they'll be bad
- given good database tech, good web data-publishing tech will be trivial!
- given good tools for graphs, the problems of uniting them will be only as hard as they have to be.

Response:

Giant Global Graph is just another moniker for a "Web of Linked Data" or "Linked Data Web".

Multi-Model Database technology that meshes the best of the Graph & Relational Models exist. In a nutshell, this is what Virtuoso is all about and it's existed for a very long time :-)

Virtuoso is also a Virtual DBMS engine (so you can see Heterogeneous Relational Data via Graph Model Context Lenses). Naturally, it is also a Linked Data Deployment platform (or Linked Data Sever).

The issue isn't the "Semantic Web" moniker per se., it's about how Linked Data (foundation layer of Semantic Web) gets introduced to users. As I said during the MIT Gathering: "The Web is experienced via Web Browsers primarily, so any enhancement to the Web must be exposed via traditional Web Browsers", which is why we've opted to simply add "View Linked Data Sources" to the existing set of common Browser options that includes:

  1. View page in rendered form (default)
  2. View page source (i.e., how you see the markup behind the page)

By exposing the Linked Data Web option as described above, you enable the Web user to knowingly transition from the traditional Rendered (X)HTML page view to the Linked Data View (i.e., structured data behind the page). This simple "User Interaction" tweak makes the notion of exploiting a Structured Web becomes somewhat clearer.

The Linked Data Web isn't a panacea. It's just an addition to the existing Web that enrichens the things you can do with the Web. It's predominance, like any application feature, will be subject to the degrees to which it delivers tangible value or matrializes internal and external opportunity costs.

Note: The Web isn't ubiquitous today becuase all it's users groked HTML Markup. It's ubquitity is a function of opportunity costs: there simply came a point in the Web boostrap when nobody could afford the opportunity costs associated with being off the Web. The same thing will play out with Linked Data and the broader Semantic Web vision.

Links:
  1. Linked Data Journey part of my Linked Data Planet Presentation Remix(from slides 15 to 22 - which include bits from TimBL's presentation)
  2. OpenLink Data Explorer
  3. OpenLink Data Explorer Screenshots and examples.
]]>
Response to: Whole Data Post (Update 3)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1413Fri, 15 Aug 2008 22:31:48 GMT42008-08-15T18:31:48-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Unfortunately, I could only spend 4 days at the recent WWW2008 event in Beijing (I departed the morning following the Linked Data Workshop), so I couldn't take my slot on the "Commercializing the Semantic Web panel" etc.. Anyway, thanks to the Web I can still inject my points of view in the broad Web based discourse. Well so I hoped, when I attempted to post a comment to Paul Miller's ZDNet domain hosted blog thread titled: Commercialising the Semantic Web.

Unfortunately, the cost of completing ZDNet's unwieldy signup process simply exceeded the benefits of dropping my comments in their particular space :-( Thus, I'll settle for a trackback ping instead.

What follows is the cut and paste of my intended comment contributions to Paul's post.

Paul,

As discussed earlier this week during our podcast session, commercialization of Semantic Web technology shouldn't be a mercurial matter at this stage in the game :-) It's all about looking at how it provides value :-)

From the Linked Data angle, the ability to produce, dispatch, and exploit "Context" across an array of "Perspectives" from a plethora of disparate data sources on the Web and/or behind corporate firewalls, offers immense commercial value.

Yahoo's Searchmonkey effort will certainly bring clarity to some of the points I made during the podcast re. the role of URIs as "value consumption tickets" (Data Services are exposed via URIs). There has to be a trigger (in user space) that compels Web users to seek broader, or simply varied, perspectives as a response to data encountered on the Web. Yahoo! is about to put this light on in a big way (imho).

The "self annotating" nature of the Web is what ultimately drives the manifestation of the long awaited Semantic Web. I believe I postulated about "Self Annotation & the Semantic Web" in a number of prior posts which, by the way, should be DataRSS compatible right now due to Yahoo's support of OpenSearch Data Providers (which this Blog Space has been for eons).

Today, have many communities adding strucuture to the Web (via their respective tools of preference) without explicitly realizing what they are contributing. Every RSS/Atom feed, Tag, Weblog, Shared Bookmark, Wikiword, Microformat, Microformat++ (eRDF or RDFa), GRDDL stylesheet, and RDFizer etc.. is a piece of structured data.

Finally, the different communities are all finding ways to work together (thank heavens!) and the results are going to be cataclysmic when it all plays out :-)

Data, Structure, and Extraction are the keys to the Semantic Life! First you get the Data in a container (information resource), and then you add Structure to the information resource (RSS, Atom, microformats, RDFa, eRDF, SIOC, FOAF, etc.), once you have Structure RDFization (i.e. transformation to Linked Data) is a synch thanks to RDF Middleware (as per earlier RDF middleware posts).

]]>
Commercializing the Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1363Sun, 18 May 2008 14:58:26 GMT12008-05-18T10:58:26.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
Unfortunately, I could only spend 4 days at the recent WWW2008 event in Beijing (I departed the morning following the Linked Data Workshop), so I couldn't take my slot on the "Commercializing the Semantic Web panel" etc.. Anyway, thanks to the Web I can still inject my points of view in the broad Web based discourse. Well so I hoped, when I attempted to post a comment to Paul Miller's ZDNet domain hosted blog thread titled: Commercialising the Semantic Web.

Unfortunately, the cost of completing ZDNet's unwieldy signup process simply exceeded the benefits of dropping my comments in their particular space :-( Thus, I'll settle for a trackback ping instead.

What follows is the cut and paste of my intended comment contributions to Paul's post.

Paul,

As discussed earlier this week during our podcast session, commercialization of Semantic Web technology shouldn't be a mercurial matter at this stage in the game :-) It's all about looking at how it provides value :-)

From the Linked Data angle, the ability to produce, dispatch, and exploit "Context" across an array of "Perspectives" from a plethora of disparate data sources on the Web and/or behind corporate firewalls, offers immense commercial value.

Yahoo's Searchmonkey effort will certainly bring clarity to some of the points I made during the podcast re. the role of URIs as "value consumption tickets" (Data Services are exposed via URIs). There has to be a trigger (in user space) that compels Web users to seek broader, or simply varied, perspectives as a response to data encountered on the Web. Yahoo! is about to put this light on in a big way (imho).

The "self annotating" nature of the Web is what ultimately drives the manifestation of the long awaited Semantic Web. I believe I postulated about "Self Annotation & the Semantic Web" in a number of prior posts which, by the way, should be DataRSS compatible right now due to Yahoo's support of OpenSearch Data Providers (which this Blog Space has been for eons).

Today, have many communities adding strucuture to the Web (via their respective tools of preference) without explicitly realizing what they are contributing. Every RSS/Atom feed, Tag, Weblog, Shared Bookmark, Wikiword, Microformat, Microformat++ (eRDF or RDFa), GRDDL stylesheet, and RDFizer etc.. is a piece of structured data.

Finally, the different communities are all finding ways to work together (thank heavens!) and the results are going to be cataclysmic when it all plays out :-)

Data, Structure, and Extraction are the keys to the Semantic Life! First you get the Data in a container (information resource), and then you add Structure to the information resource (RSS, Atom, microformats, RDFa, eRDF, SIOC, FOAF, etc.), once you have Structure RDFization (i.e. transformation to Linked Data) is a synch thanks to RDF Middleware (as per earlier RDF middleware posts).

]]>
Commercializing the Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1362Fri, 16 May 2008 20:15:29 GMT12008-05-16T16:15:29.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
During a brief chat with Michael Hausenblas about a new Linked Data project he is championing called: LForum, I made a freudian slip, in the form of the typo: Evoluation, which at the time was supposed to have been: Evolution. Anyway, we had a chuckle and realized we were on to something, so I proceeded to formalize the definition:

Evoluation is evolution devoid of the randomness of mutation. A state of being in which it is possible to evaluate and choose evolutionary paths.

Evoluation actually describes where we are today in relation to the World Wide Web; to the Linking Open Data community (LOD), it's taking the path towards becoming a Giant Global Graph of Linked Data; to the Web 2.0 community, it's simply a collection of Web Services and associated APIs; and to many others, it remains an opaque collection of interlinked documents.

The great thing about the Web is that it allows netizens to explore a plethora of paths without adversely affecting the paths of others. That said, controlling one's path may take mutation out of evolution, but we are still left with the requirement to adapt and eventually survive in a competitive environment. Thus, although we can evaluate and choose from the many paths the Web's evolution offers us, the path that delivers the most benefits ultimately dominates. :-)

]]>
Linked Data enters state of Evoluationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1351Tue, 29 Apr 2008 20:25:47 GMT12008-04-29T16:25:47-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
I just listen to, and very much enjoyed (lots of chuckling) Dave Beckett's podcast interview on the Talis podcast network. Clearly Dave has a bent for funny project names etc.. He also introduced "Inter-Webs" (Web Data Spaces in my parlance) towards the end of the interview.

Trent Adams, Steve Greenberg, and I, also had a podcast chat about Web Data Portability and Accessibility (Linked Data). I also remixed Jon Breslin's "Data Portability & Me" presentation to produce: "Data Accessibility & Me".

The podcasts interviews and presentations provide contributions to the broadening discourse about Open Data Access / Connectivity on the Web.

]]>
Recent Data Portability, Linked Data, and Open Data Access Podcastshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1332Wed, 09 Apr 2008 17:22:23 GMT12008-04-09T13:22:23.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
One of the biggest impediments to the adoption of technology is the cost burden typically associated with doing the right thing. For instance, requirements for making the Linked Data Web (GGG) buzz would include the following (paraphrasing TimBL's original Linked Data meme):

    -- identifying the things you observe, or stumble upon, using URIs (aka Entity IDs)
    -- construct URIs using HTTP so that the Web provides a channel for referencing things elsewhere (remote object referencing)
    -- Expose things in your Data Space(s) that are potentially useful to other Web users via URIs
    -- Link to other Web accessible things using their URIs.

The list is nice, but actual execution can be challenging. For instance, when writing a blog post, or constructing a WikiWord, would you have enough disposable time to go searching for these URIs? Or would you compromise and continue to inject "Literal" values into the Web, leaving it to the reasoning endowed human reader to connect the dots?

Anyway, OpenLink Data Spaces is now equipped with a Glossary system that allows me to manage terms, meaning of terms, and hyper-linking of phrases and words matching associated with my terms. The great thing about all of this is that everything I do is scoped to my Data Space (my universe of discourse), I don't break or impede the other meanings of these terms outside my Data Space. The Glossary system can be shared with anyone I choose to share it with, and even better, it makes my upstreaming (rules based replication) style of blogging even more productive :-)

Remember, on the Linked Data Web, who you know doesn't matter as much as what your are connected to, directly or indirectly. Jason Kolb covers this issue in his post: People as Data Connectors, and so doesFrederick Giasson via a recent post titled: Networks are everywhere. For instance, this blog post (or the entire Blog) is a bona fide RDF Linked Data Source, you can use it as the Data Source of a SPARQL Query to find things that aren't even mentioned in this post, since all you are doing is beaming a query through my Data Space (a container of Linked Data Graphs). On that note, let's re-watch Jon Udell's "On-Demand-Blogosphere" screencast from 2006 :-)

]]>
The Cost of doing the Right Thinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1330Sat, 29 Mar 2008 04:50:07 GMT32008-03-29T00:50:07.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
For all the one-way feed consumers and aggregators, and readers of the original post, here is a variant equipped hyperlinked phrases as opposed to words. As I stated in the prior post, the post (like most of my posts) was part experiment / dog-fodding of automatic tagging and hyper-linking functionality in OpenLink Data Spaces.

ReadWriteWeb via Alex Iskold's post have delivered another iteration of their "Guide to Semantic Technologies".

If you look at the title of this post (and their article) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the "Semantic Web" as prescribed by TimBL then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: "ReadWriteWeb" and "Alex Iskold" are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted blog posts.

Preoccupation with Literal objects as describe above, implies we can only take what "ReadWriteWeb" and "Alex Iskold" say "Literally" (grep, regex, and XPath/Xquery are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar "About Page" blurb), no data connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.

TimBL's vision as espoused via the "Semantic Web" vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (Hyperdata Links / Linked Data). It's how we use the Web as a Distributed Database where (as Jim Hendler once stated with immense clarity): I can point to records (entity instances) in your database (aka Data Space) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type "Document") using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the "Federated and Open" nature of the Web.

As I stated in a prior post, if you or your platform of choice aren't producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no "World Wide Web" in what you are doing.

What are the Benefits of the Semantic Web?

    Consumer - "Discovery of relevant things" and be being "Discovered by relevant things" (people, places, events, and other things)
    Enterprise - ditto plus the addition of enterprise domain specific things such as market opportunities, product portfolios, human resources, partners, customers, competitors, co-opetitors, acquisition targets, new regulation etc..)

Simple demo:

I am a Kingsley Idehen, a Person who authors this weblog. I also share bookmarks gathered over the years across an array of subjects via my bookmark data space. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data space. Of course, all of these data sources have Tags which are collectively exposed via my weblog tag-cloud, feeds subscriptions tag-cloud, and bookmarks tag-cloud data spaces.

As I don't like repeating myself, and I hate wasting my time or the time of others, I simply share my Data Space (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new information (perspectives) that is hopefully exposed others via the Web.

Bottom-line, the Semantic Web is about adding the missing "Open Data Access & Connectivity" feature to the current Document Web (we have to beyond regex, grep, xpath, xquery, full text search, and other literal scrapping approaches). The Linked Data Web of de-referencable data object URIs is the critical foundation layer that makes this feasible.

Remember, It's not about "Applications" it's about Data and actually freeing Data from the "tyranny of Applications". Unfortunately, application inadvertently always create silos (esp. on the Web) since entity data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.

Final comment, RDF facilitates Linked Data on the Web, but all RDF isn't endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn't provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they've discovered without exposing de-referencable URIs for each item.

BTW - my final comments above aren't in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, Virtuoso, OpenLink Data Spaces, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, Zitgist, which now has Mike Bergman as it's CEO alongside Frederick Giasson as CTO. Of course, I cannot do Zitgist justice via a footnote in a blog post, so I will expand further in a separate post.

Additional information about this blog post:

  1. I didn't spent hours looking for URIs used in my hyperlinks
  2. The post is best viewed via an RDF Linked Data aware user agents (OpenLink RDF Browser, Zitgist Data Viewer, DISCO Hyperdata Browser, Tabulator).
]]>
Semantic Web Patterns: A Guide to Semantic Technologies (Update 2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1329Thu, 17 Jul 2008 01:43:36 GMT42008-07-16T21:43:36-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
ReadWriteWeb via Alex Iskold have delivered another iteration of their "Guide to Semantic Technologies".

If you look at the title of this post (and their article) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the "Semantic Web" as prescribed by TimBL then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: "ReadWriteWeb" and "Alex Iskold" are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted blog posts.

Preoccupation with Literal objects as describe above, implies we can only take what "ReadWriteWeb" and "Alex Iskold" say "Literally" (grep, regex, and XPath/Xquery are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar "About Page" blurb), no data connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.

TimBL's vision as espoused via the "Semantic Web" vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (Hyperdata Links / Linked Data). It's how we use the Web as a Distributed Database where (as Jim Hendler once stated with immense clarity): I can point to records (entity instances) in your database (aka Data Space) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type "Document") using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the "Federated and Open" nature of the Web.

As I stated in a prior post, if you or your platform of choice aren't producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no "World Wide Web" in what you are doing.

What are the Benefits of the Semantic Web?

    Consumer - "Discovery of relevant things" and be being "Discovered by relevant things" (people, places, events, and other things)
    Enterprise - ditto plus the addition of enterprise domain specific things such as market opportunities, product portfolios, human resources, partners, customers, competitors, co-opetitors, acquisition targets, new regulation etc..)

Simple demo:

I am a Kingsley Idehen, a Person who authors this weblog. I also share bookmarks gathered over the years across an array of subjects via my bookmark data space. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data space. Of course, all of these data sources have Tags which are collectively exposed via my weblog tag-cloud, feeds subscriptions tag-cloud, and bookmarks tag-cloud data spaces.

As I don't like repeating myself, and I hate wasting my time or the time of others, I simply share my Data Space (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new information (perspectives) that is hopefully exposed others via the Web.

Bottom-line, the Semantic Web is about adding the missing "Open Data Access & Connectivity" feature to the current Document Web (we have to beyond regex, grep, xpath, xquery, full text search, and other literal scrapping approaches). The Linked Data Web of de-referencable data object URIs is the critical foundation layer that makes this feasible.

Remember, It's not about "Applications" it's about Data and actually freeing Data from the "tyranny of Applications". Unfortunately, application inadvertently always create silos (esp. on the Web) since entity data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.

Final comment, RDF facilitates Linked Data on the Web, but all RDF isn't endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn't provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they've discovered without exposing de-referencable URIs for each item.

BTW - my final comments above aren't in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, Virtuoso, OpenLink Data Spaces, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, Zitgist, which now has Mike Bergman as it's CEO alongside Frederick Giasson as CTO. Of course, I cannot do Zitgist justice via a footnote in a blog post, so I will expand further in a separate post.

Additional information about this blog post:

  1. I didn't spent hours looking for URIs used in my hyperlinks
  2. The post is best viewed via an RDF Linked Data aware user agents (OpenLink RDF Browser, Zitgist Data Viewer, DISCO Hyperdata Browser, Tabulator).
]]>
Semantic Web Patterns: A Guide to Semantic Technologies (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1328Thu, 17 Jul 2008 01:43:04 GMT112008-07-16T21:43:04-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
  • End to Buzzword Blur - how buzzwords are used to obscure comprehension of core concepts. Let SKOS, MOAT, SCOT reign!
  • End of Data Silos - you don't own me, my data, my data's mobility (import/export), or accessibility (by reference) just because I signed up for Yet Another Software as Service (ySaaS)
  • End of Misinformation - Sins of omission will no longer go unpunished the era of self induced amnesia due to competitive concerns is over, Co-opetition shall reign (Ray Noorda always envisoned this reality)
  • Serendipitous information and data discovery gets cheaper by the second - you're only a link away for a universe of relevant and accessible data
  • Rise of Quality - Contrary to historic president (due to all of the above) well engineered solutions will no longer be sure indicators of commercial failure
  • BTW - Benjamin Nowack penned an interesting post titled: Semantic Web Aliases, that covers a variety of labels used to describe the Semantic Web. The great thing about this post is that it provides yet another demonstration-in-the-making for the virtues of Linked Data :-)

    Labels are harmless when their sole purpose is the creation of routes of comprehension for concepts. Unfortunately, Labels aren't always constructed with concept comprehension in mind, most of the time they are artificial inflectors and deflectors servicing marketing communications goals.

    Anyway, irrespective of actual intent, I've endowed all of the labels from Bengee's post with URIs as my contribution important disambiguation effort re. the Semantic Web:

    As per usual this post is best appreciated when processed via an Linked Data aware user agent.

    ]]>
    My 5 Favorite Things about Linked Data on the Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1319Sun, 09 Mar 2008 15:48:35 GMT32008-03-09T11:48:35.000004-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    At OpenLink Software, we've had an immense problem explaining the depth and breadth of our product porfolio via traditional Document Web pages. Thanks to SPARQL and Linked Data, we are now able to use Web Data Object IDs (HTTP based URIs) to produce super SKUs for every item in our product portfolio. Even better, we are able to handle the additional challenge of exposing features and benefits which by their very nature are mercurial across an array of fronts (products releases, product formats, and supported platforms etc).

    Now I can simply state the following using Linked Data (hyperdata) links:

    OpenLink Software's product porfolio is comprised of the following product families:
    1. Universal Data Access Drivers Suite (UDA) for ODBC, JDBC, ADO.NET, OLE-DB, and XMLA
    2. OpenLink Data Spaces
    3. Virtuoso

    We no longer have to explain (repeatedly) why our drivers exist in Express, Lite, and Multi-Tier Edition formats, or why you ultimately need Multi-Tier Drivers over Single Tier Drivers (Express or Lite Editions) since you ultimately heed high-performance, data encryption, and policy based security across each of the data access driver formats.

    ]]>
    Linked Data Solution for Exposing OpenLink Product Portfoliohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1317Mon, 25 Feb 2008 20:08:04 GMT42008-02-25T15:08:04-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Increasingly, I am encountering commentary from the ReadWriteWeb data space that highlights critical problems solved by a Linked Data Web. Unfortunately, most of the time, there is a disconnect between the problem and the solution. By this I mean: technology in the Semantic Web realm isn't seen as the solution.

    A while back, I wrote a post titled:Why we need Linked Data. The aim of the post was to bring attention to the implications of exponential growth of User Generated Content (typically, semi-structured and unstructured data) on the Web. The growth in question is occurring within a fixed data & information processing timeframe (i.e. there will always be 24hrs in a day), which sets the stage for Information Overload as expressed in a recent post from ReadWriteWeb titled: Visualizing Social Media Fatigue.

    The emerging "Web of Linked Data" augments the current "Web of Linked Documents", by providing a structured data corpus partitioned by containers I prefer to call: Data Spaces. These spaces enable Linked Data aware solutions to deliver immense value such as, complex data graph traversal, starting from document beachheads, that expose relevant data within a faction of the time it would take to achieve the same thing using traditional document web methods such as full text search patterns, scraping, and mashing etc.

    Remember, our DNA based data & information system far exceeds that of any inorganic system when it comes to reasoning, but it remains immensely incapable of accurately and efficiently processing huge volumes of data & information -- irrespective of data model.

    The Idea behind the Semantic Web has always been about an evolution of the Web into a structured data collective comprised of interlinked Data items and Data Containers (Data Spaces). Of course we can argue forever about the Semantics of the solution (ironically), but we can't shirk away from the impending challenges that "Information Overload" is about to unleash on our limited processing time and capabilities.

    For those looking for a so called "killer application" for the Semantic Web, I would urge you to align this quest with the "Killer Problem" of our times, because when you do so you will that all routes lead to: Linked Data that leverages existing Web Architecture.

    Once you understand the problem, you will hopefully understand that we all need some kind of "Data Junction Box" that provides a "Data Access Focal Point" for all of the data we splatter across the net as we sign up for the next greatest and latest Web X.X hosted service, or as we work on a daily basis with a variety of tools within enterprise Intranets.

    BTW - these "Data Junction Boxes" will also need to be unobtrusively bound to our individual Identities.

    ]]>
    Contd: Why we need Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1316Tue, 26 Feb 2008 13:16:43 GMT32008-02-26T08:16:43.000005-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    We've just released version 5.0.4 of the Virtuoso Universal Server platform for SQL, XML, and RDF. The new release includes the following enhancements:

    Web Server:

      - HTTP 1.1 compliant Transparent content-negotiation in URL-rewrite rules for Linked Data Deployment.

    RDF Data Management:

      - New providers for the Jena, Sesame and Redland frameworks
      - support for SPARQL INSERT and UPDATE via HTTP POST
      - New SPARQL-BI extenstions that make Business Intelligence feasible via SPARQL
      - new "rdf_sink" folder for handling HTTP PUTs into WebDAV that automatically sync with Quad Store.
      - There are new Sponger (RDFizer) cartridges that map Amazon book-search results to the Biliographic Ontology, supports production of Linked Data from OAI, XBRL, and Yahoo finance data sources.
      - HTTPS protocol support added to Sponger
      - performance optimizations for SPARQL `DESCRIBE' and `CONSTRUCT', alongside general performance enhancements for RDF data set loading.

    Core DBMS Engine:

      - PHP hosting a module re-implemented as a Virtuoso plugin inline with otherlanguage hosting modules
      - improved deadlock condtion management
      - enhanced POP and FTP server side protocol implementations that allow larger data transfers.

    Additional Information

    ]]>
    Virtuoso Universal Server 5.0.4 Release Detailshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1310Tue, 05 Feb 2008 01:30:43 GMT12008-02-04T20:30:43.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The W3C officially unveiled the SPARQL Query Language today via a press release titled: W3C Opens Data on the Web with SPARQL.

    What is SPARQL?

    A query language for the burgeoning Structured & Linked Data Web (aka Semantic Web / Giant Global Graph). Like SQL, for the Relational Data Model, it provides a query language for the Graph based RDF Data Model.

    It's also a REST or SOAP based Web Service that exposes SPARQL access to RDF Data via an endpoint.

    In addition, it's also a Query Results Serialization format that includes XML and JSON support.

    Why is it Important?

    It brings important clarity to the notion of the "Web as a Database" by transforming existing Web Sites, Portals, and Web Services into bona fide corpus of Mesh-able (rather than Mash-able) Data Sources. For instance, you can perform queries that join one or more of the aforementioned data sources in exactly the same manner (albeit different syntax) as you would one or more SQL Tables.

    Example:

    -- SPARQL equivalent of SQL SELECT * against my personal data space hosted FOAF file

    SELECT DISTINCT ?s ?p ?o
    FROM <http://myopenlink.net/dataspace/person/kidehen> 
    WHERE {?s ?p ?o}

    -- SPARQL against my social network -- Note: My SPARQL will be beamed across all of contacts in the social networks of my contacts as long as they are all HTTP URI based within each data space

    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    SELECT DISTINCT ?Person
    FROM <http://myopenlink.net/dataspace/person/kidehen>
    WHERE {?s a foaf:Person; foaf:knows ?Person}

    Note: you can use the basic SPARQL Endpoint, SPARQL Query By Example, or SPARQL Query Builder Demo tool to experiment with the demonstration queries above.

    How Do I use It?

    SPARQL is implemented by RDF Data Management Systems (Triple or Quad Stores) just as SQL is implemented by Relational Database Management Systems. The aforementioned data management systems will typically expose SPARQL access via a SPARQL endpoint.

    Where are it's implementations?

    A SPARQL implementors Testimonial page accompanies the SPARQL press release. In addition the is a growing collection of implementations on the ESW Wiki Page for SPARQL compliant RDF Triple & Quad Stores.

    Is this really a big deal?

    Yes! SPARQL facilitates an unobtrusive manifestation of a Linked Data Web by way of natural extension of the existing Document Web i.e these Web enclaves co-exist in symbiotic fashion.

    As DBpedia very clearly demonstrates, Linked Data makes the Semantic Web demonstrable and much easier to comprehend. Without SPARQL there would be no mechanism for Linked Data deployment, and without Linked Data there is no mechanism for Beaming Queries (directly or indirectly) across the Giant Global Graph of data hosted by Social Networks, Shard Bookmarks Services, Weblogs, Wikis, RSS/Atom/OPML feeds, Photo Galleries and other Web accessible Data Sources (Data Spaces).

    Related items

      Detailed SPARQL Query Examples using SIOC Data Spaces
    ]]>
    W3C's SPARQLing Data Access Ingenuityhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1295Thu, 17 Jan 2008 20:41:04 GMT82008-01-17T15:41:04.000006-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As 2007 came to a close I repeatedly mulled over the idea of putting together a usual "year in review" and a set of predictions for the coming year etc. Anyway, the more I pondered, the smaller the list became. While pondering (as 2008 rolled around), the Blogosphere was set ablaze with the Robert Scoble's announcement of his account suspension by Facebook. Of course, many chimed in expressing views either side of the ensuing debate: Who is right -- Scoble or Facebook. The more I assimilated the views expressed about this event, the more ironic I found the general discourse, for the following reasons:

    1. Web 2.0 is fundamentally about Web Services as the prime vehicle for interactions across "points of Web presence"
    2. Facebook is a Web 2.0 hosted service for social networking that provides Web Services APIs for accessing data in the Facebook data space. You have to do so "on the fly" within clearly defined constraints i.e you can interact with data across your social network via Facebook APIs, but you cannot cache the data (perform an export style dump of the data)
    3. Facebook is a main driver of the term: "social graph", but their underlying data model is relational and the Web Services response (data you get back) doesn't return a data graph, instead it returns an tree (i.e XML)
    4. Scoble's had a number of close encounters with Linked Data Web | Semantic Data Web | Web 3.0 aficionados in various forms throughout 2007, but still doesn't quite make the connection between Web Services APIs as part of a processing pipeline that includes structured data extraction from XML data en route to producing Data Graphs comprised of Data Objects (Entities) endowed with: Unique Identifiers, Classification or Categorization schemes, Attributes, and Relationships prescribed by one or more shared Data Dictionaries/Schemas/Ontologies
    5. A global information bus that exposes a Linked Data mesh comprised of Data Objects, Object Attributes, and Object Relationships across "points of Web presence" is what TimBL described in 1998 (Semantic Web Roadmap) and more recently in 2007 (Giant Global Graph)
    6. The Linked Data mesh (i.e Linked Data Web or GGG) is anchored by the use of HTTP to mint Location, Structure, and Value independent Object Identifiers called URIs or IRIs. In addition, the Linked Data Web is also equipped with a query language, protocol, and results serialization format for XML and JSON called: SPARQL.

    So, unlike Scoble, I am able to make my Facebook Data portable without violating Facebook rules (no data caching outside Facebook realm) by doing the following:

    1. Use an RDFizer for Facebook to convert XML response data from Facebook Web Services into RDF "on the fly" Ensure that my RDF is comprised of Object Identifiers that are HTTP based and thereby dereferencable (i.e. I can use SPARQL to unravel the Linked Data Graph in my Facebook data space)
    2. The act of data dereferencing enables me to expose my Facebook Data as Linked Data associated with my Personal URI
    3. This interaction only occurs via my data space and in all cases the interactions with data work via my RDFizer middleware (e.g the Virtuoso Sponger) that talks directly to Facebook Web Services.

    In a nutshell, my Linked Data Space enables you to reference data in my data space via Object Identifiers (URIs), and some cases the Object IDs and Graphs are constructed on the fly via RDFization middleware.

    Here are my URIs that provide different paths to my Facebook Data Space:

    To conclude, 2008 is clearly the inflection year during which we will final unshackle Data and Identity from the confines of "Web Data Silos" by leveraging the HTTP, SPARQL, and RDF induced virtues of Linked Data.

    Related Posts:

    1. 2008 and the Rise of Linked Data
    2. Scoble Right, Wrong, and Beyond
    3. Scoble interviewing TimBL (note to Scoble: re-watch your interview since he made some specific points about Linked Data and URIs that you need to grasp)
    4. Prior Blog posts my this Blog Data Space that include the literal patterns: Scoble Semantic Web
    ]]>
    2008, Facebook Data Portability, and the Giant Global Graph of Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1289Mon, 07 Jan 2008 16:44:42 GMT32008-01-07T11:44:42.000007-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Question posed by Dan Brickley via a blog post: SQL, OpenOffice: would a JDBC driver for SPARQL protocol make sense?

    Writing a JDBC Driver for SPARQL is a little overkill. OpenOffice.org simply needs to make XML or Web Data (HTML, XHTML, and XML) bonafide data sources within its "Pivot Table" functionality realm. Then all that would then be required is a SPARQL SELECT Query transported via the SPARQL Protocol with results sent back using the SPARQL XML results serialization format (all part of a single SPARQL Protocol URL).

    Excel successfully consumes the following information resource URI: http://tinyurl.com/yvoccj (a tiny url for a SPARQL SELECT against my FOAF file).

    Alternatively, and currently achievable, you could simply use SPASQL (SPARQL within SQL) using a DBMS engine that supports SQL, SPARQL, and SPARQL e.g. Virtuoso.

    Virtuoso SPASQL support is exposed via it's ODBC and/or JDBC Drivers. Thus you can do things such as:

    1. Use a SPARQL Query in the FROM CLAUSE of a SQL statement
    2. Execute SPARQL via SQL processor by prepending SPARQL query text with the literals "sparql"

    BTW - My News Years Resolution: get my act together and shrink the ever increasing list of "simple & practical Virtuoso use case demos" on my todo which now spans all the way back to 2006 :-(

    ]]>
    OpenOffice.org, SPARQL, and the Linked Data Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1288Tue, 05 Feb 2008 01:42:50 GMT52008-02-04T20:42:50.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    "The phrase Open Social implies portability of personal and social data. That would be exciting but there are entirely different protocols underway to deal with those ideas. As some people have told me tonight, it may have been more accurate to call this "OpenWidget" - though the press wouldn't have been as good. We've been waiting for data and identity portability - is this all we get?"
    [Source: Read/Write Web's Commentary & Analysis of Google's OpenSocial API]

    ..Perhaps the world will read the terms of use of the API, and realize this is not an open API; this is a free API, owned and controlled by one company only: Google. Hopefully, the world will remember another time when Google offered a free API and then pulled it. Maybe the world will also take a deeper look and realize that the functionality is dependent on Google hosted technology, which has its own terms of service (including adding ads at the discretion of Google), and that building an OpenSocial application ties Google into your application, and Google into every social networking site that buys into the Dream. Hopefully the world will remember. Unlikely, though, as such memories are typically filtered in the Great Noise....

    [Source: Poignant commentary excerpt from Shelly Power's Blog (as always)]

    The "Semantic Data Web" vision has always been about "Data & Identity" portability across the Web. Its been that and more from day one.

    In a nutshell, we continue to exhibit varying degrees of Cognitive Dissonance re the following realities:

    1. The Network is the Computer (Internet/Intranet/Extranet depending on your TCP/IP usage scenarios)
    2. The Web is the OS (ditto) and it provides a communications subsystem (Information BUS) comprised of
      • - URIs (pointer system for identifying, accessing, and manipulating data)
    3. HTTP based Interprocess (i.e Web Apps are processes when you discard the HTML UI and interact with the application logic containers called "Web Services" behind the pages) ultimately hit data
    4. Web Data is best Modeled as a Graph (RDF, Containers/Items/Item Types, Property & Value Pairs associated with something, and other labels)
    5. Network are Graphs and vice versa
    6. Social Networks are graphs where nodes are connected via social connectors ( [x]--knows-->[y] )
    7. The Web is a Graph that exposes a People and Data Network (to the degree we allude to humans not being data containers i.e. just nodes in a network, otherwise we are talking about a Data Network)
    8. Data access and manipulation depends inherently on canonical Data Access mechanisms such as Data Source Identifiers / Names (time-tested practice in various DBMS realms)
    9. Data is forever, it is the basis of Information, and it is increasing exponentially due to proliferation of Web Services induced user activities (User Generated Content)
    10. Survival, Vitality, Longevity, Efficiency, Productivity etc.. are all depend on our ability to process data effectively in a shrinking time continuum where Data and/or Information overload is the alternative.

    The Data Web is about Presence over Eyeballs due to the following realities:

    1. Eyeballs are input devices for a DNA based processing system (Humans). The aforementioned processing system can reason very well, but simply cannot effectively process masses of data or information
    2. Widgets offer little value long term re. the imminent data and information overload dilemma, ditto Web pages (however pretty), and any other Eyeballs-only centric Web Apps
    3. Computers (machines) are equipped with inorganic (non DNA) based processing power, they are equipped to process huge volumes of data and/or information, but they cannot reason
    4. To be effective in the emerging frontier comprised of a Network Computer and a Web OS, we need an effective mechanism that makes best use of the capabilities possessed by humans and machines, by shifting the focus to creation and interaction with points of "Data Web Presence" that openly expose "Structured Linked Data".

    This is why we need to inject a mesh of Linked Data into the existing Web. This is what the often misunderstood vision of the "Semantic Data Web" or "Web of Data" or "Web or Structured Data" is all about.

    As stated earlier (point 10 above), "Data is forever" and there is only more of it to come! Sociality and associated Social Networking oriented solutions are at best a spec in the Web's ocean of data once you comprehend this reality.

    Note: I am writing this post as an early implementor of GData and an implementor of RDF Linked Data technology and a "Web Purist".

    OpenSocial implementation and support across our relevant product families: Virtuoso (i.e the Sponger Middleware for RDF component), OpenLink Data Spaces (Data Space Controller / Services), and the OpenLink Ajaxt Toolkit (i.e OAT Widgets and Libraries), is a triviality now that the OpenSocial APIs are public.

    The concern I have, and the problem that remains mangled in the vast realms of Web Architecture incomprehension, is the fact that GData and GData based APIs cannot deliver Structured Linked Data in line with the essence of the Web without introducing "lock-in" that ultimately compromises the "Open Purity" of the Web. Facebook and Google's OpenSocial response to the Facebook juggernaut (i.e. open variant of the Facebook Activity Dashboard and Social Network functionality realms, primarily), are at best icebergs in the ocean we know as the "World Wide Web". The nice and predictable thing about icebergs is that they ultimately melt into the larger ocean :-)

    On a related note, I had the pleasure of attending the W3C's RDF and DBMS Integration Workshop, last week. The event was well attended by organizations with knowledge, experience, and a vested interested in addressing the issues associated with exposing none RDF data (e.g. SQL) as RDF, and the imminence of data and/or information overload covered in different ways via the following presentations: . ]]>
    Reminder: Why We Need Linked Data!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1267Fri, 02 Nov 2007 22:52:34 GMT52007-11-02T18:52:34-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    OpenLink Software are pleased to announce release 2.6 of the OpenLink AJAX Toolkit (OAT).

    New Semantic Data Web related features and enhancements include:

      * A Javascript-based Fresnel processor enabling declarative RDF-based display templates for RDF Data Sources
      * An XSLT template for generating HTML pages from the Fresnel processor's XML output
      * Enhanced Javascript-based N3/Turtle parser
    Related Items: ]]>
    OpenLink Ajax Toolkit (OAT) 2.6 Released!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1238Wed, 01 Aug 2007 18:49:17 GMT12007-08-01T14:49:17-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I stumbled across an article titled: Thoughts on Compound Documents, from the Open Archives initiative (OAI). The article discusses the increasingly popular topic of deploying structured data containers on the Web.

    This article, like the one from Mike, and our soon to be released Linked Data Deployment white paper, collectively address the main topic without inadvertent distraction by the misnomer: non-information resource. For instance, the OAI article uses the term: Generic Resource instead of Non-informaton Resource.

    The Semantic Data Web is here, but we need to diffuse this reality across a broader spectrum of Web communities, so as to avoid unnecessary uptake inertia that can arise due basic incomprehension of key concepts such as Linked Data deployment.

    ]]>
    Another Paper Discussing RDF Data Publishinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1234Wed, 25 Jul 2007 02:02:56 GMT32007-07-24T22:02:56-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As the Linked Data meme beams across the Web, it is important to note that Ontology / Schema sharing and reuse is critical to the overall vitality of the burgeoning Semantic Data Web.

    The items that follow attempt to demonstrate the point by way of SIOC (Semantically-Interlinked Online Communities Ontology) and MO (Music Ontology) domain exploration:

    Linked Data or Dynamic Data Web Pages:

    1. Music Ontology Overview
    2. SIOC Ontology Overview
    3. SIOC Type Ontology Module (how you extend SIOC Concepts unobtrusively)
    4. SIOC Services Ontology Module (how you extend SIOC in relation to Services Modeling).

    Semantic Web Browser Sessions:

    1. Music Ontology Overview via OpenLink RDF Browser
    2. SIOC Ontology Overview via OpenLink RDF Browser
    3. SIOC Type Ontology Module via OpenLink RDF Browser
    4. SIOC Services Ontology Module via OpenLink RDF Browser.

    Key point, if you are modeling People, Communities, Organizations, Documents, and other entities in the People, Organizations, Documents etc. Data Space, don't forget to : FOAF-FOAF-FOAF it Up! :-)

    ]]>
    Shared Ontologies Linked Data Style!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1203Fri, 01 Jun 2007 23:54:05 GMT32007-06-01T19:54:05.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Scobleizer's had a Semantic Web Epiphany but can't quite nail down what his discovered in laymans prose :-)

    Well, I'll have a crack at helping him out i.e. defining the Semantic Data Web in simple terms with linked examples :-)

    Tip: Watch the recent TimBL video interview re. the Semantic Data Web before, during, or after reading this post.

    Here goes!

    The popular Web is a "Web of Documents". The Semantic Data Web is a "Web of Data". Going down a level, the popular web connects documents across the web via hyperlinks. The Semantic Data Web connects data on the web via hyperlinks. Next level, hyperlinks on the popular web have no inherent meaning (lack context beyond: "there is another document"). Hyperlinks on the Semantic Data Web have inherent meaning (they possess context: "there is a Book" or "there is a Person" or "this is a piece of Music" etc..).

    Very simple example:

    Click the traditional web document URLs for Dan Connolly and Tim Berners-Lee. Then attempt to discern how they are connected. Of course you will see some obvious connections by reading the text, but you won't easily discern other data driven connections. Basically, this is no different to reading about either individual in a print journal, bar the ability to click on hyperlinks that open up other pages. The Data Extraction process remains labour intensive :-(

    Repeat the exercise using the traditional web document URLs as Data Web URIs, this time around, paste the hyperlinks above into an RDF aware Browser (in this case the OpenLink RDF Browser). Note, we are making a subtle but critical change i.e. the URLs are now being used as Semantic Data Web URIs (a small-big-deal kind of thing).

    If you're impatient or simply strapped for time (aren't we all these days), simply take a look at these links:

    1. Dan Connolly (DanC) RDF Browser Session permalink
    2. Tim Berners-Lee (TimBL) RDF Browser Session permalink
    3. TimBL and DanC combined RDF Browser Session permalink

    Note: There are other RDF Browsers out there such as:

    1. Tabulator
    2. DISCO
    3. Objectviewer

    All of these RDF Browsers (or User Agents) demonstrate the same core concepts in subtly different ways.

    If I haven't lost you, proceed to a post I wrote a few weeks ago titled: Hello Data Web (Take 3 - Feel the "RDF" Force).

    If you've made it this far, simply head over to DBpedia for a lot of fun :-)

    Note Re. my demos: we make use of SVG in our RDF Browser which makes them incompatible with IE (6 or 7) and Safari. That said, Firefox (1.5+), Opera 9.x, WebKit (Open Source Safari), and Camino work fine.

    Note to Scoble:

    All the Blogs, Wikis, Shared Bookmarks, Image Galleries, Discussion Forums and the like are Semantic Web Data Spaces. The great thing about all of this is that through RSS 2.0's wild popularity, Blogosphere has done what I postulated about a while back: The Semantic Web would be self-annotating, and so it has come to be :-)

    To prove the point above: paste your blog's URL into the OpenLink RDF Browser and see it morph into a Semantic Data Web URI (a pointer to Web Data that's you've created) once you click the "Query" button (click on the TimeLine tab for full effect). The same applies to del.icio.us, Flickr, Googlebase, and basically any REST style Web Service as per my RDF Middleware post.

    Lazy Semantic Web Callout:

    If you're a good animator (pro or hobbyist), please produce an animation of a document going through a shredder. The strips that emerge from the shredder represent the granular data that was once the whole document. The same thing is happening on the Web right now, we are putting photocopies of (X)HTML documents through the shredder (in a good way) en route to producing granular items of data that remain connected to the original copy while developing new and valuable connections to other items of Web Data.

    That's it!

    ]]>
    Describing the Semantic Data Web (Take 3)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1180Fri, 13 Apr 2007 21:15:42 GMT32007-04-13T17:15:42-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Danny Ayers responds, via his post titled: Sampling, to "Stefano Mazzochi's post about Data Integration using Semantic Web Technologies.

    "There is a potential problem with republication of transformed data, in that right away there may be inconsistency with the original source data. Here provenance tracking (probably via named graphs) becomes a must-have. The web data space itself can support very granular separation. Whatever, data integration is a hard problem. But if you have a uniform language for describing resources, at least it can be possible."

    Alex James also chimes in with valuable insights in his post: Sampling the global data model, where he concludes:

    "Exactly we need to use projected views, or conceptual models. '

    See a projected view can be thought of as a conceptual model that has some mapping to a *sampling* of the global data model.

    The benefits of introducing this extra layer are many and varied: Simplicity, URI predictability, Domain Specificity and the ability to separate semantics from lower level details like data mapping.

    Unfortunately if you look at today’s ORMs you will quickly notice that they simply map directly from Object Model to Data Model in one step.

    This naïve approach provides no place to manage the mapping to a conceptual model that sampling the world’s data requires.

    What we need to solve the problems Stefano sees is to bring together the world of mapping and semantics. And the place they will meet is simply the Conceptual Model."

    Data Integration challenges arise because the following facts hold true all of the time (whether we like it or not):

    1. Data Heterogeneity is a fact of life at the intranet and internet levels
    2. Data is rarely clean
    3. Data Integration prowess are ultimately measured by pain alleviation
    4. A some point human participation is required, but the trick is to move human activity up the value chain
    5. Glue code size and Data Integration success are inversely related
    6. Data Integration is best addressed via "M" rather than "C" (if we use the MVC pattern as a guide. "V" is dead on arrival for the scrappers out there)

    In 1997 we commenced the Virtuoso Virtual DBMS Project that morphed into the Virtuoso Universal Server; A fusion of DBMS functionality and Middleware functionality in a single product. The goal of this undertaking remains alleviation of the costs associated with Data Integration Challenges by Virtualizing Data at the Logical and Conceptual Layers.

    The Logical Data Layer has been concrete for a while (e.g Relational DBMS Engines), what hasn't reached the mainstream is the Concrete Conceptual Model, but this is changing fast courtesy of the activity taking place in the realm of RDF.

    RDF provides an Open and Standards compliant vehicle for developing and exploiting Concrete Conceptual Data Models that ultimately move the Human aspect of the "Data Integration alleviation quest" higher up the value chain.

    ]]>
    RDF based Integration Challenges (update)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1174Fri, 30 Mar 2007 23:35:35 GMT12007-03-30T19:35:35-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Ivan Herman has published another great Semantic Web presentation titled: State of the Semantic Web. I have placed links to some key points below; primarily for those who are new to the Semantic Web vision or somewhat confused about it thus far:

    1. Messaging Issues - misconceptions and misrepresentations (e.g intermingling or RDF the Data Model and RDF/XML one of several serialization formats)
    2. RDF Data Availability
    3. Generating RDF from non RDF Data ("RDF Tax" eradication)
    4. Querying RDF Data Sources
    ]]>
    Semantic Web: State of Affairs Presentationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1167Mon, 26 Mar 2007 17:02:53 GMT12007-03-26T13:02:53-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    (Via Read/Write Web.)

    Web 3.0: When Web Sites Become Web Services: "

    .....

    Conclusion

    As more and more of the Web is becoming remixable, the entire system is turning into both a platform and the database. Yet, such transformations are never smooth. For one, scalability is a big issue. And of course legal aspects are never simple.'

    But it is not a question of if web sites become web services, but when and how. APIs are a more controlled, cleaner and altogether preferred way of becoming a web service. However, when APIs are not avaliable or sufficient, scraping is bound to continue and expand. As always, time will be best judge; but in the meanwhile we turn to you for feedback and stories about how your businesses are preparing for 'web 3.0'.

    We are hitting a little problem re. Web 3.0 and Web 2.0, naturally :-) Web 2.0 is one of several (present and future) Dimensions of Web Interaction that turns Web Sites into Web Services Endpoints; a point I've made repeatedly [1] [2] [3] [4] across the blogosphere, in addition to my early futile attempts to make the Wikipedia's Web 2.0 article meaningful (circa 2005), as per the Wikipedia Web 2.0 Talk Page excerpt below:

    Web 2.0 is a web of executable endpoints and well formed content. The executable endpoints and well formed content are accessible via URIs. Put differently, Web 2.0 is a web defined by URIs for invoking Web Services and/or consuming or syndicating well formed content.

    Hopefully, someone with more time on their hands will expand on this ( I am kinda busy)

    .

    BTW - Web 2.0 being a platform doesn't distinguish it in anyway from Web 1.0. They are both platforms, the difference comes down to platform focus and mode of experience.

    Web 3.0 is about Data Spaces: Points of Semantic Web Presence that provide granular access to Data, Information, and Knowledge via Conceptual Data Model oriented Query Languages and/or APIs.

    The common denominator across all the current and future Web Interaction Dimensions is HTTP. While their differences are as follows:

      Web 1.0 - Browser (HTTP + (X)HTML)
      Web 2.0 - Presence (Web Service Endpoints for REST or SOAP over HTTP)
      Web 3.0 - Presence (Query Languages, Data Models, and HTTP based Query Oriented Web Service Endpoints)

    Examples of Web 3.0 Infrastructure:

    1. Query Languages: SPARQL, Googlebase Query Language, Facebook Query Language (FQL), and many others to come
    2. Query Language aligned Web Services (Query Services): SPARQL Protocol, GData, or REST style Web services such as Facebook's service for FQL.
    3. Data Models: Concrete Conceptual Data Model (which RDF happens to deliver for Web Data)

    Web 3.0 is not purely about Web Sites becoming Web Services endpoints. It is about the "M" (Data Model) taking it's place in the MVC pattern as applied to the Web Platform.

    I will repeat myself yet again:

    The Devil is in the Details of the Data Model. Data Models make or break everything. You ignore data at your own peril. No amount of money in the bank will protect you from Data Ignorance! A bad Data Model will bring down any venture or enterprise, the only variable is time (where time is directly related to your increasing need to obtain, analyze, and then act on data, over repetitive operational cycles, that have ever decreasing intervals).

    This applies to the Real-time enterprise of Information and/or knowledge workers and Real-time Web Users alike.

    BTW - Data Makes Shifts Happen (spotter: Sam Sethi).

    ]]>
    Web 3.0: When Web Sites Become Web Serviceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1161Tue, 20 Mar 2007 12:27:37 GMT92007-03-20T08:27:37-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Courtesy of Henry Story's post: O'Reilly groks the Semantic Web.

    Web 2.0 commentators such as Mike Arrington, and as mentioned above,Tim O'Reilly, both blogged about the imminent release of Freebase earlier today. Although I haven't looked at this database yet, it is crystal clear to me that it is one of many Web Databases to come. Others that I am personally familiar with, and involved in, include: DBpedia (Wikipedia as a true Database) and Zitgist (soon to be unveiled).

    All of these databases mark the crystallization of the "Data Web" and the imminence of what is increasingly referred to as Web 3.0.

    I certainly hope that all web 3.0 Database Providers keep the data Open, adhere to Web Best Practice recipes for sharing and publishing data, and generally make the process of data, information, and knowledge discovery via the Web much easier.

    ]]>
    Web Databases on the risehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1152Fri, 09 Mar 2007 17:56:01 GMT12007-03-09T12:56:01-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As I have stated, and implied, in various posts about the Data Web and burgeoning Semantic Web in general; the value of RDF is felt rather than seen (driven by presence as opposed to web sites). That said, it is always possible to use the visual Interactive-Web dimension (Web 1.0) as a conduit to the Data-Web dimension.

    In this third take on my introduction to the Data Web I would like to share a link with you (a Dynamic Start Page in Web 2.0 parlance) with a Data Web twist: You do not have to preset the Start Page Data Sources (this is a small-big thing, if you get my drift, hopefully!).

    Here are some Data Web based Dynamic Start Pages that I have built for some key play ers from the Semantic Web realm (in random order):

    1. Dan Brickley
    2. Tim Berners-Lee
    3. Dan Connolly
    4. Danny Ayers
    5. Planet RDF

    "These are RDF prepped Data Sources....", you might be thinking, right? Well here is the reminder: The Data Web is a Global Data Generation and Integration Effort. Participation may be active (Semantic Web & Microformats Community), or passive (web sites, weblogs, wikis, shared bookmarks, feed subscription, discussion forums, mailing lists etc..). Irrespective of participation mode, RDF instance can be generated from close to anything (I say this because I plan to add binary files holding metadata to this mix shortly). Here are examples of Dynamic Start Pages for non RDF Data Sources:

    1. del.icio.us Web 2.0 Events Bookmarks
    2. Vecosys
    3. Techcrunch
    4. Jon Udell's Blog
    5. Dave Winer's Scripting News
    6. Robert Scoble's Blog

    what about Microformats you may be wondering? Here goes:

    1. Microformats Wiki (click on the Brian Suda link for instance)
    2. Microformats Planet
    3. Del.icio.us Microformats Bookmarks
    4. Ben Adida's home page (RDFa)

    Let's carry on.

    How about some traditional Web Sites? Here goes:

    1. OpenLink Software's Home Page
    2. Oracle's Home Page
    3. Apple's Home Page
    4. Microsoft's Home Page
    5. IBM's Home Page

    And before I forget, here is My Data Web Start Page .

    Due to the use of Ajax in the Data Web Start Pages, IE6 and Safari will not work. For Mac OS X users, Webkit works fine. Ditto re. IE7 on Windows.

    ]]>
    Hello Data Web (Take 3 - Feel The "RDF" Force)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1144Sat, 24 Feb 2007 22:01:28 GMT22007-02-24T17:01:28-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Rob Boothby aptly describes the recipe for success in a networked world.

    Our loosely coupled webs of hypertext, services, and data present an intriguing realm of perpetually expanding and contracting clusters (aka conversations as exemplified by digg swarms). The only issue we have today is that you cannot perceive the aforementioned realm through the lenses of the Hypertext- or Interactive-Web or the API oriented Services-Web. Which is why we need a new frontier in the web innovation continuum. A frontier that unveils, with clarity, the somewhat unperceived realm of "People and Data Networks" en route to simplifying "Network Effects" exploitation: spotting, connecting to, and constructing conversation clusters.

    Once again, this is what the Semantic Web facilitates by delivering a Data Model that exposes these "People & Data Networks". When you write a blog post, comment on a blog post, share bookmarks, tag resources, share and tag photos etc. You are contributing links and nodes to this network :-)

    ]]>
    Network Effects Exploitation the Key to Success!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1119Thu, 11 Jan 2007 23:01:02 GMT12007-01-11T18:01:02-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    This post is part contribution to the general Web 3.0 / Data-Web / Semantic Web discourse, and part experiment / demonstration of the Data Web.

    I came across a pretty deep comments trail about the aforementioned items on Fred Wilson's blog (aptly titled: A VC) under the subject heading: Web 3.0 Is The Semantic Web.

    Contributions to the general Semantic Web discourse by way of responses to valuable questions and commentary contributed by a Semantic Web skeptic (Ed Addison who may be this Ed Addison according to Google):

    Ed, Responses to your points re. Semantic Web Matrialization:
      << 1) ontologies can be created and maintained by text extractors and crawlers" >>

      Ontologies will be developed by Humans. This process has already commenced and far more landscape has been covered that you may be aware of. For instance, there is an Ontology for Online Communities with Semantics factored in. More importantly, most Blogs, Wikis, and other "points of presence" on the Web are already capable of generating Instance Data for this Ontology by way of the underlying platforms that drive these things. The Ontology is called: SIOC (Semantically-Interlinked Online Communities).

      << 2) the entire web can be marked up, semantically indexed, and maintained by spiders without human assistance >>

      Most of it can, and already is :-) Human assistance should, and would, be on an "exception basis" a preferred use of human time (IMHO). We do not need to annotate the Web manually when this labor intensive process can be automated (see my earlier comments).

      << 3) inference over the semantic web does not require an extremely deep heuristic search down multiple, redundant, cyclical pathways with many islands that are disconnected >>

      When you have a foundation layer of RDF Data (generated in the manner I've discussed above), you then have a substrate that's far more palatable to Intelligent Reasoning. Note, the Semantic Web is made of many layers. The critical layer at this juncture is the Data-Web (Web of RDF Data). Note, when I refer to RDF I am not referring to RDF/XML the serialization format, I am referring to the Data Model (a Graph).

      << 4) the web becomes smart enough to eliminate websites or data elements that are incorrect, misleading, false, or just plain lousy >>

      The Semantic Web vision is not about eliminating Web Sites (The Hypertext-Document-Web). It is simply about adding another dimension of interaction to the Web. This is just like the Services-Web dimension as delivered by Web 2.0.

      We are simply evolving within an innovation continuum. There is no mutual exclusivity about any of the Web Dimensions since they collectively provide us with a more powerful infrastructure for building and exploiting "collective wisdom".

    As for the Data-Web experiment part of this post, I would expect to see this post exposed as another contribution to the Data-Web via the PingTheSemanticWeb notification service :-) Implying, that all the relevant parts of this conversation are in a format (Instance Data for the SIOC Ontology) that is available for further use in a myriad of forms.

    ]]>
    Contd: Web 3.0 Commentary etc..http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1090Fri, 24 Nov 2006 18:30:08 GMT12006-11-24T13:30:08.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Frederick Giasson continues the conversation about the Web Experience Dimensions in a new post --the first of several-- that chronicles the evolution of Pingthesemanticweb.com and Talk Digger, from Interactive-Web (Web 1.0) sites to Data-Web oriented Data Spaces:

    On a related front, I also came across an e-Government Data Reference Model presentation (PPT) by Mills Davis  from the Colab Wiki that  illustrates the aforementioned Web Dimensions (even though his presentation didn't have dimensionality of the Web in mind) in one of its graphics (which I've yanked and placed into this post so that it has a URI courtesy of ODS ):



    Notes:
    =====
    Conceptual - Data-Web (*we are starting to comprehend and use this dimension* aka Semantic Web Layer 1)

    Logical Theory - To follow when we let loose the intelligent agents that enrichen the Data Web experience

    Philosophy - by way of Axiology (sometime in the future, but note, we are talking Internet time :-) )

    I also stumbled across another graphic that actually provides visual delineation of the value propositions of XML (Structure) and RDF (Context):

    Notes:
    =====

    Description - XML

    Context - RDF

    Sharing - Access Points (e.g SPARQL, XMLA, GData Generic Query oriented Web Service Endpoints)
    ]]>
    Contd: Web Dimensionalityhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1072Wed, 25 Oct 2006 22:19:40 GMT102006-10-25T18:19:40.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Geonames marches foward with ontology v1.2: "

    Geonames announced the release of its Geonames ontology v1.2. The new ontology has few enhancements. It introduced the notion of linked data and made clear distinction between URI that intended for linking documents and for linking ontology concepts.

    Different types of geospatial data are of different spatial granularity. Data of different spatial granularity may relate to each other by the containment relation. For example, countries contain states, states contains cities and so on. Some geospatial data are of the similar spatial granularity (e.g., two cities that are nearby each other, or two countries that are neighboring each other). To support the knowledge representation of these relationships, the ontology introduced three new properties: childreanFeatures, nearbyFeatures and neighbouringFeatures.

    In the Semantic Web, both ontology concepts and physical web documents are linked by URI. Sometimes in applications, it’s useful to make clear whether the use of a URI is intended for linking documents or for linking ontology concepts. The new Geonames ontology introduced a URI convention for identifying the intended usage of a URI. This convention also simplifies the discovering of geospatial data using Geonames web services.

    Here is an example:

    Other interesting ontology properties include wikipediaArticle and locationMap. The former links a Feature instance to a Web article on Wikipedia, and the latter links a Feature instance to a digital map Web page.

    For additional information about Geonames ontology v1.2, see Marc’s post at the Geonames blog.

    "

    (Via Geospatial Semantic Web Blog.)

    ]]>
    Geonames marches foward with ontology v1.2http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1067Mon, 23 Oct 2006 13:02:33 GMT12006-10-23T09:02:33-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Note: An updated version of a previously unpublished blog post:

    Continuing from our recent Podcast conversation, Jon Udell sheds further insight into the essence of our conversation via a “Strategic Developer” column article titled: Accessing the web of databases.

    Below, I present an initial dump of a DataSpace FAQ below that hopefully sheds light on the DataSpace vision espoused during my podcast conversation with Jon.

    What is a DataSpace?

    A moniker for Web-accessible atomic containers that manage and expose Data, Information, Services, Processes, and Knowledge.

    What would you typically find in a Data Space? Examples include:

    • Raw Data - SQL, HTML, XML (raw), XHTML, RDF etc.

    • Information (Data In Context) - XHTML (various microformats), Blog Posts (in RSS, Atom, RSS-RDF formats), Subscription Lists (OPML, OCS, etc), Social Networks (FOAF, XFN etc.), and many other forms of applied XML.
    • Web Services (Application/Service Logic) - REST or SOAP based invocation of application logic for context sensitive and controlled data access and manipulation.
    • Persisted Knowledge - Information in actionable context that is also available in transient or persistent forms expressed using a Graph Data Model. A modern knowledgebase would more than likely have RDF as its Data Language, RDFS as its Schema Language, and OWL as its Domain  Definition (Ontology) Language. Actual Domain, Schema, and Instance Data would be serialized using formats such as RDF-XML, N3, Turtle etc).

    How do Data Spaces and Databases differ?
    Data Spaces are fundamentally problem-domain-specific database applications. They offer functionality that you would instinctively expect of a database (e.g. AICD data management) with the additonal benefit of being data model and query language agnostic. Data Spaces are for the most part DBMS Engine and Data Access Middleware hybrids in the sense that ownership and control of data is inherently loosely-coupled.

    How do Data Spaces and Content Management Systems differ?
    Data Spaces are inherently more flexible, they support multiple data models and data representation formats. Content management systems do not possess the same degree of data model and data representation dexterity.

    How do Data Spaces and Knowledgebases differ?
    A Data Space cannot dictate the perception of its content. For instance, what I may consider as knowledge relative to my Data Space may not be the case to a remote client that interacts with it from a distance, Thus, defining my Data Space as Knowledgebase, purely, introduces constraints that reduce its broader effectiveness to third party clients (applications, services, users etc..). A Knowledgebase is based on a Graph Data Model resulting in significant impedance for clients that are built around alternative models. To reiterate, Data Spaces support multiple data models.

    What Architectural Components make up a Data Space?

    • ORDBMS Engine - for Data Modeling agility (via complex purpose specific data types and data access methods), Data Atomicity, Data Concurrency, Transaction Isolation, and Durability (aka ACID).

    • Virtual Database Engine - for creating a single view of, and access point to, heterogeneous SQL, XML, Free Text, and other data. This is all about Virtualization at the Data Access Level.
    • Web Services Platform - enabling controlled access and manipulation (via application, service, or protocol logic) of Virtualized or Disparate Data. This layer handles the decoupling of functionality from monolithic wholes for function specific invocation via Web Services using either the SOAP or REST approach.

    Where do Data Spaces fit into the Web's rapid evolution?
    They are an essential part of the burgeoning Data Web / Semantic Web. In short, they will take us from data “Mash-ups” (combining web accessible data that exists without integration and repurposing in mind) to “Mesh-ups” (combining web accessible data that exists with integration and repurposing in mind).

    Where can I see a DataSpace along the lines described, in action?

    Just look at my blog, and take the journey as follows:

    What about other Data Spaces?

    There are several and I will attempt to categorize along the lines of query method available:
    Type 1 (Free Text Search over HTTP):
    Google, MSN, Yahoo!, Amazon, eBay, and most Web 2.0 plays .

    Type 2 (Free Text Search and XQuery/XPath over HTTP)
    A few blogs and Wikis (Jon Udell's and a few others)

    Type 3 (RDF Data Sets and SPARQL Queryable):
    Type 4 (Generic Free Text Search, OpenSearch, GData, XQuery/XPath, and SPARQL):
    Points of Semantic Web presence such as the Data Spaces at:

    What About Data Space aware tools?

    •    OpenLink Ajax Toolkit - provides Javascript Control level binding to Query Services such as XMLA for SQL, GData for Free Text, OpenSearch for Free Text, SPARQL for RDF, in addition to service specific Web Services (Web 2.0 hosted solutions that expose service specific APIs)
    •    Semantic Radar - a Firefox Extension
    •    PingTheSemantic - the Semantic Webs equivalent of Web 2.0's weblogs.com
    •    PiggyBank - a Firefox Extension

    ]]>
    Data Spaces and Web of Databaseshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1030Mon, 04 Sep 2006 22:58:56 GMT52006-09-04T18:58:56.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I just found this interesting Semantic Web effort via 'Danny Ayers' blog. Here is the synopsis from his post:

    Piggy Bank 2.0 Beta

    New release of Piggy Bank, the Semantic Web extension for Firefox. It harvests data as you browse (when you click a status bar indicator), which can later be searched and viewed in a facetted browser.

    The docs have come along some too -

    Piggy Bank can collect pure information in the following cases:

    1. The web page has invisible link(s) to RDF data (encoded in RDF/XML or N3 formats).
    2. The web page exports an RSS feeds.
    3. The address of the web page is a file:/ URL pointing to a directory.
    4. Piggy Bank has a "screen scraper"€ [XSLT or Javascript] that can re-structure the web page HTML code into RDF data.

    There's a tutorial on writing Javascript screenscrapers on the site, nice touch.

    I have also added an architecture diagram to accelerate comprehension (a picture speaks a thousand words...):

    The infrastructure for tier-3 is an aspect of Virtuoso's functionality pool; combining Database & Web Application Server functionality amongst other things, as a single product offering.
    ]]>
    FireFox Semantic Web Extension: Piggy Bank 2.0 Betahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/850Fri, 21 Jul 2006 11:25:03 GMT12006-07-21T07:25:03.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I was asked about my weblog engine in email and in comments, so I'll just post a quick reply.

    Its pretty much a very simple home-grown blogging engine along with a web-based admin front-end thats still partially in the works. All built on ASP.NET v1.1... of course. All the data (entries, comments, links etc.) is managed in a SQL database. The pages were developed in Web Matrix (as part of app-building exercise while preparing for a new updated version - more on that specifically in a future post sometime soon).
    ]]>
    <em>I was asked about my weblog engine in email and in comments, so I&#39;ll just post a quick reply.<br /><br />Its pretty much a very simple home-grown blogging engine along with a web-based admin front-end thats still partially in the works. All built on ASP.NET v1.1... of course. All the data (entries, comments, links etc.) is managed in a SQL database. The pages were developed in </em>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/5Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Amazon RSS Feeds

    RSS feeds are everywhere, and they are changing the Web landscape fast. The Web is shifting from distributed freeform database, to distributed semi-structured database.

    Amazon.com RSS Feeds They never got around to it, so we set up 160+ separate RSS channels for darn near every type of product on Amazon.com for you. If you have any feedback for this new (free) service, please let us know immediately! We're looking to make it an outstanding and permanent part to your collection. Enjoy! (Chris) [via Lockergnome's Bits and Bytes]

    Your Web Site is gradually becoming a database (what?). Yes, your Web Site needs to be driven by database software that can rapidly create RSS feeds for your organizations non XML and XML data sources. Your web site needs to provide direct data access to  users, bots, Web Services.

    Here is my blog database for instance, you can query the XML data in this database using XQuery, XPath, and Web Services (if I decide to publish any of my XML Query Templates as Web Services).

    Note the teaser here, each XML document is zero bytes! This is becuase these are live Virtuoso SQL-XML documents that are producing a variety of XML documents on the fly, which means that they retain a high degree of sensitivity to changes in the underlying databases supplying the data.  I could have chosen to make these persistent XML docs with interval based synchronization with the backen data sources (but I chose not to for maximum effect).

    As you can see SQL and XML (Relational and Hierarchical Models) engines can co-exist in a single server, ditto Object-Relational (which might be hidden from view but could be used in the SQL that serves the SQL-XML docs), ditto Full Text (see the search feature of this blog) and finally, ditto directed graph model for accessing my RDF data.(more on this as the RDF data pool increases).

    ]]>
    Amazon.com RSS Feedshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/181Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    A simple guide usable by any Perl developer seeking to exploit SPARQL without hassles.

    Why?

    SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

    How?

    SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

    Steps:

    1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
    2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

    Script:

    #
    # Demonstrating use of a single query to populate a 
    # Virtuoso Quad Store via Perl. 
    #
    
    # 
    # HTTP URL is constructed accordingly with CSV query results format as the default via mime type.
    #
    
    use CGI qw/:standard/;
    use LWP::UserAgent;
    use Data::Dumper;
    use Text::CSV_XS;
    
    sub sparqlQuery(@args) {
      my $query=shift;
      my $baseURL=shift;
      my $format=shift;
    	
    	%params=(
    		"default-graph" => "", "should-sponge" => "soft", "query" => $query,
    		"debug" => "on", "timeout" => "", "format" => $format,
    		"save" => "display", "fname" => ""
    	);
    	
    	@fragments=();
    	foreach $k (keys %params) {
    		$fragment="$k=".CGI::escape($params{$k});
    		push(@fragments,$fragment);
    	}
    	$query=join("&", @fragments);
    	
    	$sparqlURL="${baseURL}?$query";
    	
    	my $ua = LWP::UserAgent->new;
    	$ua->agent("MyApp/0.1 ");
    	my $req = HTTP::Request->new(GET => $sparqlURL);
    	my $res = $ua->request($req);
    	$str=$res->content;
    	
    	$csv = Text::CSV_XS->new();
    	
    	foreach $line ( split(/^/, $str) ) {
    		$csv->parse($line);
    		@bits=$csv->fields();
    	  push(@rows, [ @bits ] );
    	}
    	return \@rows;
    }
    
    
    # Setting Data Source Name (DSN)
    
    $dsn="http://dbpedia.org/resource/DBpedia";
    
    # Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET using the IRI in
    # FROM clause as Data Source URL en route to DBMS
    # record Inserts.
    
    $query="DEFINE get:soft \"replace\"\n
    
    # Generic (non Virtuoso specific SPARQL
    # Note: this will not add records to the 
    # DBMS 
    
    SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 
    
    $data=sparqlQuery($query, "http://localhost:8890/sparql/", "text/csv");
    
    print "Retrieved data:\n";
    print Dumper($data);
    

    Output

    Retrieved data:
    $VAR1 = [
              [
                's',
                'p',
                'o'
              ],
              [
                'http://dbpedia.org/resource/DBpedia',
                'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
                'http://www.w3.org/2002/07/owl#Thing'
              ],
              [
                'http://dbpedia.org/resource/DBpedia',
                'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
                'http://dbpedia.org/ontology/Work'
              ],
              [
                'http://dbpedia.org/resource/DBpedia',
                'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
                'http://dbpedia.org/class/yago/Software106566077'
              ],
    ...
    

    Conclusion

    CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Perl developer that already knows how to use Perl for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

    Related

    ]]>
    SPARQL Guide for the Perl Developerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1655Wed, 26 Jan 2011 23:11:13 GMT32011-01-26T18:11:13-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    A simple guide usable by any Javascript developer seeking to exploit SPARQL without hassles.

    Why?

    SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

    How?

    SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.

    Steps:

    1. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
    2. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

    Script:

    /*
    Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. 
    */
    
    /* 
    HTTP URL is constructed accordingly with JSON query results format as the default via mime type.
    */
    
    function sparqlQuery(query, baseURL, format) {
    	if(!format)
    		format="application/json";
    	var params={
    		"default-graph": "", "should-sponge": "soft", "query": query,
    		"debug": "on", "timeout": "", "format": format,
    		"save": "display", "fname": ""
    	};
    	
    	var querypart="";
    	for(var k in params) {
    		querypart+=k+"="+encodeURIComponent(params[k])+"&";
    	}
    	var queryURL=baseURL + '?' + querypart;
    	if (window.XMLHttpRequest) {
      	xmlhttp=new XMLHttpRequest();
      }
      else {
      	xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
      }
      xmlhttp.open("GET",queryURL,false);
      xmlhttp.send();
      return JSON.parse(xmlhttp.responseText);
    }
    
    /*
    setting Data Source Name (DSN)
    */
    
    var dsn="http://dbpedia.org/resource/DBpedia";
    
    /*
    Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to 
    DBMS record inserts
    */
    
    var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; 
    var data=sparqlQuery(query, "/sparql/");
    

    Output

    Place the snippet above into the <script/> section of an HTML document to see the query result.

    Conclusion

    JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

    Related

    ]]>
    SPARQL Guide for the Javascript Developer http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1653Wed, 26 Jan 2011 23:10:28 GMT42011-01-26T18:10:28-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    A simple guide usable by any PHP developer seeking to exploit SPARQL without hassles.

    Why?

    SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

    How?

    SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. PHP.

    Steps:

    1. From your command line execute: aptitude search '^PHP26', to verify PHP is in place
    2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
    3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

    Script:

    #!/usr/bin/env php
    <?php
    #
    # Demonstrating use of a single query to populate a # Virtuoso Quad Store via PHP. 
    #
    
    # HTTP URL is constructed accordingly with JSON query results format in mind.
    
    function sparqlQuery($query, $baseURL, $format="application/json")
    
      {
    	$params=array(
    		"default-graph" =>  "",
    		"should-sponge" =>  "soft",
    		"query" =>  $query,
    		"debug" =>  "on",
    		"timeout" =>  "",
    		"format" =>  $format,
    		"save" =>  "display",
    		"fname" =>  ""
    	);
    
    	$querypart="?";	
    	foreach($params as $name => $value) 
      {
    		$querypart=$querypart . $name . '=' . urlencode($value) . "&";
    	}
    	
    	$sparqlURL=$baseURL . $querypart;
    	
    	return json_decode(file_get_contents($sparqlURL));
    };
    
    
    
    # Setting Data Source Name (DSN)
    $dsn="http://dbpedia.org/resource/DBpedia";
    
    #Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
    #using the IRI in FROM clause as Data Source URL
    
    $query="DEFINE get:soft \"replace\"
    SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; 
    
    $data=sparqlQuery($query, "http://localhost:8890/sparql/");
    
    print "Retrieved data:\n" . json_encode($data);
    
    ?>
    

    Output

    Retrieved data:
      {"head":
      {"link":[],"vars":["s","p","o"]},
      "results":
    		{"distinct":false,"ordered":true,
    		"bindings":[
    			{"s":
    			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
    			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
    			{"type":"uri","value":"http:\/\/www.w3.org\/2002\/07\/owl#Thing"}},
    			{"s":
    			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
    			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
    			{"type":"uri","value":"http:\/\/dbpedia.org\/ontology\/Work"}},
    			{"s":
    			{"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p":
    			{"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o":
    			{"type":"uri","value":"http:\/\/dbpedia.org\/class\/yago\/Software106566077"}},
    ...
    

    Conclusion

    JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a PHP developer that already knows how to use PHP for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

    Related

    ]]>
    SPARQL Guide for the PHP Developerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1652Tue, 25 Jan 2011 15:36:58 GMT32011-01-25T10:36:58-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    A simple guide usable by any Python developer seeking to exploit SPARQL without hassles.

    Why?

    SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

    How?

    SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Python.

    Steps:

    1. From your command line execute: aptitude search '^python26', to verify Python is in place
    2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
    3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

    Script:

    #!/usr/bin/env python
    #
    # Demonstrating use of a single query to populate a # Virtuoso Quad Store via Python. 
    #
    
    import urllib, json
    
    # HTTP URL is constructed accordingly with JSON query results format in mind.
    
    def sparqlQuery(query, baseURL, format="application/json"):
    	params={
    		"default-graph": "",
    		"should-sponge": "soft",
    		"query": query,
    		"debug": "on",
    		"timeout": "",
    		"format": format,
    		"save": "display",
    		"fname": ""
    	}
    	querypart=urllib.urlencode(params)
    	response = urllib.urlopen(baseURL,querypart).read()
    	return json.loads(response)
    
    # Setting Data Source Name (DSN)
    dsn="http://dbpedia.org/resource/DBpedia"
    
    # Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
    # using the IRI in FROM clause as Data Source URL
    
    query="""DEFINE get:soft "replace"
    SELECT DISTINCT * FROM <%s> WHERE {?s ?p ?o}""" % dsn 
    
    data=sparqlQuery(query, "http://localhost:8890/sparql/")
    
    print "Retrieved data:\n" + json.dumps(data, sort_keys=True, indent=4)
    
    #
    # End
    

    Output

    Retrieved data:
    {
        "head": {
            "link": [], 
            "vars": [
                "s", 
                "p", 
                "o"
            ]
        }, 
        "results": {
            "bindings": [
                {
                    "o": {
                        "type": "uri", 
                        "value": "http://www.w3.org/2002/07/owl#Thing"
                    }, 
                    "p": {
                        "type": "uri", 
                        "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
                    }, 
                    "s": {
                        "type": "uri", 
                        "value": "http://dbpedia.org/resource/DBpedia"
                    }
                }, 
    ...
    

    Conclusion

    JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Python developer that already knows how to use Python for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

    Related

    ]]>
    SPARQL Guide for Python Developerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1651Tue, 25 Jan 2011 15:35:46 GMT32011-01-25T10:35:46-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    A simple guide usable by any Ruby developer seeking to exploit SPARQL without hassles.

    Why?

    SPARQL is a powerful query language, results serialization format, and an HTTP based data access protocol from the W3C. It provides a mechanism for accessing and integrating data across Deductive Database Systems (colloquially referred to as triple or quad stores in Semantic Web and Linked Data circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form.

    How?

    SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Ruby.

    Steps:

    1. From your command line execute: aptitude search '^ruby', to verify Ruby is in place
    2. Determine which SPARQL endpoint you want to access e.g. DBpedia or a local Virtuoso instance (typically: http://localhost:8890/sparql).
    3. If using Virtuoso, and you want to populate its quad store using SPARQL, assign "SPARQL_SPONGE" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).

    Script:

    #!/usr/bin/env ruby
    #
    # Demonstrating use of a single query to populate a # Virtuoso Quad Store. 
    #
    
    require 'net/http'
    require 'cgi'
    require 'csv'
    
    #
    # We opt for CSV based output since handling this format is straightforward in Ruby, by default.
    # HTTP URL is constructed accordingly with CSV as query results format in mind.
    
    def sparqlQuery(query, baseURL, format="text/csv")
    	params={
    		"default-graph" => "",
    		"should-sponge" => "soft",
    		"query" => query,
    		"debug" => "on",
    		"timeout" => "",
    		"format" => format,
    		"save" => "display",
    		"fname" => ""
    	}
    	querypart=""
    	params.each { |k,v|
    		querypart+="#{k}=#{CGI.escape(v)}&"
    	}
      
    	sparqlURL=baseURL+"?#{querypart}"
    	
    	response = Net::HTTP.get_response(URI.parse(sparqlURL))
    
    	return CSV::parse(response.body)
    	
    end
    
    # Setting Data Source Name (DSN)
    
    dsn="http://dbpedia.org/resource/DBpedia"
    
    #Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET
    #using the IRI in FROM clause as Data Source URL
    
    query="DEFINE get:soft \"replace\"
    SELECT DISTINCT * FROM <#{dsn}> WHERE {?s ?p ?o} "
    
    #Assume use of local installation of Virtuoso 
    #otherwise you can change URL to that of a public endpoint
    #for example DBpedia: http://dbpedia.org/sparql
    
    data=sparqlQuery(query, "http://localhost:8890/sparql/")
    
    puts "Got data:"
    p data
    
    #
    # End
    

    Output

    Got data:
    [["s", "p", "o"], 
      ["http://dbpedia.org/resource/DBpedia", 
       "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
       "http://www.w3.org/2002/07/owl#Thing"], 
      ["http://dbpedia.org/resource/DBpedia", 
       "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
       "http://dbpedia.org/ontology/Work"], 
      ["http://dbpedia.org/resource/DBpedia", 
       "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
       "http://dbpedia.org/class/yago/Software106566077"],
    ...
    

    Conclusion

    CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Ruby developer that already knows how to use Ruby for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.

    Related

    ]]>
    SPARQL for the Ruby Developerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1648Tue, 25 Jan 2011 15:17:12 GMT82011-01-25T10:17:12.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Linked Data is simply hypermedia-based structured data.

    Linked Data offers everyone a Web-scale, Enterprise-grade mechanism for platform-independent creation, curation, access, and integration of data.

    The fundamental steps to creating Linked Data are as follows:

    1. Choose a Name Reference Mechanism — i.e., URIs.

    2. Choose a Data Model with which to Structure your Data — minimally, you need a model which clearly distinguishes

      1. Subjects (also known as Entities)
      2. Subject Attributes (also known as Entity Attributes), and
      3. Attribute Values (also known as Subject Attribute Values or Entity Attribute Values).
    3. Choose one or more Data Representation Syntaxes (also called Markup Languages or Data Formats) to use when creating Resources with Content based on your chosen Data Model. Some Syntaxes in common use today are HTML+RDFa, N3, Turtle, RDF/XML, TriX, XRDS, GData, OData, OpenGraph, and many others.

    4. Choose a URI Scheme that facilitates binding Referenced Names to the Resources which will carry your Content -- your Structured Data.

    5. Create Structured Data by using your chosen Name Reference Mechanism, your chosen Data Model, and your chosen Data Representation Syntax, as follows:

      1. Identify Subject(s) using Resolvable URI(s).
      2. Identify Subject Attribute(s) using Resolvable URI(s).
      3. Assign Attribute Values to Subject Attributes. These Values may be either Literals (e.g., STRINGs, BLOBs) or Resolvable URIs.

    You can create Linked Data (hypermedia-based data representations) Resources from or for many things. Examples include: personal profiles, calendars, address books, blogs, photo albums; there are many, many more.

    Related

    1. Linked Data an Introduction -- simple introduction to Linked Data and its virtues
    2. How Data Makes Corporations Dumb -- Jeff Jonas (IBM) interview
    3. Hypermedia Types -- evolving information portal covering different aspects of Hypermedia resource types
    4. URIBurner -- service that generates Linked Data from a plethora of heterogeneous data sources
    5. Linked Data Meme -- TimbL design issues note about Linked Data
    6. Data 3.0 Manifesto -- note about format agnostic Linked Data
    7. DBpedia -- large Linked Data Hub
    8. Linked Open Data Cloud -- collection of Linked Data Spaces
    9. Linked Open Commerce Cloud -- commerce (clicks & mortar and/or clicks & clicks) oriented Linked Data Space
    10. LOD Cloud Cache -- massive Linked Data Space hosting most of the LOD Cloud Datasets
    11. LOD2 Initiative -- EU Co-Funded Project to develop global knowledge space from LOD
    12. .
    ]]>
    What is Linked Data, really?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1645Tue, 09 Nov 2010 18:53:01 GMT22010-11-09T13:53:01-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Conflation is the tech industry's equivalent of macroeconomic inflation. Whenever it rears it head, we lose value courtesy of diminishing productivity.

    Looking retrospectively at any technology failure -- enterprises or industry at large -- you will eventually discover -- at the core -- messy conflation of at least one of the following:

    1. Data Model (Semantics)
    2. Data Object (Entity) Names (Identifiers)
    3. Data Representation Syntax (Markup)
    4. Data Access Protocol
    5. Data Presentation Syntax (Markup)
    6. Data Presentation Media.

    The Internet & World Wide Web (InterWeb) are massive successes because their respective architectural cores embody the critical separation outlined above.

    The Web of Linked Data is going to become a global reality, and massive success, because it leverages inherently sound architecture -- bar conflationary distractions of RDF. :-)

    ]]>
    6 Things That Must Remain Distinct re. Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1643Thu, 04 Nov 2010 15:01:39 GMT12010-11-04T11:01:39.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Linked Data is simply hypermedia-based structured data.

    Linked Data offers everyone a Web-scale, Enterprise-grade mechanism for platform-independent creation, curation, access, and integration of data.

    The fundamental steps to creating Linked Data are as follows:

    1. Choose a Name Reference Mechanism — i.e., URIs.

    2. Choose a Data Model with which to Structure your Data — minimally, you need a model which clearly distinguishes

      1. Subjects (also known as Entities)
      2. Subject Attributes (also known as Entity Attributes), and
      3. Attribute Values (also known as Subject Attribute Values or Entity Attribute Values).
    3. Choose one or more Data Representation Syntaxes (also called Markup Languages or Data Formats) to use when creating Resources with Content based on your chosen Data Model. Some Syntaxes in common use today are HTML+RDFa, N3, Turtle, RDF/XML, TriX, XRDS, GData, and OData; there are many others.

    4. Choose a URI Scheme that facilitates binding Referenced Names to the Resources which will carry your Content -- your Structured Data.

    5. Create Structured Data by using your chosen Name Reference Mechanism, your chosen Data Model, and your chosen Data Representation Syntax, as follows:

      1. Identify Subject(s) using Resolvable URI(s).
      2. Identify Subject Attribute(s) using Resolvable URI(s).
      3. Assign Attribute Values to Subject Attributes. These Values may be either Literals (e.g., STRINGs, BLOBs) or Resolvable URIs.

    You can create Linked Data (hypermedia-based data representations) Resources from or for many things. Examples include: personal profiles, calendars, address books, blogs, photo albums; there are many, many more.

    Related

    1. Hypermedia Types -- evolving information portal covering different aspects of Hypermedia resource types
    2. URIBurner -- service that generates Linked Data from a plethora of heterogeneous data sources
    3. Linked Data Meme -- TimbL design issues note about Linked Data
    4. Data 3.0 Manifesto -- note about format agnostic Linked Data
    5. DBpedia -- large Linked Data Hub
    6. Linked Open Data Cloud -- collection of Linked Data Spaces
    7. Linked Open Commerce Cloud -- commerce (clicks & mortar and/or clicks & clicks) oriented Linked Data Space
    8. LOD Cloud Cache -- massive Linked Data Space hosting most of the LOD Cloud Datasets
    9. LOD2 Initiative -- EU Co-Funded Project to develop global knowledge space from LOD
    10. .
    ]]>
    What is Linked Data, really?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1639Tue, 15 Feb 2011 22:28:06 GMT12011-02-15T17:28:06.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Deceptively simple demonstrations of how Virtuoso's SPARQL-GEO extensions to SPARQL lay critical foundation for Geo Spatial solutions that seek to leverage the burgeoning Web of Linked Data.

    Setup Information

    SPARQL Endpoint: Linked Open Data Cache (8.5 Billion+ Quad Store which includes data from Geonames and the Linked GeoData Project Data Sets) .

    Live Linked Data Meshup Links:

    Related

    ]]>
    Meshups Demonstrating How SPARQL-GEO Enhances Linked Data Exploitation (Update 2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1612Wed, 24 Mar 2010 15:44:24 GMT32010-03-24T11:44:24.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here are 5 powerful benefits you can immediately derive from the combination of Virtuoso and Amazon's AWS services (specifically the EC2 and EBS components):

    1. Acquire your own personal or service specific data space in the Cloud. Think DBase, Paradox, FoxPRO, Access of yore, but with the power of Oracle, Informix, Microsoft SQL Server etc.. using a Conceptual, as opposed to solely Logical, model based DBMS (i.e., a Hybrid DBMS Engine for: SQL, RDF, XML, and Full Text)
    2. Ability to share and control access to your resources using innovations like FOAF+SSL, OpenID, and OAuth, all from one place
    3. Construction of personal or organization based FOAF profiles in a matter of minutes; by simply creating a basic DBMS (or ODS application layer) account; and then using this profile to create strong links (references) to all your Data silos (esp. those from the Web 2.0 realm)
    4. Load data sets from the LOD cloud or Sponge existing Web resources (i.e., on the fly data transformation to RDF model based Linked Data) and then use the combination to build powerful lookup services that enrich the value of URLs (think: Web addressable reports holding query results) that you publish
    5. Bind all of the above to a domain that you own (e.g. a .Name domain) so that you have an attribution-friendly "authority" component for resource URLs and Entity URIs published from your Personal Linked Data Space on the Web (or private HTTP network).

    In a nutshell, the AWS Cloud infrastructure simplifies the process of generating Federated presence on the Internet and/or World Wide Web. Remember, centralized networking models always end up creating data silos, in some context, ultimately! :-)

    ]]>
    5 Game Changing Things about the OpenLink Virtuoso + AWS Cloud Combohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1590Mon, 01 Feb 2010 13:59:36 GMT22010-02-01T08:59:36-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Personally, I believe that we've actually reached a watershed moment re. the evolution of the Web from a mesh of Linked Data Containers (Web of Linked Documents) to a mesh of Linked Data Items (entities or real world objects).

    The journey towards this watershed moment started with the Semantic Web Project, gained focus and pragmatism via the Linked Data meme, attained substance & credibility via efforts such as DBpedia and the resulting cloud of Open Linked Data Spaces, and finally arrived at the most important destination of all: broad comprehension and coherence, via RDFa.

    Over the years, I've chronicled the journey above via entries in this particular data space (my blog) and most recently, via my rapid-fire comments and debates on Twitter (basically hastag #linkeddata account: kidehen).

    On a parallel front re. my chronicles, I've periodically had conversations with Jon Udell, who has always provided a coherent sounding board and reconciliation framework for my world views and open data access vision; naturally, this has a lot to do with his holistic grasp of the big picture issues, associated technical details, and special communication prowess :-)

    Against this backdrop, I refer you to my most recent podcast conversation with Jon, which is about how the tandem of HTML+RDFa and the GoodRelations vocabulary deliver the critical missing links re. broad comprehension of the Semantic Web vision en route to mass exploitation.

    Related

    ]]>
    Conversation with Jon Udell: Are We There Yet Re. Web++ ?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1584Mon, 01 Feb 2010 13:58:04 GMT22010-02-01T08:58:04.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Situation Analysis

    As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:

    "Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..

    And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..

    What's up with that?

    Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).

    The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.

    What is a Real World Object?

    People, Places, Music, Books, Cars, Ideas, Emotions etc..

    What is a URI?

    A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.

    URI Generic Syntax

    The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below:

    What is a URL?

    A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:

    1. Resource Address/Location Identifier
    2. Data Access mechanism for an Information bearing Resource (Document, File etc..)

    So far so good!

    What is an HTTP based URI?

    The kind of URI Linked Data aficionados mean when they use the term: URI.

    An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:

    1. RWO Identfier/Name
    2. RWO Metadata document Locator (courtesy of URL aspect)
    3. Negotiable Representation of the Located Document (courtesy of HTTP's content negotiation feature).

    What is Metadata?

    Data about Data. Put differently, data that describes other data in a structured manner.

    How Do we Model Metadata?

    The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).

    What about RDF?

    The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:

    1. Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes & Relationships (Data Dictionaries e.g., OWL) metadata model
    2. A plethora of instance data representation formats that include: RDFa (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc.

    What's the Problem Today?

    The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.

    Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?

    How Does the Link Data meme solve the problem?

    The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.

    Conclusion

    Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)

    Related

    1. History of how "Resource" became part of URI - historic account by TimBL
    2. Linked Data Design Issues Document - TimBL's initial Linked Data Guide
    3. Linked Data Rules Simplified - My attempt at simplifying the Linked Data Meme without SPARQL & RDF distraction
    4. Linked Data & Identity - another related post
    5. The Linked Data Meme's Value Proposition
    6. So What Does "HREF" stand for anyway?
    7. My Del.icio.us hosted Bookmark Data Space for Identity Schemes
    8. TimBL's Ted Talk re. "Raw Linked Data"
    9. Resource Oriented Architecture
    10. More Famous Than Simon Cowell .
    ]]>
    The URI, URL, and Linked Data Meme's Generic HTTP URI (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567Sun, 28 Mar 2010 16:19:00 GMT62010-03-28T12:19:00-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As espoused by the Ubuntu philosophy, no Human is an Island. Thus, although the objects of our sociality are vast and varied; that said, the basic foundation still centers on the pursuit and/or delivery of products and services.

    Today, the we put stuff on the Web because we want it do be discovered as part of a "sharing act". Likewise, we make regular use of Search Engine Services because we want to "Find" stuff in a productive manner.

    Putting, the above in context, you don't need to be Einstein to figure out that to date the Web hasn't enabled vendors to describe their products and services clearly. Likewise, it hasn't enabled us to describe what we want, when we want it, and how much we are willing to pay etc. Basically, the SDQ of Web Content is excruciatingly low!

    The Linked Data meme is about using the essence of the Web -- HTTP URIs -- as the mechanism for conducting data across the Web that unambiguously unveils basic things like:

    1. Using a personal profile to describe exactly who I am, my interests, favorite things, what I want (wishlist), what I have to offer (offerlist) etc.
    2. Using an company profile to describe my entire product catalog, inventory levels, store locations, distributor and reseller networks, feature specs, price specs, deal terms and duration, and even opening and closing hours.

    Conclusions

    A Web of Linked Data enables a complete redefinition of eCommerce, and that's just for starters :-)

    Related

    ]]>
    Why Do We Put Stuff On The Web, Really?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1566Sat, 25 Jul 2009 01:00:21 GMT12009-07-24T21:00:21-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is Linked Data?

    The primary topic of a meme penned by TimBL in the form of a Design Issues Doc (note: this is how TimBL has shared his thoughts since the Beginning of the Web).

    There are a number of dimensions to the meme, but its primary purpose is the reintroduction of the HTTP URI -- a vital component of the Web's core architecture.

    What's Special about HTTP URIs?

    They possess an intrinsic duality that combines persistent and unambiguous Data Identity with platform & representation format independent Data Access. Thus, you can use a string of characters that look like a contemporary Web URL to unambiguously achieve the following:

    1. Identity or Name Anything of Interest
    2. Describe Anything of Interest by associating the Description Subject's Identity with a constellation of Attribute and Value pairs (technically: an Entity-Attribute-Value or Subject-Predicate-Object graph)
    3. Make the Description of Named Things of Interest discoverable on the Web by implicitly binding the aforementioned to Documents that hold their descriptions (technically: metadata documents or information resources)

    What's the basic value proposition of the Linked Data meme?

    Enabling more productive use of the Web by users and developers alike. All of which is achieved by tweaking the Web's Hyperlinking feature such that it now includes Hypertext and Hyperdata as link types.

    Note: Hyperdata Linking is simply what an HTTP URI facilitates.

    Examples problems solved by injecting Linked Data into the Web:

    1. Federated Identity by enabling Individuals to unambiguously Identify themselves (Profiles++) courtesy of existing Internet and Web protocols (e.g., FOAF+SSL's WebIDs which combine Personal Identity with X.509 certificates and HTTPs based client side certification)
    2. Security and Privacy challenge alleviation by delivering a mechanism for policy based data access that feeds off federated individual identity and social network (graph) traversal
    3. Spam Busting via the above
    4. .
    5. Increasing the Serendipitous Discovery Quotient (SDQ) of Web accessible resources by embedding Rich Metadata into (X)HTML Documents e.g., structured descriptions of your "WishLists" and "OfferLists" via a common set of terms offered by vocabularies such as GoodRelations and SIOC
    6. Coherent integration of disparate data across the Web and/or within the Enterprise via "Data Meshing" rather than "Data Mashing"
    7. Moving beyond imprecise statistically driven "Keyword Search" (e.g. Page Rank) to "Precision Find" driven by typed link based Entity Rank plus Entity Type and Entity Property filters.

    Conclusion

    If all of the above still falls into the technical mumbo-jumbo realm, then simply consider Linked Data as delivering Open Data Access in granular form to Web accessible data -- that goes beyond data containers (documents or files).

    The value proposition of Linked Data is inextricably linked to the value proposition of the World Wide Web. This is true, because the Linked Data meme is ultimately about an enhancement of the current Web; achieved by reintroducing its architectural essence -- in new context -- via a new level of link abstraction, courtesy of the Identity and Access duality of HTTP URIs.

    As a result of Linked Data, you can now have Links on the Web for a Person, Document, Music, Consumer Electronics, Products & Services, Business Opening & Closing Hours, Personal "WishLists" and "OfferList", an Idea, etc.. in addition to links for Properties (Attributes & Values) of the aforementioned. Ultimately, all of these links will be indexed in a myriad of ways providing the substrate for the next major period of Internet & Web driven innovation, within our larger human-ingenuity driven innovation continuum.

    Related

    ]]>
    Exploring the Value Proposition of Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1565Fri, 24 Jul 2009 12:20:01 GMT22009-07-24T08:20:01-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    We have reached a beachead re. the Virtuoso instance hosting the Linked Open Data (LOD) Cloud; meaning, we are not going to be performing any major updates and deletions short-term, bar incorporation of fresh data sets from the Freebase and Bio2RDF projects (both communities a prepping new RDF data sets).

    At the current time we have loaded 100% of all the very large data sets from the LOD Cloud. As result, we can start the process of exposing Linked Data virtues in a manner that's palatable to users, developers, and database professionals across the Web 1.0, 2.0, and 3.0 spectrums.

    What does this mean?

    You can use the "Search & Find" or"URI Lookup" or SPARQL endpoint associated with the LOD cloud hosting instance to perform the following tasks:

    1. Find entities associated with full text search patterns -- Google Style, but with Entity & Text proximity Rank instead of Page Rank, since we are dealing with Entities rather than documents about entities
    2. Find and Lookup entities by Identifier (URI) -- which is helpful when locating URIs to use for identify entities in your own linked data spaces on the Web
    3. View entity descriptions via a variety of representation formats (HTML, RDFa, RDF/XML, N3, Turtle etc.)
    4. Determine uses of entity identifiers across the LOD cloud -- which helps you select preferred URIs based on usage statistics.

    What does it offer Web 1.0 and 2.0 developers?

    If you don't want to use the SPARQL based Web Service, or other Linked Data Web oriented APIs for interacting with the LOD cloud programmatically, you can simply use the powerful REST style Web Service that provides URL parameters for performing full text oriented "Search", entity oriented "Find" queries, and faceted navigation over the huge data corpus with results data returned in JSON and XML formats.

    Next Steps:

    Amazon have agreed to add all the LOD Cloud data sets to their existing public data sets collective. Thus, the data sets we are loading will be available in "raw data" (RDF) format on the public data sets page via Named Elastic Block Storage (EBS) Snapshots); meaning, you can make an EC2 AMI (e.g. a Linux, Windows, Solaris) and install an RDF quad or triple store of choice into your AMI, then simply load data from the LOD cloud based on your needs.

    In addition to the above, we are also going to offer a Virtuoso 6.0 Cluster Edition based LOD Cloud AMI (as we've already done with DBpedia, MusicBrainz, NeuroCommons, and Bio2Rdf) that will enable you to simply instantiate a personal and service specific edition of Virtuoso with all the LOD data in place and fully tuned for performance and scalability; basically, you will simply press "Instantiate AMI" and a LOD cloud data space, in true Linked Data from, will be at your disposal within minutes (i.e. the time it takes the DB to start).

    Work on the migration of the LOD data to EC2 starts this week. Thus, if you are interested in contributing an RDF based data set to the LOD cloud now is the time to get your archive links in place on the (see: ESW Wiki page for LOD Data Sets).

    ]]>
    Live Virtuoso instance hosting Linked Open Data (LOD) Cloudhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1539Wed, 01 Apr 2009 18:26:22 GMT22009-04-01T14:26:22.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    This post is a reply to Jason Kolb's post titled: Using Advertising to Take Over the World. Jason's post is a response to Robert Scoble's post titled: Why Facebook has never listened and why it definitely won’t start now.

    Jason:

    Scoble is sensing what comes next, but in my opinion, describes it using an old obtrusive advertising model anecdote.

    I've penned a post or two about the "Magic of You" which is all about the new Web power broker (Entity: "You").

    Personally, I've long envisaged a complete overhaul of advertising where obtrusive advertising simply withers away; ultimately replaced by an unobtrusive model that is driven by individualized relevance and high doses of serendipity. Basically, this is ultimately about "taking the Ad out of item placement in Web pages".

    The fundamental ingredients of an unobtrusive advertising landscape would include the following Human facts:

    1. We are social beings and need stuff from time to time
    2. We know what we need and would like to "Find stuff" when we are in "I Need Stuff" mode.

    Ideally, we would like to be able to simply state the following, via a Web accessible profile:

    1. Here are my "Wants" or "Needs" (my Wish-List)
    2. Here are the products and services that I "Offer" (my Offer-List).

    Now put the above into the context of an evolving Web where data items are becoming more visible by the second, courtesy of the "Linked Data" meme. Thus, things that weren't discernable via the Web: "People", "Places", "Music", "Books", "Products", etc., become much easier to identify and describe.

    Assuming the comments above hold true re. the Web's evolution into a collection of Linked Data Spaces, and the following occur:

    1. Structured profile pages become the basic units of Web presence
    2. Wish-Lists and Offer-Lists are exposed by profile pages

    Wish-Lists and Offer-Lists will gradually start bonding with increasing degrees of serendipity courtesy of exponential growth in Linked Data Web density.

    So based on what I've stated so far, Scoble would simply browse the Web or visit his profile page, and in either scenario enjoy a "minority report" style of experience albeit all under his control (since he is the one driving his Web user agent).

    What I describe above simply comes down to "Wish-lists" and associated recommendations becoming the norm outside the confines of Amazon's data space on the Web. Serendipitous discovery, intelligent lookups, and linkages are going to be the fundamental essence of Linked Data Web oriented applications, services, agents.

    Beyond Scoble, it's also important to note that access to data will be controlled by entity "You". Your data space on the Web will be something you will controll access to in a myriad of ways, and it will include the option to provide licensed access to commercial entities on your terms. Naturally, you will also determine the currency that facilitates the value exchange :-)

    Related

    ]]>
    How Linked Data will change Advertisinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1534Wed, 25 Mar 2009 12:30:58 GMT32009-03-25T08:30:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here is a tabulated "compare and contrast" of Web usage patterns 1.0, 2.0, and 3.0.

      Web 1.0 Web 2.0 Web 3.0
    Simple Definition Interactive / Visual Web Programmable Web Linked Data Web
    Unit of Presence Web Page Web Service Endpoint Data Space (named structured data enclave)
    Unit of Value Exchange Page URL Endpoint URL for API Resource / Entity / Object URI
    Data Granularity Low (HTML) Medium (XML) High (RDF)
    Defining Services Search Community (Blogs to Social Networks) Find
    Participation Quotient Low Medium High
    Serendipitous Discovery Quotient Low Medium High
    Data Referencability Quotient Low (Documents) Medium (Documents) High (Documents and their constituent Data)
    Subjectivity Quotient High Medium (from A-list bloggers to select source and partner lists) Low (everything is discovered via URIs)
    Transclusence Low Medium (Code driven Mashups) HIgh (Data driven Meshups)
    What You See Is What You Prefer (WYSIWYP) Low Medium High (negotiated representation of resource descriptions)
    Open Data Access (Data Accessibility) Low Medium (Silos) High (no Silos)
    Identity Issues Handling Low Medium (OpenID)

    High (FOAF+SSL)

    Solution Deployment Model Centralized Centralized with sprinklings of Federation Federated with function specific Centralization (e.g. Lookup hubs like LOD Cloud or DBpedia)
    Data Model Orientation Logical (Tree based DOM) Logical (Tree based XML) Conceptual (Graph based RDF)
    User Interface Issues Dynamically generated static interfaces Dyanically generated interafaces with semi-dynamic interfaces (courtesy of XSLT or XQuery/XPath) Dynamic Interfaces (pre- and post-generation) courtesy of self-describing nature of RDF
    Data Querying Full Text Search Full Text Search Full Text Search + Structured Graph Pattern Query Language (SPARQL)
    What Each Delivers Democratized Publishing Democratized Journalism & Commentary (Citizen Journalists & Commentators) Democratized Analysis (Citizen Data Analysts)
    Star Wars Edition Analogy Star Wars (original fight for decentralization via rebellion) Empire Strikes Back (centralization and data silos make comeback) Return of the JEDI (FORCE emerges and facilitates decentralization from "Identity" all the way to "Open Data Access" and "Negotiable Descriptive Data Representation")

    Naturally, I am not expecting everyone to agree with me. I am simply making my contribution to what will remain facinating discourse for a long time to come :-)

    Related

    ]]>
    Simple Compare & Contrast of Web 1.0, 2.0, and 3.0 (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1531Wed, 29 Apr 2009 17:21:25 GMT62009-04-29T13:21:25.000004-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've just read James Governor's insightful post titled: Why Applications Are Like Fish and Data is Like Wine, where he sums up the comparative value of applications (code containers) and data as follows:

    "Only one improves with age. With apologies to the originator of the phrase - “Hardware is like fish, operating systems are like wine.

    Yes! Applications are like Fish and Data like Wine, which is basically what Linked Data is fundamentally about, especially when you inject memes such as "Cool URIs" into the mix. Remember, the essence of Linked Data is all about a Web of Linked Data Objects endowed with Identifiers that don't change i.e., they occupy one place in public (e.g. World Wide Web) or private (your corporate Intranet or Extranet) networks, keeping the data that they expose relevant (as in fresh), accessible, and usable in many forms courtesy of the data access & representation dexterity that HTTP facilitates, when incorporated into object identifiers.

    Here is another excerpt from his post that rings true (amongst many others):

    What am I talking about? Processes change, and need to change. Baking data into the application is a bad idea because the data can’t then be extended in useful, and “unexpected ways”. But not expecting corporate data to be used in new ways is kind of like not expecting the Spanish Inquisition. But… “NOBODY expects the Spanish Inquisition! Amongst our weaponry are such diverse elements as: fear, surprise, ruthless efficiency, an almost fanatical devotion to the Pope.” (sounds like Enterprise Architecture ...).

    Related

    ]]>
    Cool URIs, Fish, and Winehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1497Fri, 23 Jan 2009 22:22:00 GMT12009-01-23T17:22:00.000005-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is it?

    A pre-installed edition of Virtuoso for Amazon's EC2 Cloud platform.

    What does it offer?

    From a Web Entrepreneur perspective it offers:
    1. Low cost entry point to a game-changing Web 3.0+ (and beyond) platform that combines SQL, RDF, XML, and Web Services functionality
    2. Flexible variable cost model (courtesy of EC2 DevPay) tightly bound to revenue generated by your services
    3. Delivers federated and/or centralized model flexibility for you SaaS based solutions
    4. Simple entry point for developing and deploying sophisticated database driven applications (SQL or RDF Linked Data Web oriented)
    5. Complete framework for exploiting OpenID, OAuth (including Role enhancements) that simplifies exploitation of these vital Identity and Data Access technologies
    6. Easily implement RDF Linked Data based Mail, Blogging, Wikis, Bookmarks, Calendaring, Discussion Forums, Tagging, Social-Networking as Data Space (data containers) features of your application or service offering
    7. Instant alleviation of challenges (e.g. service costs and agility) associated with Data Portability and Open Data Access across Web 2.0 data silos
    8. LDAP integration for Intranet / Extranet style applications.

    From the DBMS engine perspective it provides you with one or more pre-configured instances of Virtuoso that enable immediate exploitation of the following services:

    1. RDF Database (a Quad Store with SPARQL & SPARUL Language & Protocol support)
    2. SQL Database (with ODBC, JDBC, OLE-DB, ADO.NET, and XMLA driver access)
    3. XML Database (XML Schema, XQuery/Xpath, XSLT, Full Text Indexing)
    4. Full Text Indexing.

    From a Middleware perspective it provides:

    1. RDF Views (Wrappers / Semantic Covers) over SQL, XML, and other data sources accessible via SOAP or REST style Web Services
    2. Sponger Service for converting non RDF information resources into RDF Linked Data "on the fly" via a large collection of pre-installed RDFizer Cartridges.

    From the Web Server Platform perspective it provides an alternative to LAMP stack components such as MySQL and Apace by offering

    1. HTTP Web Server
    2. WebDAV Server
    3. Web Application Server (includes PHP runtime hosting)
    4. SOAP or REST style Web Services Deployment
    5. RDF Linked Data Deployment
    6. SPARQL (SPARQL Query Language) and SPARUL (SPARQL Update Language) endpoints
    7. Virtuoso Hosted PHP packages for MediaWiki, Drupal, Wordpress, and phpBB3 (just install the relevant Virtuoso Distro. Package).

    From the general System Administrator's perspective it provides:

    1. Online Backups (Backup Set dispatched to S3 buckets, FTP, or HTTP/WebDAV server locations)
    2. Synchronized Incremental Backups to Backup Set locations
    3. Backup Restore from Backup Set location (without exiting to EC2 shell).

    Higher level user oriented offerings include:

    1. OpenLink Data Explorer front-end for exploring the burgeoning Linked Data Web
    2. Ajax based SPARQL Query Builder (iSPARQL) that enables SPARQL Query construction by Example
    3. Ajax based SQL Query Builder (QBE) that enables SQL Query construction by Example.

    For Web 2.0 / 3.0 users, developers, and entrepreneurs it offers it includes Distributed Collaboration Tools & Social Media realm functionality courtesy of ODS that includes:

    1. Point of presence on the Linked Data Web that meshes your Identity and your Data via URIs
    2. System generated Social Network Profile & Contact Data via FOAF?
    3. System generated SIOC (Semantically Interconnected Online Community) Data Space (that includes a Social Graph) exposing all your Web data in RDF Linked Data form
    4. System generated OpenID and automatic integration with FOAF
    5. Transparent Data Integration across Facebook, Digg, LinkedIn, FriendFeed, Twitter, and any other Web 2.0 data space equipped with RSS / Atom support and/or REST style Web Services
    6. In-built support for SyncML which enables data synchronization with Mobile Phones.

    How Do I Get Going with It?

    ]]>
    Introducing Virtuoso Universal Server (Cloud Edition) for Amazon EC2http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1489Fri, 28 Nov 2008 21:06:02 GMT22008-11-28T16:06:02.000006-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Orri Erling (Program Manager: OpenLink Virtuoso) has dropped a well explained reiteration of the essence of the "Linked Data Web" or "Data Web" with an emphasis on the business value. His post is titled: State of the Semantic Web (Part 1) - Sociology, Business, and Messaging.

    Typically, Orri's post are targeted at the hard core RDF and SQL DBMS audiences, but in this particular post, he shoots straight at the business community revealing "Opportunity Cost" containment as the invisible driver behind the business aspects of any market inflection.

    Remember, the Web isn't ubiquitous because its users mastered the mechanics and virtues of HTML and/or HTTP. Web ubiquity is a function of the opportunity cost of not being on the Web, courtesy of the network effects of hyperlinked documents -- i.e., the instant gratification of traversing documents on the Web via a single click action. In similar fashion, the Linked Data Web's ubiquity will simply come down to the opportunity cost of not being "inside the Web", courtesy of the network effects of hyperlinked entities (documents, people, music, books, and other "Things").

    Here are some excerpts from Orri's post:

    Every time there is a major shift in technology, this shift needs to be motivated by addressing a new class of problem. This means doing something that could not be done before. The last time this happened was when the relational database became the dominant IT technology. At that time, the questions involved putting the enterprise in the database and building a cluster of line of business applications around the database. The argument for the RDBMS was that you did not have to constrain the set of queries that might later be made, when designing the database. In other words, it was making things more ad hoc. This was opposed then on grounds of being less efficient than the hierarchical and network databases which the relational eventually replaced. Today, the point of the Data Web is that you do not have to constrain what your data can join or integrate with, when you design your database. The counter-argument is that this is slow and geeky and not scalable. See the similarity? A difference is that we are not specifically aiming at replacing the RDBMS. In fact, if you know exactly what you will query and have a well defined workload, a relational representation optimized for the workload will give you about 10x the performance of the equivalent RDF warehouse. OLTP remains a relational-only domain. However, when we are talking about doing queries and analytics against the Web, or even against more than a handful of relational systems, the things which make RDBMS good become problematic.

    If we think about Web 1.0 as a period where the distinguishing noun was: "Author", and Web 2.0 the noun: "Journalist", we should be able to see that what comes next is the noun: "Analyst". This new generation analyst would be equipped with de-referencable Web Identity courtesy of their Person Entity URI. The analyst's URI would also be the critical component of Web based low cost attribution ecosystem; one that ultimately turns the URI into the analyst's brand emblem / imprint.

    Related

    ]]>
    The Virtuous Web of Linked Data -- Business Perspective (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1462Fri, 24 Oct 2008 18:49:18 GMT22008-10-24T14:49:18-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    In response to the "Semantic Web Technology" application classification scheme espoused by ReadWriteWeb (RWW), emphasized in the post titled: Where are all the RDF-based Semantic Web Apps?, here is my attempt to clarify and reintroduce what OpenLink Software offers (today) in relation to Semantic Web technology.

    From the RWW Top-Down category, which I interpret as: technologies that produce RDF from non RDF data sources. Our product portfolio is comprised of the following; Virtuoso Universal Server, OpenLink Data Spaces, OpenLink Ajax Toolkit, and OpenLink Data Explorer (which includes ubiquity commands).

    Virtuoso Universal Server functionality summary:

    1. Generation of RDF Linked Data Views of SQL, XML, and Web Services in general
    2. Deployment of RDF Linked Data
    3. "On the Fly" generation of RDF Linked Data from Document Web information resources (i.e. distillation of entities from their containers e.g. Web pages) via Cartridges / Drivers
    4. SPARQL query language support
    5. SPARQL extensions that bring SPARQL closer to SQL e.g Aggregates, Update, Insert, Delete Named Graph support (i.e. use of logical names to partition RDF data within Virtuoso's multi-model dbms engine)
    6. Inference Engine (currently in use re. DBpedia via Yago and UMBEL)
    7. Host and exposes data from Drupal, Wordpress, MediaWiki, phpBB3 as RDF Linked Data via in-built support for PHP runtime
    8. Available as an EC2 AMI
    9. etc..

    OpenLink Data Spaces functionality summary:

    1. Simple mechanism for Linked Data Web enabling yourself by giving you an HTTP based User ID (a de-referencable URI) that is linked to a FOAF based Profile page and OpenID
    2. Binds all your data sources (blogs, wikis, bookmarks, photos, calendar items etc. ) to your URI so can "Find" things by only remembering your URI
    3. Makes your profile page and personal URI the focal point of Linked Data Web presence
    4. Delivers Data Portability (using data access by value or data access by reference) across data silos (e.g. Web 2.0 style social networks)
    5. Allows you make annotations about anything in your own Data Space(s) on the Web without exposure to RDF markup
    6. A Briefcase feature that provides a WebDAV driven RDF Linked Data variant of functionality seen in Mac OS X Spotlight and WinFS with the addition of SPARQL compliance
    7. Automatically generates RDFa in its (X)HTML pages
    8. Blog, Wiki, WebDAV File Server, Shared Bookmarks, Calendar, and other applications that look and feel like Web 2.0 counterparts but emitt RDF Linked Data amongst a plethora of data exchange formats
    9. Available as an EC2 AMI
    10. etc..

    OpenLink Ajax Toolkit functionality summary:

    1. Provides binding to SQL, RDF, XML, and Web Services via Ajax Database Connectivity Layer (you only need an ODBC, JDBC, OLE-DB, ADO.NET, XMLA Driver, or Web Service on the backend for dynamic data access from Javascript)
    2. All controls are Ajax Database Connectivity bound (widgets get their data from Ajax Database Connectivity data sources)
    3. Bundled with Virtuoso and ODS installations.
    4. etc.

    OpenLink Data Explorer functionality summary

    1. Distills entities associated with information resource style containers (e.g. Web Pages or files) as RDF Linked Data
    2. Exposes the RDF based Linked Data graph associated with information resources (see the Linked Data behind Web pages)
    3. Ubiquity commands for invoking the above
    4. Available as a Hosted Service or Firefox Extension
    5. Bundled with Virtuoso and ODS installations
    6. etc.

    Note:

    Of course you could have simply looked up OpenLink Software's FOAF based Profile page (*note the Linked Data Explorer tab*), or simply passed the FOAF profile page URL to a Linked Data aware client application such as: OpenLink Data Explorer, Zitgist Data Viewer, Marbles, and Tabulator, and obtained information. Remember, OpenLink Software is an Entity of Type: foaf:Organization, on the burgeoning Linked Data Web :-)

    Related

    ]]>
    Where Are All the RDF-based Semantic Web Applications?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1447Thu, 02 Oct 2008 19:27:41 GMT42008-10-02T15:27:41-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    All about Data Dictionary issues

    Over emphasis on Description Logics (RDFS, OWL, Inference & Reasoning etc) matters without any actual real-world instance data (e.g., lot's of reasoning over RDF in zip files or local drives).

    Image

    All about Linking Openly accessible RDF Data Sets

    Over emphasis on Instance Data without Data Dictionary appreciation and utilization (e.g., Linked Data instance level linkage via "owl:sameAs").

    Image

    All about Applications & Frameworks

    Here we are dealing with numerous applications and frameworks that inextricably bind Instance Data Management and Data Dictionaries. Basically, an all or nothing proposition, if you want to delve into the RDF Linked Data solutions realm.

    Image

    Often overlooked, is the fact that the Linked Data Web - as an aspect of the Semantic Web innovation continuum - is fundamentally about designing and constructing an "Open World" compatible DBMS for the Internet. Thus, erstwhile "Closed World" DBMS components such as Data Dictionaries (handlers of Data Definition, Referential Integrity etc.) and actual Instance Data, are now distributed and loosely coupled. Thus, your data could be in one Data Space while the data dictionary resides in another. In actual fact, you could have several loosely bound data dictionaries that serve the specific Inference and Reasoning needs of a variety of applications, services, or agents.

    Image]]>
    Semantic Web: Travails to Harmony Illustrated (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1444Sun, 28 Sep 2008 19:18:53 GMT22008-09-28T15:18:53-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here is another "Linked Discourse" effort via a blog post that attempts to add perspective to a developing Web based conversation. In this case, the conversation originates from Juan Sequeda's recent interview with Jana Thompson titled: Is the Semantic Web necessary (and feasible)?

    Jana: What are the benefits you see to the business community in adopting semantic technology?

    Me: Exposure, exploitation, of untapped treasure trove of interlinked data, information, and knowledge across disparate IT infrastructure via conceptual entry points (Entity IDs / URIs / Data Source Names) that refer to as "Context Lenses".


    Jana: Do you think these benefits are great enough for businesses to adopt the changes?

    Me: Yes, infrastructural heterogeneity is a fact of corporate life (growth, mergers, acquisitions etc). Any technology that addresses these challenges is extremely important and valuable. Put differently, the opportunity costs associated with IT infrastructural heterogeneity remains high!


    Jana: How large do you think this impact will actually be?

    Me: Huge, enterprise have been aware of their data, information, and knowledge treasure troves etc. for eons. Tapping into these via a materialization of the "information at your fingertips" vision is something they've simply been waiting to pursue without any platform lock-in, for as long as I've been in this industry.


    Jana: I’ve heard, from contacts in the Bay Area, that they are skeptical of how large this impact of semantic technology will actually be on the web itself, but that the best uses of the technology are for fields such as medical information, or as you mentioned, geo-spatial data.

    Me: Unfortunately, those people aren't connecting the Semantic Web and open access to heterogeneous data sources, or the intrinsic value of holistic exploration location of entity based data networks (aka Linked Data).


    Jana: Are semantic technologies going to be part of the web because of people championing the cause or because it is actually a necessary step?

    Me: Linked Data technology on the Web is a vital extension of the current Web. Semantic Technology without the "Web" component, or what I refer to as "Semantics Inside only" solutions, simply offer little or no value as Web enhancements based on their incongruence with the essence of the Web i.e., "Open Linkage" and no Silos! A nice looking Silo is still a Silo.


    Jana: In the early days of the web, there was an explosion of new websites, due to the ease of learning HTML, from a business to a person to some crackpot talking about aliens. Even today, CSS and XHTML are not so difficult to learn that a determined person can’t learn them from W3C or other tutorials easily. If OWL becomes the norm for websites, what do you think the effects will be on the web? Do you think it is easy enough to learn that it will be readily adopted as part of the standard toolkit for web developers for businesses?

    Me: Correction, learning HTML had nothing to do with the Web's success. The value proposition of the Web simply reached critical mass and you simply couldn't afford to not be part of it. The easiest route to joining the Web juggernaut was a Web Page hosted on a Web Site. The question right now is: what's the equivalent driver for the Linked Data Web bearing in mind the initial Web bootstrap. My answer is simply this: Open Data Access i.e., getting beyond the data silos that have inadvertently emerged from Web 2.0.


    Jana: Following the same theme, do you think this will lead to an internet full of corporate-controlled websites, with sites only written by developers rather than individuals?

    Me: Not at all, we will have an Internet owned by it's participants i.e., You and the agents that work on your behalf.


    Jana: So, you are imagining technologies such as Drupal or Wordpress, that allow users to manage sites without a great deal of knowledge of the nuts and bolts of current web technologies?

    Me: Not at all! I envisage simple forms that provide conduits to powerful meshes of interlinked data spaces associated with Web users.


    Jana: Given all of the buzz, and my own familiarity with ontology, I am just very curious if the semantic web is truly necessary?

    Me:This question is no different than saying: I hear the Web is becoming a Database, and I wonder if a Data Dictionary is necessary, or even if access to structured data is necessary. It's also akin to saying: I accept "Search" as my only mechanism for Web interaction even though in reality, I really want to be able to "Find" and "Process" relevant things at a quicker rate than I do today, relative to the amount of information, and information processing time, at my disposal.


    Jana: Will it be worth it to most people to go away from the web in its current form, with keyword searches on sites like Google, to a richer and more interconnected internet with potentially better search technology?

    Me: As stated above, we need to add "Find" to the portfolio of functions we seek to perform against the Web. "Finding" and "Searching" are mutually inclusive pursuits at different ends of an activity spectrum.


    Jana: For our more technical readers, I have a few additional questions: If no standardization comes about for mapping relational databases to domain ontologies, how do you see that as influencing the decisions about adoption of semantic technology by businesses? After all, the success of technology often lives or dies on its ease of adoption.

    Me: Standardization of RDBMS to RDF Mapping is not the critical success factor here (of course it would be nice). As stated earlier, the issue of data integration that arises from IT infrastructural heterogeneity has been with decision makers in the enterprise for ever. The problem is now seeping into the broader consumer realm via Web ubiquity. The mistakes made in the enterprise realm are now playing out in the consumer Web realm. In both realms the critical success factors are:

    1. Scalable productivity relative to exponential growth of data generated across Intranets, Extranets, and the Internet
    2. Concept based Context Lenses that transcend logical and physical data heterogeneity by putting dereferencable URIs in front of the Line of Business Application Data and/or Web Data Spaces such as Blogs, Wikis, Discussion Forums etc.).
    ]]>
    Is the Semantic Web necessary (and feasible)?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1426Fri, 29 Aug 2008 15:08:12 GMT12008-08-29T11:08:12.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Jason Kolb (who initially nudged me to chime in), and then ReadWriteWeb, and of course Nova's Twine about the topic, have collectively started an interesting discussion about Web.vNext (3.0 and beyond) under the heading: The Future of the Desktop.

    My contribution to the developing discourse takes the form of a Q&A session. I've taken the questions posed and provided answers that express my particular points of view:

    Q: Is the desktop of the future going to just be a web-hosted version of the same old-fashioned desktop metaphors we have today?
    A: No, it's going to be a more Web Architecture aware and compliant variant exposed by appropriate metaphors.

    Q: The desktop of the future is going to be a hosted web service
    A: A vessel for exploiting the virtues of the Linked Data Web.

    Q: The Browser is Going to Swallow Up the Desktop
    A: Literally, of course not! Metaphorically, of course! And then the Browser metaphor will decomposes into function specific bits of Web interaction amenable to orchestration by its users.

    Q: The focus of the desktop will shift from information to attention
    A: No! Knowledge, Information, and Data sharing courtesy of Hyperdata & Hypertext Linking.

    Q: Users are going to shift from acting as librarians to acting as daytraders
    A: They were Librarians at Web 1.0, Journalist at Web 2.0, and Analysts in Web 3.0 (i.e, analyze structured and interlinked data), and CEOs in Web 4.0 (i.e. get Agents to do stuff intelligently en route to making decisions).

    Q: The Webtop will be more social and will leverage and integrate collective intelligence
    A: The Linked Data Web vessel will only require you to fill in your profile (once) and then serendipitous discovery and meshing of relevant data will simply happen (the serendipity quotient will grow in line with Linked Data Web density).

    Q: The desktop of the future is going to have powerful semantic search and social search capabilities built-in
    A: It is going to be able to "Find" rather than "Search" for stuff courtesy of the Linked Data Web.

    Q: Interactive shared spaces will replace folders
    A: Data Spaces and their URIs (Data Source Names) replace everything. You simply choose the exploration metaphor that best suits you space interaction needs.

    Q: The Portable Desktop
    A: Ubiquitous Desktop i.e. do the same thing (all answers above) on any device connected to the Web.

    Q: The Smart Desktop
    A: Vessels with access to Smart Data (Linked Data + Action driven Context sprinklings).

    Q: Federated, open policies and permissions
    A: More federation for sure, XMPP will become a lot more important, and OAuth will enable resurgence of the federated aspects of the Web and Internet.

    Q: The personal cloud
    A: Personal Data Spaces plugged into Clouds (Intranet, Extranet, Internet).

    Q: The WebOS
    A: An operating system endowed with traditional Database and Host Operating system functionality such as: RDF Data Model, SPARQL Query Language, URI based Pointer mechanism, and HTTP based message Bus.

    Q: Who is most likely to own the future desktop?
    A: You! And all you need is a URI (an ID or Data Source Name for "Entity You") and a Profile Page (a place where "Entity You" is Describe by You).

    One Last Thing

    You can get a feel for the future desktop by downloading and then installing the OpenLink Data Explorer plugin for Firefox, which allows you to switch viewing modes between Web Page and Linked Data behind the page. :-)

    Related

    ]]>
    The Future of the Desktophttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1415Thu, 21 Aug 2008 19:59:25 GMT42008-08-21T15:59:25.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    At OpenLink, we've been investigating LinqToRdf, an exciting project from Andrew Matthews that seeks to expose the Semantic Web technology space to the large community of .NET developers.

    The LinqToRdf project is about binding LINQ to RDF. It sits atop Joshua Tauberer's C# based Semantic Web/RDF library which has been out there for a while and works across Microsoft .NET and it's open source variant "Mono".

    Historically, the Semantic Web realm has been dominated by RDF frameworks such as Sesame, Jena and Redland; which by their Open Source orientation, predominantly favor non-Windows platforms (Java and Linux). Conversely, Microsoft's .NET frameworks have sought to offer Conceptualization technology for heterogeneous Logical Data Sources via .NET's Entity Frameworks and ADO.NET, but without any actual bindings to RDF.

    Interestingly, believe it or not, .NET already has a data query language that shares a number of similarities with SPARQL, called Entity-SQL, and a very innovative programming language called LINQ; that offers a blend of constructs for natural data access and manipulation across relational (SQL), hierarchical (XML), and graph (Object) models without the traditional object language->database impedance tensions of the past.

    With regards to all of the above, we've just released a mini white paper that covers the exploitation of RDF-based Linked Data using .NET via LINQ. The paper offers a an overview of LinqToRdf, plus enhancements we've contributed to the project (available in LinqToRdf v0.8.). The paper includes real-world examples that tap into a MusicBrainz powered Linked Data Space, the Music Ontology, the Virtuoso RDF Quad Store, Virtuoso Sponger Middleware, and our RDfization Cartridges for Musicbrainz.

    Enjoy!]]>
    .NET, LINQ, and RDF based Linked Data (Update 2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1408Fri, 08 Aug 2008 12:54:01 GMT42008-08-08T08:54:01.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I just stumbled across a post titled: Why Reasoning Matters: Consistency Checking from Clark and Parsia

    As you can see from my recent post about how we've started the process of inoculating DBpedia against the potential dangers of "contextual incoherence", we are entering a newer era in the Semantic Web's evolution. My post and the one from Clark & Parsia both touch different aspects of the "Data Dictionary" for the Semantic Web issue.

    Note: in my universe of discourse, a Data Dictionary manifests when the constraints and class hierarchies defined in an ontology (e.g. a web accessible shared ontology) are functionally bound to a data manager. Interestingly the binding can take the following forms:

    • Engine Hosted - which is what you get with Virtuoso's in-built Inference Engine
    • External - which is what you get when the Inference Engine is a distinct component from the data manager (example: Owlgres which can sit in front of 3rd party SPARQL endpoints via ARQ)

    The classification terminology I use above is very much off-the-cuff, its sole purpose is architectural distinction.

    Anyway, it's really nice to see that we are entering an era re. the Semantic Web vision, where the virtues of reasoning are getting simpler to demonstrate and articulate.

    In a nutshell, the point-point data integration era is coming to an end! The era of intelligent ontology based enterprise data integration is nigh!

    Of course, there is much more to come on the practical utility front, so stay tuned as we work our way through the DBpedia inoculation program.

    ]]>
    Reasoning Matters Contdhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1373Fri, 06 Jun 2008 18:38:54 GMT12008-06-06T14:38:54-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    1995: "

    1995 (and the early 90’s) must have been a visionaries time of dreaming… most of their dreams are happening today.

    Watch Steve Jobs (then of NeXT) discuss what he thinks will be popular in 1996 and beyond at OpenStep Days 1995:

    Heres a spoiler:

    • There is static web document publishing
    • There is dynamic web document publishing
    • People will want to buy things off the web: e-commerce

    The thing that OpenStep propose is:

    What Steve was suggesting was one of the beginnings of the Data Web! Yep, Portable Distributed Objects and Enterprise Objects Framework was one of the influences of the Semantic Web / Linked Data Web…. not surprising as Tim Berners-Lee designed the initial web stack on a NeXT computer!

    I’m going to spend a little time this evening figuring out how much ‘distributed objects’ stuff has been taken from the OpenStep stuff into the Objective-C + Cocoa environment. (<- I guess I must be quite geeky ;-))

    "

    (Via Daniel Lewis.)

    ]]>
    1995http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1371Fri, 06 Jun 2008 11:54:33 GMT12008-06-06T07:54:33.000010-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Courtesy of Nova Spivack's post titled: Tagging and the Semantic Web: Tags as Objects, I stumbled across a related post by John Clarke titled: Tagging and the Semantic Web. Both of these posts use the common practice of tagging to shed light on the increasing realization that "The Pursuit of Context" is the fusion point between the current Web and its evolution into a structured Web of Linked Data.

    How Semantic Tagging Works (from a 1000 feet)

    When tagging a document, the semantic tagging service passes the content of a target document through a processing pipeline (a distillation process of sorts) that results in automagic extraction of the following:

    Once the extraction phase is completed, a user is presented with a list of "suggested tags" using a variety of user interaction techniques. The literal values of elected Tags are then associated with one or more Tag and Tag Meaning Data Objects, with each Object type endowed with a unique Identifier.

    Issues to Note

    Broad acceptance that: "Context is king", is gradually taking shape. That said, "Context" landlocked within Literal values offers little over what we have right now (e.g. at Del.icio.us or Technorati), long term. By this I mean: if the end product of semantically enhanced tagging leaves us with: Literal Tag values only, Tags associated with Tag Data Objects endowed with platform specific Identifiers, or Tag Data Objects with any other Identity scheme that excludes HTTP, the ability of Web users to discern or derive multiple perspectives from the base Context (exposed by semantically enhanced Tags) will be lost, or severely impeded at best.

    The shape, form, and quality of the lookup substrate that underlies semantic tagging services, ultimately affects "context fidelity" matters such as Entity Disambiguation. The importance of quality lookup infrastructure on the burgeoning Linked Data Web is the reason why OpenLink Software is intimately involved with the DBpedia and UMBEL projects.

    Conclusions

    I am immensely happy to see that the Web 2.0 and Semantic Web communities are beginning to coalesce around the issue of "Context". This was the case at the WWW2008 Linked Data Workshop, I am feeling a similar vibe emerging from the Semantic Web Technologies conference currently nearing completion in San Jose. Of course, I will be talking about, and demonstrating practical utility of all of this, at the upcoming Linked Data Planet conference.

    Related

    ]]>
    Context, Tagging, Semantic Web, and Linked Data (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1366Tue, 27 May 2008 22:36:37 GMT32008-05-27T18:36:37-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    After listening to the latest Semantic Web Gang podcast, I found myself agreeing with some of the points made by Alex Iskold, specifically:

      -- Business exploitation of Linked Data on the Web will certainly be driven by the correlation of opportunity costs (which is more than likely what Alex meant by "use cases") associated with the lack of URIs originating from the domain of a given business (Tom Heath: also effectively alluded to this via his BBC and URI land grab anecdotes; same applies Georgi's examples)
      -- History is a great tutor, answers to many of today's problems always lie somewhere in plain sight of the past.

    Of course, I also believe that Linked Data serves Web Data Integration across the Internet very well too, and the fact that it will be beneficial to businesses in a big way. No individual or organization is an island, I think the Internet and Web have done a good job of demonstrating that thus far :-) We're all data nodes in a Giant Global Graph.

    Daniel lewis did shed light on the read-write aspects of the Linked Data Web, which is actually very close to the callout for a Wikipedia for Data. TimBL has been working on this via Tabulator (see Tabulator Editing Screencast), Bengamin Nowack also added similar functionality to ARC, and of course we support the same SPARQL UPDATE into an RDF information resource via the RDF Sink feature of our WebDAV and ODS-Briefcase implementations.

    ]]>
    Comments about recent Semantic Gang Podcasthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1357Tue, 06 May 2008 00:06:42 GMT12008-05-05T20:06:42.000004-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've always been of the opinion that concise value proposition articulation shouldn't be the achilles of the Semantic Web. As the Linked Data wave climbs up the "value Appreciation and Comprehension chain", it's getting clearer by the second that "Context" is a point of confluence for Semantic Web Technologies and easy to comprehend value, from the perspectives of those outside the core community.

    In today's primarily Document centric Web, the pursuit of Context is akin to pursuing a mirage in a desert of user generated content. The quest is labor intensive, and you ultimaely end up without water at the end of the pursuit :-)

    Listening to the Christine Connor's podcast interview with Talis simply reinforces my strong belief that "Context, Context, Context" is the Semantic Web's equivalent of Real Estate's "Location, Location, Location" (ignore the subprime loans mess for now). The critical thing to note is that you cannot unravel "Context" from existing Web content without incorporating powerful disambiguation technology into an "Entity Extraction" process. Of course, you cannot even consider seriously pursing any entity extraction and disambiguation endeavor without a lookup backbone that exposes "Named Entities" and their relationships to "Subject matter Concepts" (BTW - this is what UMBEL is all about). Thus, when looking at the broad subject of the Semantic Web, we can also look at "Context" as the vital point of confluence for the Data oriented (Linked Data) and the "Linguistic Meaning" oriented perspectives.

    I am even inclined to state publicly that "Context" may ultimately be the foundation for 4th "Web Interaction Dimension" where practical use of AI leverages a Linked Data Web substrate en route to exposing new kinds of value :-)

    "Context" may also be the focal point of concise value proposition articulation to VCs as in: "My solution offers the ability to discover and exploit "Context" iteratively, at the rate of $X.XX per iteration, across a variety of market segments :-)

    ]]>
    In Perpetual Pursuit of Contexthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1356Sat, 03 May 2008 19:07:32 GMT12008-05-03T15:07:32-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Daniel lewis has penned a variation of post about Linked Data enabling PHP applications such as: Wordpress, phpBB3, MediaWiki etc.

    Daniel simplifies my post by using diagrams to depict the different paths for PHP based applications exposing Linked Data - especially those that already provide a significant amount of the content that drives Web 2.0.

    If all the content in Web 2.0 information resources are distillable into discrete data objects endowed with HTTP based IDs (URIs), with zero "RDF handcrafting Tax", what do we end up with? A Giant Global Graph of Linked Data; the Web as a Database.

    So, what used to apply exclusively, within enterprise settings re. Oracle, DB2, Informix, Ingres, Sybase, Microsoft SQL Server, MySQL, PostrgeSQL, Progress Open Edge, Firebird, and others, now applies to the Web. The Web becomes the "Distributed Database Bus" that connects database records across disparate databases (or Data Spaces). These databases manage and expose records that are remotely accessible "by reference" via HTTP.

    As I've stated at every opportunity in the past, Web 2.0 is the greatest thing that every happened to the Semantic Web vision :-) Without the "Web 2.0 Data Silo Conundrum" we wouldn't have the cry for "Data Portability" that brings a lot of clarity to some fundamental Web 2.0 limitations that end-users ultimately find unacceptable.

    In the late '80s, the SQL Access Group (now part of X/Open) addressed a similar problem with RDBMS silos within the enterprise that lead to the SAG CLI which is exists today as Open Database Connectivity.

    In a sense we now have WODBC (Web Open Database Connectivity), comprised of Web Services based CLIs and/or traditional back-end DBMS CLIs (ODBC, JDBC, ADO.NET, OLE-DB, or Native), Query Language (SPARQL Query Language), and a Wire Protocol (HTTP based SPARQL Protocol) delivering Web infrastructure equivalents of SQL and RDA, but much better, and with much broader scope for delivering profound value due to the Web's inherent openness. Today's PHP, Python, Ruby, Tcl, Perl, ASP.NET developer is the enterprise 4GL developer of yore, without enterprise confinement. We could even be talking about 5GL development once the Linked Data interaction is meshed with dynamic languages (delivering higher levels of abstraction at the language and data interaction levels). Even the underlying schemas and basic design will evolve from Closed World (solely) to a mesh of Closed & Open World view schemas.

    ]]>
    Linked Data enabling PHP Applicationshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1334Thu, 10 Apr 2008 18:12:47 GMT12008-04-10T14:12:47-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As per usual I am writing this post with the aim of killing a number of meme-birds with a single post in relation to the emerging Linked Data Web.

    *On* the ubiquitous Web of "Linked Documents", HREF means (by definition and usage): Hypertext Reference to an HTTP accessible Data Object of Type: "Document" (an information resource). Of course we don't make the formal connection of Object Type when dealing with the Web on a daily basis, but whenever you encounter the "resource not found" condition notice the message: HTTP/1.0 404 Object Not Found, from the HTTP Server tasked with retrieving and returning the resource.

    *In* the Web of "Linked Data", a complimentary addition to the current Web of "Linked Documents", HREF is used to reference Data Objects that are of a variety of "Types", not just "Documents". And the way this is achieved, is by using Data Object Identifiers (URIs / IRIs that are generated by the Linked Data deployment platform) in the strict sense i.e. Data Identity (URI) is separated from Data Address (URL). Thus, you can reference a Person Data Object (aka an instance of a Person Class) in your HREF and the HTTP Server returns a Description of the Data Object via a Document (again, an information resource). A document containing the Description of a Data Object typically contains HREFs to other Data Objects that expose the Attributes and Relationships of the initial Person Data Object, and it this collection of Data Objects that is technically called a "Graph" -- which is what RDF models.

    What I describe above is basic stuff for anyone that's familiar with Object Database or Distributed Objects technology and concepts.

    URI and URL confusion

    The Linked Document Web is a collection of physical resources that traverse the Web Information Bus in palatable format i.e documents. Thus, Document Object Identity and Document Object Data Address can be the same thing i.e. a URL can serve as the ID/URI of a Document Data Object.

    The Linked Data Web on the other hand, is a Distributed Object Database, and each Data Object must be uniquely defined, otherwise we introduce ambiguity that ultimately taints the Database itself (making incomprehensible to reasoning challenged machines). Thus we must have unique Object IDs (URIs / IRIs) for People, Places, Events, and other things that aren't Documents. Once we follow the time tested rules of Identity, People can then be associated with the things they create (blog posts, web pages, bookmarks, wikiwords etc). RDF is about expressing these graph model relationships while RDF serialization formats enables the information resources to transport these data object link ladden information resources to requesting User Agents.

    Put in more succinct terms, all documents on the Web are compound documents in reality (e.g. mast contain a least an image these days). The Linked Data Web is about a Web where Data Object IDs (URIs) enable us to distill source data from the information contained in a compound document.

    Examples:

    1. <http://community.linkeddata.org/dataspace/person/kidehen2#this> - the ID (URI minted from URL via addition of #this) of a Data Object of Type Person that Identifies me. The Person definition I use comes from the FOAF vocabulary/schema/ontology/data dictionary
    2. <http://community.linkeddata.org/dataspace/person/kidehen2> - the URI (also a URL) of a FOAF file that contains a description of the Data Object ID: <http://community.linkeddata.org/dataspace/person/kidehen2#this> (me)
    3. As an information resource <http://community.linkeddata.org/dataspace/person/kidehen2> can be dispatched from an HTTP server to a User Agent in (X)HTML, RDF/XML, N3/Turtle representations via HTTP Content Negotiation (note: Look at the "Linked Data" tab to see one example of what Data Links facilitate re. Data Discovery and Exploration)
    4. If I choose an Object ID of <http://community.linkeddata.org/dataspace/person/kidehen2/this> instead of <http://community.linkeddata.org/dataspace/person/kidehen2#this> then the HTTP Server should not return an information resource (i.e provide 200 OK response) when a User Agent requests a resource via HTTP using the URI: <http://community.linkeddata.org/dataspace/person/kidehen2/this>, because a Data Object ID (URI) and the Data Object Address (URL) cannot be the same when my Data Object isn't of Type Document; the sever has to use response code 303 to redirect the user agent to the URL of an information resource that matches the Content-type designated in the HTTP Request or determine representation based on it's own quality of service rules for the information resource associated with the Object ID (URI).

    The degree of unobtrusiveness of new technology, concepts, or new applications of existing technology, is what ultimately determines eventual uptake and meme virulence (network effects). For a while, the Semantic Web meme was mired in confusion and general misunderstanding due to a shortage of practical use case scenario demos.

    The emergence of the SPARQL Query Language has provided critical infrastructure for a number of products, projects, and demos, that now make the utility of the Semantic Web vision mush clearly via the simplicity of Linked Data, as exemplified by the following:

    1. Linking Open Data Community - collection of People and Linked Data Spaces (across a variety of domains)
    2. DBpedia - Ground zero for experiencing and comprehending Linked Data
    3. OpenLink Data Spaces - a simple solution for creating Linked Data Web presence via from existing Web Data Sources (Blogs, Wikis, Shared Bookmarks, Tag Spaces, Web Sites, Social Networking Services, Web Services, Discussion Forums etc..)
    4. OpenLink Virtuoso - a Universal Server for generating, managing, and deploying RDF Linked Data from SQL, XML, Web Services based data sources
    Why Is This Post a Linked Data Demo, Again? Place the permalink of this post in a Linked Data aware user agent (OpenLink RDF Browser1, OpenLink RDF Browser2, Zitgist, DISCO, Tabulator), and the you can see the universal of interlinked data exposed by this post. The Title of this post should not be the sole mechanism for determining that it is Linked to other posts about the same topic.

    Related

    ]]>
    So, What Does "HREF" Stand For, Anywayhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1323Thu, 10 Apr 2008 20:13:50 GMT32008-04-10T16:13:50-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Post absorption of Web 3G commentary emanating from the Talis blog space. Ian Davis appears to be expending energy on the definition of, and timeframes for, the next Web Frontier (which is actually here btw) :-)

    Daniel Lewis also penned an interesting post in response to Ian's, that actually triggered this post.

    I think definition time has long expired re. the Web's many interaction dimensions, evolutionary stages, and versions.

    On my watch it's simply demo / dog-food time. Or as Dan Brickley states: Just Show It.

    Below, I've created a tabulated view of the various lanes on the Web's Information Super Highway. Of course, this is a Linked Data demo should you be interested in the universe of data exposed via the links embedded in this post :-)

    The Web's Information Super Highway Lanes

    1.0

    2.0

    3.0

    Desire

    Information Creation & Retrieval

    Information Creation, Retrieval, and Extraction

    Distillation of Data from Information

    Meme

    Information Linkage (Hypertext)

    Information Mashing (Mash-ups)

    Linked Data Meshing (Hyperdata)

    Enabling Protocol

    HTTP

    HTTP

    HTTP

    Markup

    HTML

    (X)HTML& various XML based formats (RSS, ATOM, others)

    Turtle, N3, RDF/XML, others

    Basic Data Unit
    Resource (Data Object) of type "Document"
    Resource (Data Object) of type "Document"
    Resource (Data Object) that may be one of a variety of Types: Person, Place, Event, Music etc.

    Basic Data Unit Identity

    Resource URL (Web Data Object Address)

     

    Resource URL (Web Data Object Address)

     

    Unique Identifier (URI) that is indepenent of actual Resource (Web Data Object) Address.

    Note: An Identifier by itself has no utility beyond Identifying a place around which actual data may be clustered.

     

    Query or Search

    Full Text Search patterns

    Full Text Search patterns

    Structured Querying via SPARQL

    Deployment

    Web Server (Document Server)

    Web Server + Web Services Deployment modules

    Web Server + Linked Data Deployment modules (Data Server)

    Auto-discovery

    <link rel="alternate"..>

    <link rel="alternate"..>

    <link rel="alternate" | "meta"..>, basic and/or transparent content negotiation

    Target User
    Humans
    Humans & Text extraction and manipulation oriented agents (Scrappers)
    Agents with varying degrees of data processing intelligence and capacity
    Serendipitous Discovery Quotient (SDQ) Low Low High

    Pain

    Information Opacity

    Information Silos

    Data Graph Navigability (Quality)

    ]]>
    Driving Lanes on the Web based Information Super Highway http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1318Tue, 04 Mar 2008 23:17:56 GMT12008-03-04T18:17:56-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Via post by Daniel Lewis, titled:10 Reasons to use OpenLink Data Spaces

    There are quite a few reasons to use OpenLink Data Spaces (ODS). Here are 10 of the reasons why I use ODS:

    1. Its native support of DataPortability Recommendations such as RSS, Atom, APML, Yadis, OPML, Microformats, FOAF, SIOC, OpenID and OAuth.
    2. Its native support of Semantic Web Technologies such as: RDF and SPARQL/SPARUL for querying.
    3. Everything in ODS is an Object with its own URI, this is due to the underlying Object-Relational Architecture provided by Virtuoso.
    4. It has all the social media components that you could need, including: blogs, wikis, social networks, feed readers, CRM and a calendar.
    5. It is expandable by installing pre-configured components (called VADs), or by re-configuring a LAMP application to use Virtuoso. Some examples of current VADs include: MediaWiki, Wordpress and Drupal.
    6. It works with external webservices such as: Facebook, del.icio.us and Flickr.
    7. Everything within OpenLink Data Spaces is Linked Data, which provides more meaningful information than just plain structural information. This meaningful information could be used for complex inferencing systems, as ODS can be seen as a Knowledge Base.
    8. ODS builds bridges between the existing static-document based web (aka ‘Web 1.0‘), the more dynamic,  services-oriented, social and/or user-orientated webs (aka ‘Web 2.0‘) and the web which we are just going into, which is more data-orientated (aka ‘Web 3.0’ or ‘Linked Data Web’).
    9. It is fully supportive of Cloud Computing, and can be installed on Amazon EC2.
    10. Its released free under the GNU General Public License (GPL). [note]However, it is technically dual licensed as it lays on top of the Virtuoso Universal Server which has both Commercial and GPL licensing[/note]

    The features above collectively provide users with a Linked Data Junction Box that may reside with corporate intranets or "out in the clouds" (Internet). You can consume, share, and publish data in a myriad of formats using a plethora of protocols, without any programming. ODS is simply about exposing the data from your Web 1.0, 2.0, 3.0 application interactions in structured from, with Linking, Sharing, and ultimately Meshing (not Mashing) in mind.

    Note: Although ODS is equipped with a broad array of Web 2.0 style Applications, you do not need to use native ODS apps in order to exploit it's power. It binds to anything that supports the relevant protocols and data formats.

    ]]>
    10 Reasons to use OpenLink Data Spaces (ODS)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1314Fri, 08 Feb 2008 22:08:43 GMT22008-02-08T17:08:43-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    There are two upcoming keynotes that I will be giving in the months of September and October in relation to the burgeoning Semantic Data Web. The events are: SABRE Conference about the Social Semantic Web and Jupiter's Semantic Web Strategies Fall Event.

    The abstract of my Semantic Web Strategies keynote contains a reference to the acronym MLD but it doesn't really expose what MLD is (i.e. initial acronym source isn't clearly identified in the abstract's opening paragraph). Thus, I am attempting to fix the aforementioned anomally via this blog post :-)

    Market Leadership Discipline (MLD) is defined as follows: A strategy adopted by a company for attaining leadership in a given marketplace.

    MLD strategies usually take one of the following forms:

    1. Product Innovation - common amongst most startup and perpetual startup mode companies
    2. Customer Intimacy - common amongst large and established market leaders
    3. Operational Excellence - common amongst companies (established or startup) that use Information Technology to enhance operations behind the deliver of products and services.

    MLD is a critical component of Enterprise Agility.

    ]]>
    Market Leadership Discipline (MLD) & Upcoming Keynoteshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1239Tue, 05 Feb 2008 01:45:26 GMT12008-02-04T20:45:26.000005-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Terminology is a pain to construct, and an even bigger pain to diffuse effectively, when dealing with large collections of superficially heterogeneous, and factually homogeneous, interlinked individuals.

    In my "Linked Data & Web Information BUS" post (plus a few LOD mailing list posts), I had the delight and displeasure (on the brain primarily) of attempting to get terminology right with regards to Information- and Non-Information Web Resources. I eventually settled for Data Sources instead of the simpler and more obvious term: Data Resources :-)

    Thus, I redefine the URIs from earlier past as follows:

      http://demo.openlinksw.com/Northwind/Customer/ALFKI (Information Resource)
      http://demo.openlinksw.com/Northwind/Customer/ALFKI#this (Data Resource)

    Thanks to today's internet connectivity, it took a simple Skype ping from Mike Bergman, and a 30 minute (or so) session that followed for us to arrive at "Data Resource" as a clearer term for Non Information Resources.

    Mike has promised to write a detailed post covering our Linked Data and the Structured Web terminology meshing odyssey.

    ]]>
    Terminology & Specificity http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1232Tue, 05 Feb 2008 01:47:01 GMT22008-02-04T20:47:01.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Using Solvent to extract data from structured pages: "

    I’ve put together a short tutorial on Solvent, a very nice web page parsing utility. It is still a little rough around the edges, but I wanted to throw it out there and continue working on it since there isn’t a whole lot of existing documentation.

    "

    (Via Wing Yung.)

    After reading the interesting post above I quickly (and quite easily) knocked together a "Dynamic Data Web Page for Major League Baseball" using data from the Virtuoso hosted edition of dbpedia. Just click on the "Explore" option whenever you click on a URI of interest. Enjoy!

    ]]>
    Data Web and Major League Baseballhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1149Fri, 02 Mar 2007 00:13:27 GMT12007-03-01T19:13:27-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Linking personal posted content across communities: "

    With the help of Kingsley, Uldis and I have been looking at how SIOC can be used to link the content that a single person posts to a number of community sites. The picture below shows an example of stuff that I’ve created on Flickr, YouTube, etc. through my various user identities on those sites (these match some SIOC types that we want to add to a separate module). We can also say that each Web 2.0 content item is a user-contributed post, with some attached or embedded content (e.g. a file or maybe just some metadata). This is part of a new discussion on the sioc-dev mailing list, and we’d value your contributions.

    20070228a.png

    Edit: The inner layer is a person (semantically described in FOAF), the next layer is their user accounts (described in FOAF, SIOC) and the outer layer is the posted content - text, files, associated metadata - on community sites (again described using SIOC).

    No Tags"

    (Via John Breslin - Cloudlands.)

    The point that John is making about the Data Web and Interlinked Data Spaces exposed via URIs (e.g Personal URIs), crystallizes a number of very important issues about the Data Web that may remain unclear. I am hoping that by digesting the post excerpt above, in conjunction with the items below, aids the pursuit of clarity and comprehension about the all important Data Web (Semantic Web - Layer 1):

    1. Your OpenID can be Your Personal URI (as noted by Henry Story's post about: The Many Uses of OpenID). That that's what I have courtesy of OpenLink Data Spaces (ODS)
    2. The above only works unobtrusively (i.e. OpenID and Personal sharing a URI) if Content Negotiation is exploited on the Client and Server sides.
    3. TimBL's call out to Share Your Data and Link to Other Data via URIs via post titled: Give Yourself a URI.
    4. W3C's Best Practice Recipes for Publishing RDF Vocabularies
    5. W3C's Architecture of the World Wide Web - Vol 1 which covers URI Dereferencing (HTTP GET-ing the data that a URI points to)
    6. Richard Cyganiak's post titled: Debugging Semantic Web Sites with Curl.

    Examples of some of these principles in practice:

    1. Chris Bizer, Tobias Gaub, and Richard's Javascript based Semantic Web Client Library
    2. DISCO RDF Browser
    3. OpenLink Ajax Toolkit's (OAT) RDF Browser
    4. OpenLink Interactive SPARQL Query by Example (iSPARQL QBE)
    5. Dynamic Data Web Pages from my prior posts [1][2][3]
    6. dbpedia (Wikipedia as a Data Web oriented Data Source)
    7. And of course this blog post's permalink is a bona fide dereferencable URI.

    And of course there is more to come such as Grandma's Semantic Web Browser which is coming from Zitgist LLC (pronounced: Zeitgeist) a joint venture of OpenLink Software and Frederick Giasson.

    ]]>
    Personal URIs & Data Spaceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1148Fri, 02 Mar 2007 14:14:02 GMT12007-03-02T09:14:02.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Alex James has just written an interesting piece titled: Who Controls Your Model, that sets the stage for introducing the concept of "Self Describing Data". To cut a long story short, RDF is one example of a mechanism that facilitates the assembly/construction of self-describing databases (built around a Concrete Conceptual Model) that allows instance data to be serialized using open serialization formats such as: XML, N3, Turtle, TriX.

    Rich Internet Applications ultimately enable intelligent processing of self-describing databases originating from data servers as demonstrated by these examples:

    1. My Dynamic Data Web Start Page
    2. Chris Bizer Data Space
    3. Our RDF Browser (just enter a Web URI e.g http://sites.wiwiss.fu-berlin.de/suhl/bizer/foaf.rdf or http://www.openlinksw.com and then drill down; not Grandma's unobtrusive Data Web Navigator, but headed in that direction..)
    ]]>
    Rich Clients, Conceptual Models, and Self-Describing Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1145Mon, 26 Feb 2007 23:27:47 GMT62007-02-26T18:27:47.000009-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    XMP and microformats revisited: "

    Yesterday I exercised poetic license when I suggested that Adobe’s Extensible metadata platform (XMP) was not only the spiritual cousin of microformats like hCalendar but also, perhaps, more likely to see widespread use in the near term. My poetic license was revoked, though, in a couple of comments:

    Mike Linksvayer: How someone as massively clued-in as Jon Udell could be so misled as to describe XMP as a microformat is beyond me.

    Danny Ayers: Like Mike I don’t really understand Jon’s references to microformats - I first assumed he meant XMP could be replaced with a uF.

    Actually, I’m serious about this. If I step back and ask myself what are the essential qualities of a microformat, it’s a short list:

    1. A small chunk of machine-readable metadata,
    2. embedded in a document.

    Mike notes:

    XMP is embedded in a binary file, completely opaque to nearly all users; microformats put a premium on (practically require) colocation of metadata with human-visible HTML.

    Yes, I understand. And as someone who is composing this blog entry as XHTML, in emacs, using a semantic CSS tag that will enable me to search for quotes by Mike Linksvayer and find the above fragment, I’m obviously all about metadata coexisting with human-readable HTML. And I’ve been applying this technique since long before I ever heard the term microformats — my own term was originally microcontent.

    (Via Jon Udell.)

    I believe Jon is acknowledging the fact that the propagation of metadata in "Binary based" Web data sources is no different to the microformats based propagation that is currently underway in full swing across the "Text based" Web data sources realm. He is reiterating the fact that the Web is self-annotating (exponentially) by way of Metadata Embedding. And yes, what he describes is a similar to Microformats in substance and propagation style :-)

    Here is what I believe Jon is hoping to see:

    1. Binary files become valid data sources for Metadata oriented query processing. Technically I mean a binary file becomes a valid data source from which RDF Instance could be generated on the fly.
    2. Enhanement or unveiling of the Data Web by way of meshups that combine metadata from an array or data sources (not just the XML, (X)HTML, or RDF variety)
    3. The ability to use an array of query languages and techniques to construct these meshups

    My little "Hello Data Web!" meme was about demonstrating a view that Danny has sought for a while: unobtrusive meshing of microformats and RDF via GRDDL and SPARQL binding that simply eliminates the often perceived "RDF Tax". Danny, Jon, myself, and many others have always understood that making the Data Web (Web of RDF Instance Data) more of a Force (Star Wars style) is the key to unravelling the power of the "Web as a Database". Of course, we also tend the describe our nirvana in different ways that sometimes obscures the fundamental commonality of vision that we all share.

    Personally, I believe everyone should simply "feel the force" or observe "the bright and dark sides of the force" that is RDF. When this occurs en masse there will be a global epiphany (similar to what happened around the time of the initial unveiling of the Web of Hypertext). Jon's meme brings the often overlooked realm of binary based metadata sources into the general discourse.

    JBinary Files as bona fide Data Web URIs (i.e. Metadata Sources) is much closer than you think :-) I should have my "Hello Data Web of Binary Data Sources" unveiled very soon!

    ]]>
    XMP and microformats revisitedhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1140Sat, 17 Feb 2007 17:43:05 GMT12007-02-17T12:43:05.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    It's kind of ironic to see what has emerged after ISWC 2006 and the Web 2.0 Summit. From my vantage point, it appears as though the Web 2.0 event inadvertently (albeit beneficially) left its attendees looking for the next big thing re. the Web Innovation Continuum as exemplified by the "Web 3.0" meme from the New York Times (NYT) which triggered the current "Web 2.0 vs Web 3.0 Brouhaha".

    Amongst the numerous comments about this subject, I felt most compelled to respond to the commentary from Tim O'Reilly (based on his proximity to Web 2.0 etc..) in relation to his view that the NYT's Web 3.0 = Collective Intelligence Harnessing aspect of his Web 2.0 meme.

    My response is dumped semi-verbatim below:

    Tim,

    A few things:

    1. We are in an innovation continuum
    2. The Web as a medium of innovation will evolve forever
    3. Different commentators have different views about monikers associated with these innovations
    4. To say Web 3.0 (aka the Data Web or Semantic Web - Layer 1) is what Web 2.0's collective intelligence is all about is a little inaccurate (IMHO); Web 2.0 doesn't provide "Open Data Access"
    5. Web 2.0 is a "Web of Services" primarily, a dimension of "Web Interaction" defined by interaction with Services
    6. Web 3.0 ("Data Web" or "Web of Databases" or "Semantic Web - Layer 1") is a Web dimension that provides "Open Data Access" that will be exemplified by the transition from "Mash-ups" (brute force data joining) to "Mesh-ups" (natural data joining)

    The original "Web of Hypertext" or "Interactive Web", the current "Web of Services", and the emerging "Data Web" or "Web of Databases" collectively provide dimensions of interaction in the innovation continuum called the Web.

    There are many more dimensions to come. Monikers come and go, but the retrospective "Long Shadow" of Innovation is ultimately timeless.

    "Mutual Inclusivity" is a critical requirement for truly perceiving these "Web Interaction Dimensions" ("Participation" if I recall). "Mutual Exclusivity" on the other hand, simpy leads to obscuring reality with Versionitis as exemplified by the ongoing: Web 1.0 vs 2.0 vs 3.0 debates.

    BTW - I enjoyed reading Nick Carr's take on the Web 3.0 meme, especially his "tongue in cheek" power-grab for the rights to all "Web 3.0" Conferences etc. :-)

    ]]>
    Web 2.0 vs Web 3.0 Brouhaha!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1080Fri, 05 Sep 2008 03:00:54 GMT42008-09-04T23:00:54-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    It's really nice to see DAWG-Fooding in effect at ISWC 2006 as demonstrated by this ISWC 2006 Technical Links Page :-)

    Likewise, It would be nice if there were some Mash-ups, Service Endpoints, or Syndication Feeds that exposed relevant Data from the Web 2.0 Summit (beyond the usual selective, best-of, type Blog Commentary and traditional Speakers List).

    ]]>
    ISWC 2006 - Technical Linkshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1079Sat, 11 Nov 2006 21:59:50 GMT32006-11-11T16:59:50-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I have just watched a pretty nifty presentation (courtesy of Babelfish) about the 10 dimensions of our existence (a 'la String Theory) when it dawned on me that similar thinking can be applied to the Web :-)

      Dimension 1 = Interactive Web (Visual Web of HTML based Sites aka Web 1.0)
      Dimension 2 = Services Web (Presence based Web of Services; a usage pattern commonly referred to as Web 2.0)
      Dimension 3 = Data Web (Presence and Open Data Access based Web of Databases aka Semantic Web layer 1)
      Dimension 4 = Ontology Web (Intelligent Agent palatable Web aka Semantic Web layer 2)
      ....

    Hopefully, I can expand further :-)

    ]]>
    Dimensions of the Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1037Sun, 12 Nov 2006 23:55:54 GMT52006-11-12T18:55:54.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Another example of Data Spaces in action by John Breslin.. In this case John visualizes the connections that are exploitable by creating SIOC (Semantically-Interlinked Online Communities) instance data from existing Distributed Collaborative Application profiles (Web 2.0 in current parlance). Of course, SIOC is an Ontology for RDF data since it describes the Concepts and Terms for a a network mesh of online communities. Which by implication provides another insight into the realization that the Web we know has always been a "Web of Databases" (federation of Graph Model Databases encapsulated in Data Spaces). The emergence of SPARQL as the standard Query Language for querying RDF Data Sets, alongside the SPARQL Protocol for transmitting SPARQL Queries over HTTP, and the SPARQL Query Results Serialization formats (XML or JSON) Results Serialization Format), basically set the stage truly open and flexible data access across Web Data Space clusters such as: the Blogosphere, Wikispehere, Usenetverse, Linkspaces, Boardscapes, and others.

    For additional clarity re. my comments above, you can also look at the SPARQL & SIOC Usecase samples document for our OpenLink Data Spaces platform. Bottom line, the Semantic Web and SPARQL aren't BORING. In fact, quite the contrary, since they are essential ingredients of a more powerful Web than the one we work with today!

    Enjoy the rest of John's post:

    Creating connections between discussion clouds with SIOC:

    (Extract from our forthcoming BlogTalk paper about browsers for SIOC.)

    20060907b.png

    SIOC provides a unified vocabulary for content and interaction description: a semantic layer that can co-exist with existing discussion platforms. Using SIOC, various linkages are created between the aforementioned concepts, which allow new methods of accessing this linked data, including:

    • Virtual Forums. These may be a gathering of posts or threads which are distributed across discussion platforms, for example, where a user has found posts from a number of blogs that can be associated with a particular category of interest, or an agent identifies relevant posts across a certain timeframe.
    • Distributed Conversations. Trackbacks are commonly used to link blog posts to previous posts on a related topic. By creating links in both directions, not only across blogs but across all types of internet discussions, conversations can be followed regardless of what point or URI fragment a browser enters at.
    • Unified Communities. Apart from creating a web page with a number of relevant links to the blogs or forums or people involved in a particular community, there is no standard way to define what makes up an online community (apart from grouping the people who are members of that community using FOAF or OPML). SIOC allows one to simply define what objects are constituent parts of a community, or to say to what community an object belongs (using sioc:has_part / part_of): users, groups, forums, blogs, etc.
    • Shared Topics. Technorati (a search engine for blogs) and BoardTracker (for bulletin boards) have been leveraging the free-text tags that people associate with their posts for some time now. SIOC allows the definition of such tags (using the subject property), but also enables hierarchial or non-hierarchial topic definition of posts using sioc:topic when a topic is ambiguous or more information on a topic is required. Combining with other Semantic Web vocabularies, tags and topics can be further described using the SKOS organisation system.
    • One Person, Many User Accounts. SIOC also aims to help the issue of multiple identities by allowing users to define that they hold other accounts or that their accounts belong to a particular personal identity (via foaf:holdsOnlineAccount or sioc:account_of). Therefore, all the posts or comments made by a particular person using their various associated user accounts across platforms could be identified.
    ]]>
    Creating connections between discussion clouds with SIOChttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1036Tue, 05 Feb 2008 04:22:26 GMT42008-02-04T23:22:26.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    OpenLink AJAX Toolkit). It's basically an XML for Analysis (XMLA) client that enables the development and deployment of database independent Rich Internet Applications (RIAs). Thus, you can now develop database centric AJAX applications without lock-in at the Operating System, Database Connectivity mechanism (ODBC, JDBC, OLEDB, ADO.NET), or back-end Database levels.

    XMLA has been around for a long time. Its fundamental goal was to provide Web Applications with Tabular and Multi-dimensional data access before it fell off the radar (a story too long to tell in this post).

    AJAX Database connectivity only requires your target DBMS to be XMLA (direct), ODBC, JDBC, OLEDB, or ADO.NET accessible.

    I have attached a Query By Example (QBE) screencast movie enclosure to this post (should you be reading this post Web 1.0 style). The demo shows how Paradox-, Quattro Pro-, Access-, and MS Query-like user friendly querying is achieved using AJAX Database  Connect Connectivity

    ]]>
    Screencast: Ajax Database Connectivity and SQL Query By Examplehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/982Thu, 22 Jun 2006 12:56:58 GMT72006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    chat with Jon Udell. The item in question is the OpenLink Ajax Toolkit (OAT) that enables the rapid development of Database Independent Rich Internet Applications. My very first public screencast is deliberately silent (since its a live work in progress etc.).

    The screencast style demo covers the production of a map based mashup that simply unveils the national flag of each country underneath its map marker (a lookup associated with geocoded map pin).

    This post is also a deliberate test of the automatic production of IPod and Yahoo RSS sytle syndication gems based on the content of my blog post. Naturally, this is a demonstration of the soon to be unveiled OpenLink Data Spaces technology (the one that supports GData and SPARQL Query Services).

    BTW - The the Data Space that is this blog has been GData aware for a few weeks now (I digress, just watch the movie!):

    Note: If you are reading this post Web 1.0 style (i.e. via traditional non aggregating browser UI) then click on the "enclosure" link to grab the quicktime movie file. If on the other hand your are reading via a Web 2.0 aggregator, note that the Podcast Gem should alert you to the existence of the movie enclosure.
    ]]>
    A Web 2.0 Style Mash-up using the OpenLink Ajax Toolkit (OAT)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/981Thu, 22 Jun 2006 12:56:58 GMT162006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    SPARQL with SQL (Inline)

    Virtuoso extends its SQL3 implementation with syntax for integrating SPARQL into queries and subqueries.Thus, as part of a SQL SELECT query or subquery, one can write the SPARQL keyword and a SPARQL query as part of query text processed by Virtuoso's SQL Query Processor.

    Example 1 (basic) :

    Using Virtuoso's Command line or the Web Based ISQL utility type in the following (note: "SQL>" is the command line prompt for the native ISQL utility):

    SQL> sparql select distinct ?p where { graph ?g { ?s ?p ?o } };

    Which will return the following:

    	  p varchar
         ----------
         http://example.org/ns#b
         http://example.org/ns#d
         http://xmlns.com/foaf/0.1/name
         http://xmlns.com/foaf/0.1/mbox
         ...   

    Example 2 (a subquery variation):

    SQL> select distinct subseq (p, strchr (p, '#')) as fragment
     from (sparql select distinct ?p where { graph ?g { ?s ?p ?o } } ) as all_predicates
     where p like '%#%' ;
         fragment varchar
         ----------
         #query
         #data
         #name
         #comment
         ...

    Parameterized Queries:

    You can pass parameters to a SPARQL query using a Virtuoso-specific syntax extension. '??' or '$?' indicates a positional parameter similar to '?' in standard SQL. '??' can be used in graph patterns or anywhere else where a SPARQL variable is accepted. The value of a parameter should be passed in SQL form, i.e. this should be a number or an untyped string. An IRI ID can not be passed, but an absolute IRI can. Using this notation, a dynamic SQL capable client (ODBC, JDBC, ADO.NET, OLEDB, XMLA, or others) can execute parametrized SPARQL queries using parameter binding concepts that are common place in dynamic SQL. Which implies that existing SQL applications and development environments (PHP, Ruby, Python, Perl, VB, C#, Java, etc.) are capable of issuing SPARQL queries via their existing SQL bound data access channels against RDF Data stored in Virtuoso.

    Note: This is the Virtuoso equivalent of a recently published example using Jena (a Java based RDF Triple Store).

    Example:

    Create a Virtuoso Function by execting the following:

    SQL> create function param_passing_demo ();
     {
     	declare stat, msg varchar;
     	declare mdata, rset any;
     	exec ('sparql select ?s where { graph ?g { ?s ?? ?? }}',
     			stat, msg,
     			vector ('http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#int1',
     		  		   4 ),	-- Vector of two parameters 
    			10,			-- Max. result-set rows
    			mdata, 		-- Variable for handling result-set metadata
     		 	rset   		-- Variable for handling query result-set
    		 ); 
         return rset[0][0];
     }
    
    
    Test new "param_passing_demo" function by executing the following:
    SQL> select param_passing_demo ();
    

    Which returns:

    callret VARCHAR
     _______________________________________________________________________________
    http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#four
    1 Rows. -- 00000 msec.

     

    Using SPARQL in SQL Predicates:

    A SPARQL ASK query can be used as an argument of the SQL EXISTS predicate.

    create function sparql_ask_demo () returns varchar
      {
     		if (exists (sparql ask where { graph ?g { ?s ?p 4}})) return 'YES';
     		else return 'NO';
       };
    


    Test by executing:

    SQL> select sparql_ask_demo ();
    

    Which returns:

    _________________________
    YES
    ]]>
    SPARQL Parameterized Queries (Virtuoso using SPARQL in SQL)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/973Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Solutions to allow XMLHttpRequest to talk to external services: "

    Over on XML.com they published Fixing AJAX: XmlHttpRequest Considered Harmful.

    This article discusses a few ways to get around the security constraints that we have to live with in the browsers theses days, in particular, only being able to talk to your domain via XHR.

    The article walks you through three potential solutions:

    1. Application proxies. Write an application in your favorite programming language that sits on your server, responds to XMLHttpRequests from users, makes the web service call, and sends the data back to users.
    2. Apache proxy. Adjust your Apache web server configuration so that XMLHttpRequests can be invisibly re-routed from your server to the target web service domain.
    3. Script tag hack with application proxy (doesn't use XMLHttpRequest at all). Use the HTML script tag to make a request to an application proxy (see #1 above) that returns your data wrapped in JavaScript. This approach is also known as On-Demand JavaScript.

    I can't wait for Trusted Relationships within the browser - server infrastructure.

    With respect to Apache proxies, these things are priceless. I recently talked about them in relation to Migrating data centers with zero downtime.

    What do you guys think about this general issue? Have you come up with any interesting solutions? Any ideas on how we can keep security, yet give us the freedom that we want?

    (Via Ajaxian Blog.)

    Well here is what I think (actually know):

    Our Virtuoso Universal Server has been sitting waiting to deliver this for years (for the record see the Virtuoso 2000 Press Release). Virtuoso can proxy for disparate data sources and expose disparate data as Well-Formed XML using an array of vocabularies (you experience this SQL-XML integration on the fly every time you interact with various elements of my public blog).

    Virtuoso has always been able to expose Application Logic as SOAP and/or RESTful/RESTian style XML Web Services. This blog's search page is a simple demo of this capability.

    Virtuoso is basically a Junction Box / Aggregator / Proxy for disparate Data, Applications, Services, and BPEL compliant business processes. AJAX clients talk to this single multi-purpose server which basically acts as a conduit to content/data, services, and processes (which are composite services).

    BTW - there is a lot more, but for now, thou shall have to seek in order to find :-)

    ]]>
    Solutions to allow XMLHttpRequest to talk to external serviceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/900Fri, 21 Jul 2006 11:23:03 GMT12006-07-21T07:23:03.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    1. Don't you mean the fall/death of Relational Databases?
      2. Does anyone use these anymore?
      3. What are these?
    Relational Database Management Systems (RDBMS) are alive and kicking as expressed eloquently in this excerpt from a book titled "Funding A Revolution":

    Large-scale computer applications require rapid access to large amounts of data. A computerized checkout system in a supermarket must track the entire product line of the market. Airline reservation systems are used at many locations simultaneously to place passengers on numerous flights on different dates. Library computers store millions of entries and access citations from hundreds of publications. Transaction processing systems in banks and brokerage houses keep the accounts that generate international flows of capital. World Wide Web search engines scan thousands of Web pages to produce quantitative responses to queries almost instantly. Thousands of small businesses and organizations use databases to track everything from inventory and personnel to DNA sequences and pottery shards from archaeological digs.

    Thus, databases not only represent significant infrastructure for computer applications, but they also process the transactions and exchanges that drive the U.S. economy.

    My only addition to the excerpt above is that the impact of databases extends beyond the U.S. economy. We are talking about the global economy. And this will be so for all of time!

    I came across this page while enriching the links in one of my earlier "history" related posts about Relational Database Technology pioneers. During this effort I also stumbled across another historic document titled: "1995 SQL Reunion".

    ]]>
    Rise of Relational Databaseshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/889Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    A Webpage is Not An API or a Platform (The Populicio.us Remix): "

    A few months ago in my post GMail Domain Change Exposes Bad Design and Poor Code, I wrote Repeat after me, a web page is not an API or a platform. It seems some people are still learning this lesson the hard way. In the post The danger of running a remix service Richard MacManus writes

    Populicio.us was a service that used data from social bookmarking site del.icio.us, to create a site with enhanced statistics and a better variety of 'popular' links. However the Populicio.us service has just been taken off air, because its developer can no longer get the required information from del.icio.us. The developer of Populicio.us wrote:

    'Del.icio.us doesn't serve its homepage as it did and I'm not able to get all needed data to continue Populicio.us. Right now Del.icio.us doesn't show all the bookmarked links in the homepage so there is no way I can generate real statistics.'

    This plainly illustrates the danger for remix or mash-up service providers who rely on third party sites for their data. del.icio.us can not only giveth, it can taketh away.

    It seems Richard Macmanus has missed the point. The issue isn't depending on a third party site for data. The problem is depending on screen scraping their HTML webpage. An API is a service contract which is unlikely to be broken without warning. A web page can change depending on the whims of the web master or graphic designer behind the site.

    Versioning APIs is hard enough, let alone trying to figure out how to version an HTML website so screen scrapers are not broken. Web 2.0 isn't about screenscraping. Turning the Web into an online platform isn't about legitimizing bad practices from the early days of the Web. Screen scraping needs to die a horrible death. Web APIs and Web feeds are the way of the future.

    "

    (Via Dare Obasanjo aka Carnage4Life.)

    Amen! ]]>
    A Webpage is Not An API or a Platform (The Populicio.us Remix)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/867Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    By Jeremy J. Carroll, MultiLingual Computing and Technology

    The author gives a brief introduction to the Semantic Web and describes difficulties -- and occasionally solutions -- related to building multilingual Semantic Web sites and applications. The initial drivers for the Semantic Web came from metadata about web pages. Who wrote it?

    When? Who owns the copyright? And so on. Conveying such metadata requires agreement about the key terms such as author and date. This agreement has been reached by the Dublin Core community. For example, they have an agreed definition for the term creator, generalizing author for use in metadata records. The Semantic Web does not, however, draw a sharp distinction between metadata about the page and data contained within the page. In both cases, the idea is to provide sufficient structure around the data to turn it into information and to connect the concepts used to express such information with concepts used by others so that this information can become knowledge that can be acted upon.

    http://tinyurl.com/3o2zm

    See also W3C Semantic Web: http://www.w3.org/2001/sw/

    ]]>
    An Introduction to the Semantic Web. Considerations for Building Multilingual Semantic Web Sites and Applications.http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/670Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The W3C RDF Data Access Working Group recently released an initial public Working Draft specification for "RDF Data Access Use Cases and Requirements". Naturally, this triggered discussion on the RDF mailing list along the following lines:

    In section, 4.1 Human-friendly Syntax, you say  "There must be a text-based form of the query language which can be read and written by users of the language",  and you list the status as "pending".

    As background for section 4.1, you may be interested in RDFQueryLangComparison1 (original text replaced with live link).

    It shows how to write queries in a form that includes English meanings.

    The example queries can be run by pointing a browser to www.reengineeringllc.com .

    Perhaps importantly, given the intricacy of RDF for nonprogrammers, one can get an English explanation of the result of each query.

    -- Dr. Adrian Walker of Internet Business Logic

    The Semantic Web continues to take shape, and Infonauts (information centric agents) are already taking shape.

    A great this about the net is the "back to the future" nature of most Web and Internet technology. For instance we are now frenzied about Service Oriented Architecture (SOA), Event Drivent Architecture (EDA), Loose Coupling of Composite Services etc. Basically rehashing the CORBA vision.

    I see the Semantic Web playing a similar role in relation to artificial intelligence.

    BTW - It still always comes down to data, and as you can imagine Virtuoso will be playing its usual role of alleviating the practical implementation and ulization challenges of all of the above :-)

     

    ]]>
    Comparison of RDF Query Languageshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/557Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Mozilla's SQL Support allows applications to directly connect to SQL databases. A web application no longer needs to pass information through a scripting language, such as Perl or Python, in order to recieve information it can use. The removal of the layer seperating applications and data simplifies the job of the programmer.

    Somehow I missed this effort, and only stumbled across it today after experimenting with Virtuoso's SyncML features (and then pondering about OutLook, WinFS, and what may or may not happen with SyncML support - another story).

    As usual the SQL binding to Mozilla caught my attention (I do recall trying to get Marc and Jim Clark to head down this path many years ago via an email; at least Jim acknowledged not knowing that much about SQL and past it on.., and as for Marc well... nothing happened).

    A few

    ]]>
    SQL Support in Mozilla?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/523Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    An interesting piece I stumbled across regarding one of the RDBMS industry's notable pioneers.

    Today, technology areas that catch Stonebraker's eye include wireless and data integration on the Web.

    Started Ingres project in early 1970s at Berkeley to develop relational databases. Ingres Corp. formed in 1980.

    Another Berkeley project, Postgres, yielded object relational databases and spawned Illustra Information Technologies in 1992.

    Became Informix's CTO in 1996, holding that post until September 2000.

    Launched Cohera, a maker of federated databases, in 1999, based on a Berkeley research project, Miraposa.

    Read on..

    ]]>
    DBMS Hall of Fame: Prof. Michael Stonebrakerhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/483Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    IBM TO SHIP DB2 INTEGRATION SOFTWARE

    Posted May 15, 2003 4:46 PM Pacific Time

    IBM on Tuesday plans to announce availability of its DB2 Information Integrator software, for integrating and analyzing multiple forms of information, the company acknowledged on Thursday.

    In beta since February, the software is intended to enable customers to manage centrally data, text, images, photos, video and audio files stored in different databases, according to IBM. XML content and Web services also are supported.

    Interesting Quote:

    "If we move to information as a utility for giant data grids, this is key technology for hiding or making unimportant the location and type of data. This software enables the data to be accessed transparently wherever it might be," Jones said.

    Product Pricing
    DB2 Information Integrator will be available for $20,000 per processor and $15,000 per data source connector.
    Detail will also be available on Tuesday.

    The cost for a bulk adapter license is about $75,000. If change capture is involved, the adapter license costs about $150,000. Real-time integration costs are mips-based, with a starting cost of about $300,000. One adapter can be used to translate and make native calls to all environments.

    Very interesting pricing! 

    For the full story: http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html

    ]]>
    <p>IBM TO SHIP DB2 INTEGRATION SOFTWARE</p>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/301Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Microsoft to do for Usenet what it did for Email & The Web?

    Netscan is an interesting NNTP based project and it is pretty much along the same lines of what Virtuoso has provided (albeit with an inferior UI) for NNTP since 1999.

    Using Virtuoso the data presented by Netscan could very easily be presented as XML which could then be further processed using XPath, XQuery, and XSL-T with the final result RDF (since this is metadata afterall - another contribution to the Semantic Web)

    ]]>
    Microsoft to do for Usenet what it did for Email & The Web?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/228Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Tim O'Reilly about network aware software

    Tim O'Reilly wrote some thoughts about network aware software. Good sumup and nice ideas, why not only blogs should be net-aware (and where even blogs can be improved ;) )

    "For the desktop, my personal vision is to see existing software instrumented to become increasingly web aware. It seems that Apple are doing a good job with this. (What does web aware mean for me? Being able to grok URIs, speaking WebDAV, and using open standard data formats.)" -- Edd Dumbill
    I agree, but you do have to add Open Data Access formats (such as ODBC and to some degree JDBC) to this mix otherwise the you will need to create data for Open Standard Data Formats from sratch (tough for any enterprise irrespective of size).
    Tim O'Reilly added the following items to Edd's list:
    • Rendezvous-like functionality for automatic discovery of and potential synchronization with other instances of the application on other computers. Apple is showing the power of this idea with iChat and iTunes, but it really could be applied in so many other places. For example, if every PIM supported this functionality, we could have the equivalent of "phonester" where you could automatically ask peers for contact information. Of course, that leads to guideline 2.

    Another application is discovery of ODBC data sources, and database servers. Rendezvous can also simply security and administration of data sources accessible by either one of these standards data access mechanisms. It can also apply to XML databases and data sources exposed by XML Databases.

    • If you assume ad-hoc networking, you have to automatically define levels of access. I've always thought that the old Unix ugo (user, group, other) three-level permission system was simple and elegant, and if you replace the somewhat arbitrary "group" with "on my buddy list", you get something quite powerful. Which leads me to...

      • Buddy lists ought to be supported as a standard feature of many apps, and in a consistent way. What's more, our address books really ought to make it easy to indicate who is in a "buddy list" and support numerous overlapping lists for different purposes.
    • Every application ought to expose some version of its data as an XML feed via some well-defined and standard access mechanism. It strikes me that one of the really big wins that fueled the early web was a simple naming scheme: you could go to a site called www.foo.com, and you'd find a web server there. While it wasn't required, it made web addresses eminently guessable. We missed the opportunity for xml.foo.com to mean "this is where you get the data feed" but it's probably still possible to come up with a simple, consistent naming scheme. And of course, if we can do it for web sites, we also need to think about how to do it for local applications, since...

    The very point I continue to make about Internet Points of Presence beingactual data acces points, in short these end points should be served by database serverprocesses. This is the very basis of Virtuoso, the inevitability of this realization remains the undepinings of this product. There are other products out there that have some sense of this vision too, but there is a little snag (at least so far in my research efforts), and that is the tendency to create dedicated independent server per protocol (an ultimate integration, administration, and maintenance nightmare).

    • We ought to be able to have the expectation that all applications, whether local or remote (web) will be set up for two-way interactions. That is, they can be either a source or sink of online data. So, for example, the natural complement to amazon's web services data feeds is data input (for example, the ability to comment on a book on your local blog, and syndicate the review via RSS to amazon's detail page for the book.) And that leads to:

    • We really need to understand who owns what, and come up with mechanisms that protect the legitimate rights of individuals and businesses to their own data, while creating the "liquidity" and free movement of data that will fuel the next great revolution in computer functionality. (I'm doing a panel on this subject at next week's Open Source Convention, entitled "We Need a Bill of Rights for Web Services.")

    • We need easy gateways between different application domains. I was recently in Finland at a Nokia retreat, and we used camera-enabled cell phones to create a mobile photoblog. That was great. But even more exciting was the ease with which I could send a photo from the phone not just to another phone but also to an email address. This is the functionality that enabled the blog gateway, but it also made it trivial to send photos home to my family and friends. Similarly, I often blog things that I hear on mailing lists, and read many web sites via screen-scraping enabled email lists. It would be nice to have cross-application gateways be a routine part of software, rather than something that has to be hacked on after the fact.
    The wish list is pretty much a clear articulation of key items that should matter most to decision makers (CTOs and CIOs) ; in particular those that continue to wrestle with the identification and isolation of relevantcomponentsfor their enterprisearchitectures.
    ]]>
    Tim O'Reilly about network aware softwarehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/201Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Corporate blogging is about data transformation from raw form to contextual form (knowledge aka competitive advantage). The ability to consume, distill, synthesize, and disseminate, is how corporations ultimately attain success or failure. Corporate blogging done the right way is just one of many IT based initiatives at the disposal of those corporations that comprehend the potential impact on their bottom and top lines.

    Ahh, Doc Searls is covering the Corporate Weblogging thing.

    Personally, I think corporate weblogging is a non-event. For instance? Am I a corporate weblogger? I don't think so. I don't have Microsoft's executive blessing for this.

    The blessing isn't the point. Corporations have always blogged (or attempted to, they just never called it blogging, or simply lacked cohesive technology to make the concept gel). Every second of the day in any corporation data come in, and goes out (after numerous transformations across a plethora of contexts).

    Every corporation knows that it has to create, persist, and disseminate knowledge, and like the Internet, Web, XML, Web Services, and now Blogging, technology is simply catching up in a somewhat standardized form.

    Funny, I was talking with my boss's boss today. Vic Gundotra (General Manager of Platform Evangelism). I asked him "so, from a Microsoft's exec point of view, what would you like me to do on my weblog?"

    He answered: "I don't want to tell you what to do, because anything I tell you will only screw it up and make it boring."

    Oh, you mean like Eric Rudder's weblog? Now I'm in trouble... ;-)

    [via The Scobleizer Weblog]

    Your boss was right on every count :-)

    ]]>
    Doc Searls is covering the Corporate Weblogging thing.http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/190Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What?

    The DBpedia + BBC Combo Linked Dataset is a preconfigured Virtuoso Cluster (4 Virtuoso Cluster Nodes, each comprised of one Virtuoso Instance; initial deployment is to a single Cluster Host, but license may be converted for physically distributed deployment), available via the Amazon EC2 Cloud, preloaded with the following datasets:

    Why?

    The BBC has been publishing Linked Data from its Web Data Space for a number of years. In line with best practices for injecting Linked Data into the World Wide Web (Web), the BBC datasets are interlinked with other datasets such as DBpedia and MusicBrainz.

    Typical follow-your-nose exploration using a Web Browser (or even via sophisticated SPARQL query crawls) isn't always practical once you get past the initial euphoria that comes from comprehending the Linked Data concept. As your queries get more complex, the overhead of remote sub-queries increases its impact, until query results take so long to return that you simply give up.

    Thus, maximizing the effects of the BBC's efforts requires Linked Data that shares locality in a Web-accessible Data Space — i.e., where all Linked Data sets have been loaded into the same data store or warehouse. This holds true even when leveraging SPARQL-FED style virtualization — there's always a need to localize data as part of any marginally-decent locality-aware cost-optimization algorithm.

    This DBpedia + BBC dataset, exposed via a preloaded and preconfigured Virtuoso Cluster, delivers a practical point of presence on the Web for immediate and cost-effective exploitation of Linked Data at the individual and/or service specific levels.

    How?

    To work through this guide, you'll need to start with 90 GB of free disk space. (Only 41 GB will be consumed after you delete the installer archives, but starting with 90+ GB ensures enough work space for the installation.)

    Install Virtuoso

    1. Download Virtuoso installer archive(s). You must deploy the Personal or Enterprise Edition; the Open Source Edition does not support Shared-Nothing Cluster Deployment.

    2. Obtain a Virtuoso Cluster license.

    3. Install Virtuoso.

    4. Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell and install directory):

      . /opt/virtuoso/virtuoso-enterprise.sh
    5. Optional: To keep the default single-server configuration file and demo database intact, set the VIRTUOSO_HOME environment variable to a different directory, e.g.,

      export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/

      Note: You will have to adjust this setting every time you shift between this cluster setup and your single-server setup. Either may be made your environment's default through the virtuoso-enterprise.sh and related scripts.

    6. Set up your cluster by running the mkcluster.sh script. Note that initial deployment of the DBpedia + BBC Combo requires a 4 node cluster, which is the default for this script.

    7. Start the Virtuoso Cluster with this command:

      virtuoso-start.sh
    8. Stop the Virtuoso Cluster with this command:

      virtuoso-stop.sh

    Using the DBpedia + BBC Combo dataset

    1. Navigate to your installation directory.

    2. Download the combo dataset installer script — bbc-dbpedia-install.sh.

    3. For best results, set the downloaded script to fully executable using this command:

      chmod 755 bbc-dbpedia-install.sh
    4. Shut down any Virtuoso instances that may be currently running.

    5. Optional: As above, if you have decided to keep the default single-server configuration file and demo database intact, set the VIRTUOSO_HOME environment variable appropriately, e.g.,

      export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/
    6. Run the combo dataset installer script with this command:

      sh bbc-dbpedia-install.sh

    Verify installation

    The combo dataset typically deploys to EC2 virtual machines in under 90 minutes; your time will vary depending on your network connection speed, machine speed, and other variables.

    Once the script completes, perform the following steps:

    1. Verify that the Virtuoso Conductor (HTTP-based Admin UI) is in place via:

      http://localhost:[port]/conductor
    2. Verify that the Virtuoso SPARQL endpoint is in place via:

      http://localhost:[port]/sparql
    3. Verify that the Precision Search & Find UI is in place via:

      http://localhost:[port]/fct
    4. Verify that the Virtuoso hosted PivotViewer is in place via:

      http://localhost:[port]/PivotViewer

    Related

    ]]>
    DBpedia + BBC (combined) Linked Data Space Installation Guidehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1656Tue, 29 Mar 2011 14:09:45 GMT22011-03-29T10:09:45.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Injecting Linked Data into the Web has been a major pain point for those who seek personal, service, or organization-specific variants of DBpedia. Basically, the sequence goes something like this:

    1. You encounter DBpedia or the LOD Cloud Pictorial.
    2. You look around (typically following your nose from link to link).
    3. You attempt to publish your own stuff.
    4. You get stuck.

    The problems typically take the following form:

    1. Functionality confusion about the complementary Name and Address functionality of a single URI abstraction
    2. Terminology confusion due to conflation and over-loading of terms such as Resource, URL, Representation, Document, etc.
    3. Inability to find robust tools with which to generate Linked Data from existing data sources such as relational databases, CSV files, XML, Web Services, etc.

    To start addressing these problems, here is a simple guide for generating and publishing Linked Data using Virtuoso.

    Step 1 - RDF Data Generation

    Existing RDF data can be added to the Virtuoso RDF Quad Store via a variety of built-in data loader utilities.

    Many options allow you to easily and quickly generate RDF data from other data sources:

    • Install the Sponger Bookmarklet for the URIBurner service. Bind this to your own SPARQL-compliant backend RDF database (in this scenario, your local Virtuoso instance), and then Sponge some HTTP-accessible resources.
    • Convert relational DBMS data to RDF using the Virtuoso RDF Views Wizard.
    • Starting with CSV files, you can
      • Place them at an HTTP-accessible location, and use the Virtuoso Sponger to convert them to RDF or;
      • Use the CVS import feature to import their content into Virtuoso's relational data engine; then use the built-in RDF Views Wizard as with other RDBMS data.
    • Starting from XML files, you can
      • Use Virtuoso's inbuilt XSLT-Processor for manual XML to RDF/XML transformation or;
      • Leverage the Sponger Cartridge for GRDDL, if there is a transformation service associated with your XML data source, or;
      • Let the Sponger analyze the XML data source and make a best-effort transformation to RDF.

    Step 2 - Linked Data Deployment

    Install the Faceted Browser VAD package (fct_dav.vad) which delivers the following:

    1. Faceted Browser Engine UI
    2. Dynamic Hypermedia Resource Generator
      • delivers descriptor resources for every entity (data object) in the Native or Virtual Quad Stores
      • supports a broad array of output formats, including HTML+RDFa, RDF/XML, N3/Turtle, NTriples, RDF-JSON, OData+Atom, and OData+JSON.

    Step 3 - Linked Data Consumption & Exploitation

    Three simple steps allow you, your enterprise, and your customers to consume and exploit your newly deployed Linked Data --

    1. Load a page like this in your browser: http://<cname>[:<port>]/describe/?uri=<entity-uri>
      • <cname>[:<port>] gets replaced by the host and port of your Virtuoso instance
      • <entity-uri> gets replaced by the URI you want to see described -- for instance, the URI of one of the resources you let the Sponger handle.
    2. Follow the links presented in the descriptor page.
    3. If you ever see a blank page with a hyperlink subject name in the About: section at the top of the page, simply add the parameter "&sp=1" to the URL in the browser's Address box, and hit [ENTER]. This will result in an "on the fly" resource retrieval, transformation, and descriptor page generation.
    4. Use the navigator controls to page up and down the data associated with the "in scope" resource descriptor.

    Related

    ]]>
    Virtuoso Linked Data Deployment In 3 Simple Stepshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1642Tue, 02 Nov 2010 15:55:31 GMT12010-11-02T11:55:31.000005-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've created a new discussion space that's squarely focused on the business development and marketing aspects of "HTTP based Linked Data" (Linked Data). As its name indicates, It's a BOLD attempt to fill a VoiD. :-)

    Background

    A few months ago, Aldo Bucchi posted a message to the LOD mailing list seeking a discussion space for more business and marketing oriented topic, in relation to Linked Data. At the time, my assumption was that the existing LOD mailing list served that purpose absolutely fine, but in due course I came to realize that Aldo's request had a much lager foundation than I initially suspected.

    Historic Oversight

    Linked Data, like its umbrella Semantic Web Project, has suffered from an inadvertent oversight on the parts of many of its enthusiasts (myself included): 100% of the discussion spaces are created by, geared towards, or dominated by researchers (from Academia primarily) and/or developers. Thus, at the very least, we've been operating in an echo chamber that only feed the existing void between the core community and those who are more interested in discussing business and marketing related topics.

    The new discussion space seeks to cover the following:

    1. Brainstorming Value Proposition Articulation
    2. War Story Exchanges
    3. Case Studies and Use-cases
    4. Market Research & Positioning (for instance Linked Data is killer technology that redefines Data Integration, but none of the major research firms currently make that connection)
    5. .

    How Do I Join The Conversation? Simply sign up on the Google hosted BOLD mailing list, introduce yourself (ideally), and then start conversing! :-)

    ]]>
    The Business Of Linked Data (BOLD) Discussion Spacehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1600Mon, 01 Feb 2010 14:02:27 GMT12010-02-01T09:02:27.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've created a new discussion space that's squarely focused on the business development and marketing aspects of "HTTP based Linked Data" (Linked Data). As its name indicates, It's a BOLD attempt to fill a VoiD. :-)

    Background

    A few months ago, Aldo Bucchi posted a message to the LOD mailing list seeking a discussion space for more business and marketing oriented topic, in relation to Linked Data. At the time, my assumption was that the existing LOD mailing list served that purpose absolutely fine, but in due course I came to realize that Aldo's request had a much lager foundation than I initially suspected.

    Historic Oversight

    Linked Data, like its umbrella Semantic Web Project, has suffered from an inadvertent oversight on the parts of many of its enthusiasts (myself included): 100% of the discussion spaces are created by, geared towards, or dominated by researchers (from Academia primarily) and/or developers. Thus, at the very least, we've been operating in an echo chamber that only feed the existing void between the core community and those who are more interested in discussing business and marketing related topics.

    The new discussion space seeks to cover the following:

    1. Brainstorming Value Proposition Articulation
    2. War Story Exchanges
    3. Case Studies and Use-cases
    4. Market Research & Positioning (for instance Linked Data is killer technology that redefines Data Integration, but none of the major research firms currently make that connection)
    5. .

    How Do I Join The Conversation? Simply sign up on the Google hosted BOLD mailing list, introduce yourself (ideally), and then start conversing! :-)

    ]]>
    The Business Of Linked Data (BOLD) Discussion Spacehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1596Sun, 31 Jan 2010 22:48:48 GMT12010-01-31T17:48:48-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    One of the real problems that pervades all routes to Linked Data value prop. incomprehension stems from the layering of its value pyramid; especially when communicating with -initially detached- end-users.

    Note to Web Programmers: Linked Data is about Data (Wine) and not about Code (Fish). Thus, it isn't a "programmer only zone", far from it. More than anything else, its inherently inclusive and spreads its participation net widely across: Data Architects, Data Integrators, Power Users, Knowledge Workers, Information Workers, Data Analysts, etc.. Basically, everyone that can "click on a link" is invited to this particular party; remember, it is about "Linked Data" not "Linked Code", after all. :-)

    Problematic Value Pyramid Layering

    Here is an example of a Linked Data value pyramid that I am stumbling across --with some frequency-- these days (note: 1 being the pyramid apex):

    1. SPARQL Queries
    2. RDF Data Stores
    3. RDF Data Sets
    4. HTTP scheme URIs

    Basically, Linked Data deployment (assigning de-referencable HTTP URIs to DBMS records, their attributes, and attribute values [optionally] ) is occurring last. Even worse, this happens in the context of Linked Open Data oriented endeavors, resulting in nothing but confusion or inadvertent perpetuation of the overarching pragmatically challenged "Semantic Web" stereotype.

    As you can imagine, hitting SPARQL as your introduction to Linked Data is akin to hitting SQL as your introduction to Relational Database Technology, neither is an elevator-style value prop. relay mechanism.

    In the relational realm, killer demos always started with desktop productivity tools (spreadsheets, report-writers, SQL QBE tools etc.) accessing, relational data sources en route to unveiling the "Productivity" and "Agility" value prop. that such binding delivered i.e., the desktop application (clients) and the databases (servers) are distinct, but operating in a mutually beneficial manner to all, courtesy of a data access standards such as ODBC (Open Database Connectivity).

    In the Linked Data realm, learning to embrace and extend best practices from the relational dbms realm remains a challenge, a lot of this has to do with hangovers from a misguided perception that RDF databases will somehow completely replace RDBMS engines, rather than compliment them. Thus, you have a counter productive variant of NIH (Not Invented Here) in play, taking us to the dreaded realm of: Break the Pot and You Own It (exemplified by the 11+ year Semantic Web Project comprehension and appreciation odyssey).

    From my vantage point, here is how I believe the Linked Data value pyramid should be layered, especially when communicating the essential value prop.:

    1. HTTP URLs -- LINKs to documents (Reports) that users already appreciate, across the public Web and/or Intranets
    2. HTTP URIs -- typically not visually distinguishable from the URLs, so use the Data exposed by de-referencing a URL to show how each Data Item (Entity or Object) is uniquely identified by a Generic HTTP URI, and how clicking on the said URIs leads to more structured metadata bearing documents available in a variety of data representation formats, thereby enabling flexible data presentation (e.g., smarter HTML pages)
    3. SPARQL -- when a user appreciates the data representation and presentation dexterity of a Generic HTTP URI, they will be more inclined to drill down an additional layer to unravel how HTTP URIs mechanically deliver such flexibility
    4. RDF Data Stores -- at this stage the user is now interested data sources behind the Generic HTTP URIs, courtesy of natural desire to tweak the data presented in the report; thus, you now have an engaged user ready to absorb the "How Generic HTTP URIs Pull This Off" message
    5. RDF Data Sets -- while attempting to make or tweak HTTP URIs, users become curious about the actual data loaded into the RDF Data Store, which is where data sets used to create powerful Lookup Data Spaces (e.g., DBpedia) come into play such as those from the LOD constellation as exemplified by DBpedia (extractions from Wikipedia).

    Related

    ]]>
    Getting The Linked Data Value Pyramid Layers Right (Update #2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1595Sun, 31 Jan 2010 22:47:04 GMT12010-01-31T17:47:04-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The recent Wikipedia imbroglio centered around DBpedia is the fundamental driver for this particular blog post. At time of writing this blog post, the DBpedia project definition in Wikipedia remains unsatisfactory due to the following shortcomings:

    1. inaccurate and incomplete definition of the Project's What, Why, Who, Where, When, and How
    2. inaccurate reflection of project essence, by skewing focus towards data extraction and data set dump production, which is at best a quarter of the project.

    Here are some insights on DBpedia, from the perspective of someone intimately involved with the other three-quarters of the project.

    What is DBpedia?

    A live Web accessible RDF model database (Quad Store) derived from Wikipedia content snapshots, taken periodically. The RDF database underlies a Linked Data Space comprised of: HTML (and most recently HTML+RDFa) based data browser pages and a SPARQL endpoint.

    Note: DBpedia 3.4 now exists in snapshot (warehouse) and Live Editions (currently being hot-staged). This post is about the snapshot (warehouse) edition, I'll drop a different post about the DBpedia Live Edition where a new Delta-Engine covers both extraction and database record replacement, in realtime.

    When was it Created?

    As an idea under the moniker "DBpedia" it was conceptualized in late 2006 by researchers at University of Leipzig (lead by Soren Auer) and Freie University, Berlin (lead by Chris Bizer). The first public instance of DBpedia (as described above) was released in February 2007. The official DBpedia coming out party occurred at WWW2007, Banff, during the inaugural Linked Data gathering, where it showcased the virtues and immense potential of TimBL's Linked Data meme.

    Who's Behind It?

    OpenLink Software (developers of OpenLink Virtuoso and providers of Web Hosting infrastructure), University of Leipzig, and Freie Univerity, Berlin. In addition, there is a burgeoning community of collaborators and contributors responsible DBpedia based applications, cross-linked data sets, ontologies (OpenCyc, SUMO, UMBEL, and YAGO) and other utilities. Finally, DBpedia wouldn't be possible without the global content contribution and curation efforts of Wikipedians, a point typically overlooked (albeit inadvertently).

    How is it Constructed?

    The steps are as follows:

    1. RDF data set dump preparation via Wikipedia content extraction and transformation to RDF model data, using the N3 data representation format - Java and PHP extraction code produced and maintained by the teams at Leipzig and Berlin
    2. Deployment of Linked Data that enables Data browsing and exploration using any HTTP aware user agent (e.g. basic Web Browsers) - handled by OpenLink Virtuoso (handled by Berlin via the Pubby Linked Data Server during the early months of the DBpedia project)
    3. SPARQL compliant Quad Store, enabling direct access to database records via SPARQL (Query language, REST or SOAP Web Service, plus a variety of query results serialization formats) - OpenLink Virtuoso since first public release of DBpedia

    In a nutshell, there are four distinct and vital components to DBpedia. Thus, DBpedia doesn't exist if all the project offered was a collection of RDF data dumps. Likewise, it doesn't exist if you have a SPARQL compliant Quad Store without loaded data sets, and of course it doesn't exist if you have a fully loaded SPARQL compliant Quad Store is up to the cocktail of challenges presented by live Web accessibility.

    Why is it Important?

    It remains a live exemplar for any individual or organization seeking to publishing or exploit HTTP based Linked Data on the World Wide Web. Its existence continues to stimulate growth in both density and quality of the burgeoning Web of Linked Data.

    How Do I Use it?

    In the most basic sense, simply browse the HTML pages en route to discovery erstwhile relationships that exist across named entities and subject matter concepts / headings. Beyond that, simply look at DBpedia as a master lookup table in a Web hosted distributed database setup; enabling you to mesh your local domain specific details with DBpedia records via structured relations (triples or 3-tuples records) comprised of HTTP URIs from both realms e.g., owl:sameAs relations.

    What Can I Use it For?

    Expanding on the Master-Details point above, you can use its rich URI corpus to alleviate tedium associated with activities such as:

    1. List maintenance - e.g., Countries, States, Companies, Units of Measurement, Subject Headings etc.
    2. Tagging - as a compliment to existing practices
    3. Analytical Research - you're only a LINK (URI) away from erstwhile difficult to attain research data spread across a broad range of topics
    4. Closed Vocabulary Construction - rather than commence the futile quest of building your own closed vocabulary, simply leverage Wikipedia's human curated vocabulary as our common base.

    Related

    ]]>
    What is the DBpedia Project? (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1594Sun, 31 Jan 2010 22:46:10 GMT12010-01-31T17:46:10.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The recent Wikipedia imbroglio centered around DBpedia is the fundamental driver for this particular blog post. At time of writing this blog post, the DBpedia project definition in Wikipedia remains unsatisfactory due to the following shortcomings:

    1. inaccurate and incomplete definition of the Project's What, Why, Who, Where, When, and How
    2. inaccurate reflection of project essence, by skewing focus towards data extraction and data set dump production, which is at best a quarter of the project.

    Here are some insights on DBpedia, from the perspective of someone intimately involved with the other three-quarters of the project.

    What is DBpedia?

    A live Web accessible RDF model database (Quad Store) derived from Wikipedia content snapshots, taken periodically. The RDF database underlies a Linked Data Space comprised of: HTML (and most recently HTML+RDFa) based data browser pages and a SPARQL endpoint.

    Note: DBpedia 3.4 now exists in snapshot (warehouse) and Live Editions (currently being hot-staged). This post is about the snapshot (warehouse) edition, I'll drop a different post about the DBpedia Live Edition where a new Delta-Engine covers both extraction and database record replacement, in realtime.

    When was it Created?

    As an idea under the moniker "DBpedia" it was conceptualized in late 2006 by researchers at University of Leipzig (lead by Soren Auer) and Freie University, Berlin (lead by Chris Bizer). The first public instance of DBpedia (as described above) was released in February 2007. The official DBpedia coming out party occurred at WWW2007, Banff, during the inaugural Linked Data gathering, where it showcased the virtues and immense potential of TimBL's Linked Data meme.

    Who's Behind It?

    OpenLink Software (developers of OpenLink Virtuoso and providers of Web Hosting infrastructure), University of Leipzig, and Freie Univerity, Berlin. In addition, there is a burgeoning community of collaborators and contributors responsible DBpedia based applications, cross-linked data sets, ontologies (OpenCyc, SUMO, UMBEL, and YAGO) and other utilities. Finally, DBpedia wouldn't be possible without the global content contribution and curation efforts of Wikipedians, a point typically overlooked (albeit inadvertently).

    How is it Constructed?

    The steps are as follows:

    1. RDF data set dump preparation via Wikipedia content extraction and transformation to RDF model data, using the N3 data representation format - Java and PHP extraction code produced and maintained by the teams at Leipzig and Berlin
    2. Deployment of Linked Data that enables Data browsing and exploration using any HTTP aware user agent (e.g. basic Web Browsers) - handled by OpenLink Virtuoso (handled by Berlin via the Pubby Linked Data Server during the early months of the DBpedia project)
    3. SPARQL compliant Quad Store, enabling direct access to database records via SPARQL (Query language, REST or SOAP Web Service, plus a variety of query results serialization formats) - OpenLink Virtuoso since first public release of DBpedia

    In a nutshell, there are four distinct and vital components to DBpedia. Thus, DBpedia doesn't exist if all the project offered was a collection of RDF data dumps. Likewise, it doesn't exist without a fully populated SPARQL compliant Quad Store. Last but not least, it doesn't exist if you have a fully loaded SPARQL compliant Quad Store isn't up to the cocktail of challenges (query load and complexity) presented by live Web database accessibility.

    Why is it Important?

    It remains a live exemplar for any individual or organization seeking to publishing or exploit HTTP based Linked Data on the World Wide Web. Its existence continues to stimulate growth in both density and quality of the burgeoning Web of Linked Data.

    How Do I Use it?

    In the most basic sense, simply browse the HTML based resource decriptor pages en route to discovering erstwhile undiscovered relationships that exist across named entities and subject matter concepts / headings. Beyond that, simply look at DBpedia as a master lookup table in a Web hosted distributed database setup; enabling you to mesh your local domain specific details with DBpedia records via structured relations (triples or 3-tuples records), comprised of HTTP URIs from both realms e.g., via owl:sameAs relations.

    What Can I Use it For?

    Expanding on the Master-Details point above, you can use its rich URI corpus to alleviate tedium associated with activities such as:

    1. List maintenance - e.g., Countries, States, Companies, Units of Measurement, Subject Headings etc.
    2. Tagging - as a compliment to existing practices
    3. Analytical Research - you're only a LINK (URI) away from erstwhile difficult to attain research data spread across a broad range of topics
    4. Closed Vocabulary Construction - rather than commence the futile quest of building your own closed vocabulary, simply leverage Wikipedia's human curated vocabulary as our common base.

    Related

    ]]>
    What is the DBpedia Project? (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1592Wed, 15 Sep 2010 22:10:51 GMT32010-09-15T18:10:51.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is RDF?

    The acronym stands for: Resource Description Framework. And that's just what it is.

    RDF is comprised of a Data Model (EAV/CR Graph) and Data Representation Formats such as: N3, Turtle, RDF/XML etc.

    RDF's essence is about: "Entities" and "Attributes" being URI based, while "Values" may be URI or Literals (typed or untyped) based.

    URIs are Entity Identifiers.

    What is Linked Data?

    Short for "Web of Linked Data" or "Linked Data Web".

    A term coined by TimBL that describes an HTTP based "data access by reference pattern" that uses a single pointer or handle for "referring to" and "obtaining actual data about" an entity.

    Linked Data uses the deceptively simple messaging scheme of HTTP to deliver a granular entity reference and access mechanism that transcends traditional computing boundaries such as: operating system, application, database engines, and networks.

    How are Linked Data & RDF Related?

    Linked Data simply mandates the following re. RDF:

    • URIs should be HTTP based so that you can "refer to" (Reference) an Entity, its Attributes, or URI based Attribute values via the Web (infact any HTTP based network e.g., Intranets and Extranets)
    • URIs should also be HTTP based so that you can use them to de-reference resource descriptions via the Web (or Intranets and Extranets).

    Note: by Entity I am also referring to: a resource (Web parlance), data item, data object, real-world object, or datum.

    Linked Data is also about, using URIs and HTTP's content negotiation feature to separate: presentation, representation, access, and identity of data items. Even better, content negotiation can be driven by user agent and/or data server based quality of service algorithms (representation preference order schemes).

    To conclude, Linked Data is ultimately about the realization that: Data is the new Electricity, and it's conductors are URIs :-)

    Tip to governments of the world: we are in exponential times, the current downturn is but one side of the "exponential times ledger", the other side of the "exponential times ledger" is simply about unleashing "raw data" -- in structured form -- into the Web, so that "citizen analysts" can blossom and ultimately deliver the transparency desperately sought at every level of the economic value chain. Think: "raw data ready" whenever you ponder about "shovel ready" infrastructure projects!

    ]]>
    Simple Explanation of RDF and Linked Data Dynamicshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1543Fri, 24 Apr 2009 21:14:41 GMT12009-04-24T17:14:41-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I pose the question above because I stumbled across an interesting claim about OpenLink Software and its representatives expressed in the ReadWriteWeb post titled: XBRL: Mashing Up Financial Statements, where the following claim is made:

    "..There is evidence that they promote LINKED DATA at any expense without understanding the rationale behind other approaches...".

    To answer the question above, Linked Data is always relevant as long as we are actually talking about "Data" which is simply the case all of the time, irrespective of interaction medium.

    If XBRL can be disconnected in anyway from Linked Data, I desperately would like to be enlightened (as per my comments to the post). Why wouldn't anyone desire the ability to navigate the linked data inherent in any financial report? Every entity in an XBRL instance document is an entity, directly or indirectly related to other entities. Why "Mash" the data when you can harmonize XBRL data via a Generic Financial Dictionary (schema or ontology) such that descriptions of Balance Sheet, P&L, and other entities are navigable via their attributes and relationships? In short, why "Mash" (code based brute force joining across disparately shaped data) when you can "Mesh" (natural joining of structured data entities)?

    "Linked Data" is about the ability to connect all our observations (data)? , perceptions (information), and inferences / conclusions (knowledge) across a spectrum of interaction media. And it just so happens that the RDF data model (Entity-Attribute-Vaue + Class Relationships + HTTP based Object Identifiers), a range of RDF data model serialization formats, and SPARQL (Query Language and Web Service combo) actually make this possible, in a manner consistent with the essence of the global space we know as the World Wide Web.

    Related

    ]]>
    Is Linked Data Always Relevant?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1509Wed, 31 Dec 2008 17:57:41 GMT22008-12-31T12:57:41-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is Virtuoso+DBpedia AMI for EC2?

    A pre-installed and fully tuned edition of Virtuoso that includes a fully configured DBpedia instance on Amazon's EC2 Cloud platform.

    Benefits?

    Generally, it provides a no hassles mechanism for instantiating personal, organization, or service specific instances of DBpedia within approximately 1.5 hours as opposed to a lengthy rebuild from RDF source data that takes between 8 - 22 hours depending on machine hardware configuration and host operating system resources.

    From a Web Entrepreneur perspective it offers all of the generic benefits of a Virtuoso EC2 AMI plus the following:

    1. Instant bootstrap of a dense Lookup Hub for Linked Data Web oriented solutions
    2. No exposure to any of the complexities and nuances associated with deployment of dereferencable URIs (you have a DBpedia replica)
    3. Predictable performance and scalability due localization of query processing (you aren't sharing the public DBpedia server with the rest of the world).

    Features:

    1. DBpedia public instance functionality replica (re. RDF and (X)HTML resource description representations & SPARQL endpoint)
    2. Local URI de-referencing (so no contention with public endpoint) as part of the Linked Data Deployment
    3. Fully tuned Virtuoso instance for DBpedia data set hosting.

    How Do I Get Started?

    Simply read the Virtuoso-DBpedia EC2 AMI installation guide.

    Here are a few live examples of DBpedia resource URIs deployed and de-referencable via one of my EC2 based personal data spaces:

    ]]>
    Virtuoso+DBpedia AMI for EC2 now Live!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1490Fri, 12 Dec 2008 16:22:27 GMT42008-12-12T11:22:27-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Recent perturbations in Data Access and Data Management technology realms are clear signs of an imminent inflection. In a nutshell, the focus of data access is moving from the "Logical Level" (what you see if you've ever looked at a DBMS schema derived from an Entity Data Model) to the "Conceptual Level" (i.e., the Entity Model becoming concrete).

    In recent times I've stumbled across Master Data Management (MDM) which is all about entities that provide holistic views of enterprise data (or what I call: Context Lenses). I've also stumbled across emerging tensions in the .NET realm between Linq to Entities and Linq to SQL, where in either case the fundamental issues comes down to the optimal paths "Conceptual Level Access" over the "Logical Logical Level" when dealing with data access in the .NET realm.

    Strangely, the emerging realm of RDF Linked Data, MDM, and .NET's Entity Frameworks, remain strangely disconnected.

    Another oddity is the obvious, but barely acknowledged, blurring of the lines between the "traditional enterprise employee" and the "individual Web netizen". The fusion between these entities is one of the most defining characteristics of how the Web is reshaping the data landscape.

    At the current time, I tend to crystalize my data access world view under the moniker: YODA ("You" Oriented Data Access), based on the following:

    1. Entities are the new focal point of data access, management, and integration
    2. "You" are the entry point (Data Source Name) into this new realm of inter connected Entities that the Web exposes
    3. "You" the "Person" Entity is associated with many other "Things" such as "Organizations", "Other People", "Books", "Music", "Subject Matter" etc.
    4. "You" the "Person" needs Identity in this new global database, which is why "You" need to Identify "Yourself" using an an HTTP based Entity ID (aka. URI)
    5. When "You" have an ID for "Yourself" it becomes much easier for the essence of "You" to be discovered via the Web
    6. When "Others" have IDs for "Themselves" on the Web it becomes much easier for "You" to serendipitously discover or explicitly "Find" things on the Web.

    Related

    ]]>
    Entity Oriented Data Access http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1475Tue, 04 Nov 2008 03:51:48 GMT12008-11-03T22:51:48-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    data curation efforts into the burgeoning Linked Data Web.

    Here are some examples of how we distill Entities (People, Places, Music, and other things) from Freebase (X)HTML pages (meaning: we don't have to start from RDF information resources as data sources for the eventual RDF Linked Data we generate):

    Tip: Install our OpenLink Data Explorer extension for Firefox. Once installed, simply browse through Freebase, and whenever you encounter a page about something of interest, simply use the following sequences to distill (via the Page Description feature) the entities from the page you are reading:

    • CTRL-Click (Mac OS X)
    • Right+Click (Windows & Linux)

    Related

    ]]>
    Welcoming Freebase to the Linked Data Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1468Fri, 31 Oct 2008 15:23:35 GMT12008-10-31T11:23:35.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    RDF-ization is a term used by the Semantic Web community to describe the process of generating RDF from non RDF Data Sources such as (X)HTML, Weblogs, Shared Bookmark Collections, Photo Galleries, Calendars, Contact Managers, Feed Subscriptions, Wikis, and other information resource collections.

    If the RDF generated, results in an entity-to-entity level network (graph) in which each entity is endowed with a de-referencable HTTP based ID (a URI), we end up with an enhancement to the Web that adds Hyperdata linking across extracted entities, to the existing Hypertext based Web of linked documents (pages, images, and other information resource types). Thus, I can use the same URL linking mechanism to reference a broader range of "Things" i.e., documents, things that documents are about, or things loosely associated with documents.

    The Virtuoso Sponger is an example of an RDF Middleware solution from OpenLink Software. It's an in-built component of the Virtuoso Universal Server, and deployable in many forms e.g., Software as Service (SaaS) or traditional software installation. It delivers RDF-ization services via a collection of Web information resource specific Cartridges/Providers/Drivers covering Wikipedia, Freebase, CrunchBase, WikiCompany, OpenLibrary, Digg, eBay, Amazon, RSS/Atom/OPML feed sources, XBRL, and many more.

    RDF-ization alone doesn't ensure valuable RDF based Linked Data on the Web. The process of producing RDF Linked Data is ultimately about the art of effectively describing resources with an eye for context.

    RDF-ization Processing Steps

    1. Entity Extraction
    2. Vocabulary/Schema/Ontology (Data Dictionary) mapping
    3. HTTP based Proxy URI generation
    4. Linked Data Cloud Lookups (e.g., perform UMBEL lookup to add "isAbout" fidelity to graph and then lookup DBpedia and other LOD instance data enclaves for Identical individuals and connect via "owl:sameAs")
    5. RDF Linked Data Graph projection that uses the description of the container information resource to expose the URIs of the distilled entities.

    The animation that follows illustrates the process (5,000 feet view), from grabbing resources via HTTP GET, to injecting RDF Linked Data back into the Web cloud:

    Note: the Shredder is a Generic Cartridge, so you would have one of these per data source type (information resource type).

    ]]>
    What is Linked Data oriented RDF-ization?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1453Tue, 07 Oct 2008 21:35:24 GMT32008-10-07T17:35:24-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've just read a really nice post by Henry Story titled: Are OO Languages Autistic?

    In typical style, Henry walks you through his point of view using simple but powerful illustrations. Here is a key statement in his post that really struck me:

    "In order to be able to have a mental theory one needs to be able to understand that other people may have a different view of the world. On a narrow three dimensional understanding of 'view', this reveals itself in that people at different locations in a room will see different things. One person may be able to see a cat behind a tree that will be hidden to another. In some sense though these two views can easily be merged into a coherent description."

    Opaque Web pages (e.g., generated by Semantic Technology inside offerings that will not expose or share data entity URIs), irrespective of how smart the underlying page generation and visualization technology may be, a fundamentally autistic and counter intuitive as we move toward a Web of Linked Data.

    Preoccupation with the "V" aspect of the M-V-C trinity is inadvertently compounding and the problem of digital autism on the Web. Unbeknownst to the purveyors of data silos and proprietary service lock-in, digital autism on the Web ultimately implies Web business model autism.

    ]]>
    View Plurality Deficiency & Programming Language Autismhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1441Wed, 17 Sep 2008 14:54:48 GMT12008-09-17T10:54:48.000004-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Ubiquity from Mozilla Labs, provides an alternative entry point for experiencing the "Controller" aspect of the Web's natural compatibility with the MVC development pattern. As I've noted (in various posts) Web Services, as practiced by the REST oriented Web 2.0 community or SOAP oriented SOA community within the enterprise, is fundamentally about the ("Controller" aspect of MVC.

    Ubiquity provides a commandline interface for direct invocation of Web Services. For instance, in our case, we can expose the Virtuoso's in-built RDF Middleware ("Sponger") and Linked Data deployment services via a single command of the form: describe-resource <url>

    To experience this neat addition to Firefox you need to do the following:

    1. Download and install the Ubiquity Extension for Firefox
    2. Subscribe to the OpenLink Command for Resource Description
    3. Click on CTRL+Space (Windows / Linux) or Option+Space (Mac OS X)
    4. Type in: describe-resource <a-web-resource-url>

    How to unsubscribe

    At the current time, you need to do this if you've installed commands using ubiquity 0.1.0 and seek to use newer versions of the same commands after upgrading to ubiquity 0.1.1.
    1. To unsubscribe use type "about:ubiquity" into browser
    2. Click on unsubscribe links associated with you command subscription list

    Enjoy!

    ]]>
    Linked Data, Ubiquity Commands, and Resource Descriptions (Update 3)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1430Mon, 08 Sep 2008 13:00:51 GMT72008-09-08T09:00:51-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Virtuoso that have been on the Web for while.

    Remember, Virtuoso offers data management, data access, web application server, enterprise service bus, and virtualization of disparate and heterogeneous data sources, as part of a single, multi threaded, cross-platform server solution; hence it's description as a "Universal Server".

    Conceptual View:



    Technical View (kinda missing PHP, Perl, Python runtime hosting in the Virtual Application Sever realm):




    Virtuoso's architecture is not a reaction to current trends. The diagrams above are pretty old (with minor touch ups in recent times). At OpenLink Software, we've have a consistent world-view re. standards and the vital role they play when it comes to developing software that enables the construction and exploitation of "Context Lenses" that tap into a substrate of Virtualized Logical Data Sources (SQL, XML, RDF, Web Services, Full Text etc.).




    ]]>
    Virtuoso's Universal Server Architecture (Conceptual & Technical)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1406Tue, 05 Aug 2008 22:07:45 GMT32008-08-05T18:07:45-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    It's getting really hot in Linked Data land! Two days ago Benjamin Nowack pinged the LOD community about his RDFization of Crunchbase (sample (X)HTML view: http://cb.semsol.org/company/opera-software) courtesy of Crounchbase releasing an API. As you know, I've always equated Web Service API to Database CLIs (ODBC, JDBC, ADO.NET etc.) as both offer code level hooks into Data Spaces.

    Naturally, we've decided to join the Crunchbase RDFization party, and have just completed a Virtuoso Sponger Cartridge (an RDFizer) for Crouncbase. What we add in our particular cartridge is additional meshing with DBpedia and Wikicompany Linked Data Spaces, plus RDFizaton of the Crunchbase (X)HTML pages :-)

    As I've postulated for a while, Linked Data is about data "Meshing" and "Meshups". This isn't a buzzword play. I am pointing out an important distinction between "Mashups" and "Meshpus". Which goes as follows: "Mashups" are about code level joining devoid of structured modelling, hence the revelation of code as opposed to data when you look behind a "Mashup". "Meshups" on the other hand, are about joining disparate structured data sources across the Web. And when you look behind a "Meshup" you see structured data (preferably Linked Data) that enables further "Meshing".

    I truly believe that we are now inches away from critical mass re. Linked Data, and because we are dealing with data, the network-effect will be sky-high! I shudder to think about the state of the Linked Data Web in 12 months time. Yes, I am giving the explosion 12 months (or less). These are very exciting times.

    Demo Links:

    For best experience I encourage you to look at the OpenLink Data Explorer extension for Firefox (2.x - 3.x). This enables you to go to Crunchbase (X)HTML pages (and other sites on the Web of course), and then simply use the "View | Linked Data Sources" main or context menu sequence to unveil the Linked Data Sources associated with any Web Page.

    Of course there is much more to come!

    ]]>
    CrunchBase gets hooked up with the Linked Data Web! http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1395Wed, 30 Jul 2008 01:43:27 GMT32008-07-29T21:43:27-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The build up to Linked Data Planet continues... Here is semanticweb.com's interview with Jim Hendler and *I* titled: Linked Data Leaders - The Semantic Web is Here.

    ]]>
    Internet.com Interviews Jim Hendler & Ihttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1385Thu, 12 Jun 2008 00:55:15 GMT12008-06-11T20:55:15-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As I start my countdown to the upcoming Linked Data Planet conference, here is the first of a series of posts geared towards showcasing practical use of the burgeoning Linked Data Web.

    First up, the Library of Congress, take a look at the following pages which are "Human" and machine based "User Agent" friendly:

    Key point: The pages above are served up in line with Linked Data deployment and publishing tenets espoused by the Linking Open Data Community (LOD) which include (in my preferred terminology):

    • Giving "Names" to things you observe (aka Data Source Names or "DSNs" for short)
    • Use HTTP URLs in your data source naming scheme so that "access by reference" to your data sources exploits the expanse of the HTTP driven Web i.e make your DSNs "Linked Data Source Names" (LDNS)
    • Remember that Documents / Pages are compound in nature, and they aren't the only data sources we would want to name; a document's LDSN must be distinct from the LDSNs used for the subject matter concepts and/or named entities associated with a document
    • Use the RDF Data Model to express structure within your data source(s)
    • Use LDSNs when constructing statements/claims/assertions/records (triples) inside your structured data sources
    • When publishing Web Pages related to your data sources; use at least one of the following to methods to guide user agents to data sources associated with your published page; the HTML LINK tag, RDFa, GRDDL, or Content Negotiation.

    The items above are features that users and decision makers should start to hone into when seeking, and evaluating, platforms that facilitate cost-effective exploitation of the Linked Data Web.

    ]]>
    Linked Data in Action: Library of Congresshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1384Wed, 11 Jun 2008 17:16:31 GMT22008-06-11T13:16:31.000010-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Yihong Ding has posted an interesting series of posts under the banner: Web Evolution. Post number 4 in the series covers: Web Evolution and Human Growth. This particular post is orthogonal (related but independent) to some of my earlier posts about Web Evolution.]]>Web Evolutionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1367Tue, 27 May 2008 11:45:51 GMT22008-05-27T07:45:51.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com> Daniel Lewis has penned a post titled: Clearing up some misconceptions..again, in response to Ben Werdmuller's post titled: Introducing the Open Data Definition.

    The great thing about the Linked Data Web is that it's much easier to discovery and respond to these points of view before the ink dries :-) Ben certainly needs to take a look at the Semantic Web FAQ pre or post assimilation of Daniel's response.

    ]]>
    Clearing Up RDF misrepresentation once again!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1352Wed, 30 Apr 2008 16:07:58 GMT12008-04-30T12:07:58.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Wordpress is a Weblog platform comprised of the following:

    1. User Interface - PHP
    2. Application Logic - PHP
    3. Data Storage (SQL RDBMS) - MySQL via PHP-MySQL
    4. Application Server - Apache

    In the form above (the norm), Wordpress data can be injected into the Linked Data Web via RDFization middleware such as theVirtuoso Sponger (built into all Virtuoso instances) and Triplr. The downside of this approach is that the blog owner doesn't necessary possess full control over their contributions to the emerging Giant Global Graph or Linked Data.

    Another route to Linked Data exposure is via Virtuoso's Metaschema Language for producing RDF Views over ODBC/JDBC accessible Data Sources, that enables the following setup:

    1. User Interface - PHP
    2. Application Logic - PHP
    3. Data Storage (SQL RDBMS) - MySQL via the PHP-MySQL data access interface
    4. Virtual Database linkage of MySQL Tables into Virtuoso
    5. RDF View generated over the Virtual SQL Tables
    6. Application Server - Virtuoso which provides Linked Data Deployment such that RDF Linked Data is exposed when requested by Web User Agents.

    Alternatively, you can also exploit Virtuoso as the SQL DBMS, RDF DBMS, Application Server, and Linked Data Deployment platform:

    1. User Interface - PHP
    2. Application Logic - PHP
    3. Data Storage (SQL RDBMS) - Virtuoso via PHP-ODBC data access interface (* ODBC is Virtuoso's native SQL CLI/API *)
    4. RDF View generated over the Native SQL Tables
    5. Application Server - Virtuoso which provides Linked Data Deployment such that RDF Linked Data is exposed when requested by Web User Agents (e.g. OpenLink RDF Browser, Zitgist Data Viewer, DISCO Hyperdata Browser, and Tabulator).

    Benefits?

    • Each user account gets a proper Linked Data URI (ID) that can me meshed/smushed with other IDs (so you add data from this new blog space to other linked data sources associated with you other URIs/IDs)
    • Each post gets a proper URI All data is now query-able via SPARQL Discoverability increases exponentially (without drop in relevance in either direction i.e. discovering or being discovered)

    How Do I map the WordPress SQL Schema to RDF using Virtuoso?

    • Determine the RDF Schema or Ontologies that define the Classes for which you will be producing instance data (e.g. SIOC and FOAF)
    • Declare URI/IRI generator functions (*special Virtuoso functions*)
    • Use SPARQL Graph patterns to apply URI/IRI generator functions to Tables, Views, Table Values mode Stored Procedures, Query Resultsets as part of RDBMS to RDF mapping

    Read the Meta Schema Language guide or simply apply our "WordPress SQL Schema to RDF" script to your Virtuoso hosted instance. Of course, there are other mappings that cover other PHP applications deployed via Virtuoso:

    Live Demos?

    ]]>
    Adding Wordpress Blogs into the Linked Data Web using Virtuosohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1333Thu, 10 Apr 2008 16:33:05 GMT42008-04-10T12:33:05.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    John Schmidt, from Informatica, penned an interesting post titled: IT Doesn't Matter - Integration Does.

    Yes, integration is hard, but I do profoundly believe that what's been happening on the Web over the last 10 or so years also applies to the Enterprise, and by this I absolutely do not mean "Enterprise 2.0" since "2.0" and productive agility do not compute in my realm of discourse.

    large collections of RSS feeds, Wikiwords, Shared Bookmarks, Discussion Forums etc.. when disconnected at the data level (i.e. hosted in pages with no access to the "data behind") simply offer information deluge and inertia (there are only so many hours for processing opaque information sources in a given day).

    Enterprises fundamentally need to process information efficiently as part of a perpetual assessment of their relative competitive Strengths, Weaknesses, Opportunities, and Threats (SWOT), in existing and/or future markets. Historically, IT acquisitions have run counter intuitively to the aforementioned quest for "Ability" due to the predominance of "rip and replace" approach technology acquisition that repeatedly creates and perpetuates information silos across Application, Database, Operating System, Development Environment boundaries. The sequence of events typically occurs as follows:

    1. applications are acquired on a problem by problem basis
    2. back-end application databases are discovered once ad-hoc information views are sought by information workers
    3. back-end database disparity across applications is discovered once holistic views are sought by knowledge workers (typically domain experts).

    In the early to mid 90's (pre ubiquitous Web), operating system, programming language, operating system, and development framework independence inside the enterprise was technically achievable via ODBC (due to it's platform independence). That said, DBMS specific ODBC channels alone couldn't address the holistic requirements associated with Conceptual Views of disparate data sources, hence the need for Data Access Virtualization via Virtual Database Engine technology.

    Just as is the case on the Web today, with the emergence of the "Linked Data" meme, enterprises now have a powerful mechanism for exploiting the Data Integration benefits associated with generating Data Objects from disparate data sources, endowed with HTTP based IDs (URIs).

    Conceptualizing access to data exposed Databases APIs, SOA based Web Services (SOAP style Web Services), Web 2.0 APIs (REST style Web Services), XML Views of SQL Data (SQLX), pure XML etc.. is problem area addressed by RDF aware middleware (RDFizers e.g Virtuoso Sponger).

    Here are examples of what SQL Rows exposed as RDF Data Objects (identified using HTTP based URIs) would look like outside or behind a corporate firewall:

    What's Good for the Web Goose (Personal Data Space URIs) is good for the Enterprise Gander (Enterprise Data Space URIs).

    Related

    ]]>
    Linked Data is vital to Enterprise Integration driven Agilityhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1325Sat, 22 Mar 2008 18:13:41 GMT22008-03-22T14:13:41.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Daniel Lewis has published another post about OpenLink Data Spaces (ODS) functionality titled:A few new features in OpenLink Data Spaces, that exposes additional features (some hot out the oven).

    OpenLink Data Spaces (ODS) now officially supports:

    Which means that OpenLink Data Spaces support all of the main standards being discussed in the DataPortability Interest Group!

    APML Example:

    All users of ODS automatically get a dynamically created APML file, for example: APML profile for Kingsley Idehen

    The URI for an APML profile is: http://myopenlink.net/dataspace/<ods-username>/apml.xml

    Meaning of a Tag Example:

    All users of ODS automatically have tag cloud information embedded inside their SIOC file, for example: SIOC for Kingsley Idehen on the Myopenlink.net installation of ODS.

    But even better, MOAT has been implemented in the ODS Tagging System. This has been demonstrated in a recent test blog post by my colleague Mitko Iliev, the blog post comes up on the tag search: http://myopenlink.net/dataspace/imitko/weblog/Mitko%27s%20Weblog/tag/paris

    Which can be put through the OpenLink Data Browser:

    OAuth Example:

    OAuth Tokens and Secrets can be created for any ODS application. To do this:

    1. you can log in to MyOpenlink.net beta service, the Live Demo ODS installation, an EC2 instance, or your local installation
    2. then go to ‘Settings’
    3. and then you will see ‘OAuth Keys’
    4. you will then be able to choose the applications that you have instantiated and generate the token and secret for that app.

    Related Document (Human) Links

    Remember (as per my most recent post about ODS), ODS is about unobtrusive fusion of Web 1.0, 2.0, and 3.0+ usage and interaction patterns. Thanks to a lot of recent standardization in the Semantic Web realm (e.g SPARQL), we are now employ the MOAT, SKOS, and SCOT ontologies as vehicles for Structured Tagging.

    Structured Tagging?

    This is how we take a key Web 2.0 feature (think 2D in a sense), bend it over, to create a Linked Data Web (Web 3.0) experience unobtrusively (see earlier posts re. Dimensions of Web). Thus, nobody has to change how they tag or where they tag, just expose ODS to the URLs of your Web 2.0 tagged content and it will produce URIs (Structured Data Object Identifiers) and a lnked data graph for your Tags Data Space (nee. Tag Cloud). ODS will construct a graph which exposes tag subject association, tag concept alignment / intended meaning, and tag frequencies, that ultimately deliver "relative disambiguation" of intended Tag Meaning (i.e. you can easily discern the taggers meaning via the Tags actual Data Space which is associated with the tagger). In a nutshell, the dynamics of relevance matching, ranking, and the like, change immensely without futile timeless debates about matters such as:

      What's the Linked Data value proposition?
      What's the Linked Data business model?
      What's the Semantic Web Killer application?

    We can just get on with demonstrating Linked Data value using what exists on the Web today. This is the approach we are deliberately taking with ODS.

    Related Items

    .

    Tip: This post is best viewed via an RDF aware User Agent (e.g. a Browser or Data Viewer). I say this because the permalink of this post is a URI in a Linked Data Space (My Blog) comprised of more data than meets the eye (i.e. what you see when you read this post via a Document Web Browser) :-)

    ]]>
    Additional OpenLink Data Spaces Featureshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1315Mon, 11 Feb 2008 16:38:03 GMT22008-02-11T11:38:03.000006-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    According to current media:

    Senator Barack Obama is a beacon of change within the democratic party while Senator Hillary Clinton is status quo.

    According to the data in the GovtTrack.us data space:

    Senator Barack Obama is a rank-and-file Democrat according to GovTrack's analysis of his track record in congress. Whereas, Senator Hillary Clinton is a radical democrat, according to the same Govt. Track analysis of her track record in congress.

    Who do we believe? The GovtTrack.us performance data, old media pundits, or postulations of the candidates? GovtTrack.us is a new approach to candidate vetting. It provides data in traditional Document Web and Linked Data Web forms, placing analytic power in the hands of the citizen.

    Here are insights into the track records of Senators Hillary Clinton and Barack Obama via the Zitgist Linked Data Viewer:

    1. Senator Hillary Clinton
    2. Senator Barack Obama

    Note: I am not aligned to any political party or candidate, this is just a demonstration of Linked Data that has a high degree of poignancy relative to US primary elections etc..

    ]]>
    Politics, Old Media, and Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1290Mon, 07 Jan 2008 17:22:15 GMT22008-01-07T12:22:15.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Bearing in mind we are all time challenged, here are links to OpenLink and Zitgist RDF Browser views of my earlier blog post re. Hyperdata & Linked Data.

    Both browsers should lead you to the posts from Danny, Nova, and Tim. In both cases the URI < xmlns="http" www.openlinksw.com="www.openlinksw.com" dataspace="dataspace" kidehen="kidehen" openlinksw.com="openlinksw.com" weblog="weblog" s="s" blog="blog" b127="b127" d="d"> is a pointer to structured data (in my Blog Data Space) if your user agent (browser or other Web Client) requests an RDF representation of this post via its HTTP request payload (what the Browser are doing via the "Accept:" headers).</>

    As you can see the Data Web is actually here! Without RDF generation upheaval (or Tax).

    ]]>
    RDF Browser View of My Hyperdata & Linked Data Posthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1253Thu, 20 Sep 2007 01:26:02 GMT52007-09-19T21:26:02-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Over the last few hours the FOAF project received a wakeup call via Dan Brickley's FOAF 0.9 "touch" effort.

    Naturally, this triggered an obvious opportunity to demonstrate the prowess of Linked Data on the Semantic Web. What follows is a quick dump of what I sent to the foaf-dev mailing list:

    Here are variety of FOAF Views built using:

    Enabling you to explore the following lines:

    ]]>
    Exploring FOAF Linked Data Style!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1202Fri, 25 May 2007 18:36:47 GMT32007-05-25T14:36:47-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The term "Community" is beginning resonate across the increasing number of conversations centered around the growing appreciation of the Semantic Data Web Vision. I've been troubled in the past with the once growing tendency to disconnect social networks from Graph based Conceptual Data Models as expressed in the underlying infrastructure (Data Management layer) of many first generation social networking services.

    Last week, John Breslin published a post that contained a very nice presentation of what is best described as "Objects of Our Sociality". The presentation provides insight into the elements that collectively drive the creation of People & Data networks (communities). The presentation certainly unveils the often forgotten fact that although People & Data network construction is always socially driven, our intentions aren't always amorous :-)

    At the core of the Semantic Data Web vision is the desire to leverage the "network effects" that communities provide, while exponentially reducing the cost of knowledge creation, discovery, and exchange in the process.

    In short, the Semantic Data Web ultimately enables us to collectively do our bit for a greater good! Thus, quoting TimBL, "you do your bit and others will do theirs" :-)

    ]]>
    It's the Community, Cupid!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1193Tue, 05 Feb 2008 04:20:25 GMT12008-02-04T23:20:25.000002-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As I continue my quest to unravel the thinking and vison behind the "Universal Server" branding of Virtuoso, it always simplifies matters when I come across articles that bring context to this vision. 

    Tim Berners-Lee provided a keynote at WWW2004 earlier this week, and Paul Ford provided a keynote breakdown from which I have scrapped a poignant excerpt that helps me illuminate Virtuoso's role in the inevitable semantic web.

    First off, I see the Semantic Web as a core component of Web 2.x (a minor upgrade of Web 2.0), and I see Virtuoso as a definitive Web 2.0 (and beyond) technology, hence the use today of the branding term "Universal Server". A term that I expect to become a common product moniker in the not too distant future.

    The first challenge that confronts the semantic web is the creation of Semantic content. How will the content be created? Ideally, this should come from data, at the end of the day this is a data contextualization process. The excerpt below from Paul's article highlights the point:

    Rather than concerning themselves unduly with hewing to existing ontologies, Berners-Lee pushed developers to start using RDF and triples more aggressively. In particular, he wants to see existing databases exported as RDF, with ontologies created ad-hoc to match the structure of that data. Rather than using PHP scripts only to produce HTML, he suggested, create RDF as well. Then, when all of the RDF is aggregated, apply rules and see what happens. "Let's not fall back on handmade markup."

    Data in existing databases does not have to be exported as RDF, especially if sensitivity to change is a specific contextual requirement. Naturally, the assumption is made that most databases don't have the ability to produce RDF so an additonal tool would be required to perform the data exports and transformation, and then a separate HTTP server makes this repurposed RDF data accessible over HTTP.

    Later in the talk, he described a cascade of Semantic Web connections, postulating that one day, individuals may be able to follow links from a parts catalog to order status, from location to weather to taxes.

    The final excerpt (above) outlines the kinds of interactions that the Semantic Web facilitates. The traversal from a "part catalog" to "order status", or from "location" to "weather" to "taxes", illustrates the roles that services and service orchestration will also play in the Semantic Web era.

    Thus, we can safely deduce the following about the semantic web:

    1. It has RDF at its foundation
    2. We need to transform existing data into RDF; ideally retaining sensitivity to changes
    3. Allows ontologies to be associated with RDF post generation
    4. RDF graph navigation will be event driven and orchestrated (the cascading effect)
    5. There will be an RDF Query Language (there are several burgeoning ones currently)
    6. HTTP will be the prime transport protocol

    I would also like to conclude that what we know today, as the monolithic "point of presence" on the web called a "Web Site" (which infers browsing and page serving), is naturally going to morph into a different kind of "point of presence" that is capable of delivering the following from a single process:

    1. Serve up Semantic Data from existing data sources
    2. Provide execution endpoints for Web Services
    3. Provide an instigation point for events that trigger Service Orchestratio

    This is what Virtuoso is all about, and why it is described as a "Universal Server"; a server instance that speaks many protocols, delivering a plethora of functionality (Database, Web Services Platform, Orchestration Engine, and more).

    ]]>
    Semantic Web brings clarity to the Universal Server concepthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1190Mon, 23 Apr 2007 16:42:13 GMT12007-04-23T12:42:13-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Mike Bergman has written a very detailed article about OpenLink Software and it's product portfolio that basically answers the question: What has OpenLink been Up To?

    As the company's founder, it was quite compelling to read a third party article that accurately navigates and articulates the depth of work that we've undertaken since that seminal moment in 1997 when we decided to extend our product portfolio beyond the Universal Data Access Drivers family.

    Of course I also take this opportunity to slip in another Semantic Data Web demo :-) Thus, take a look at this mother of all blog posts from Mike via the following:

    1. OpenLink RDF Browser Session
    2. Dynamic Data Web Page

    Note: In both cases above, you use the "Explore" or "Dereference" options of the Data Link (typed hyperlink) to traverse the RDF data that has been materialized "on the fly" courtesy of Virtuoso's in-built RDF Middleware (called the Sponger).

    BTW - I am assembling a collection of interesting DBpedia based Dynamic pages that showcase the depth of knowledge available from Wikipedia. If you're a current or future technology entrepreneur (or VC trying to grok the Semantic Web) then you certainly need to look at:

    1. Venture Capital
    2. Venture Capital Firms
    3. Venture Capitalists
    4. Entrepreneurs By Nationality
    ]]>
    What's OpenLink Software been Up To?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1187Tue, 05 Feb 2008 01:47:40 GMT22008-02-04T20:47:40.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Situation Analysis: Pre or Post Oscars, you want to research Forest Whitaker, Helen Mirren, or Jennifer Hudson. What do you do? Go on a screen scrapping and keyword regular expression odyssey? Or you simply lookup a Data Web oriented Data Source like dbpedia.

    Here is what I was I was able to knock together using my SPARQL QBE (without writing the SPARQL by hand):

    1. Forest Whitaker Data
    2. Helen Mirren Data
    3. Jennifer Hudson Data.

    Note: Just select the "Explore" option when the link-lookup window appears in response to you clicking on any of the links. That said, if you are using the Firefox Linkification extension the page will not work properly (as per this discussion about disabling Linkification) :-(

    BTW - I have a comments page, so don't be shy about showing me how you could produce this kind of data driven web page much quicker than I have :-)

    Warning: IE6 and Safari (use Webkit instead) cannot process these pages due to the use of Ajax.

    ]]>
    Using The Data Web to Research Oscar Winnershttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1146Tue, 27 Feb 2007 05:29:02 GMT112007-02-27T00:29:02-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The W3C RDF Data Access Working Group recently released an initial public Working Draft specification for "RDF Data Access Use Cases and Requirements". Naturally, this triggered discussion on the RDF mailing list along the following lines:

    In section, 4.1 Human-friendly Syntax, you say  "There must be a text-based form of the query language which can be read and written by users of the language",  and you list the status as "pending".

    As background for section 4.1, you may be interested in RDFQueryLangComparison1 (original text replaced with live link).

    It shows how to write queries in a form that includes English meanings.

    The example queries can be run by pointing a browser to www.reengineeringllc.com .

    Perhaps importantly, given the intricacy of RDF for nonprogrammers, one can get an English explanation of the result of each query.

    -- Dr. Adrian Walker of Internet Business Logic

    The Semantic Web continues to take shape, and Infonauts (information centric agents) are already taking shape.

    A great thing about the net is the "back to the future" nature of most Web and Internet technology. For instance we are now frenzied about Service Oriented Architecture (SOA), Event Drivent Architecture (EDA), Loose Coupling of Composite Services etc. Basically rehashing the CORBA vision.

    I see the Semantic Web playing a similar role in relation to artificial intelligence.

    BTW - It still always comes down to data, and as you can imagine Virtuoso will be playing its usual role of alleviating the practical implementation and ulization challenges of all of the above :-)

     

    ]]>
    Comparison of RDF Query Languageshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1100Thu, 14 Dec 2006 20:53:29 GMT12006-12-14T15:53:29-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I have written extensively about "Presence", "Data Spaces", and "Open Access to Data". What I haven't emphasized is how "Identity" brings this together, primarily becuase I didn't have something to demonstrate, or point to, coherently etc..

    Anyway, we now have OpenID support in OpenLink Data Spaces (ODS) which coincides nicely with the growing support of OpenID across the web.

    The beauty of OpenID support in ODS is that I now have a URL that meshes with my identity (at least in line with what I have chosed to share with the public via the Web). For instance, http://www.openlinksw.com/dataspace/kidehen@openlinksw.com is my OpenID as well as my personal URI (you look closer at this link and you have a map of my Data Space).

    To really understand what I am getting at here you should open up My OpenID URL using one of the following:

    1. Semantic Radar
    2. PiggyBank
    3. SIOC Enabled Wiki

    To be continued....

    ]]>
    OpenID meets Data Spaces etc..http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1048Tue, 26 Sep 2006 05:42:04 GMT12006-09-26T01:42:04.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Additional commentary from Orri Erling. re. ORDBMS, ADO.NET vNext, and RDF (in relation to Semantic Web Objects):

    More Thoughts on ORDBMS Clients, .NET and RDF:

    Continuing on from the previous post... If Microsoft opens the right interfaces for independent developers, we see many exciting possibilities for using ADO .NET 3 with Virtuoso.

    Microsoft quite explicitly states that their thrust is to decouple the client side representation of data as .NET objects from the relational schema on the database. This is a worthy goal.

    But we can also see other possible applications of the technology when we move away from strictly relational back ends. This can go in two directions: Towards object oriented database and towards making applications for the semantic web.

    In the OODBMS direction, we could equate Virtuoso table hierarchies with .NET classes and create a tighter coupling between client and database, going as it were in the other direction from Microsofts intended decoupling. For example, we could do typical OODBMS tricks such as prefetch of objects based on storage clustering. The simplest case of this is like virtual memory, where the request for one byte brings in the whole page or group of pages. The basic idea is that what is created together probably gets used together and if all objects are modeled as subclasses of (subtables) of a common superclass, then, regardless of instance type, what is created together (has consecutive ids) will indeed tend to cluster on the same page. These tricks can deliver good results in very navigational applications like GIS or CAD. But these are rather specialized things and we do not see OODBMS making any great comeback.

    But what is more interesting and more topical in the present times is making clients for the RDF world. There, the OWL Ontology Language could be used to make the .NET classes and the DBMS could, when returning URIs serving as subjects of triple include specified predicates on these subjects, enough to allow instantiating .NET instances as 'proxies' of these RDF objects. Of course, only predicates for which the client has a representation are relevant, thus some client-server handshake is needed at the start. What data could be prefetched is like the intersection of a concise bounded description and what the client has classes for. The rest of the mapping would be very simple, with IRIs becoming pointers, multi-valued predicates lists and so on. IRIs for which the RDF type were not known or inferable could be left out or represented as a special class with name-value pairs for its attributes, same with blank nodes.

    In this way,.NETs considerable UI capabilities could directly be exploited for visualizing RDF data, only given that the data complied reasonably well with a known ontology.

    If an SPARQL query returned a resultset, IRI type columns would be returned as .NET instances and the server would prefetch enough data for filling them in. For a SPARQL CONSTRUCT, a collection object could be returned with the objects materialized inside. If the interfaces allow passing an Entity SQL string, these could possibly be specialized to allow for a SPARQL string instead. LINQ might have to be extended to allow for SPARQL type queries, though.

    Many of these questions will be better answerable as we get more details on Microsofts forthcoming ADO .NET release. We hope that sufficient latitude exists for exploring all these interesting avenues of development.

    ]]>
    More Thoughts on ORDBMS Clients, ADO.NET vNext, and RDFhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1012Tue, 18 Jul 2006 18:28:58 GMT32006-07-18T14:28:58.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Web 2.0 Self-Experiment: "

    I shopped for everything except food on eBay. When working with foreign-language documents, I used translations from Babel Fish. (This worked only so well. After a Babel Fish round-trip through Italian, the preceding sentence reads, 'That one has only worked therefore well.') Why use up space storing files on my own hard drive when, thanks to certain free utilities, I can store them on Gmail's servers? I saved, sorted, and browsed photos I uploaded to Flickr. I used Skype for my phone calls, decided on books using Amazon's recommendations rather than 'expert' reviews, killed time with videos at YouTube, and listened to music through customizable sites like Pandora and Musicmatch. I kept my schedule on Google Calendar, my to-do list on Voo2do, and my outlines on iOutliner. I voyeured my neighborhood's home values via Zillow. I even used an online service for each stage of the production of this article, culminating in my typing right now in Writely rather than Word. (Being only so confident that Writely wouldn't somehow lose my work -- or as Babel Fish might put it, 'only confident therefore' -- I backed it up into Gmail files.
    Interesting article, Tim O'Reilly's response is here"

    (Via Valentin Zacharias (Student).)

    Tim O'Reilly's response provides the following hierarchy for Web 2.0 based on The what he calls: "Web 2.0-ness":

    level 3: The application could ONLY exist on the net, and draws its essential power from the network and the connections it makes possible between people or applications. These are applications that harness network effects to get better the more people use them. EBay, craigslist, Wikipedia, del.icio.us, Skype, (and yes, Dodgeball) meet this test. They are fundamentally driven by shared online activity. The web itself has this character, which Google and other search engines have then leveraged. (You can search on the desktop, but without link activity, many of the techniques that make web search work so well are not available to you.) Web crawling is one of the fundamental Web 2.0 activities, and search applications like Adsense for Content also clearly have Web 2.0 at their heart. I had a conversation with Eric Schmidt, the CEO of Google, the other day, and he summed up his philosophy and strategy as "Don't fight the internet." In the hierarchy of web 2.0 applications, the highest level is to embrace the network, to understand what creates network effects, and then to harness them in everything you do.

    Level 2: The application could exist offline, but it is uniquely advantaged by being online. Flickr is a great example. You can have a local photo management application (like iPhoto) but the application gains remarkable power by leveraging an online community. In fact, the shared photo database, the online community, and the artifacts it creates (like the tag database) is central to what distinguishes Flickr from its offline counterparts. And its fuller embrace of the internet (for example, that the default state of uploaded photos is "public") is what distinguishes it from its online predecessors.

    Level 1: The application can and does exist successfully offline, but it gains additional features by being online. Writely is a great example. If you want to do collaborative editing, its online component is terrific, but if you want to write alone, as Fallows did, it gives you little benefit (other than availability from computers other than your own.)

    Level 0: The application has primarily taken hold online, but it would work just as well offline if you had all the data in a local cache. MapQuest, Yahoo! Local, and Google Maps are all in this category (but mashups like housingmaps.com are at Level 3.) To the extent that online mapping applications harness user contributions, they jump to Level 2.

    So, in a sense we have near conclusive confirmation that Web 2.0 is simply about APIs (typically service specific Data Silos or Walled-gardens) with little concern, understanding, or interest in truly open data access across the burgeoning "Web of Databases". Or the Web of "Databases and Programs" that I prefer to describe as "Data Spaces"

    Thus, we can truly begin to conclude that Web 3.0 (Data Web) is the addition of Flexible and Open Data Access to Web 2.0; where the Open Data Access is achieved by leveraging Semantic Web deliverables such as the RDF Data Model and the SPARQL Query Language :-)

    ]]>
    Web 2.0 Self-Experiment aids Web 3.0 comprehensionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1009Tue, 18 Jul 2006 05:17:43 GMT32006-07-18T01:17:43-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Microsoft's recent unveiling of the next generation of ADO.NET has pretty much crystalized a long running hunch that the era of standardized client/user level interfaces for "Object-Relational" technology is neigh. Finally, this application / problem domain is attracting the attention of industry behemoths such as Microsoft.

    In an initial response to these developmentsOrri Erling, Virtuoso's Program Manager, shares valuable insights from past re. Object-Relational technology developments and deliverables challenges. As Orri notes, the Virtuoso team suspended ORM and ORDBMS work at the onset of the Kubl-Virtuoso transition due to the lack of standardized client-side functionality exposure points.

    My hope is that Microsoft's efforts trigger community wide activity that result in a collection of interfaces that make scenarios such as generating .NET based Semantic Web Objects (where the S in an S-P->O RDF-Triple becomes a bona fide .NET class instance generated from OWL).

    To be continued since the interface specifics re. ADO.NET 3.0 remain in flux...

    ]]>
    Object Relational Rediscovered?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1005Fri, 14 Jul 2006 01:59:16 GMT22006-07-13T21:59:16.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    By Harry Fuecks
    Here's a question: what if I was to tell you that you can write your own version of Word using something like HTML and JavaScript? What if I added that you could run on your hard disk or launch it directly from your Web server and use it to update your site's content? It sounds a little far fetched, I know, but it's right here, right now -- and it calls itself "Zool". 

    Here?s what this three-part series will cover:

    • The XUL Revolution: just who is Zool?
    • Back to School: time to dust of that JavaScript...
    • Zoolology: getting read to fire up your first XUL application
    • 3D Browsing with XUL: straight in at the deep end.
    • Desperately Seeking: the search is over.
    • Takeaway Menu: with fries please!
    • But no one uses Mozilla: back to browser detection.
    • The Rise of the Rich Client: the future is XUL.

    Part 1

    My Comments:
    I am a firm believer in the possibilities presented by XUL. It will enable the bundling of UI, Data, Data Manipulation logic (Application or Module ) as part of a payload hosted on report server Like Virtuoso. Basically, I anticipate the emergence of an IDE that is able to persist is UI components (widgets) and UI behaviour as XML using the XUL grammer. Then along comes a XUL Processor that is able to emit a XUL based UI payloads ( via user agent aware transformation) as:
    .NET/Mono Windows Forms assemblies
    Javascript
    Flash MX
    XUL (If we know the client is Mozilla or Firebird for instance)
    .....
    I think this is a Virtuoso demo in the making :-)




     

     

    ]]>
    By Harry Fueckshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/997Fri, 07 Jul 2006 12:29:38 GMT22006-07-07T08:29:38-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Jon and I had a recent chat yesterday that is now available in Podcast form.

    "In my fourth Friday podcast we hear from Kingsley Idehen, CEO of OpenLink Software. I wrote about OpenLink's universal database and app server, Virtuoso, back in 2002 and 2003. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the transition to open source. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. ..."

    (Via Jon's Radio.)

    I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as "Adam Bosworth's fingerprints." I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years).
    BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what's already in place).

    I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam's (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn't mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).

    Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)

    If you are encountering Virtuoso for the first time via this post or Jon's, please make time to read the product history article on the Virtuoso Wiki (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).

    That said, I better go listen to the podcast :-)

    ]]>
    My podcast conversation with Jon Udellhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/993Thu, 29 Jun 2006 14:14:44 GMT12006-06-29T10:14:44.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Structured data is boring and useless.. This article provides insight into a serious point of confusion about what exactly is structured vs. unstructured data. Here is a key excerpt:
    "We all know that structured data is boring and useless; while unstructured data is sexy and chock full of value. Well, only up to a point, Lord Copper. Genuinely unstructured data can be a real nuisance - imagine extracting the return address from an unstructured letter, without letterhead and any of the formatting usually applied to letters. A letter may be thought of as unstructured data, but most business letters are, in fact, highly-structured." ....
    Duncan Pauly, founder and chief technology officer of Coppereye add's eloquent insight to the conversation:
    "The labels "structured data" and "unstructured data" are often used ambiguously by different interest groups; and often used lazily to cover multiple distinct aspects of the issue. In reality, there are at least three orthogonal aspects to structure:
      * The structure of the data itself.
      * The structure of the container that hosts the data.
      * The structure of the access method used to access the data.
    These three dimensions are largely independent and one does not need to imply another. For example, it is absolutely feasible and reasonable to store unstructured data in a structured database container and access it by unstructured search mechanisms."

    Data understanding and appreciation is dwindling at a time when the reverse should be happening. We are supposed to be in the throws of the "Information Age", but for some reason this appears to have no correlation with data and "data access" in the minds of many -- as reflected in the broad contradictory positions taken re. unstructured data vs structured data, structured is boring and useless while unstructured is useful and sexy....

    The difference between "Structured Containers" and "Structured Data" are clearly misunderstood by most (an unfortunate fact).

    For instance all DBMS products are "Structured Containers" aligned to one or more data models (typically one). These products have been limited by proprietary data access APIs and underlying data model specificity when used in the "Open-world" model that is at the core of the World Wide Web. This confusion also carries over to the misconception that Web 2.0 and the Semantic/Data Web are mutually exclusive.

    But things are changing fast, and the concept of multi-model DBMS products is beginning to crystalize. On our part, we have finally released the long promised "OpenLink Data Spaces" application layer that has been developed using our Virtuoso Universal Server. We have structured unified storage containment exposed to the data web cloud via endpoints for querying or accessing data using a variety of mechanisms that include; GData, OpenSearch, SPARQL, XQuery/XPath, SQL etc..

    To be continued....

    ]]>
    Structured Data vs. Unstructured Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/991Tue, 27 Jun 2006 05:39:09 GMT12006-06-27T01:39:09-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The return of WinFS back into SQL Server has re-ignited interest in the somewhat forgotten “DBMS Engine hosted Unified Storage System” vision. The WinFS project struggles have more to do with the futility of “Windows Platform Monoculture” than the actual vision itself. In today's reality you simply cannot seek to deliver a “Unified Storage” solution that's inherently operating system specific, and even worse, ignores existing complimentary industry standards and the loosely coupled nature of the emerging Web Operating System.

    A quick FYI:
    Virtuoso has offered a DBMS hosted Filesystem via WebDAV for a number of years, but the implications of this functionality have remained unclear for just as long. Thus, we developed (a few years ago) and released (recently) an application layer above Virtuoso's WebDAV storage realm called: “The OpenLink Briefcase” (nee. oDrive). This application allows you to view items uploaded by content type and/or kind (People, Business Cards, Calendars, Business Reports, Office Documents, Photos, Blog Posts, Feed Channels/Subscriptions, Bookmarks etc..). it also includes automatic metadata extraction (where feasible) and indexing. Naturally, as an integral part of our “OpenLink Data Spaces” (ODS) product offering, it supports GData, URIQA, SPARQL (note: WebDAV metadata is sync'ed with Virtuoso's RDF Triplestore), SQL, and WebDAV itself.

    You can explore the power of this product via the following routes:

    1. Download the Virtuoso Open Source Edition and the ODS add-ons or
    2. Visit our live demo server (note: this is strictly a demo server with full functionality available) and simply register and then create a “Briefcase” application instance
    3. Digest this Briefcase Home Page Screenshot
    ]]>
    DBMS Hosted Filesystems & WinFShttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/990Tue, 27 Jun 2006 01:28:44 GMT12006-06-26T21:28:44-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Last week I put out a series of screencast style demos that sought to demonstrate the core elements of our soon to be released Javascript Toolkit called OAT (OpenLink Ajax Toolkit) and its Ajax Database Connectivity layer.

    The screencasts covered the following functionality realms:

    1. SQL Query By Example (basic)
    2. SQL Query By Example (advanced - pivot table construction)
    3. Web Form Design (basic database driven map based mashup)
    4. Web Form Design (advanced database driven map based mashup)

    To bring additional clarity to the screencasts demos and OAT in general, I have saved a number of documents that are the by products of activities in the screenvcasts:

    1. Live XML Document produced using SQL Query By Example (basic) (you can use drag and drop columns across the grid to reorder and sort presentation)
    2. Live XML Document produced using QBE and Pivot Functionality (you can drag and drop the aggregate columns and rows to create your own views etc..)
    3. Basic database driven map based mashup (works with FireFox, Webkit, Camino; click on pins to see national flag)
    4. Advanced database driven map based mashup (works with FireFox, Webkit, Camino; records, 36, 87, and 257 will unveil pivots via lookup pin)

    Notes:

    • “Advanced”, as used above, simply means that I am embedding images (employee photos and national flags) and a database driven pivot into the map pins that serve as details lookups in classic SQL master/details type scenarios.
    • The “Ajax Call In Progress..” dialog is there to show live interaction with a remote database (in this case Virtuoso but this could be any ODBC, JDBC, OLEDB, ADO.NET, or XMLA accessible data source)
    • The data access magic source (if you want to call it that) is XMLA - a standard that has been in place for years but completely misunderstood and as a result under utilized

    You can see a full collection of saved documents at the following locations:

    ]]>
    Contd: Ajax Database Connectivity Demoshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/988Thu, 22 Jun 2006 12:56:58 GMT102006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Search Engine Challenges Posed by the Semantic Web: "

    A pre-print from Tim Finin and Li Deng entitled, Search Engines for Semantic Web Knowledge,1 presents a thoughtful and experienced overview of the challenges posed to conventional search by semantic Web constructs.' The authors’ base much of their observations on their experience with the Swoogle semantic Web search engine over the past two years.' They also used Swoogle, whose index contains information on over 1.3M RDF documents, to generate statistics on the semantic Web size and growth in the paper.

    Among other points, the authors note these key differences and challenges from conventional search engines:

    • Harvesting — the need to discriminantly discover semantic Web documents and to accurately index their semi-structured components
    • Search - the need for search to cover a broader range than documents in a repository, going from the universal to the atomic granularity of a triple.' Path tracing and provenance of the information may also be important
    • Rank — results ranking needs to account for the contribution of the semi-structured data, and
    • Archive — more versioning and tracking is needed since undelrying ontologies will surely grow and evolve.

    The authors particularly note the challenge of indexing as repositories grow to actual Internet scales.

    Though not noted, I would add to this list the challenge of user interfaces. Only a small percentage of users, for example, use Google’s more complicated advanced search form.' In its full-blown implementation, semantic Web search variations could make the advanced Google form look like child’s play.

    '


    1Tim Finin and Li Ding, 'Search Engines for Semantic Web Knowledge,' a pre-print to be published in the Proceedings of XTech 2006: Building Web 2.0, May 16, 2006, 19 pp.' A PDF of the paper is available for download.

    "

    (Via AI3 - Adaptive Information:::.)

    ]]>
    Search Engine Challenges Posed by the Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/978Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Very detailed and insightful peek into the state of affairs re. database engines (Open & Closed Source).

    I added the missing piece regarding the "Virtuoso Conductor" (the Web based Admin UI for Virtuoso) to the original post below. I also added a link to our live SPARQL Demo so that anyone interested can start playing around with SPARQL and SPARQL integrated into SQL right away.

    Another good thing about this post is the vast amount of valuable links that it contains. To really appreciate this point simply visit my Linkblog (excuse the current layout :-) - a Tab if you come in via the front door of this Data Space (what I used to call My Weblog Home Page).

    "Free" Databases: Express vs. Open-Source RDBMSs: "Open-source relational database management systems (RDBMSs) are gaining IT mindshare at a rapid pace. As an example, BusinessWeek's February 6, 2006 ' Taking On the Database Giants ' article asks 'Can open-source upstarts compete with Oracle, IBM, and Microsoft?' and then provides the answer: 'It's an uphill battle, but customers are starting to look at the alternatives.'

    There's no shortage of open-source alternatives to look at. The BusinessWeek article concentrates on MySQL, which BW says 'is trying to be the Ikea of the database world: cheap, needs some assembly, but has a sleek, modern design and does the job.' The article also discusses Postgre[SQL] and Ingres, as well as EnterpriseDB, an Oracle clone created from PostgreSQL code*. Sun includes PostgreSQL with Solaris 10 and, as of April 6, 2006, with Solaris Express.**

    *Frank Batten, Jr., the investor who originally funded Red Hat, invested a reported $16 million into Great Bridge with the hope of making a business out of providing paid support to PostgreSQL users. Great Bridge stayed in business only 18 months , having missed an opportunity to sell the business to Red Hat and finding that selling $50,000-per-year support packages for an open-source database wasn't easy. As Batten concluded, 'We could not get customers to pay us big dollars for support contracts.' Perhaps EnterpriseDB will be more successful with a choice of $5,000, $3,000, or $1,000 annual support subscriptions .

    **Interestingly, Oracle announced in November 2005 that Solaris 10 is 'its preferred development and deployment platform for most x64 architectures, including x64 (x86, 64-bit) AMD Opteron and Intel Xeon processor-based systems and Sun's UltraSPARC(R)-based systems.'

    There is a surfeit of reviews of current MySQL, PostgreSQL and—to a lesser extent—Ingres implementations. These three open-source RDBMSs come with their own or third-party management tools. These systems compete against free versions of commercial (proprietary) databases: SQL Server 2005 Express Edition (and its MSDE 2000 and 1.0 predecessors), Oracle Database 10g Express Edition, IBM DB2 Express-C, and Sybase ASE Express Edition for Linux where database size and processor count limitations aren't important. Click here for a summary of recent InfoWorld reviews of the full versions of these four databases plus MySQL, which should be valid for Express editions also. The FTPOnline Special Report article, 'Microsoft SQL Server Turns 17,' that contains the preceding table is here (requires registration.)

    SQL Server 2005 Express Edition SP-1 Advanced Features

    SQL Server 2005 Express Edition with Advanced Features enhances SQL Server 2005 Express Edition (SQL Express or SSX) dramatically, so it deserves special treatment here. SQL Express gains full text indexing and now supports SQL Server Reporting Services (SSRS) on the local SSX instance. The SP-1 with Advanced Features setup package, which Microsoft released on April 18, 2006, installs the release version of SQL Server Management Studio Express (SSMSE) and the full version of Business Intelligence Development Studio (BIDS) for designing and editing SSRS reports. My 'Install SP-1 for SQL Server 2005 and Express' article for FTPOnline's SQL Server Special Report provides detailed, illustrated installation instructions for and related information about the release version of SP-1. SP-1 makes SSX the most capable of all currently available Express editions of commercial RDBMSs for Windows.

    OpenLink Software's Virtuoso Open-Source Edition

    OpenLink Software announced an open-source version of it's Virtuoso Universal Server commercial DBMS on April 11, 2006. On the initial date of this post, May 2, 2006, Virtuoso Open-Source Edition (VOS) was virtually under the radar as an open-source product. According to this press release, the new edition includes:

    • SPARQL compliant RDF Triple Store
    • SQL-200n Object-Relational Database Engine (SQL, XML, and Free Text)
    • Integrated BPEL Server and Enterprise Service Bus
    • WebDAV and Native File Server
    • Web Application Server that supports PHP, Perl, Python, ASP.NET, JSP, etc.
    • Runtime Hosting for Microsoft .NET, Mono, and Java
    VOS only lacks the virtual server and replication features that are offered by the commercial edition. VOS includes a Web-based administration tool called the "Virtuoso Conductor" According to Kingsley Idehen's Weblog, 'The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).'

    InfoWorld's Jon Udell has tracked Virtuoso's progress since 2002, with an additional article in 2003 and a one-hour podcast with Kingsley Idehen on April 26, 2006. A major talking point for Virtuoso is its support for Atom 0.3 syndication and publication, Atom 1.0 syndication and (forthcoming) publication, and future support for Google's GData protocol, as mentioned in this Idehen post. Yahoo!'s Jeremy Zawodny points out that the 'fingerprints' of Adam Bosworth, Google's VP of Engineering and the primary force behind the development of Microsoft Access, 'are all over GData.' Click here to display a list of all OakLeaf posts that mention Adam Bosworth.

    One application for the GData protocol is querying and updating the Google Base database independently of the Google Web client, as mentioned by Jeremy: 'It's not about building an easier onramp to Google Base. ... Well, it is. But, again, that's the small stuff.' Click here for a list of posts about my experiences with Google Base. Watch for a future OakLeaf post on the subject as the GData APIs gain ground.

    Open-Source and Free Embedded Database Contenders

    Open-source and free embedded SQL databases are gaining importance as the number and types of mobile devices and OSs proliferate. Embedded databases usually consist of Java classes or Windows DLLs that are designed to minimize file size and memory consumption. Embedded databases avoid the installation hassles, heavy resource usage and maintenance cost associated with client/server RDBMSs that run as an operating system service.

    Andrew Hudson's December 2005 'Open Source databases rounded up and rodeoed' review for The Enquirer provides brief descriptions of one commercial and eight open source database purveyors/products: Sleepycat, MySQL, PostgreSQL, Ingres, InnoBase, Firebird, IBM Cloudscape (a.k.a, Derby), Genezzo, and Oracle. Oracle Sleepycat* isn't an SQL Database, Oracle InnoDB* is an OEM database engine that's used by MySQL, and Genezzo is a multi-user, multi-server distributed database engine written in Perl. These special-purpose databases are beyond the scope of this post.

    * Oracle purchased Sleepycat Software, Inc. in February 2006 and purchased Innobase OY in October 2005 . The press release states: 'Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software.'

    Derby is an open-source release by the Apache Software Foundation of the Cloudscape Java-based database that IBM acquired when it bought Informix in 2001. IBM offers a commercial release of Derby as IBM Cloudscape 10.1. Derby is a Java class library that has a relatively light footprint (2 MB), which make it suitable for client/server synchronization with the IBM DB2 Everyplace Sync Server in mobile applications. The IBM DB2 Everyplace Express Edition isn't open source or free*, so it doesn't qualify for this post. The same is true for the corresponding Sybase SQL Anywhere components.**


    * IBM DB2 Everyplace Express Edition with synchronization costs $379 per server (up to two processors) and $79 per user. DB2 Everyplace Database Edition (without DB2 synchronization) is $49 per user. (Prices are based on those when IBM announced version 8 in November 2003.)

    ** Sybase's iAnywhere subsidiary calls SQL Anywhere 'the industry's leading mobile database.' A Sybase SQL Anywhere Personal DB seat license with synchronization to SQL Anywhere Server is $119; the cost without synchronization wasn't available from the Sybase Web site. Sybase SQL Anywhere and IBM DB2 Everyplace perform similar replication functions.

    Sun's Java DB, another commercial version of Derby, comes with the Solaris Enterprise Edition, which bundles Solaris 10, the Java Enterprise System, developer tools, desktop infrastructure and N1 management software. A recent Between the Lines blog entry by ZDNet's David Berlind waxes enthusiastic over the use of Java DB embedded in a browser to provide offline persistence. RedMonk analyst James Governor and eWeek's Lisa Vaas wrote about the use of Java DB as a local data store when Tim Bray announced Sun's Derby derivative and Francois Orsini demonstrated Java DB embedded in the Firefox browser at the ApacheCon 2005 conference.

    Firebird is derived from Borland's InterBase 6.0 code, the first commercial relational database management system (RDBMS) to be released as open source. Firebird has excellent support for SQL-92 and comes in three versions: Classic, SuperServer and Embedded for Windows, Linux, Solaris, HP-UX, FreeBSD and MacOS X. The embedded version has a 1.4-MB footprint. Release Candidate 1 for Firebird 2.0 became available on March 30, 2006 and is a major improvement over earlier versions. Borland continues to promote InterBase, now at version 7.5, as a small-footprint, embedded database with commercial Server and Client licenses.

    SQLite is a featherweight C library for an embedded database that implements most SQL-92 entry- and transitional-level requirements (some through the JDBC driver) and supports transactions within a tiny 250-KB code footprint. Wrappers support a multitude of languages and operating systems, including Windows CE, SmartPhone, Windows Mobile, and Win32. SQLite's primary SQL-92 limitations are lack of nested transactions, inability to alter a table design once committed (other than with RENAME TABLE and ADD COLUMN operations), and foreign-key constraints. SQLite provides read-only views, triggers, and 256-bit encryption of database files. A downside is the the entire database file is locked when while a transaction is in progress. SQLite uses file access permissions in lieu of GRANT and REVOKE commands. Using SQLite involves no license; its code is entirely in the public domain.

    The Mozilla Foundation's Unified Storage wiki says this about SQLite: 'SQLite will be the back end for the unified store [for Firefox]. Because it implements a SQL engine, we get querying 'for free', without having to invent our own query language or query execution system. Its code-size footprint is moderate (250k), but it will hopefully simplify much existing code so that the net code-size change should be smaller. It has exceptional performance, and supports concurrent access to the database. Finally, it is released into the public domain, meaning that we will have no licensing issues.'

    Vieka Technology, Inc.'s eSQL 2.11 is a port of SQLite to Windows Mobile (Pocket PC and Smartphone) and Win32, and includes development tools for Windows devices and PCs, as well as a .NET native data provider. A conventional ODBC driver also is available. eSQL for Windows (Win32) is free for personal and commercial use; eSQL for Windows Mobile requires a license for commercial (for-profit or business) use.

    HSQLDB isn't on most reviewers' radar, which is surprising because it's the default database for OpenOffice.org (OOo) 2.0's Base suite member. HSQLDB 1.8.0.1 is an open-source (BSD license) Java dembedded database engine based on Thomas Mueller's original Hypersonic SQL Project. Using OOo's Base feature requires installing the Java 2.0 Runtime Engine (which is not open-source) or the presence of an alternative open-source engine, such as Kaffe. My prior posts about OOo Base and HSQLDB are here, here and here.

    The HSQLDB 1.8.0 documentation on SourceForge states the following regarding SQL-92 and later conformance:

    HSQLDB 1.8.0 supports the dialect of SQL defined by SQL standards 92, 99 and 2003. This means where a feature of the standard is supported, e.g. left outer join, the syntax is that specified by the standard text. Many features of SQL92 and 99 up to Advanced Level are supported and here is support for most of SQL 2003 Foundation and several optional features of this standard. However, certain features of the Standards are not supported so no claim is made for full support of any level of the standards.

    Other less well-known embedded databases designed for or suited to mobile deployment are Mimer SQL Mobile and VistaDB 2.1 . Neither product is open-source and require paid licensing; VistaDB requires a small up-front payment by developers but offers royalty-free distribution.

    Java DB, Firebird embedded, SQLite and eSQL 2.11 are contenders for lightweight PC and mobile device database projects that aren't Windows-only.

    SQL Server 2005 Everywhere

    If you're a Windows developer, SQL Server Mobile is the logical embedded database choice for mobile applications for Pocket PCs and Smartphones. Microsoft's April 19, 2006 press release delivered the news that SQL Server 2005 Mobile Editon (SQL Mobile or SSM) would gain a big brother—SQL Server 2005 Everywhere Edition.

    Currently, the SSM client is licensed (at no charge) to run in production on devices with Windows CE 5.0, Windows Mobile 2003 for Pocket PC or Windows Mobile 5.0, or on PCs with Windows XP Tablet Edition only. SSM also is licensed for development purposes on PCs running Visual Studio 2005. Smart Device replication with SQL Server 2000 SP3 and later databases has been the most common application so far for SSM.

    By the end of 2006, Microsoft will license SSE for use on all PCs running any Win32 version or the preceding device OSs. A version of SQL Server Management Studio Express (SSMSE)—updated to support SSE—is expected to release by the end of the year. These features will qualify SSE as the universal embedded database for Windows client and smart-device applications.

    For more details on SSE, read John Galloway's April 11, 2006 blog post and my 'SQL Server 2005 Mobile Goes Everywhere' article for the FTPOnline Special Report on SQL Server."

    (Via OakLeaf Systems.)

    ]]>
    "Free" Databases: Express vs. Open-Source RDBMSshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/968Fri, 21 Jul 2006 11:21:57 GMT12006-07-21T07:21:57.000006-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Jon and I had a recent chat yesterday that is now available in Podcast form.

    "In my fourth Friday podcast we hear from Kingsley Idehen, CEO of OpenLink Software. I wrote about OpenLink's universal database and app server, Virtuoso, back in 2002 and 2003. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the transition to open source. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. ..."

    (Via Jon's Radio.)

    I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as "Adam Bosworth's fingerprints." I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years).
    BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what's already in place).

    I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam's (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn't mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).

    Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)

    If you are encountering Virtuoso for the first time via this post or Jon's, please make time to read the product history article on the Virtuoso Wiki (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).

    That said, I better go listen to the podcast :-)

    ]]>
    My podcast conversation with Jon Udell http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/965Fri, 21 Jul 2006 11:22:41 GMT12006-07-21T07:22:41.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I am pleased to unveil (officially) the fact that Virtuoso is now available in Open Source form.

    What Is Virtuoso?

    A powerful next generation server product that implements otherwise distinct server functionality within a single server product. Think of Virtuoso as the server software analog of a dual core processor where each core represents a traditional server functionality realm.

    Where did it come from?

    The Virtuoso History page tells the whole story.

    What Functionality Does It Provide?

    The following:
      1. Object-Relational DBMS Engine (ORDBMS like PostgreSQL and DBMS engine like MySQL)
      2. XML Data Management (with support for XQuery, XPath, XSLT, and XML Schema)
      3. RDF Triple Store (or Database) that supports SPARQL (Query Language, Transport Protocol, and XML Results Serialization format)
      4. Service Oriented Architecture (it combines a BPEL Engine with an ESB)
      5. Web Application Server (supports HTTP/WebDAV)
      6. NNTP compliant Discussion Server
    And more. (see: Virtuoso Web Site)

    90% of the aforementioned functionality has been available in Virtuoso since 2000 with the RDF Triple Store being the only 2006 item.

    What Platforms are Supported

    The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).

    Why Open Source?

    Simple, there is no value in a product of this magnitude remaining the "best kept secret". That status works well for our competitors, but absolutely works against the legions of new generation developers, systems integrators, and knowledge workers that need to be aware of what is actually achievable today with the right server architecture.

    What Open Source License is it under?

    GPL version 2.

    What's the business model?

    Dual licensing.

    The Open Source version of Virtuoso includes all of the functionality listed above. While the Virtual Database (distributed heterogeneous join engine) and Replication Engine (across heterogeneous data sources) functionality will only be available in the commercial version.

    Where is the Project Hosted?

    On SourceForge.

    Is there a product Blog?

    Of course!

    Up until this point, the Virtuoso Product Blog has been a covert live demonstration of some aspects of Virtuoso (Content Management). My Personal Blog and the Virtuoso Product Blog are actual Virtuoso instances, and have been so since I started blogging in 2003.

    Is There a product Wiki?

    Sure! The Virtuoso Product Wiki is also an instance of Virtuoso demonstrating another aspect of the Content Management prowess of Virtuoso.

    What About Online Documentation?

    Yep! Virtuoso Online Documentation is hosted via yet another Virtuoso instance. This particular instance also attempts to demonstrate Free Text search combined with the ability to repurpose well formed content in a myriad of forms (Atom, RSS, RDF, OPML, and OCS).

    What about Tutorials and Demos?

    The Virtuoso Online Tutorial Site has operated as a live demonstration and tutorial portal for a numbers of years. During the same timeframe (circa. 2001) we also assembled a few Screencast style demos (their look feel certainly show their age; updates are in the works).

    BTW - We have also updated the Virtuoso FAQ and also released a number of missing Virtuoso White Papers (amongst many long overdue action items).

    ]]>
    Virtuoso is Officially Open Source!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/951Fri, 21 Jul 2006 11:22:20 GMT12006-07-21T07:22:20.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Swoogle knows how Semantic Web ontologies are used: "

    The Dublin Core Metadata Initiative is updating the RDF expression of DC and might add range restrictions to some properties. Mikael Nilsson wondered if we would use the Swoogle Semantic Web search engine to see what types of values are being used with DC properties. This kind of query is just the ticket for Swoogle. Well, almost. The current web-based interface supports a limited number of query types. Many more can be asked if you use SQL directly to query Swoogle’s underlying databases. We don’t want to provide a direct SQL query service over the main Swoogle database because it’s easy to ask a query that will take a looooooong time to answer and some could even crash the database server. We are planning to put up a second server with a copy of the database and we give Swoogle Power Users (SPUs) access to it. We ran a simple SQL query to generate some initial data for Mikael showing fall of the DC properties. For each one, we list all of the ranges that values were drawn from and the number of separate documents and triples for each combination. For example Property Range Documents Triples dc:creater rdfs:Literal 32 648 dc:creator rdfs:Literal 234655 2477665 dc:creator wn:Person 2714 1138250 dc:creator cc:Agent 4090 6359 dc:creator foaf:Person 2281 5969 dc:creator foaf:Agent 1723 3234 Notice that the first property in this partial table is an obvious typo. You can see the complete table as pdf file or as an excel spreadsheet. [Tim Finin, UMBC ebiquity lab] "

    (Via Planet RDF.)

    ]]>
    Swoogle knows how Semantic Web ontologies are usedhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/947Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The Future Of The Internet: "

    While the framework of governance continues to evolve there is a widespread belief that along with the growth of the internet, more and more problems such as spam, viruses and 'denial of service' attacks that can cripple large websites shall begin to be felt. It seems reasonable to assume that the number of devices on the network will continue to multiply in new and unforeseen ways. So researchers are starting from the assumption that communications chips and sensors will eventually be embedded in almost everything, from furniture to cereal boxes - 'hundreds of billions of such devices'. While today's internet traffic is generally initiated by humans- as they send e-mails, click on web links, or download music tracks- in future, the vast majority of traffic may be 'machine to machine' communications: things flirting with other things – all ready to be connected wirelessly, and will move around.

    The Economist has a related article titled Reinventing the Internet. Asking the question if a can a ‘clean slate’ redesign of the internet can ever be implemented.
    Few solutions float around:
    - One is ‘trust-modulated transparency’. The network's traffic-routing infrastructure shall judge the trustworthiness of packets of data as they pass by and deliver only those deemed trustworthy & dubious packets might be shunted aside for screening. The whole system would be based on a ‘web of trust’, in which traffic flows freely between devices that trust each other, but is closely scrutinized between those that do not.
    - Another idea is a new approach to addressing, called ‘internet indirection infrastructure’ - It would overlay an additional addressing system on top of the internet-protocol numbers now used to identify devices on the internet. This would make it easier to support mobile devices, and would also allow for ‘multicasting’ of data to many devices at once, enabling the efficient distribution of audio, video and software. With Activenets or metanets, devices at the edge of the network could then dynamically reprogram all the routers along the network path between them to use whatever new protocol they wanted.
    While the research is still on there some hopes of making some progress on the technical front – but It may well transpire that the greatest impediment to upgrading the internet will turn out to be political disagreements like this , this, over how it should work, rather than the technical difficulty of bringing it about.
    The OECD hosted a workshop titled The Future of the Internet in Paris on 8 March 2006. Some of the presentations look good and a few of them make a compelling reading.



    Category :, , "

    (Via Sadagopan's weblog on Emerging Technologies,Thoughts, Ideas,Trends and Cyberworld.)

    ]]>
    The Future Of The Internethttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/943Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    ETech 2006 Trip Report: eBay Web Services: A Marketplace Platform for Fun and Profit: "

    These are my notes from the session eBay Web Services: A Marketplace Platform for Fun and Profit by Adam Trachtenberg.

    This session was about the eBay developer program. The talk started by going over the business models for 'Web 2.0' startups. Adam Trachtenberg surmised that so far only two viable models have shown up (i) get bought by Yahoo! and (ii) put a lot of Google AdSense ads on your site. The purpose of the talk was to introduce a third option, making money by integrating with eBay's APIs.

    Adam Trachtenberg went on to talk about the differences between providing information and providing services. Information is read-only while services are read/write. Services have value because they encourage an 'architecture of participation'.

    eBay is a global, online marketplace that facilitates the exchange of goods. The site started off as being a place to purchase used collectibles but now has grown to encompass old and new items, auctions and fixed price sales (fixed price sales are now a third of their sales) and even sales of used cars. There are currently 78 million items being listed at any given time on eBay.

    As eBay has grown more popular they have come to realize that one size doesn't fit all when it comes to the website. It has to be customized to support different languages and markets as well as running on devices other the PC. Additionally, they discovered that some companies had started screen scraping their site to give an optimized user experience for some power users. Given how fragile screen scraping is the eBay team decided to provide a SOAP API that would be more stable and performant for them than having people screen scrape the website.

    The API has grown to over 100 methods and about 43% of the items on the website are added via the SOAP API. The API enables one to build user experiences for eBay outside the web browser such as integration with cell phones, Microsoft Office, gadgets & widgets, etc. The API has an affiliate program so developers can make money for purchases that happen through the API. An example of the kind of mashup one can build to make money from the eBay API is https://www.dudewheresmyusedcar.com. Another example of a mashup that can be used to make money using the eBay API is http://www.ctxbay.com which provides contextual eBay ads for web publishers.

    The aforementioned sites are just a few examples of the kinds of mashups that can be built with the eBay API. Since the API enables buying and listing of items for sale as well as obtaining inventory data from the service, one can build a very diverse set of applications.

    "

    (Via Dare Obasanjo aka Carnage4Life.)

    ]]>
    ETech 2006 Trip Report: eBay Web Services: A Marketplace Platform for Fun and Profithttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/938Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    what is web 2.0?: "

    There has been lot of discussion about what Web 2.0 really is, so we thought we’d use the power of Web 2.0 itself to come up with the answer, and here it is:42. Just kidding. What we actually did was take a look at all the tag data going back to February 2004 (the month of the first use of Web 2.0 as a tag on del.icio.us), and analyzed all the bookmarks and tags related to the term. We can report that as of October 31, 2005 there have been over 230,000 separate bookmarks and over 7,000 unique tags associated with the term ‘Web 2.0’ by del.icio.us users. So for this exercise, we lopped off the really long tail and normalized some similar terms (e.g. combining blog, blogs, and blogging), and came up with this snapshot of what Web 2.0 REALLY is – at least according to del.icio.us users' most popular tags through the end of October 2005: ajax9.9% blog6.1% social4.2% tools4.1% software3.3% tagging3.3% javascript2.8% internet2.6% programming2.5% rss2.5% Other notable tags included rubyonrails (1.8%), del.icio.us (1.6%), folksonomy (1.4%), community (1.1%), wiki (.9%), flickr (.8%), free (.7%), trends (.6%), flock (.4%) and googlemaps (.3%). So there you have it - interesting, but it still seems to fall short of a definitive answer. Maybe the blinding flash of the obvious is that Web 2.0 is best defined as arguing about what Web 2.0 is really about. "

    (Via del.icio.us.)

    ]]>
    what is web 2.0?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/909Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Ajax-S: Ajaxian slideshow software: "The idea came to me because I wanted a lightweight slideshow based on HTML, CSS and JavaScript, but I also wanted to separate the data of each page from the actual code that presents it. Therefore, I decided to move the data into an XML file and then use AJAX to retrieve it. The name AJAX-S is short for AJAX-Slides (or Asynchronous JavaScript and XML Slides, if you want to)."

    (Via Ajaxian Blog.)

    AJAX is clearly illuminating one of my pet issues: Separation of Application/Service Logic and Data. Even better, the concept of XML instance data is gradually getting much clearer. AJAX has created context for validating the concept of browser hosted Rich Internet Applications (RIA).

    AJAX has become a widely accepted framework for the InternetOS that facilitates Rich Internet Application development using Web 2.0 (and beyond) APIs.

    ]]>
    Ajax-S: Ajaxian slideshow softwarehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/905Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Semantic Web Challenge Winners: "

    Hot from the Galway sportsdesk:

    1. Prize: CONFOTO, appmosphere web applications, Germany
    2. Prize: FungalWeb, Concordia University, Canada
    3. Prize: Personal Publication Reader, Universität Hannover, Germany

    challenge.semanticweb.org

    CONFOTO is a browsing and annotation service for conference photos. It combines recent Web trends (tag-based categorization, interactive user interfaces, syndication) with the advantages of Semantic Web platforms (machine-understandable information, an extensible data model, the possibility to mix arbitrary RDF vocabularies).

    Congrats bengee!!

    (Benjamin had a string of bad luck just prior to the conference, there may still be glitches in the app - ‘my sparql store exploded last week’)

    "

    (Via Raw.)

    ]]>
    Semantic Web Challenge Winnershttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/898Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I am kinda scratching my head a little re. the "Clone Google APIs" call; especially as Amazon's A9 already provides infrastructure for generic search. A9 is open at both ends; you can consume search services via a RESTian API or plug your search engine into A9 (playing the role of A9 search service provider).

    Quick Example using my blog:

      3. Hactivism" regarding this matter. Certainly worth a full-post-scrape for my ongoing content annotation efforts (see Linkblog and BlogSummary).

      Digest the rest of Dare's post:

      Clone the Google APIs: Kill That Noise: "

      Yesterday Dave Winer wrote in a post about cloning the Google API Dave Winer wrote

      Let's make the Google API an open standard. Back in 2002, Google took a bold first step to enable open architecture search engines, by creating an API that allowed developers to build applications on top of their search engine. However, there were severe limits on the capacity of these applications. So we got a good demo of what might be, now three years later, it's time for the real thing.

      and earlier that
      If you didn't get a chance to hear yesterday's podcast, it recommends that Microsoft clone the Google API for search, without the keys, and without the limits. When a developer's application generates a lot of traffic, buy him a plane ticket and dinner, and ask how you both can make some money off their excellent booming application of search. This is something Google can't do, because search is their cash cow. That's why Microsoft should do it. And so should Yahoo. Also, there's no doubt Google will be competing with Apple soon, so they should be also thinking about ways to devalue Google's advantage.

      This doesn't seem like a great idea to me for a wide variety of reasons but first, let's start with a history lesson before I tackle this specific issue

      A Trip Down Memory Lane
      This history lesson used to be in is in a post entitled The Tragedy of the API by Evan Williams but seems to be gone now. Anyway, back in the early days of blogging the folks at Pyra [which eventually got bought by Google] created the Blogger API for their service. Since Blogspot/Blogger was a popular service, a the number of applications that used the API quickly grew. At this point Dave Winer decided that since the Blogger API was so popular he should implement it in his weblogging tools but then he decided that he didn't like some aspects of it such as application keys (sound familiar?) and did without them in his version of the API. Dave Winer's version of the Blogger API became the MetaWeblog API. These APIs became de facto standards and a number of other weblogging applications implemented them.

      After a while, the folks at Pyra decided that their API needed to evolve due to various flaws in its design. As Diego Doval put it in his post a review of blogging APIs, The Blogger API is a joke, and a bad one at that. This lead to the creation of the Blogger API 2.0. At this point a heated debate erupted online where Dave Winer berated the Blogger folks for deviating from an industry standard. The irony of flaming a company for coming up with a v2 of their own API seemed to be lost on many of the people who participated in the debate. Eventually the Blogger API 2.0 went nowhere.

      Today the blogging API world is a few de facto standards based on a hacky API created by a startup a few years ago, a number of site specific APIs (LiveJournal API, MovableType API, etc) and a number of inconsistently implemented versions of the Atom API.

      On Cloning the Google Search API
      To me the most salient point in the hijacking of the Blogger API from Pyra is that it didn't change the popularity of their service or even make Radio Userland (Dave Winer's product) catch up to them in popularity. This is important to note since this is Dave Winer's key argument for Microsoft cloning the Google API.

      Off the top of my head, here are my top three technical reasons for Microsoft to ignore the calls to clone the Google Search APIs

      1. Difference in Feature Set: The features exposed by the API do not run the entire gamut of features that other search engines may want to expose. Thus even if you implement something that looks a lot like the Google API, you'd have to extend it to add the functionality that it doesn't provide. For example, compare the features provided by the Google API to the features provided by the Yahoo! search API. I can count about half a dozen features in the Yahoo! API that aren't in the Google API.

      2. Difference in Technology Choice: The Google API uses SOAP. This to me is a phenomenally bad technical decision because it raises the bar to performing a basic operation (data retrieval) by using a complex technology. I much prefer Yahoo!'s approach of providing a RESTful API and MSN Windows Live Search's approach of providing RSS search feeds and a SOAP API for the folks who need such overkill.

      3. Unreasonable Demands: A number of Dave Winer's demands seem contradictory. He asks companies to not require application keys but then advises them to contact application developers who've built high traffic applications about revenue sharing. Exactly how are these applications to be identified without some sort of application ID? As for removing the limits on the services? I guess Dave is ignoring the fact that providing services costs money, which I seem to remember is why he sold weblogs.com to Verisign for a few million dollars. I do agree that some of the limits on existing search APIs aren't terribly useful. The Google API limit of 1000 queries a day seems to guarantee that you won't be able to power a popular application with the service.
      4. Lack of Innovation: Copying Google sucks.

      (Via Dare Obasanjo aka Carnage4Life.)

    ]]>
    Clone the Google APIs: Kill That Noisehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/892Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Marc Canter's Breaking the Web Wide Open! article is something I found pretty late (by my normal discovery standards). This was partly due to the pre- and post- Web 2.0 event noise levels that have dumped the description of an important industry inflection into the "Bozo Bin" of many. Personally, I think we shouldn't confuse the Web 2.0 traditional-pitch-fest conference with an attempt to identify an important industry inflection).

    Anyway, Marc's article is a very refreshing read because it provides a really good insight into the general landscape of a rapidly evolving Web alongside genuine appreciation of our broader timeless pursuit of "Openness".

    To really help this document provide additional value have scrapped the content of the original post and dumped it below so that we can appreciate the value of the links embedded within the article (note: thanks to Virtuoso I only had to paste the content into my blog, the extraction to my Linkblog and Blog Summary Pages are simply features of my Virtuoso based Blog Engine):

    Breaking the Web Wide Open! (complete story)

    Even the web giants like AOL, Google, MSN, and Yahoo need to observe these open standards, or they'll risk becoming the "walled gardens" of the new web and be coolio no more.

    Editorial Note: Several months ago, AlwaysOn got a personal invitation from Yahoo founder Jerry Yang "to see and give us feedback on our new social media product, y!360." We were happy to oblige and dutifully showed up, joining a conference room full of hard-core bloggers and new, new media types. The geeks gave Yahoo 360 an overwhelming thumbs down, with comments like, "So the only services I can use within this new network are Yahoo services? What if I don't use Yahoo IM?" In essence, the Yahoo team was booed for being "closed web," and we heartily agreed. With Yahoo 360, Yahoo continues building its own "walled garden" to control its 135 million customers—an accusation also hurled at AOL in the early 1990s, before AOL migrated its private network service onto the web. As the  Economist recently noted, "Yahoo, in short, has old media plans for the new-media era."

    The irony to our view here is, of course, that today's AO Network is also a "closed web." In the end, Mr. Yang's thoughtful invitation and our ensuing disappointment in his new service led to the assignment of this article. It also confirmed our existing plan to completely revamp the AO Network around open standards. To tie it all together, we recruited the chief architect of our new site, the notorious Marc Canter, to pen this piece. We look forward to our reader feedback.

    Breaking the Web Wide Open!
    By Marc Canter

    For decades, "walled gardens" of proprietary standards and content have been the strategy of dominant players in mainframe computer software, wireless telecommunications services, and the World Wide Web—it was their successful lock-in strategy of keeping their customers theirs. But like it or not, those walls are tumbling down. Open web standards are being adopted so widely, with such value and impact, that the web giants—Amazon, AOL, eBay, Google, Microsoft, and Yahoo—are facing the difficult decision of opening up to what they don't control.

    The online world is evolving into a new open web (sometimes called the Web 2.0), which is all about being personalized and customized for each user. Not only open source software, but open standards are becoming an essential component.

    Many of the web giants have been using open source software for years. Most of them use at least parts of the LAMP (Linux, Apache, MySQL, Perl/Python/PHP) stack, even if they aren't well-known for giving back to the open source community. For these incumbents that grew big on proprietary web services, the methods, practices, and applications of open source software development are difficult to fully adopt. And the next open source movements—which will be as much about open standards as about code—will be a lot harder for the incumbents to exploit.

    While the incumbents use cheap open source software to run their back-ends systems, their business models largely depend on proprietary software and algorithms. But our view a new slew of open software, open protocols, and open standards will confront the incumbents with the classic Innovator's Dilemma.  Should they adopt these tools and standards, painfully cannibalizing their existing revenue for a new unproven concept, or should they stick with their currently lucrative model with the risk that eventually a bunch of upstarts eat their lunch?

    Credit should go to several of the web giants who have been making efforts to "open up." Google, Yahoo, eBay, and Amazon all have Open APIs (Application Programming Interfaces) built into their data and systems. Any software developer can access and use them for whatever creative purposes they wish. This means that the API provider becomes an open platform for everyone to use and build on top of. This notion has expanded like wildfire throughout the blogosphere, so nowadays, Open APIs are pretty much required.

    Other incumbents also have open strategies. AOL has got the RSS religion, providing a feedreader and RSS search in order to escape the "walled garden of content" stigma. Apple now incorporates podcasts, the "personal radio shows" that are latest rage in audio narrowcasting, into iTunes. Even Microsoft is supporting open standards, for example by endorsing SIP (Session Initiation Protocol) for internet telephony and conferencing over Skype's proprietary format or one of its own devising.

    But new open standards and protocols are in use, under construction, or being proposed every day, pushing the envelope of where we are right now. Many of these standards are coming from startup companies and small groups of developers, not from the giants. Together with the Open APIs, those new standards will contribute to a new, open infrastructure. Tens of thousands of developers will use and improve this open infrastructure to create new kinds of web-based applications and services, to offer web users a highly personalized online experience.

    A Brief History of Openness

    At this point, I have to admit that I am not just a passive observer, full-time journalist or "just some blogger"—but an active evangelist and developer of these standards. It's the vision of "open infrastructure" that's driving my company and the reason why I'm writing this article. This article will give you some of the background behind on these standards, and what the evolution of the next generation of open standards will look like.

    Starting back in the 1980s, establishing a software standard was a key strategy for any software company. My former company, MacroMind (which became Macromedia), achieved this goal early on with Director. As Director evolved into Flash, the world saw that other companies besides Microsoft, Adobe, and Apple could establish true cross-platform, independent media standards.

    Then Tim Berners-Lee and Marc Andreessen came along, and changed the rules of the software business and of entrepreneurialism. No matter how entrenched and "standardized" software was, the rug could still get pulled out from under it. Netscape did it to Microsoft, and then Microsoft did it back  to Netscape. The web evolved, and lots of standards evolved with it. The leading open source standards (such as the LAMP stack) became widely used alternatives to proprietary closed-source offerings.

    Open standards are more than just technology. Open standards mean sharing, empowering, and community support. Someone floats a new idea (or meme) and the community runs with it – with each person making their own contributions to the standard – evolving it without a moment's hesitation about "giving away their intellectual property."

    One good example of this was Dave Sifry, who built the Technorati blog-tracking technology inspired by the Blogging Ecosystem, a weekend project by young hacker Phil Pearson. Dave liked what he saw and he ran with it—turning Technorati into what it is today.

    Dave Winer has contributed enormously to this area of open standards. He defined and personally created several open standards and protocols—such as RSS, OPML, and XML-RPC. Dave has also helped build the blogosphere through his enthusiasm and passion.

    By 2003, hundreds of programmers were working on creating and establishing new standards for almost everything. The best of these new standards have evolved into compelling web services platforms – such as del.icio.us, Webjay, or Flickr. Some have even spun off formal standards – like XSPF (a standard for playlists) or instant messaging standard XMPP (also known as Jabber).

    Today's Open APIs are complemented by standardized Schemas—the structure of the data itself and its associated meta-data. Take for example a podcasting feed. It consists of: a) the radio show itself, b) information on who is on the show, what the show is about and how long the show is (the meta-data) and also c) API calls to retrieve a show (a single feed item) and play it from a specified server.

    The combination of Open APIs, standardized schemas for handling meta-data, and an industry which agrees on these standards are breaking the web wide open right now. So what new open standards should the web incumbents—and you—be watching? Keep an eye on the following developments:

    Identity
    Attention
    Open Media
    Microcontent Publishing
    Open Social Networks
    Tags
    Pinging
    Routing
    Open Communications
    Device Management and Control



    1. Identity

    Right now, you don't really control your own online identity. At the core of just about every online piece of software is a membership system. Some systems allow you to browse a site anonymously—but unless you register with the site you can't do things like search for an article, post a comment, buy something, or review it. The problem is that each and every site has its own membership system. So you constantly have to register with new systems, which cannot share data—even you'd want them to. By establishing a "single sign-on" standard, disparate sites can allow users to freely move from site to site, and let them control the movement of their personal profile data, as well as any other data they've created.

    With Passport, Microsoft unsuccessfully attempted to force its proprietary standard on the industry. Instead, a world is evolving where most people assume that users want to control their own data, whether that data is their profile, their blog posts and photos, or some collection of their past interactions, purchases, and recommendations. As long as users can control their digital identity, any kind of service or interaction can be layered on top of it.

    Identity 2.0 is all about users controlling their own profile data and becoming their own agents. This way the users themselves, rather than other intermediaries, will profit from their ID info. Once developers start offering single sign-on to their users, and users have trusted places to store their data—which respect the limits and provide access controls over that data, users will be able to access personalized services which will understand and use their personal data.

    Identity 2.0 may seem like some geeky, visionary future standard that isn't defined yet, but by putting each user's digital identity at the core of all their online experiences, Identity 2.0 is becoming the cornerstone of the new open web.

    The Initiatives:
    Right now, Identity 2.0 is under construction through various efforts from Microsoft (the "InfoCard" component built into the Vista operating system and its "Identity Metasystem"), Sxip Identity, Identity Commons, Liberty Alliance, LID (NetMesh's Lightweight ID), and SixApart's OpenID.

    More Movers and Shakers:
    Identity Commons and Kaliya Hamlin, Sxip Identity and Dick Hardt, the Identity Gang and Doc Searls, Microsoft's Kim Cameron, Craig Burton, Phil Windley, and Brad Fitzpatrick, to name a few.


    2. Attention

    How many readers know what their online attention is worth? If you don't, Google and Yahoo do—they make their living off our attention. They know what we're searching for, happily turn it into a keyword, and sell that keyword to advertisers. They make money off our attention. We don't.

    Technorati and friends proposed an attention standard, Attention.xml, designed to "help you keep track of what you've read, what you're spending time on, and what you should be paying attention to." AttentionTrust is an effort by Steve Gillmor and Seth Goldstein to standardize on how captured end-user performance, browsing, and interest data are used.

    Blogger Peter Caputa gives a good summary of AttentionTrust:
    "As we use the web, we reveal lots of information about ourselves by what we pay attention to. Imagine if all of that information could be stored in a nice neat little xml file. And when we travel around the web, we can optionally share it with websites or other people. We can make them pay for it, lease it ... we get to decide who has access to it, how long they have access to it, and what we want in return. And they have to tell us what they are going to do with our Attention data."

    So when you give your attention to sites that adhere to the AttentionTrust, your attention rights (you own your attention, you can move your attention, you can pay attention and be paid for it,  and you can see how your attention is used) are guaranteed. Attention data is crucial to the future of the open web, and Steve and Seth are making sure that no one entity or oligopoly controls it.

    Movers and Shakers:
    Steve Gillmor, Seth Goldstein, Dave Sifry and the other Attention.xml folks.


    3. Open Media

    Proprietary media standards—Flash, Windows Media, and QuickTime, to name a few —helped liven up the web. But they are proprietary standards that try to keep us locked in, and they weren't created from scratch to handle today's online content. That's why, for many of us, an Open Media standard has been a holy grail. Yahoo's new Media RSS standard brings us one step closer to achieving open media, as do Ogg Vorbis audio codecs, XSPF playlists, or MusicBrainz. And several sites offer digital creators not only a place to store their content, but also to sell it.

    Media RSS (being developed by Yahoo with help from the community) extends RSS and combines it with "RSS enclosures" —adds metadata to any media item—to create a comprehensive solution for media "narrowcasters." To gain acceptance for Media RSS, Yahoo knows it has to work with the community. As an active member of this community, I can tell you that we'll create Media RSS equivalents for rdf (an alternative subscription format) and Atom (yet another  subscription format), so no one will be able to complain that Yahoo is picking sides in format wars.

    When Yahoo announced the purchase of Flickr, Yahoo founder Jerry Yang insinuated that Yahoo is acquiring "open DNA" to turn Yahoo into an open standards player. Yahoo is showing what happens when you take a multi-billion dollar company and make openness one of its core values—so Google, beware, even if Google does have more research fellows and Ph.D.s.

    The open media landscape is far and wide, reaching from game machine hacks and mobile phone downloads to PC-driven bookmarklets, players, and editors, and it includes many other standardization efforts. XSPF is an open standard for playlists, and MusicBrainz is an alternative to the proprietary (and originally effectively stolen) database that Gracenote licenses.

    Ourmedia.org is a community front-end to Brewster Kahle's Internet Archive. Brewster has promised free bandwidth and free storage forever to any content creators who choose to share their content via the Internet Archive. Ourmedia.org is providing an easy-to-use interface and community to get content in and out of the Internet Archive, giving ourmedia.org users the ability to share their media anywhere they wish, without being locked into a particular service or tool. Ourmedia plans to offer open APIs and an open media registry that interconnects other open media repositories into a DNS-like registry (just like the www domain system), so folks can browse and discover open content across many open media services. Systems like Brightcove and Odeo support the concept of an open registry, and hope to work with digital creators to sell their work to fulfill the financial aspect of the "Long Tail."

    More Movers and Shakers:
    Creative Commons, the Open Media Network, Jay Dedman, Ryanne Hodson, Michael Verdi, Eli Chapman, Kenyatta Cheese, Doug Kaye, Brad Horowitz, Lucas Gonze, Robert Kaye, Christopher Allen, Brewster Kahle, JD Lasica, and indeed, Marc Canter, among others.


    4. Microcontent Publishing

    Unstructured content is cheap to create, but hard to search through. Structured content is expensive to create, but easy to search. Microformats resolve the dilemma with simple structures that are cheap to use and easy to search.

    The first kind of widely adopted microcontent is blogging. Every post is an encapsulated idea, addressable via a URL called a permalink. You can syndicate or subscribe to this microcontent using RSS or an RSS equivalent, and news or blog aggregators can then display these feeds in a convenient readable fashion. But a blog post is just a block of unstructured text—not a bad thing, but just a first step for microcontent. When it comes tostructured data, such as personal identity profiles, product reviews, or calendar-type event data, RSS was not designed to maintain the integrity of the structures.

    Right now, blogging doesn't have the underlying structure necessary for full-fledged microcontent publishing. But that will change. Think of local information services (such as movie listings, event guides, or restaurant reviews) that any college kid can access and use in her weekend programming project to create new services and tools.

    Today's blogging tools will evolve into microcontent publishing systems, and will help spread the notion of structured data across the blogosphere. New ways to store, represent and produce microcontent will create new standards, such as Structured Blogging and Microformats. Microformats differ from RSS feeds in that you can't subscribe to them. Instead, Microformats are embedded into webpages and discovered by search engines like Google or Technorati. Microformats are creating common definitions for "What is a review or event? What are the specific fields in the data structure?" They can also specify what we can do with all this information.OPML (Outline Processor Markup Language) is a hierarchical file format for storing microcontent and structured data. It was developed by Dave Winer of RSS and podcast fame.

    Events are one popular type of microcontent. OpenEvents is already working to create shared databases of standardized events, which would get used by a new generation of event portals—such as Eventful/EVDB, Upcoming.org, and WhizSpark. The idea of OpenEvents is that event-oriented systems and services can work together to establish shared events databases (and associated APIs) that any developer could then use to create and offer their own new service or application. OpenReviews is still in the conceptual stage, but it would make it possible to provide open alternatives to closed systems like Epinions, and establish a shared database of local and global reviews. Its shared open servers would be filled with all sorts of reviews for anyone to access.

    Why is this important? Because I predict that in the future, 10 times more people will be writing reviews than maintaining their own blog. The list of possible microcontent standards goes on: OpenJobpostings, OpenRecipes, and even OpenLists. Microsoft recently revealed that it has been working on an important new kind of microcontent: Lists—so OpenLists will attempt to establish standards for the kind of lists we all use, such as lists of Links, lists of To Do Items, lists of People, Wish Lists, etc.

    Movers and Shakers:
    Tantek Çelik and Kevin Marks of Technorati, Danny Ayers, Eric Meyer, Matt Mullenweg, Rohit Khare, Adam Rifkin, Arnaud Leene, Seb Paquet, Alf Eaton, Phil Pearson, Joe Reger, Bob Wyman among others.


    5. Open Social Networks

    I'll never forget the first time I met Jonathan Abrams, the founder of Friendster. He was arrogant and brash and he claimed he "owned"  all his users, and that he was going to monetize them and make a fortune off them. This attitude robbed Friendster of its momentum, letting MySpace, Facebook, and other social networks take Friendster's place.

    Jonathan's notion of social networks as a way to control users is typical of the Web 1.0 business model and its attitude towards users in general. Social networks have become one of the battlegrounds between old and new ways of thinking. Open standards for Social Networking will define those sides very clearly. Since meeting Jonathan, I have been working towards finding and establishing open standards for social networks. Instead of closed, centralized social networks with 10 million people in them, the goal is making it possible to have 10 million social networks that each have 10 people in them.

    FOAF (which stands for Friend Of A Friend, and describes people and relationships in a way that computers can parse) is a schema to represent not only your personal profile's meta-data, but your social network as well. Thousands of researchers use the FOAF schema in their "Semantic Web" projects to connect people in all sorts of new ways. XFN is a microformat standard for representing your social network, while vCard (long familiar to users of contact manager programs like Outlook) is a microformat that contains your profile information. Microformats are baked into any xHTML webpage, which means thatany blog, social network page, or any webpage in general can "contain" your social network in it—and be used byany compatible tool, service or application.

    PeopleAggregator is an earlier project now being integrated into open content management framework Drupal. The PeopleAggregator APIs will make it possible to establish relationships, send messages, create or join groups, and post between different social networks. (Sneak preview: this technology will be available in the upcoming GoingOn Network.)

    All of these open social networking standards mean that inter-connected social networks will form a mesh that will parallel the blogosphere. This vibrant, distributed, decentralized world will be driven by open standards: personalized online experiences are what the new open web will be all about—and what could be more personalized than people's networks?

    Movers and Shakers:
    Eric Sigler, Joel De Gan, Chris Schmidt, Julian Bond, Paul Martino, Mary Hodder, Drummond Reed, Dan Brickley, Randy Farmer, and Kaliya Hamlin, to name a few.


    6. Tags

    Nowadays, no self-respecting tool or service can ship without tags. Tags are keywords or phrases attached to photos, blog posts, URLs, or even video clips. These user- and creator-generated tags are an open alternative to what used to be the domain of librarians and information scientists: categorizing information and content using taxonomies. Tags are instead creating "folksonomies."

    The recently proposed OpenTags concept would be an open, community-owned version of the popular Technorati Tags service. It would aggregate the usage of tags across a wide range of services, sites, and content tools. In addition to Technorati's current tag features, OpenTags would let groups of people share their tags in "TagClouds." Open tagging is likely to include some of the open identity features discussed above, to create a tag system that is resilient to spam, and yet trustable across sites all over the web.

    OpenTags owes a debt to earlier versions of shared tagging systems, which include Topic Exchange and something called the k-collector—a knowledge management tag aggregator—from Italian company eVectors.

    Movers & Shakers:
    Phil Pearson, Matt Mower , Paolo Valdemarin, and Mary Hodder and Drummond Reed again, among others.


    7. Pinging

    Websites used to be mostly static. Search engines that crawled (or "spidered") them every so often did a good enough job to show reasonably current versions of your cousin's homepage or even Time magazine's weekly headlines. But when blogging took off, it became hard for search engines to keep up. (Google has only just managed to offer blog-search functionality, despite buying Blogger back in early 2003.)

    To know what was new in the blogosphere, users couldn't depend on services that spidered webpages once in a while. The solution: a way for blogs themselves to automatically notify blog-tracking sites that they'd been updated. Weblogs.com was the first blog "ping service": it displayed the name of a blog whenever that blog was updated. Pinging sites helped the blogosphere grow, and more tools, services, and portals started using pinging in new and different ways. Dozens of pinging services and sites—most of which can't talk to each other—sprang up.

    Matt Mullenweg (the creator of open source blogging software WordPress) decided that a one-stop service for pinging was needed. He created Ping-o-Matic—which aggregates ping services and simplifies the pinging process for bloggers and tool developers. With Ping-o-Matic, any developer can alert all of the industry's blogging tools and tracking sites at once. This new kind of open standard, with shared infrastructure, is a critical to the scalability of Web 2.0 services.

    As Matt said:
    There are a number of services designed specifically for tracking and connecting blogs. However it would be expensive for all the services to crawl all the blogs in the world all the time. By sending a small ping to each service you let them know you've updated so they can come check you out. They get the freshest data possible, you don't get a thousand robots spidering your site all the time. Everybody wins.

    Movers and Shakers:
    Matt Mullenweg, Jim Winstead, Dave Winer


    8. Routing

    Bloggers used to have to manually enter the links and content snippets of blog posts or news items they wanted to blog. Today, some RSS aggregators can send a specified post directly into an associated blogging tool: as bloggers browse through the feeds they subscribe to, they can easily specify and send any post they wish to "reblog" from their news aggregator or feed reader into their blogging tool. (This is usually referred to as "BlogThis.") As structured blogging comes into its own (see the section on Microcontent Publishing), it will be increasingly important to maintain the structural integrity of these pieces of microcontent when reblogging them.

    Promising standard RedirectThis will combine a "BlogThis"-like capability while maintaining the integrity of the microcontent. RedirectThis will let bloggers and content developers attach a simple "PostThis" button to their posts. Clicking on that button will send that post to the reader/blogger's favorite blogging tool. This favorite tool is specified at the RedirectThis web service, where users register their blogging tool of choice. RedirectThis also helps maintain the integrity and structure of microcontent—then it's just up to the user to prefer a blogging tool that also attains that lofty goal of microcontent integrity.

    OutputThis is another nascent web services standard, to let bloggers specify what "destinations" they'd like to have as options in their blogging tool. As new destinations are added to the service, more checkboxes would get added to their blogging tool—allowing them to route their published microcontent to additional destinations.

    Movers and Shakers:
    Michael Migurski, Lucas Gonze


    9. Open Communications

    Likely, you've experienced the joys of finding friends on AIM or Yahoo Messenger, or the convenience of Skyping with someone overseas. Not that you're about to throw away your mobile phone or BlackBerry, but for many, also having access to Instant Messaging (IM) and Voice over IP (VoIP) is crucial.

    IM and VoIP are mainstream technologies that already enjoy the benefits of open standards. Entire industries are born—right this second—based around these open standards. Jabber has been an open IM technology for years—in fact, as XMPP, it was officially dubbed a standard by the IETF. Although becoming an official IETF standard is usually the kiss of death, Jabber looks like it'll be around for a while, as entire generations of collaborative, work-group applications and services have been built on top of its messaging protocol. For VoIP, Skype is clearly the leading standard today—though one could argue just how "open" it is (and defenders of the IETF's SIP standard often do). But it is free and user-friendly, so there won't be much argument from users  about it being insufficiently open. Yet there may be a cloud on Skype's horizon: web behemoth Google recently released a beta of Google Talk, an IM client committed to open standards. It currently supports XMPP, and will support SIP for VoIP calls.

    Movers and Shakers:
    Jeremie Miller, Henning Schulzrinne, Jon Peterson, Jeff Pulver


    10. Device Management and Control

    To access online content, we're using more and more devices. BlackBerrys, iPods, Treos, you name it. As the web evolves, more and more different devices will have to communicate with each other to give us the content we want when and where we want it. No-one wants to be dependent on one vendor anymore—like, say, Sony—for their laptop, phone, MP3 player, PDA, and digital camera, so that it all works together. We need fully interoperable devices, and the standards to make that work. And to fully make use of how content is moving online content and innovative web services, those standards need to be open.

    MIDI (musical instrument digital interface), one of the very first open standards in music, connected disparate vendors' instruments, post-production equipment, and recording devices. But MIDI is limited, and MIDI II has been very slow to arrive. Now a new standard for controlling musical devices has emerged: OSC (Open SoundControl). This protocol is optimized for modern networking technology and inter-connects music, video and controller devices with "other multimedia devices." OSC is used by a wide range of developers, and is being taken up in the mainstream MIDI marketplace.

    Another open-standards-based device management technology is ZigBee, for building wireless intelligence and network monitoring into all kinds of devices. ZigBee is supported by many networking, consumer electronics, and mobile device companies.


          · · · · · ·    

    The Change to Openness

    The rise of open source software and its "architecture of participation" are completely shaking up the old proprietary-web-services-and-standards approach. Sun Microsystems—whose proprietary Java standard helped define the Web 1.0—is opening its Solaris OS and has even announced the apparent paradox of an open-source Digital Rights Management system.

    Today's incumbents will have to adapt to the new openness of the Web 2.0. If they stick to their proprietary standards, code, and content, they'll become the new walled gardens—places users visit briefly to retrieve data and content from enclosed data silos, but not where users "live." The incumbents' revenue models will have to change. Instead of "owning" their users, users will know they own themselves, and will expect a return on their valuable identity and attention. Instead of being locked into incompatible media formats, users will expect easy access to digital content across many platforms.

    Yesterday's web giants and tomorrow's users will need to find a mutually beneficial new balance—between open and proprietary, developer and user, hierarchical and horizontal, owned and shared, and compatible and closed.


    Marc Canter is an active evangelist and developer of open standards. Early in his career, Marc founded MacroMind, which became Macromedia. These days, he is CEO of Broadband Mechanics, a founding member of the Identity Gang and of ourmedia.org. Broadband Mechanics is currently developing the GoingOn Network (with the AlwaysOn Network), as well as an open platform for social networking called the PeopleAggregator.

    A version of the above post appears in the Fall 2005 issue of AlwaysOn's quarterly print blogozine, and ran as a four-part series on the AlwaysOn Network website.

    (Via Marc's Voice.)

    ]]>
    Breaking the Web Wide Open! http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/882Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Yet Another RSS History: "

    [You don’t expect me to work out the CSS right after making it semantic, do you?]

    Shift to another universe. It’s sometime in the late 1990’s. Ramanathan Guha, Tim Bray, Dave Winer, Tantek Çelik, Dan Libby and Dan Connolly are sharing a jacuzzi*. As they sip Marghueritas, their conversation goes like this:

    • DanL

      So, we’ve got this idea for publishing content that’s a bit like CDF, but we’ve made the system more of a service than just a desktop thing.

    • Guha

      Sounds cool. Might be a good fit with this RDF thing I’ve been working on.

    • Dave

      Hmm, Dan’s stuff does sound cool, but with all due respect dude, RDF does seem a bit complicated. I really don’t think the folks out in userland would get it. And they majored in graphs.

    • Tim

      Maybe we could make it a bit more straightforward, you know, like put pointy brackets around it?

    • Dave

      Straightforward’s good. Better still, simple. They like simple.

    • Tantek

      But what about the rest of the Web, you know, like HTML?

    • DanL

      Hmm, but how do we do the timestamping kind of thing, and wrap it up in a ‘microposty’ way, the things that makes this distribution mode work?

    • Guha

      Yeah, metadata is cool. Keep the metadata.

    • Tim

      Not cheap though. The Web must be cheap. Did Andreesen show you his pictures..?

    • Dave

      …’Microposty’? you mean like my newsletter thing, but on the Web?

    • DanL

      Yep, like Cool Diary Entry of the Day

    • Tim

      But do we really need 1000 pages of spec for that?

    • Tantek

      …Incidentally, did you see my Box Model Hack?

    • Guha

      Yup.

    • DanL

      Yup.

    • Tim

      Yup.

    • Dave

      Yup. I explained that on DaveNet last year.

    • MarcC

      Hey! I’ve got it: ‘MyDigitalCocktail’..?

    • DanC

      Hang on, that gives me an idea

    There was a tangible outcome to this conversation: a document format which supports content and unambiguous, explicit, data and metadata, timestamping and much, much more. It’s viewable in a regular browser. Can be syndicated; can be aggregated. Unlike forgetful RSS, archives are almost always retrievable using regular HTTP methods. In this universe there was no RSS. No syndication wars. No talking-at-cross-purposes conflict between docheads and dataheads, syntax fans and model fans. No-one had to publish simple data in Byzantine RDF/XML. No-one had to deal with doubly-escaped content and silent data loss. There was no need for any new format for business cards, calendars, blogs, link lists, reviews, pet profiles. XHTML with CSS was more than enough. DanL got the MyNetscape he wanted. Tim got the simple, tight format he wanted. Guha got the AI. Tantek got to do presentations in a cool black raincoat. DanC finally got his schedule on his Palm Pilot. Dave got the credit. MarcC got the parasols and a grass skirt none of the others would admit to having brought.

    Shift back to this universe. Check out hAtom. It’s not finished yet, but David’s been methodically working through the (utterly sound) microformats process. Looks good to me.

    * apologies for the imagery, but how else do think Silicon Valley might seem to someone raised in the cowpat-coated hills of Derbyshire?

    PS. Apologies to everyone mentioned. And before you suggest it, blogging *is* therapy.

    "

    (Via Raw.)

    ]]>
    Yet Another RSS Historyhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/880Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Great report from Dare as usual :-) Beyond the obvious value of the post (information wise), I am also using the post placement here as a simple demonstration of what Blogs can offer (if driven or built atop a Web 2.0+ platform like Virtuoso). See the post that follows...

    Web 2.0 Conference Trip Report: Mash-ups 2.0 - Where's the Business Model?: "

    I attended the panel on business models for mash-ups hosted by Dave McClure,
    Jeffrey McManusPaul Rademacher, and Adam Trachtenberg.

    A mash up used to mean remixing two songs into something new and cool but now the term has been hijacked by geeks to means mixing two or more web-based data sources and/or services.

    Paul Rademacher is the author of the Housing Maps mash-up which he used as a way to find a house using Craig'sList + Google Maps. The data obtained from Craig's List is fetched via screen scraping. Although Craig's List has RSS feeds, they didn't meet his needs. Paul also talked about some of the issues he had with building the site such as the fact that since most browsers block cross-site scripting using XMLHttpRequest then a server needs to be set up to aggregate the data instead of all the code running in the browser. The site has been very popular and has garnered over 900,000 unique visitors based solely on word-of-mouth.

    The question was asked as to why he didn't make this a business but instead took a job at Google. He listed a number of very good reasons

    1. He did not own the data that was powering the application.
    2. The barrier to entry for such an application was low since there was no unique intellectual property or user interface design to his application

    I asked whether he'd gotten any angry letters from the legal department at Craig's List and he said they seem to be tolerating him because he drives traffic to their site and caches a bunch of data on his servers so as not to hit their servers with a lot of traffic. 

    A related mash-up site which scrapes real estate websites called Trulia was then demoed. A member of the audience asked whether Paul thought the complexity of mash-ups using more than two data sources and/or services increased in a linear or exponential fashion. Paul said he felt it increased in a linear fashion. This segued into a demo of SimplyHired with integrates with a number of sites including PayScale, LinkedIn, Job databases, etc.

    At this point I asked whether they would have service providers giving their perspective on making money from mash-ups since they are the gating factor because they own the data and/or services mash-ups are built on. The reply was that the eBay & Yahoo folks would give their perspective later.

    Then we get a demo of a Google Maps & eBay Motors mash-up. Unlike the Housing Maps mash-up, all the data is queried live instead of cached on the server. eBay has dozens of APis that encourage people to build against their platform and they have an affiliates program so people can make money from building on their API. We also got showed Unwired Buyer which is a site that enables you to bid on eBay using your cell phone and even calls you just before an auction is about to close. Adam Trachtenberg pointed out that since there is a Skype API perhaps some enterprising soul could mash-up eBay & Skype.

    Jeffrey McManus of Yahoo! pointed out that you don't even need coding skills to build a Yahoo! Maps mash-up since all it takes is specifying your RSS feed with longitude and latitude elements on each item to have it embedded in the map. I asked why unlike Google Maps and MSN Virtual Earth, Yahoo! Maps doesn't allow users to host the maps on their page nor does there seem to be an avenue for revenue sharing with mash-up authors via syndicated advertising. The response I got was that they polled various developers and there wasn't significant interest in embedding the maps on developer's sites especially when this would require paying for hosting.

    We then got showed a number mapping mashups including a mashup of the London bombings which used Google Maps, Flickr & RSS feeds of news (the presenter had the poor taste to point out opportunities to place ads on the site), a mashup from alkemis which mashes Google Maps, A9.com street level photos and traffic cams, and a mash-up from Analygis which integrates census data with Google Maps data.

    The following items were then listed as the critical components of mash-ups
     - AJAX (Jeffrey McManus said it isn't key but a few of the guys on the panel felt that at least dynamic UIs are better)
     - APIs
     - Advertising
     - Payment
     - Identity/Acct mgmt
     - Mapping Services
     - Content Hosting
     - Other?

    On the topic of identity and account management, the problem of how mash-ups handle user passwords came up as a problem. If a website is password protected then user's often have to enter their usernames and passwords into third party sites. An example of this was the fact that PayPal used to store lots of username/password information of eBay users which caused the company some consternation since eBay went through a lot of trouble to protect their sensitive data only to have a lot of it being stored on Paypal servers.

    eBay's current solution is similar to that used by Microsoft Passport in that applications are expected to have user's login via the eBay website then the user is redirected to the originating website with a ticket indicating they have been authenticated. I pointed out that although this works fine for websites, it offers no solution for people trying to build desktop applications that are not browser based. The response I got indicated that eBay hasn't solved this problem.

    My main comment about this panel is that it didn't meet expectations. I'd expected to hear a discussion about turning mashups [and maybe the web platforms they are built on] into money making businesses. What I got was a show-and-tell of various mapping mashups. Disappointing.

    "

    (Via Dare Obasanjo aka Carnage4Life.)

    ]]>
    Web 2.0 Conference Trip Report: Mash-ups 2.0 - Where#39s the Business Model?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/871Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Dare Obasanjo's recent contribution to the Web 2.0 clarification effort. His post-processing of the Web 2.0 treatise by Tim O'Reilly certainly got me thinking about the thorny issue of attempting to define Web 2.0. As most already know, the subject of Web 2.0 definition has been contentious from the onset (unfortunately for the wrong reasons: hype over substance):
    just take a look at the oxymoronic Wikipedia 2.0 imbroglio to get my drift. In retrospect, I should have called on Esquire magazine to get the Web 2.0 article going :-) ).
    Anyway, back to Dare's analysis of Tim's 7 Web 2.0 litmus test items listed below:
    • Services, not packaged software, with cost-effective scalability
    • Control over unique, hard-to-recreate data sources that get richer as more people use them
    • Trusting users as co-developers
    • Harnessing collective intelligence
    • Leveraging the long tail through customer self-service
    • Software above the level of a single device
    • Lightweight user interfaces, development models, AND business models
    And trimmed down to 3 by Dare:
    • Exposes Web services that can be accessed on any device or platform by any developer or user. RSS feeds, RESTful APIs and SOAP APIs are all examples of Web services.
    • Harnesses the collective knowledge of its user base to benefit users
    • Leverages the long tail through customer self-service
    Well, I would like to summarize this a little further using a few excerpts from my numerous contributions to the Web 2.0 talk page on Wikipedia (albeit mildly revised; see strikeouts etc.):
    Web 2.0 is a web of executable service invocation endpoints (those Web Services URIs) and well-formed content (all of that RSS, Atom, RDF, XHTML, etc. based Web Content out on the NET). The executable service invocation endpoints and well-formed content are accessible via URIs.

    Put in even simpler terms, Web 2.0 is an incarnation of the web defined by URIs for invoking Web Services and/or consuming or syndicating well-formed content.

    Looks like I've self edited my own definition in the process. :-)

    If you don't grok this definition then consider using it as a trigger for taking a closer look at the dynamics that genuinely differentiate Web 1.0 and Web 2.0.

    In another Wikipedia "talk page" contribution (regarding "Web 2.0 Business Impact") I attempt to answer the question posed here, which should also shed light on the premise of my definition above:

    Web 1.0 was about web sites geared towards an interaction with human beings as opposed to computers. In a sense this mirrors the difference between HTML and XML.

    A simple example (purchasing a book):

    amazon.com provides value to you by enabling you to search and purchase the desired book online via the site http://www.amazon.com.

    In the Web 1.0 era the process of searching for your desired book, and then eventually purchasing the book in question, required visible interaction with the site http://www.amazon.com. In today's Web 2.0 based Web the process of discovering a catalog of books, searching for your particular book of interest, and eventually purchasing the book, occurs via Web Services which amazon has chosen to expose via an executable endpoint (the Web point of presence for exposing its Web Services).

    Direct interaction via http://www.amazon.com is no longer required. A weblog can quite easily associate keywords, tags, and post categories with items in amazon.com's catalogs. In addition, weblogs can also act as entry points for consuming the amazon.com value proposition (making books available for purchase online), by enabling you to purchase a book directly from the weblog (assuming the blog owner is an amazon associate etc..). Now compare the impact of this kind of value discovery and consumption cycle driven by software to the same process driven by humans interaction with a static or dynamic HTML page (Web 1.0 site).

    To surmise, Web 2.0 is a reflection of the potential of XML expressed through the collective impact of Web Services (XML based distributed computing) and Well-formed Content (Blogosphere, Wikisphere, XHTML micro content etc.). The potential simply comes down to the ability to ultimately connect events, triggers, impulses (chatter, conversation, etc.), and data in general via URIs.

    Let's never forget that XML is the reason why we have a blogosphere (RSS/Atom/RDF are applications of XML). Likewise, XML is also the reason why we have Web Services (doesn't matter what format).

    As I have stated in the past, we must go by Web 2.0 en route what is popularly referred to as the Semantic Web (it will be known by another name by the time we get there; 3.0 or 4.0, who knows or cares?). At the current time, the prerequisite activity of self annotation is in full swing on the current Web, thanks to the inflective effects of Web 2.0.

    BTW - Would this URI to all Semantic Web related posts on my blog pass the Web 2.0 litmus test? Likewise, this URI to all Web 2.0 related posts? I wonder :-)

    ]]>
    The Web 2.0 Litmus Testhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/870Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Microsoft Gadgets, Start.com and Innovation: "

    A lot of the comments in the initial post on the Microsoft Gadgets blog are complaints that the Microsoft is copying ideas from Apple's dashboard. First of all, people should give credit where it is due and acknowledge that Konfabulator is the real pioneer when it comes to desktop widgets. More importantly, the core ideas in Microsoft Gadgets were pioneered by Microsoft not Apple or Konfabulator.

    From the post A Brief History of Windows Sidebar by Sean Alexander

    Microsoft 'Sideshow*' Research Project (2000-2001)

    While work started prior, in September 2001, a team of Microsoft researchers published a paper entitled, 'Sideshow: Providing peripheral awareness of important information' including findings of their project.
    ...
    The research paper provides screenshots that bear a striking resemblance to the Windows Sidebar. The paper is a good read for anyone thinking about Gadget development. For folks who have visited Microsoft campuses, you may recall the posters in elevator hallways and Sidebar running on many employees desktops. Technically one of the first teams to implement this concept

    *Internal code-name, not directly related to the official, ‘Windows SideShow™’ auxiliary display feature in Windows Vista.>

    Microsoft ‘Longhorn’ Alpha Release (2003)

    In 2003, Microsoft unveiled a new feature called, 'Sidebar' at the Microsoft Professional Developer’s Conference. This feature took the best concepts from Microsoft Research and applied them to a new platform code-named, 'Avalon', now formally known as Windows Presentation Foundation...

    Microsoft Windows Vista PDC Release (2005)

    While removed from public eye during the Longhorn plan change in 2004, a small team was formed to continue to incubate Windows Sidebar as a concept, dating back to its roots in 2000/2001 as a research exercise. Now Windows Sidebar will be a feature of Windows Vista. Feedback from customers and hardware industry dynamics are being taken into account, particularly adding support for DHTML-based Gadgets to support a broader range of developer and designer, enhanced security infrastructure, and better support for Widescreen (16:10, 16:9) displays. Additionally a new feature in Windows Sidebar is support for hosting of Web Gadgets which can be hosted on sites such as Start.com or run locally. Gadgets that run on the Windows desktop will also be available for Windows XP customers – more details to be shared here in the future.

    So the desktop version of 'Microsoft Gadgets' is the shipping version of Microsoft Research's 'Sideshow' project. Since the research paper was published a number of parties have shipped products inspired by that research including MSN Dashboard, Google Desktop and Desktop Sidebar but this doesn't change the fact that the Microsoft is the pioneer in this space.

    From the post Gadgets and Start.com by Sanaz Ahari

    Start.com was initially released on February 2005, on start.com/1 – since then we’ve been innovating regularly (start.com/2, start.com/3, start.com and start.com/pdc) working towards accomplishing our goals:

    • To bring the web’s content to users through:
      • Rich DHTML components (Gadgets)
      • RSS and behaviors associated with RSS
      • High customizability and personalization
    • To enable developers to extend their start experience by building their own Gadgets

    Yesterday marked a humble yet significant milestone for us – we opened our 'Atlas' framework enabling developers to extend their start.com experience. You can read more it here: http://start.com/developer. The key differentiators about our Gadgets are:

    • Most web applications were designed as closed systems rather than as a web platform. For example, most customizable 'aggregator' web-sites consume feeds and provide a fair amount of layout customization. However, the systems were not extensible by developers. With start.com, the experience is now an integrated and extensible application platform.
    • We will be enriching the gadgets experience even further, enabling these gadgets to seamlessly work on Windows Sidebar

    The Start.com stuff is really cool. Currently with traditional portal sites like MyMSN or MyYahoo, I can customize my data sources by subscribing to RSS feeds but not how they look. Instead all my RSS feeds always look like a list of headlines. These portal sites usually use different widgets for display richer data like stock quotes or weather reports but there is no way for me to subscribe to a stock quote or weather report feed and have it look the same as the one provided by the site. Start.com fundamentally changes this model by turning it on its head. I can create a custom RSS feed and specify how it should render in Start.com using JavaScript which basically makes it a Start.com gadget, no different from the default ones provided by the site.

    From my perspective, we're shipping really innovative stuff but because of branding that has attempted to cash in on the 'widgets' hype, we end up looking like followers and copycats.

    Marketing sucks.

    "

    (Via Dare Obasanjo aka Carnage4Life.)

    Posted for historic annotation purposes (re. Widgets as Microsoft didn't copy Apple here at all; Apple just packaged this better at the expense of Konfabulator as already noted above). And yes, Marketing sucks big time!!]]>
    Microsoft Gadgets, Start.com and Innovationhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/868Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Regurgitating an old rant (Encoding, XForms, and SOAP/XML-RPC): "

    I ran into two work-related problems today that left me feeling like there are some aspects of two very recent (Web 2.0-esque if we wish to join the buzzword orgy of late) architectures (REST/Services and XForms) that are problematic:

    Demonstrating an Achilles Heel Of XML Messaging

    XML as a medium for remote communication (evangelized more with WSDL-related architectures than in REST) has over-stated its usefullness in at least one concrete regard, in my estimation. I've had a hard time taking most of the architectural arguments on the pros/cons of SOAP/XML-RPC versus REST seriously because it seems to be nothing more than buzzword warfare. However, I recently came across a concrete, real world example of the pitfalls of implementing certain remote service needs on XML-based communication mediums (such as SOAP/XML-RPC).

    If the objects/resources you wish to manipulate at the service endpoints are run of the mill (consider the standard cliche purchase order example(s)) then the benefits of communicating in XML is obvious: portability, machine readability, extensibility, etc.. However consider the scenario (which I face) in which the objects/resources you wish to manipulate are XML documents themselves! This scenario seems to work to the disadvantage of the communication architecture.

    Lets say you have a repository at one end (which I do) that has XML documents you wish to manipulate remotely. How do you update the documents? I've discussed this before (see: Base64 encoded XML content from an XForm) so I'll spare the details of the problem. However, I will mention that in retrospect this particular problem further emphasizes the advantage of a MinimalistRemoteProcedureCall (MRPC) approach - MRPC is my alternative acronym for REST :).

    Consider the setContent message:

    [SOAP:Envelope]
        [SOAP:Body]
         [foo:setContent]
           [path] .. path to document [/path]
           [src]... new document as a fragment ...[/src]
         [/foo:setContent]
       [/SOAP:Body]
    [/SOAP:Envelope]
    

    Notice that the location of the resource we wish to update is embedded within the message transmitted (via SOAP), which is transported on top of another communication medium (HTTP) that already has the neccessary semantics for saying the same thing:

    Set the content of the resource identified by a path

    In the SOAP scenario, the above message is delivered to a single service endpoint (which serves as an external gateway for all SOAP messages) which has to then parse the entire XML message in order to determine the method invoked (setContent in this case) and the parameters passed to it (both of which are only header information on a document that consists mostly of the new document).

    However, in the MRPC scenario this service would be invoked simply as an HTTP PUT request sent directly to the XML document we wish to update:

    Method: PUT
    Protocol:  HTTP/1.0
    URI: http://remoteHost:port/< .. path to XML document ..>
    CONTENT:
    ... new document in it's entirety ..
    

    Here, there is no need for a service middleman to interpret the service requested (and no need to parse a large XML document that contains another document embedded as a fragment). The HTTP request by itself specifies everything we need and does it using HTTP alone as the communication medium. This is even more advantageous when the endpoint is a repository that has a very well defined URI scheme or general addressing mechanism for it's resources (which 4Suite does, the repository in my case).

    The Headaches of Base 64 Encoding in XForms

    Since i didn't have the option of a REST-based service architecture (the preferred solution) I was relegated to having to base64 encode the new XML content and embed it within the XML message submitted to the service endpoint, like so:

    [SOAP:Envelope]
       [SOAP:Body]
         [foo:setContent]
           [path] .. path to document [/path]
           [src]... base64 encoding of new document's serialization ...[/src]
         [/foo:setContent]
       [/SOAP:Body]
    [/SOAP:Envelope]
    

    Base 64 seemed like the obvious encoding mechanism mostly because it would seem from an interpretation of the XForms specification that due to the data binding restrictions of the Upload Control when bound to instances of type xsd:base64Binary a conforming XForms processor is responsible for having the capability to encode to Base 64 on the fly. Now, this is fine and dandy if the XML content you wish to submit is retrieved from a file on the local file system of the client communicating remotely with the server. However, what if you wish to use an instance (a live DOM) as the source for the update? This seems like a very reasonable requirement given that one of the primary motivation of XForms is to encourage the use of XML instances as the user interface data model (providing a complete solution to the 'M' in the MVC architecture.)

    However:

    • There is no mechanism within XForms for serialising live instances (there needs to be such a standard so implementations don't create their own proprietary mechanisms)
    • There is no mechanism within XForms for explicitely encoding text in some portable binary format (which is incredibly useful IMHO - as shown above)
    "

    (Via Uche Ogbuji.)

    ]]>
    Regurgitating an old rant (Encoding, XForms, and SOAP/XML-RPC)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/862Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    John C. Dvorak pens an interesting piece about the "deafening silence" accorded Windows Vista thus far.

    In the past I have expressed views that echo the essence of John's piece. It has been pretty darn clear to me that Microsoft is struggling as a result of its inability to handle challenges associated with the metaphoric "computing vase" which it sought to own solely as a result of its proclivity for crushing and/or alienating erstwhile technology partners as part of this quest (a process that commenced a long time ago culminating the contradiction and ultimate paradox called IE7; remember not too long ago it was impossible to separate IE from Windows! It could only exist as an OS extension etc.).

    Windows in its current incarnation fails to provide a productive working environment, you either have a plethora of viruses and spyware contending for you computing resources, or you have all the software in place to protect against these assaults rendering the computing resources equally busy. The computing power lag is simply too much when using windows, and this is its achilles heel!

    I have been using Windows since version 2.0, and although I have always found the Mac OS variations to be superior on the UI front, I never found any of the historic versions viable alternatives. In my case, this is all about providing a productive work environment across the following usage modes, in descending order of priority:

    1. Power User (OutLook, Excel, WORD, and other desktop productivity tools)
    2. Product Testing and QA
    3. Programmer Buddy (a Microsoft term)
    4. Programming (for the most part prototyping)

    The release of Mac OS X Tiger lead me down an evaluation path that I have repeated many times in the past: test the viability of moving wholesale from Windows to Mac OS X and remain functional (if really lucky, exceed existing productivity levels). This time around I found that I could actually migrate over 6 years worth of emails, contacts, presentations, documents, spreadsheets from Windows to Mac OS X. I also discovered that success extended all the way to my data linked documents that are transparently bound to back-end databases (in my case the norm rather the exception via ODBC).

    I now use Mac OS X as my prime working platform (I still have to use Windows as the platform remains strategic for all our product offerings), and I am absolutely loving it! The joint feelings of euphoria and confusion that I experienced post migration were similar to how I felt after making the transition from "stick shift" to "automatic" geared cars (as I transitioned my residence from the UK to the U.S). At the time I couldn't understand why anyone (other than a grand prix driver) would ever drive a "stick shift" by choice.

    Today, I can't understand why I stuck with Windows for so long at the expense of my daily working productivity. The biggest bonus from this transition is that Mac OS X has made it easier for me to engage less technical individuals (family & friends) in the sheer joy and potential of Information Technology across a variety of realms as opposed to being confined to the "business computing" realm solely. I can demonstrate the power and potential of the Internet, Web, Web Services, Blogosphere, Wikispehere, with much more sanity and coherence now that my machine responds in a timely fashion during these demos amongst other benefits.

    Some may deem this windows bashing, but if they take the time to look a little deeper, this is simply about "straight shooting" from a real computer user (I like my computers to do deliver on their hugh potential promised; I don't compromise this basic expectation; my computer and associate software should save me time and ramp up my productivity!) . If Microsoft is the company that it once was, then it would simply use this kind of commentary to rally its troops and get its act together! That's what I would do if a customer felt so badly about our technology (UDA or Virtuoso).

    ]]>
    End of Line for Microsoft?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/856Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    After digesting Oblique Angle's post titled: World Wide Web of Junk, it was nice to be reassured that I am not part of a shrinking minority of increasingly peturbed Web users. The post excerpt below is what compelled me to contribute some of my thoughts about the current state of the Web and a future "Semantic Web".
    The value of the Internet as a repository of useful information is very low. Carl Shapiro in “Information Rules” suggests that the amount of actually useful information on the Internet would fit within roughly 15,000 books, which is about half the size of an average mall bookstore. To put this in perspective: there are over 5 billion unique, static & publicly accessible web pages on the www. Apparently Only 6% of web sites have educational content (Maureen Henninger, “Don’t just surf the net: Effective research strategies”. UNSW Press). Even of the educational content only a fraction is of significant informational value.
    Noise is taking over the Web at an alarming rate (to be expected in a sense ), and even though Tim Berners-Lee (TBL) had the foresight to create the Web, many see nothing but futility in his vision for a "Semantic Web" (I don't!).  A recent example of such commentary comes from Eric Nee's CIO article, titled:  Web Future is Not Semantic, Or Overly Orderly. I take issue with this article because, like most (who have been bitten at least once),  I don't like mono cultureThis article inadvertently promotes "Google Mono Culture".  I have excerpted the more frustrating parts of this article below:

    ..As Stanford students, Larry Page and Sergey Brin looked at the same problem—how to impart meaning to all the content on the Web—and decided to take a different approach. The two developed sophisticated software that relied on other clues to discover the meaning of content, such as which Web sites the information was linked to. And in 1998 they launched Google..

    You mean noise ranking. Now, I don't think Larry and Sergey set out to do this, but Google page ranks are ultimately based on the concept of "Google Juice" (aka links). The value quotient of this algorithm is accelerating at internet speed (ironically, but naturally). Human beings are smarter than computers, we just process data (not information!) much slower that's all. Thus, we can conjure up numerous ways to bubble up the google link ranking algorithms in no time (as is the case today).

    ..What most differentiates Google's approach from Berners-Lee's is that Google doesn't require people to change the way they post content..

    The Semantic Web doesn't require anyone to change how they post content either! It just provides a roadmap for intelligent content managment and consumption through innovative products.

    ..As Sergey Brin told Infoworld's 2002 CTO Forum, "I'd rather make progress by having computers under-stand what humans write, than by forcing -humans to write in ways that computers can understand." In fact, Google has not participated at all in the W3C's formulation of Semantic Web standards, says Eric Miller..

    Semantic Content generated by next generation content managers will make more progress, and they certainly won't require humans to write any differently. If anything, humans will find the process quite refreshing as and when participation is required e.g. clicking bookmarklets associated with tagging services such as 'del.icio.us', 'de.lirio.us', or Unalog and others. But this is only the beginning, if I can click on a bookmarklet to post this blog post to a tagging service, then why wouldn't I be able to incorporate the "tag service post" into the same process that saves my blog post (the post is content that ends up in a content management system aka blog server)?

    Yet Google's impact on the Web is so dramatic that it probably makes more sense to call the next generation of the Web the "Google Web" rather than the "Semantic Web."

    Ah! so you think we really want the noisy "Google Web" as opposed to a federation of distributed Information- and Knowledgbases ala the "Semantic Web"? I don't think so somehow!

    Today we are generally excited about "tagging" but fail to see its correlation with the "Semantic Web", somehow? I have said this before, and I will say it again, the "Semantic Web" is going to be self-annotated by humans with the aid of intelligent and unobtrusive annotation technology solutions. These solutions will provide context and purpose by using our our social essence as currency. The annotation effort will be subliminal, there won't be a "Semantic Web Day" parade or anything of the like. It will appear before us all, in all its glory, without any fanfare. Funnily enough, we might not even call it "The Semantic Web", who cares? But it will have the distinct attributes of being very "Quiet" and highly "Valuable"; with no burden on "how we write", but constructive burden on "why we write" as part of the content contribution process (less Google/Yahoo/etc juice chasing for more knowledge assembly and exchange).

    We are social creatures at our core. The Internet and Web have collectively reduced the connectivity hurdles that once made social network oriented solutions implausible. The eradication of these hurdles ultimately feeds the very impulses that trigger the critical self-annotation that is the basis of my fundamental belief in the realization of TBL's Semantic Web vision.

     

    ]]>
    World Wide Web of Junkhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/849Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    While I'm still trying to figure this out, you should read Shelley's original post, Steve Levy, Dave Sifry, and NZ Bear: You are Hurting Us and see whether you think the arguments against blogrolls are as wrong as I think they are.

    [via Dare Obasanjo aka Carnage4Life]
     
    Shelley's post does bring attention to important issues relating to the blogosphere. It touches on how a simple matter can get complex very quickly. All of a sudden what was so simple, becomes pretty complex.
     
    Blogrolls are completely ambiguous. We use them in a variety of ways, but the inherent ambiguity leads to misinterpretation, and in some cases it breeds dysfunctionality of the kind Shelley alludes to in this excerpt:

    "..The Technorati Top 100 is too much like Google in that ‘noise’ becomes equated with ‘authority’. Rather than provide a method to expose new voices, your list becomes nothing more than a way for those on top to further cement their positions. More, it can be easily manipulated with just the release of a piece of software.."

     
    When blogrolls started to appear on blog home pages there was no blogosphere as we know it today (most viewing was browser as opposed to aggregator based). Blogrolls where a great way of bootstrapping a burgeoning blogosphere (a kind of "look who's blogging now" symbol). The issue of Blogrolls being dynamic, static, or genuinely meaningful was unimportant, unfortunately. In a sense they were simple, static, and in today's parlance: fashionably sloppy.
     
    Today, we have a very extensive and lively blogosphere, it is now mainstream, and has basically become a data source in its own right; introducing challenges exemplified by our inability to clearly state the meaning and purpose of a blogroll.
     
    The question of "blogroll meaning" may result in alternative use of "attention.xml" which has the prime goal of addressing challenges associated with tracking and reading posts from a large blog subscription pool. Why not use this as the basis for generating less ambiguous blogrolls?
     
    The blogosphere has been an important catalyst for understanding the current Web 2.0 inflection as demonstrated by the transition from the Web Browsers to Feed Aggregators & Readers for reading and tracking blogs (blog home pages are secondary aspects of the interaction with any given blog these days). Unfortunately, there is a general perception that Web 2.0 and the Semantic Web are mutually exclusive, primarily due to the perceived lofty goals of the latter (what's wrong with being challenged?). From my vantage point, I continue to see Web 2.0 as a necessary infrastructure component for the Semantic Web that will ultimately provide context for understanding why it's so important.
     
    The Semantic Web will certainly aid in our ability to infer or deduce the meaning of a blog owner's published blogroll since it provides a vehicle for conveying such meaning in human and machine consumable forms. Until then, I remain stumped. I see where Shelley is coming from, but I don't know what to do with my blogroll right this moment :-) On the other hand I certainly know what I am planning to do with my real blogroll (not the snapshot you see today) in the not too distant future.
     
    ]]>
    When did Blogrolls Become Evil?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/846Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    My entire time in the IT industry has been spent primarily trying to develop, architect, test, mentor, evangelize, and educate about one simple subject: Standards Appreciation!

    The trouble with "Standards Appreciation" is that vendors see standards from the following perspectives primarily:

    1. Yet another opportunity to lock-in the customer
    2. If point 1. fails then undermine the standard vociferously (an activity that takes many covert forms; attack performance, security, and maturity)
    3. Developers don't like standards (the real reason for this is to-do lists and timeframes in most cases)

    Korateng Ofusu-Amaah provides insightful perspective on the issues above, in a recent "must read" blog post about how this dysfunctionality plays out today in the realm of HTML Buttons and Forms. Here are some notebable excerpts:

    "Instead my discourse devolved into a case of I told you so, a kind of Old Testament view of things instead of the softer New Age stylings that are in vogue these days. Sure there was a little concern for the users that had been hurt by lost data, but there was almost no empathy for the developers who had to lose their weekends furiously reworking their applications to do the right thing especially because it appeared that they would rather persist in trying to do the wrong thing.

    The sentiment behind that mini tempest-in-a-teapot however was a recognition of the fact that those who have been quietly evangelizing the web style were talking about the wrong thing and to the wrong people."

    ...

    "..As application developers we should ask for better forms, we should be demanding of browser makers things like XForms or Web Forms 2.0 to make sure that we can go beyond the kind of stilted usability that we currently have. Our users would appreciate our efforts in that vein but for now, they know what to expect. Until then application developers should push back when we are told to "do the wrong thing".

    There is an unfortunate mindset trend at the current time that espouses: "Sloppiness" is good, and "Simple" justifies inadequacy at all times. Today, the real focus of most development endeavours is popularity first and coherance (backward compatibility, standards compliance, security, scalability etc.) a distant second, if you can simply make things popular then that justifies the sloppiness (acquisition, VC money, Blogosphere Juice etc.). Especially as someone else will ultimately have to deal with the predictable ramifications of the sloppiness.

    Standards are critical to the success of IT investment within any enterprise, but standards are difficult to design, write, implement, and then comprehend; due to the inherent requirement for abstraction - it's a top down, as opposed to bottom up, process.

    Vendors will never genuinely embrace standards, until IT decision makers demand standards compliance of them, by demonstrating a penchant for smelling out "leaky abstractions" embedded within product implementations. Naturally, this requires a fundamental change of mindset for most decision makers. It means moving away from the "this analyst said...", "I heard that company X is going to deliver....", "I read that .....", "I saw that demo..." approach to product evaluation, to a more knowledgeable evaluation process that seeks out the What, Why, and How of any prospective IT solution. 

    Knowledge empowers all of the time. It's a gift that stands the test of time once you invest some time in its acquisition (unfortunately this gift isn't free!). Ignorance with all its superficial seduction (free and widely available!), is temporary bliss at best, and nothing but heartache over time.

    ]]>
    Standards Contempt Revisitedhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/834Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    By Mark Bierbeck:

    Ajax, Hard Facts, Brass Tacks ... and Bad Slacks

    A number of people have contacted me recently about Ajax [1] -- a catchy name -- coined to provide an umbrella term for a particular group of technologies used to build web applications. The use of the word comes from Jesse James Garrett in a recent blog [2], and describes a class of internet applications written using JavaScript in a browser. By using JavaScript these apps have full access to the DOM, and as a consequence are able to make all sorts of changes to the page that the user is interacting with, without having to go back to the server.When the application does need to go back to the server -- to deliver some data and get a response -- the idea is to keep the DOM intact so that the user has a smooth experience. This means that all communication with the server needs to take place outside of the normal HTML form mechanism, since this would obviously replace the current page.Ajax addressed this, with what it calls 'asynchronous-JavaScript' -- retrieve only the data you need, and then directly manipulate the DOM to get the effect you want. 'Asynchronous-JavaScript' accounts for the first few letters of the name, with the remainder being the obligatory 'X' for XML (although XML is not really key to this technology, and many of the applications that are often cited as Ajax-apps don't use XML as the data medium). BuzzingThe response to Ajax has been pretty positive. In fact the only negatives have been either to suggest a change of name or to moan a little that "I've been doing this for years, why hasn't anyone noticed me?" (I won't put any links to those sort of articles, since they are a little embarassing -- after all, everyone has been doing this for years!)Anyway, despite a couple of sour-pusses, the software community is almost universally excited, and the blog wires have glowed over the last few months with descriptions of Google Maps, GMail, and so on.Just about everyone who has asked me about Ajax has expected me to be disappointed. Surely, they say, this makes the case for XForms weaker? But my answer is the exact opposite -- XForms and standards-based web applications are in every way superior to the techniques described as Ajax, since the whole raison d'être of XForms and XHTML 2 is to address the very problems that Ajax-like techniques suffer from.That may come across as a little bold...so perhaps I should explain. From Workaround to FeatureWe've all been using HTML mark-up for years now, and the language hasn't changed much in that time. As a consequence, the increasing demand for more complex web-pages has meant that the balance in our documents has shifted increasingly from vanilla mark-up to 'the workaround'. Whether it's providing tooltips, dynamic/repeating data sections, or small portions of our page that change without having to request a new document, we've generally had to dive into script. But the shift from mark-up to script has meant that the mark-up language itself has been relegated to a mere carrier for programs.Unfortunately this means that no-one gains -- it's annoying for the programmer to have to produce ever more convoluted spaghetti JavaScript to meet the demands of their audience, but it's also annoying for the non-programmer, who probably only wants a tooltip. And its particularly annoying for those who want to use documents on the web for more ambitious applications to find that most of the important stuff in a document is hidden away in script.All is not lost, however, since this collection of 'workarounds' provides a rich source of real-life patterns that appear for authors and programmers, time and again. They may be workarounds, but they are much-needed ones.The aim of the new generation of languages like XForms and XHTML 2 is to take these 'common patterns' and turn them into mark-up. Just like the HTML elements <a> and <form> pack an enormous amount of functionality into deceptively simple tags, so too can new declarative mark-up capture patterns that have emerged 'in the wild'.(Note that this is the opposite of so-called folksonomies, where popular practice that occurs in the wild is left it the wild, and codification is regarded as a dirty word.) The XML HTTP Request ObjectLet's take the much talked about XML HTTP Request Object (XMLHttpRequest). If you are not familiar with it, it was originally part of Microsoft's XML parser, and allows you to send and receive data outside of the normal HTML form processing. Since it's a handy feature to have in a client, other browsers have followed suit and it's now becoming the 'standard' way to communicate with servers without messing up your page. It's a corner-stone of Ajax. (A good summary with examples is on Jim Ley's jibbering.com site [3].)But...we need to be clear that we're using XMLHttpRequest to get round a weakness in HTML forms. The problem we have is that even if you know that a server is about to give you some data, and the server knows it's about to give you some data, there's no way to tell your form that -- instead your page will be wiped out and replaced with whatever the server sends back.Of course, constant round-tripping doesn't make it completely impossible to produce applications, and a lot of books and airline tickets are bought every day without the facility to get 'just the data'. But we all know it would reduce network traffic and create a smoother user experience if we could just send a list of books or seats, rather than a whole new page.Over the years applications such as Microsoft's Outlook Web Access (OWA), have had to step around the HTML form to get just the data they need. But, whilst OWA considerably predates GMail, until the advent of XMLHttpRequest, the techniques used were quite difficult to manage. (Google Suggest is often cited as a good example of an Ajax-app, but interestingly merges old and new techniques; XMLHttpRequest is used to obtain a piece of JavaScript from a server, and this script contains a call to a client-side function, but using server-provided parameters. It's one of the techniques you might have used in the past with a hidden frame.)So as many have said on their blogs, XMLHttpRequest is not a newly devised technique, but rather a generally accepted replacement for a very old technique. But ultimately that technique is a workaround since the real problem is that HTML forms will always replace the current page. Beyond HTML FormsWhilst XMLHttpRequest gives us a way to get data to and from the server without losing our document, we've unfortunately thrown the baby out with the bath-water; whatever the weaknesses of HTML forms, you have to acknowledge that they are pretty simple to use. Here's an abbreviated version of Google's search form (note that the mark-up is HTML, not XML):<form action=/search name=f> <input type=hidden name=hl value=en> <input maxLength=256 size=55 name=q value=""> <input type=submit value="Google Search" name=btnG></form> As you can see, the simple problem with HTML forms is that we don't say anything about where the data should go when we've received it from the server. The assumption in HTML of old is that we are just doing a kind of 'super-navigation', and no matter what we send to the server, it will only ever give us back a new web-page. (To put it a different way, you could say that <a> and <form> are pretty much the same thing.)To see how this problem is resolved, let's code the same Google search in XForms:<xf:submission id="sub-search" action="http://www.google.com/complete/search?hl=en" method="get" separator="&" replace="all"/> <xf:input ref="q"> <xf:label>Query:</xf:label></xf:input> <xf:submit submission="sub-search"> <xf:label>Google Search</xf:label></xf:submit> Although it will do exactly the same -- right down to replacing the current page -- it's a little different to the HTML mark-up. But the changes in structure have given us some major benefits, from accessible labels on our form controls, to the possibility of many different submissions for the same data.But what it has also given us is the possibility of solving our data update problem. The replace attribute is actually optional in XForms, but I showed it in the previous mark-up so that you can compare it to this:<xf:submission id="sub-search" action="http://www.google.com/complete/search?hl=en" method="get" separator="&" replace="instance"/> In this example the data returned from the server will just replace the instance that was sent, and our page will remain completely intact. (The replace attribute can take the values all, instance, or none.)I won't show the full equivalent using XMLHttpRequest since it's pretty large, but I'll give a flavour of it. (Jim Ley's page -- referenced earlier -- shows how to search Google with XMLHttpRequest.) The Script VersionFirst we need to create an XMLHttpRequest object, but we need to do it in such a way that it will work on both Mozilla and IE:var req; function loadXMLDoc(url) { // native XMLHttpRequest object if (window.XMLHttpRequest) { req = new XMLHttpRequest(); req.onreadystatechange = readyStateChange; req.open("GET", url, true); req.send(null); // IE/Windows ActiveX version } else if (window.ActiveXObject) { req = new ActiveXObject("Microsoft.XMLHTTP"); if (req) { req.onreadystatechange = readyStateChange; req.open("GET", url, true); req.send(); } }} When a document is loaded via this function, the readyStateChange() method is invoked:function readyStateChange() { // '4' means document "loaded" if (req.readyState == 4) { // 200 means "OK" if (req.status == 200) { // do something here } else { // error processing here } }} From a programming point of view, I guess you could say that there isn't a lot wrong with this, but then from a programming point of view there wasn't a lot wrong with Z80 or 6502 assembly languages -- I just wouldn't want to go back to them!But the most important issue is that we have lost the very thing that was responsible for HTML's success -- the use of simple, clear, declarative mark-up, in which we simply state our intent, without having to write a program to do it for us. After all, the web took off because authors only had to master <a> in order to enter the exciting new world of 'hypertext' -- but XMLHttpRequest raises the bar again, and takes us right back into the heart of geek-world. Beyond XMLHttpRequestBut in keeping with the principle that I outlined above -- that XForms and XHTML 2 try to provide mark-up for commonly existing design patterns -- let's see if there are any other patterns that XMLHttpRequest has thrown up.You will have noticed in the earlier script that we had tests for success and failure:if (req.status == 200) { // do something here} else { // error processing here} XForms provides the same functionality through the use of events -- on success do this, on failure do that. This is far more powerful, since it hides the protocol-specific aspects of this code ("200" may be 'success' for HTTP, but it isn't 'success' when saving data to the hard-drive or sending an email).XForms uses declarative mark-up to express those events, which again dramatically reduces coding:<xf:action ev:observer="sub-search" ev:event="xforms-submit-error"> <xf:message level="modal"> Submission failed </xf:message></xf:action> But there's lots, lots more in the submission part of XForms: it can provide full XML Schema validation before submitting the data; there is built in support for numerous types of serialisation, such as multipart/related; abstract methods are used so the code is independent of protocol. For example, since put means the same thing whether the target URL begins http: or file:, a form with relative paths will run unchanged on a local machine or a web server; it's extensible -- in formsPlayer 2.0 we have used the submission element to read and write from an ADO database, allowing programmers to convert forms from using the web to using a local database by doing nothing more than changing a single target URL. (Try doing that with XMLHttpRequest!)The submission part of XForms is in fact so powerful that it will eventually form a separate specification, for use in other languages. From Patterns to Mark-upAnd there are plenty more patterns out there that were crying out to be turned into mark-up, and which are now incorporated into XForms and XHTML 2. Do you remember the days when if we wanted a tooltip that contained mark-up -- perhaps an image, or bold text -- we had to use a carefully placed <div>, a CSS display: none;, a mouseover event handler and a timer? Nowadays the programmer with better things to do than work with spaghetti-JavaScript just uses the XForms <hint> element, and for free they get platform independence (and therefore accessibility), as well as the ability to insert any mark-up.And what about the days when we had to write code to open up a text-to-speech engine, and then invoke the various methods on the object to get it to speak its mind? Nowadays who wouldn't just use a CSS property on their XForms' messages? Bad SlacksAnd do you remember...I'm sorry, this one always makes me laugh...do you remember how we used to write lots of JavaScript to recalculate the shopping-cart when a new item was added? I know it's hard to believe -- it's like looking at old photos of us all wearing flares. Anyway, thank God for straight trousers and the XForms dependency-engine. But enough of the good old days, the days of assembly language, C and JavaScript...let's stick with the new. Do Try This at HomeTo round all of this off, we'll take a look at Google Suggest, and we'll use XForms to implement it. I'll walk through the demo in a separate blog [4] so that this one doesn't get too cluttered -- and hopefully by disecting this simple but useful application, we can show how declarative mark-up scores over scripting.[1] Will AJAX help Google clean up?, c|net, http://news.com.com/Will+AJAX+help+Google+clean+up/2100-1032_3-5621010.html [2] Ajax: A New Approach to Web Applications, Jesse James Garrett, Adaptive Path blog, http://www.adaptivepath.com/publications/essays/archives/000385.php [3] Using the XML HTTP Request object, http://jibbering.com/2002/4/httprequest.html [4] "Google Suggest" Using XForms, http://internet-apps.blogspot.com/2005/04/google-suggest-using-xforms.html Tags: | | | |
    [via Internet Applications]
    ]]>
    Ajax, Hard Facts, Brass Tacks ... and Bad Slackshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/825Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Uche Ogbuji comments in his blog about the use of WebDAV and SQLX in my blog as part of his commentary about Pyblosxom & WebDAV. To provide some clarity about Virtuoso and Blogging I have decided to put out this quick step by guide to the workings of my blog (there is a long overdue technical white paper nearing completion that address this subject in more detail).

    Here goes:

    Blog Editing

    I can use any editor that supports the following Blog Post APIs:

    - Moveable Type

    - Meta Weblog

    - Blogger

    Typically I use Virtuoso (which has an unreleased WYSIWYG blog post editor), Newzcrawler, ecto, Zempt, or w.bloggar for my posts. If a post is of interest to me, or relevant to our company or customers I tend to perform one of the following tasks:

    - Generate a post using the "Blog This" feature of my blog editor

    - Write a new post that was triggered by a previously read post etc.

    Either way, the posts end up in our company wide blog server that is Virtuoso based (more about this below). The internal blog server automatically categorizes my blog posts, and automagically determines which posts to upstream to other public blogs that I author (e.g http://kidehen.typepad.com ) or co-author (e.g http://www.openlinksw.com/weblogs/uda and http://www.openlinksw.com/weblogs/virtuoso ). I write once and my posts are dispatched conditionally to multiple outlets.

    RSS/Atom/RDF Aggregation & Reading

    I discover, subscribe to, and view blog feeds using Newzcrawler (primarily), and from time to time for experimentation and evaluation purposes I use RSS BanditFeedDemon, and Bloglines. I am in the process of moving this activity over to Virtuoso completely due to the large number of feeds that I consume on a daily basis (scalability is a bit of a problem with current aggregators).

    Blog Publishing

    When you visit my blog you are experiencing the  soon to be released Virtuoso Blog Publishing engine first hand, which is how WebDAV, SQLX, XQuery/XPath, and Free Text etc. come into the mix.

    Each time I create a post internally, or subscribe to an external feed, the data ends up in Virtuoso's SQL Engine (this is how we handle some of the obvious scalability challenges associated with large subscription counts). This engine is SQL2000N based, which implies that it can transform SQL to XML on the fly using recent extensions to SQL in the form of SQLX (prior to the emergence of this standard we used the FOR XML SQL syntax extensions for the same result). It also has its own in-built XSLT processor (DB Engine resident), and validating XML parser (with support for XML Schema).  Thus, my RSS/RDF/Atom archives, FOAF, BlogRoll, OPML, and OCS blog syndication gems are all live examples of SQLX documents that leverage Virtuoso's WebDAV engine for exposure to Blog Clients.

    Blog Search

    When you search for blog posts using the basic or advanced search features of my blog, you end up interacting with one of the following methods of querying data hosted in Virtuoso: Free Text Search, XPath, or XQuery. The result sets produced by the search feature uses SQLX to produce subscription gems (RSS/Atom/RDF/ blog home page exists as a result of Virtuoso's Virtual Domain / Multi-Homing Web Server functionality. The entire site resides in an Object Relational DBMS, and I can take my DB file across Windows, Solaris, Linux, Mac OS X, FreeBSD, AIX, HP-UX, IRIX, and SCO UnixWare without missing a single beat! All I have to do is instantiate my Virtuoso server and my weblog is live.

    ]]>
    WebDAV, SQLX, and my Webloghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/810Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    If a picture speaks a thousand words, I sometimes wonder how many words we attribute to a multimedia clip? Especially one that is now openly accessible to many who don't quite understand the high degree of: "Back To The Future" quotient of most of what we see today.

    The Internet Archive initiative is building up an amazing collection of content that includes this "must watch" movie about the somewhat forgotten hypercard development environment.

    As I watched the hypercard movie I obtained clear reassurance that my vision of Web 2.0 as critical infrastructure for a future Semantic Web isn't unfounded. The solution building methodology espoused by hypercard is exactly how Semantic Web applications will be built, and this will be done by orchestrating the componentary of Web 2.0.

    When watching this clip make the following mental adjustments:

    1. Swap hypercard stacks for discrete and/or composite services that have published endpoints exposed by Web 2.0 points of presence

    2. Think of information taking the form of XML based content e.g. RSS, Atom, RDF, FOAF, XFN, and other future XML based data contextualization formats; all accessible via URIs

    3. When the Apple Mac operating system is mentioned (or infered) think of the Internet (you don't need Windows, Mac OS, Linux, UNIX etc. to realize the vision, the network provided by the Internet is the Operating System)

    4. When the Apple computer is mentioned simply think about a plethora of function specific devices (computers, mobile phones, PDAs etc.) that overtly or covertly provide conduits to the new operating environment (the Internet)

    5. As you hear term "whole new body of people that are non programmers contributing there ideas" think about yourself and the increasing ease of participation that's beginning to take shape in this emerging frontier!

    6. As for "Whole Earth Catalog", think Wikipedia or more recent efforts such as Answers.com.

    Web 2.0 is a reflection of the web taking its first major step out of the technology stone age (certainly the case relative to the hypercard movie and "pre web" application development in general).

     

    ]]>
    Back To The Future: Hypermediahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/766Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Udell to event promoters on leveraging folksonomy: 'Pick a tag' I'm now trying to figure out why InfoWorld's Jon Udell is a journalist and not a millionaire technologist (or maybe he is). Udell keeps coming up with one brilliant idea after another. The first of these -- which I thought was just plain obvious -- was Udell's idea for vendors ...

    [via Berlind's Midnight Oil]
     
    I prefer to describe Jon Udell as a Technologist Type 3 (according to Tom Bradford's Technology Types nomenclature) who is also a journalist. His insights, thought stimulation/leadership, and power of articulation defy monetization.
    I do know Jon (albeit primarily via emails and phone interviews), he even put me forward for an innovators award in 2003 re. Virtuoso etc.
    Full disclosure aside,  you only need to trace back in time to see that he has been a Type 3 Technologist for a very long time. When I read one of Jon's articles I always sense that they are the end product of the following steps:
     
    1. Hypothesis Development
    2. Hands on Experimentation 
    3. Experiment Obersvation
    3. Conclusion Attainment
    4. Report / Article generation
    5. Share findings with interested parties 
     
    On the subject of "sharing his findings", the blogosphere has become a very effective dispatch outlet. He starts conversations about Google Maps, Querying Web Data via XQuery/XPath for instance, that stimulate further discussion (in the form of related blog posts of varying relationship density which might discern from these posts by Tom and myself for instance ).
     
    Blog conversation replaces the need for a "Jon here is our take on this..." or "Jon here is our implementation of what you demonstrated" phone call or email (you know he sees the discussion threads coalescing around his origninal post exprimentation conversation; most of the time setting up the next batch of experiments).
     
    To conclude, Jon is more than likely a tech Thrillionaire  :-) 
     
    ]]>
    Udell to event promoters on leveraging folksonomy: 'Pick a tag'http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/728Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Today is one of those days where one topic appears to be on the mind of many across cyberspace. You guessed right! Its that Web 2.0 thing again.  

    Paul Bausch brings Yahoo!'s most recent Web 2.0 contribution to our broader attention in this excerpt from his O'Reilly Network article:

    I browse news, check stock prices, and get movie times with Yahoo! Even though I interact with Yahoo! technology on a regular basis, I've never thought of Yahoo! as a technology company. Now that Yahoo! has released a Web Services interface, my perception of them is changing. Suddenly having programmatic access to a good portion of their data has me seeing Yahoo! through the eyes of a developer rather than a user.

    The great thing about this move by Yahoo! is two fold (IMHO):

    1. It certainly makes Yahoo! a little more interesting of late. And it will certainly helps to distinguish Yahoo! from Google. Of course these companies overlap somewhat, but they are also pretty different in focus. I see Yahoo! increasingly as a portal platform play providing content access via syndication, publishing, and web services.

    2. It will impact their bottom line pretty rapidly, and I hope they realize the impact of Web 2.0 when trying to explain the growth increments whenever they next report to their investors :-) In a previous post I expressed my sense of some confusion on the part of Jeff Bezos regarding the total contribution of AWS to Amazon's growth (BTW - my articles to date re. Amazon and Web 2.0 are available from here in a variety of XML syndication formats: Atom, RSS 2.0, RDF).

    The great thing about the Platform oriented Web 2.0 is the ability to syndicate your value proposition (aka products and services) instead of pursuing fallable email campaigns. It enables the auto-discovery of products and services by user agents (the content aspect). Web 2.0 also provides an infrastructure for user agents to enter into a consumptive interactions with discrete or composite Web Services via published endpoints exposed by a platform (the execution aspect).

    A scenario example:

    You can obtain RSS feeds (electronic product catalogs) from Amazon today, although you have to explicitly locate these catalog-feeds since Amazon doesn't exploit feed auto-discovery within their domain.

    If you use Firefox or another auto-discovery supporting RSS/Atom/RDF user agent; visit this URL; Firefox users should simply click on the little orange icon bottom right of the browser's window to its RSS feed auto-discovery in action.

    Anyway, once you have the feeds the next step is execution endpoints discovery within the Amazon domain (the conduits to Amazon's order processing system in this example). At the current time there isn't broad standardization of Web Services auto-discovery but it's certainly coming; WSIL is a potential front runner for small scale discovery while UDDI provides a heavier duty equivalent for larger scale tasks that includes discovery and other related functionality realms.

    Back to the example trail, by having the RSS/Atom/RDF feed data within the confines of a user agent (an Internet Application to be precise) nothing stops the extraction of key purchasing data from these feeds, plus your consumer data en route to assembling an execution message (as prescribed by the schema of the service in question)for Amazon's order processing/ shopping cart service.  All of this happens without ever seeing/eye-balling the Amazon site (a prerequisite of Web 1.0 hence the dated term: Web Site).

    To summarize: Web 2.0 enables you to syndicate your value proposition and then have it consumed via Web Services, leveraging computer, as opposed to human interaction cycles. This is how I believe Web 2.0 will ultimately impact the growth rates (in most cases exponentially) of those companies that comprehend its potential. 

    ]]>
    Yahoo! Web Serviceshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/718Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Payroll hole exposes dozens of companies Flaw in PayMaxx Web site exposed the financial information of customers' workers, the payroll-services firm acknowledges.

    [via CNET News.com]
     
    Unfortunately we have more of this come! The combinaton of backend Database Engine and Application Layer Data Access technology choices play a major role in these kinds of security vulnerabilities . Databases used to confined to access from dumb terminals and PCs within the enterprise. Today, these same databases are exposed to the Internet in a myriad of ways, and a physical firewall and password protection alone one cut it, not in an increasingly social oriented cyberspace. Social Engineering is a major aspect of hacking!
     
    Hosted applications are currently the rage; there are many benefits, but there are also some serious security vulnerabilties that will "dope slap" those organizations that carelessly head down this route. You have to take a look at the underlying architecture driving the systems in question.
     
    Anyway, you can track past and future commentary relating to databases, data access, and security using this dynamic blog query. Naturally, I expect content exposed from the query URI to grow, and to ultimately integrate content from other sources around the blogosphere.
    ]]>
    Payroll hole exposes dozens of companieshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/715Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Have RSS feeds killed the email star? silicon.com Feb 28 2005 12:58PM GMT

    [via Moreover - XML and metadata news]
     
    RSS and other XML based syndication formats (RDF, Atom, etc.) allow organizations to syndicate their value propositions via feeds. Thus, instead of, depending solely on sending out HTML based advertorial emails (which end up in Spam Folders 75% of the time anyhow) to targets such as; suspects, leads, and customers. You can rely on the Web 2.0 fabric for auto-discovery of syndicated feeds covering marketing collateral such as; features & benefits data, product documentation (ODBC/JDBC Multi-Tier, ODBC/JDBC Single-Tier, and Virtuoso ), product functionality tutorials, and screencasts (UDA , Virtuoso, and ODBC Benchmark & Troubleshooting Utilities) etc.
    ]]>
    Have RSS feeds killed the email star?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/704Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Cognitive dissonance is how Dare Obasanjo aptly describes the emergence of some of the Smart Tags concepts previously introduced by Microsoft and now emulated by the new google toolbar's autolink feature (Greg Linden explains the problem with clarity).

    Anyway, back to cognitive dissonance. Could this be the reason for the following?

    1. Open Source products are increasingly database specific even though they could be database independent via Open Source ODBC SDK efforts such as iODBC and unixODBC. We increasingly narrowing our choices down to database specific "Closed Source" or database specific "Open Source" solutions and somehow deem this to be progress
    2. The prevalent use of free standards compliant data access drivers (ODBC, JDBC, and ADO.NET) or their native counterparts that remain vulnerable to simple password hacks (there are databases behind those dynamic web sites!!) as none of these have any notion of "rules based" authentication and data access policy
    3. The time-tested fallacy that: "select * from table" defines a viable RDBMS engine since Transaction Atomicity, Concurrency, Isolation, and Durability (ACID) mean zip! Ditto scrollable cursors, stored procedures, and other presumably useless aspects of any marginably decent RDBMS engine
    4. Failing to comprehend that a Weblog is your property (if you have a personal blog) not the property of the vendor hosting your service (that important issue of separating data ownership and data storage again). You may have heard about, or experienced, total loss of weblog and/or weblog archives arising from weblog engine or blog service provider changeovers
    5. Failing to see the synergy between personal/group/corporate information stores (aka infobase) such as Wikis, Weblogs, and the burgeoning semantic web. Jon Udell for instance, is trying to get the point across via his tireless collection of XQuery/XPath based queries aimed at the blogosphere section of the burgeoning semantic web. Here are some of mine (scoped to this weblog):
      • Security related posts to date (XPath query)
      • Infobase related posts to date (Free Text search)

    And more...

    ]]>
    Cognitive Dissonancehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/695Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    T-Mobile responds to Paris Hilton Sidekick hacking

    [via Venture Chronicles by Jeff Nolan]

    This incident is an interesting one to follow as there is a little more to it than the purported T-Mobile stance: "..Paris may have given out her password.." .

    I have written about database and data access security matters on numerous occasions, and my underlying message has always been that there are many dimensions to security vulnerability that aren't catered for when the distinct functional domains of data access and data storage intersect (I am almost certain that the infrastructure at the bottom of this controversy will comprise at least one or more of the following: data access drivers (free and closed- or open source), relational database engine (closed- or open source), and a web application scripting language (closed- or open source).

    Here is a hypothetical situation relating to this matter. Lets assume that Paris did inadvertently give away her password, would it be too much for her to assume that T-mobile's data access infrastructure should be capable of controlling access to her data using any combination of her password and the following:

    1. Data Access Device
    2. Data Access Device host operating system
    3. Network IP or Mac Address
    4. Data Access Application

    If a very simple combination of the elements above formed part of the T-mobile authentication and data access security matrix, we would be looking at a much clearer picture of the vulnerability scenarios for this hack that would be confined to the following:

    1. She inadvertently gives out her password and also hands over her sidekick device to the hacker
    2. She inadvertently gives out her password and then the hacker successfully logs on to her sidekick (it does have a web browser and email implying a tcp/ip stack etc..). But I would expect Paris to be within her rights to assume some basic firewalling would be in place by default

    T-mobile should have a data access security infrastructure that would have a rule that restricted sidekick accounts (by default) from direct access from remote locations to address book data for instance. Account owners should be allowed to enable this feature after receiving clear notification about security implications.

    ]]>
    Database & Data Access Vulnerability: T-Mobile responds to Paris Hilton Sidekick hackinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/689Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Email As A Platform It looks like more people are starting to realize that email is more than it seems. Especially given the drastic increase in storage size of web-based email applications, more people are realizing that email is basically a personal database. People simply store information in their email, from contact information that was emailed to them to schedule information to purchase tracking from emailed receipts. Lots of people email messages to themselves, realizing that email is basically the best "permanent" filing system they have. That's part of the reason why good email search is so important. Of course, what the article doesn't discuss is the next stage of this evolution. If you have a database of important information, the next step is to build useful applications on top of it. In other words, people are starting to realize that email, itself, is a platform for personal information management.

    [via Techdirt]
     
    Yep! And this is where the Unified Storage vision comes into play. Many years ago the same issues emerged in the business application realm, and at the time the issue at hand was: separating the DBMS engine from the Application logic. This is what the SQL Access Group (SAG) addressed via the CLI that laid the foundation for ODBC, JDBC, and recent derivatives; OLE DB and ADO.NET.
     
    Most of us live inside our email applications and the need to integrate the content of emails, address books, notes, calendars with other data sources (Web Portal, Blogs, Wikis, CRM, ERP, and more) as part of our application interaction cycles and domain specific workflow is finally becoming obvious.  There is a need for separation of the application/service layer from the storage engine across each one of these functionality realms. XML, RDF, and Triple Stores (RDF / Semantic Data Stores) collectively provide a standards based framework for achieving this goal. On the other hand so does WinFS albeit total proprietary (by this I mean none standards compliant) at the current time.
     
    As you can already see there are numerous applications (conventional or hosted) that address email, address books, bookmarking, notes, calendars, blogs, wikis, crm etc. specifically, but next to none that address the obvious need for transparent integration across each functionality realm - the ultimate goal.
     
    Yes, you know what I am about to say! OpenLink Virtuoso is the platform for developing and/or implementing these next generation solutions. We have also decided to go one step further by developing a number of applications that demonstrate the vision (and ultimate reality); and each of these applications (and the inherent integration tapestry) will be the subject of a future Virtuoso Application specific post.
    ]]>
    Email As A Platformhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/680Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Bloglines

    Since last fall, I've been recommending Bloglines to first-timers as the fastest and easiest introduction to the subscription side of the blogosphere. Remarkably, this same application also meets the needs of some of the most advanced users. I've now added myself to that list. Hats off to Mark Fletcher for putting all the pieces together in such a masterful way.

    What goes around comes around. Five years ago, centralized feed aggregators -- my.netscape.com and my.userland.com -- were the only game in town. Fat-client feedreaders only arrived on the scene later. Because of the well-known rich-versus-reach tradeoffs, I never really settled in with one of those. Most of the time I've used the Radio UserLand reader. It is browser-based, and it normally points to localhost, but I've been parking Radio UserLand on a secure server so that I can read the feeds it aggregates for me from anywhere.

    Bloglines takes that idea and runs with it. Like the Radio UserLand reader, it supports the all-important (to me) consolidated view of new items. But its two-pane interface also shows me the list of feeds, highlighting those with new entries, so you can switch between a linear of scan of all new items and random access to particular feeds. Once you've read an item it vanishes, but you can recall already-read items like so:

    Display items within the last

    If a month's worth of some blog's entries produces too much stuff to easily scan, you can switch that blog to a titles-only view. The titles expand to reveal all the content transmitted in the feed for that item.

    I haven't gotten around to organizing my feeds into folders, the way other users of Bloglines do, but I've poked around enough to see that Bloglines, like Zope, handles foldering about as well as you can in a Web UI -- which is to say, well enough. With an intelligent local cache it could be really good; more on that later.

    Bloglines does two kinds of data mining that are especially noteworthy. First, it counts and reports the number of Bloglines users subscribed to each blog. In the case of Jonathan Schwartz's weblog, for example, there are (as of this moment) 253 subscribers.

    Second, Bloglines is currently managing references to items more effectively than the competition. I was curious, for example, to gauge the reaction to the latest salvo in Schwartz's ongoing campaign to turn up the heat on Red Hat. Bloglines reports 10 References. In this case, the comparable query on Feedster yields a comparable result, but on the whole I'm finding Bloglines' assembly of conversations to be more reliable than Feedster's (which, however, is still marked as 'beta'). Meanwhile Technorati, though it casts a much wider net than either, is currently struggling with conversation assembly.

    I love how Bloglines weaves everything together to create a dense web of information. For example, the list of subscribers to the Schwartz blog includes: judell - subscribed since July 23, 2004. Click that link and you'll see my Bloglines subscriptions. Which you can export and then -- if you'd like to see the world through my filter -- turn around and import.

    Moving my 265 subscriptions into Bloglines wasn't a complete no-brainer. I imported my Radio UserLand-generated OPML file without any trouble, but catching up on unread items -- that is, marking all of each feed's sometimes lengthy history of items as having been read -- was painful. In theory you can do that by clicking once on the top-level folder containing all the feeds, which generates the consolidated view of unread items. In practice, that kept timing out. I finally had to touch a number of the larger feeds, one after another, in order to get everything caught up. A Catch Up All Feeds feature would solve this problem.

    Another feature I'd love to see is Move To Next Unread Item -- wired to a link in the HTML UI, or to a keystroke, or ideally both.

    Finally, I'd love it if Bloglines cached everything in a local database, not only for offline reading but also to make the UI more responsive and to accelerate queries that reach back into the archive.

    Like Gmail, Bloglines is the kind of Web application that surprises you with what it can do, and makes you crave more. Some argue that to satisfy that craving, you'll need to abandon the browser and switch to RIA (rich Internet application) technology -- Flash, Java, Avalon (someday), whatever. Others are concluding that perhaps the 80/20 solution that the browser is today can become a 90/10 or 95/5 solution tomorrow with some incremental changes.

    Dare Obasanjo wondered, over the weekend, "What is Google building?" He wrote:

    In the past couple of months Google has hired four people who used to work on Internet Explorer in various capacities [especially its XML support] who then moved to BEA; David Bau, Rod Chavez, Gary Burd and most recently Adam Bosworth. A number of my coworkers used to work with these guys since our team, the Microsoft XML team, was once part of the Internet Explorer team. It's been interesting chatting in the hallways with folks contemplating what Google would want to build that requires folks with a background in building XML data access technologies both on the client side, Internet Explorer and on the server, BEA's WebLogic. [Dare Obasanjo]
    It seems pretty clear to me. Web applications such as Gmail and Bloglines are already hard to beat. With a touch of alchemy they just might become unstoppable.

    [via Jon's Radio]
    ]]>
    Bloglineshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/600Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Data Structures and RDF Time to chime in on the RDF debate. There are four general ways of storing information: A list, in which one has a number of items, which may or not be related to one another. A table, in which one has a number of items (records), each with a distinct set of properties or columns. A tree, in which one has a hierarchy of items. A graph, in which one has a number of items (nodes), with the nodes connected to each other in some way. There are others, but they are more or less just variations of the same. There are examples all over of each type. Arrays are examples of lists. Of course, they are used all over the place. Relational databases typically store all of their data in tables. So do spreadsheets. Trees are used for mail or news messages and your bookmarks. XML is a syntax for specifying trees of information. The Windows and Classic Macintosh file systems are presented and/or stored as a tree. The Unix file system however isn't a tree. It's a graph. RDF is a graph. The Web is also a graph -- it's a bunch of pages connected via links. Each of the four storage methods, lists, tables, trees, and graphs, increase in complexity as you go up. Lists are simple to store. Graphs are the most difficult. Actually, that doesn't need to be the case. But, very few programming languages come with any kind of Graph structure ready to use. Due to the complexity, you should probably store data in the lowest type possible, depending on the kind of data you have. You can always use one of the structures higher than what is necessary. A list could be stored in a table with only one column, a table can be stored in a tree, where a root node has a set of records, each with a set of properties, and a tree is really a specialized form of graph. However, the reverse is not true. You can't store a graph in a tree, you can't store a tree in a table, and you can't store a table in a list. Any place where you see someone trying to is a hack. Many people don't know this though. So they just store everything in a tabular database or in XML, regardless of what it is. This has two problems. First, you get data that can be stored in a simpler format, stored in some more complex format. So you get people passing lists of things around using XML. Or, configuration files stored in XML. Second, you get people trying to coerce more complex data into a simpler format, so you might see people trying to shove trees of data into a database. Or you get serialized RDF written as XML. Many people think that XML is the ultimate format for storing data. It isn't. It can represent trees nicely, and it can do tables and lists if you really wanted it to, but it can't represent graphs, not cleanly anyway. Perhaps what is needed is an eXtensible Graph Language, which represents graphs of data. There is RDF-XML, and XGMML but both use a language for describing trees. Actually, it shouldn't be called the eXtensible Graph Language, because then people will get confused thinking it's like XML. Because a tree can be represented as a graph, all data could be represented in the Graph Language (not that it should be, of course), unlike XML which can't. Of course, this assumes there isn't some higher level structure above the graph. Long, long ago, people stored data in lists, because that was all that was available. Then, someone came up with the idea of storing data in tables. So relational databases came along and people moved up the ladder to tables. A few years ago, XML came along so data moved up again to trees. Can you guess what will happen next? The Semantic Web folks want us to move to using graphs. Should we move to graphs? Seems to be the next logical step in information evolution. What's holding us back? Well, it's probably too soon. The world is still in the tree phase. One day, graphs will start to become more popular -- it will just take time. In 30 years, someone might come up with something beyond graphs, and we'll all slowly switch to it as well. There's also the RSS in RDF debate. Many people don't see the value in storing RSS data in RDF. This is because the information stored in a single RSS file isn't a graph -- it's a tree, so plain-old XML actually makes more sense. Of course, the Semantic Web folks don't agree. Why? Because they aren't thinking in terms of a single RSS file - they are thinking of building giant collections of RSS data, all linked together so that it forms one giant - hey, it's not a tree - it's a graph. Then, you can search and navigate it like you can with the existing Web. But of course, the Semantic Web lets the servers and the software you're using, know more about what you're talking about. This is unlike current popular search engines like Google which are pretty much just guessing. You can make it better, sure, but the best way to acheive accuracy is if someone tells it the answer to begin with.]]><a href="http://www.xulplanet.com/cgi-bin/ndeakin/homeN.cgi?ai=133">Data Structures and RDF</a>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/59Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>A Blog post for the ages, from Jon Udell. I expect to refer back to this post a number of times in the future, as I have the same concerns across related realms; for instance data access API usage and evolution.

    Enjoy!

    Questions about Longhorn, part 3: Avalon's enterprise mission

    The slide shown at the right comes from a presentation entitled Windows client roadmap, given last month to the International .NET Association (INETA). When I see slides like this, I always want to change the word "How" to "Why" -- so, in this case, the question would become "Why do I have to pick between Windows Forms and Avalon?" Similarly, MSDN's Channel 9 ran a video clip of Joe Beda, from the Avalon team, entitled How should developers prepare for Longhorn/Avalon? that, at least for me, begs the question "Why should developers prepare for Longhorn/Avalon?"

    I've been looking at decision trees like the one shown in this slide for more than a decade. It's always the same yellow-on-blue PowerPoint template, and always the same message: here's how to manage your investment in current Windows technologies while preparing to assimilate the new stuff. For platform junkies, the internal logic can be compelling. The INETA presentation shows, for example, how it'll be possible to use XAML to write WinForms apps that host combinations of WinForms and Avalon components, or to write Avalon apps that host either or both style of component. Cool! But...huh? Listen to how Joe Beda frames the "rich vs. reach" debate:

    Avalon will be supplanting WinForms, but WinForms is more reach than it is rich. It's the reach versus rich thing, and in some ways there's a spectrum. If you write an ASP.NET thing and deploy via the browser, that's really reach. If you write a WinForms app, you can go down to Win98, I believe. Avalon's going to be Longhorn only.

    So developers are invited to classify degrees of reach -- not only with respect to the Web, but even within Windows -- and to code accordingly. What's more, they're invited to consider WinForms, the post-MFC (Microsoft Foundation Classes) GUI framework in the .NET Framework, as "reachier" than Avalon. That's true by definition since Avalon's not here yet, but bizarre given that mainstream Windows developers can't yet regard .NET as a ubiquitous foundation, even though many would like to.

    Beda recommends that developers isolate business logic and data-intensive stuff from the visual stuff -- which is always smart, of course -- and goes on to sketch an incremental plan for retrofitting Avalon goodness into existing apps. He concludes:

    Avalon, and Longhorn in general, is Microsoft's stake in the ground, saying that we believe power on your desktop, locally sitting there doing cool stuff, is here to stay. We're investing on the desktop, we think it's a good place to be, and we hope we're going to start a wave of excitement leveraging all these new technologies that we're building.

    It's not every decade that the Windows presentation subsystem gets a complete overhaul. As a matter of fact, it's never happened before. Avalon will retire the hodge-podge of DLLs that began with 16-bit Windows, and were carried forward (with accretion) to XP and Server 2003. It will replace this whole edifice with a new one that aims to unify three formerly distinct modes: the document, the user interface, and audio-visual media. This is a great idea, and it's a big deal. If you're a developer writing a Windows application that needs to deliver maximum consumer appeal three or four years from now, this is a wave you won't want to miss. But if you're an enterprise that will have to buy or build such applications, deploy them, and manage them, you'll want to know things like:

    • How much fragmentation can my developers and users tolerate within the Windows platform, never mind across platforms?

    • Will I be able to remote the Avalon GUI using Terminal Services and Citrix?

    • Is there any way to invest in Avalon without stealing resources from the Web and mobile stuff that I still have to support?

    Then again, why even bother to ask these questions? It's not enough to believe that the return of rich-client technology will deliver compelling business benefits. (Which, by the way, I think it will.) You'd also have to be shown that Microsoft's brand of rich-client technology will trump all the platform-neutral variations. Perhaps such a case can be made, but the concept demos shown so far don't do so convincingly. The Amazon demo at the Longhorn PDC (Professional Developers Conference) was indeed cool, but you can see similar stuff happening in Laszlo, Flex, and other RIA (rich Internet application) environments today. Not, admittedly, with the same 3D effects. But if enterprises are going to head down a path that entails more Windows lock-in, Microsoft will have to combat the perception that the 3D stuff is gratuitous eye candy, and show order-of-magnitude improvements in users' ability to absorb and interact with information-rich services.

    [via Jon's Radio]
    ]]>
    Questions about Longhorn, part 3: Avalon's enterprise missionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/559Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here are some thoughts on where I think things are going in the mobile and content space.

    I wrote this essay before reading Free Culture so I'm saying a lot of stuff that Larry says better...

    Several crucial shifts in technology are emerging that will drastically affect the relationship between users and technology in the near future. Wireless Internet is becoming ubiquitous and economically viable. Internet capable devices are becoming smaller and more powerful.

    Alongside technological shifts, new social trends are emerging. Users are shifting their attention from packaged content to social information about location, presence and community. Tools for identity, trust, relationship management and navigating social networks are becoming more popular. Mobile communication tools are shifting away from a 1-1 model, allowing for increased many-to-many interactions; such a shift is even being used to permit new forms of democracy and citizen participation in global dialog.

    While new technological and social trends are occurring, it is not without resistance, often by the developers and distributors of technology and content. In order to empower the consumer as a community member and producer, communication carriers, hardware manufacturers and content providers must understand and build models that focus less on the content and more on the relationships.

    Smaller faster

    Computing started out as large mainframe computers, software developers and companies “time sharing” for slices of computing time on the large machines. The mini-computer was cheaper and smaller, allowing companies and labs to own their own computers. The mini computer allowed a much greater number of people to have access to computers and even use them in real time. The mini computer lead to a burst in software and networking technologies. In the early 80’s, the personal computer increased the number of computers by an order of magnitude and again, led to an explosion in new software and technology while lowering the cost even more. Console gaming companies proved once again that unit costs could be decreased significantly by dramatically increasing the number of units sold. Today, we have over a billion cell phones in the market. There are tens of millions camera phones. The incredible number of these devices has continued to lower the unit cost of computing as well as devices imbedded in these devices such as small cameras. High end phones have the computing power of the personal computers of the 80’s and the game consoles of the 90’s.

    History repeats with WiFi

    There are parallels in the history of communications and computing. In the 1980’s the technology of packet switched networks became widely deployed. Two standards competed. X.25 was a packet switched network technology being promoted by CCITT (a large, formal international standards body) and the telephone companies. It involved a system run by telephone companies including metered tariffs and multiple bilateral agreements between carriers to hook up.

    Concurrently, universities and research labs were promoting TCP/IP and the Internet opportunity for loosely organized standards meetings being operated with flat rate tariffs and little or no agreements between the carriers. People just connected to the closest node and everyone agreed to freely carry traffic for others.

    There were several “free Internet” services such as “The Little Garden” in San Francisco. Commercial service providers, particularly the telephone company operators such as SprintNet tried to shut down such free services by threatening not to carry this free traffic.

    Eventually, large ISPs began providing high quality Internet connectivity and finally the telephone companies realized that the Internet was the dominant standard and shutdown or acquired the ISPs.

    A similar trend is happening in wireless data services. GPRS is currently the dominant technology among mobile telephone carriers. GPRS allows users to transmit packets of data across the carrier network to the Internet. One can roam to other networks as long as the mobile operators have agreements with each other. Just like in the days of X.25, the system requires many bilateral agreements between the carriers; their goal is to track and bill for each packet of information.

    Competing with this standard is WiFi. WiFi is just a simple wireless extension to the current Internet and many hotspots provide people with free access to the Internet in cafes and other public areas. WiFi service providers have emerged, while telephone operators –such as a T-Mobile and Vodaphone- are capitalizing on paid WiFi services. Just as with the Internet, network operators are threatening to shut down free WiFi providers, citing a violation of terms of service.

    Just as with X.25, the GPRS data network and the future data networks planned by the telephone carriers (e.g. 3G) are crippled with unwieldy standards bodies, bilateral agreements, and inherently complicated and expensive plant operations.

    It is clear that the simplicity of WiFi and the Internet is more efficient than the networks planned by the telephone companies. That said, the availability of low cost phones is controlled by mobile telephone carriers, their distribution networks and their subsidies.

    Content vs Context

    Many of the mobile telephone carriers are hoping that users will purchase branded content manufactured in Hollywood and packaged and distributed by the telephone companies using sophisticated technology to thwart copying.

    Broadband in the home will always be cheaper than mobile broadband. Therefore it will be cheaper for people to download content at home and use storage devices to carry it with them rather than downloading or viewing content over a mobile phone network. Most entertainment content is not so time sensitive that it requires real time network access.

    The mobile carriers are making the same mistake that many of the network service providers made in the 80s. Consider Delphi, a joint venture between IBM and Sears Roebuck. Delphi assumed that branded content was going to be the main use of their system and designed the architecture of the network to provide users with such content. Conversely, the users ended up using primary email and communications and the system failed to provide such services effectively due to the mis-design.

    Similarly, it is clear that mobile computing is about communication. Not only are mobile phones being used for 1-1 communications, as expected through voice conversations; people are learning new forms of communication because of SMS, email and presence technologies. Often, the value of these communication processes is the transmission of “state” or “context” information; the content of the messages are less important.

    Copyright and the Creative Commons

    In addition to the constant flow of traffic keeping groups of people in touch with each other, significant changes are emerging in multimedia creation and sharing. The low cost of cameras and the nearly television studio quality capability of personal computers has caused an explosion in the number and quality of content being created by amateurs. Not only is this content easier to develop, people are using the power of weblogs and phones to distribute their creations to others.

    The network providers and many of the hardware providers are trying to build systems that make it difficult for users to share and manipulate multimedia content. Such regulation drastically stifles the users’ ability to produce, share and communicate. This is particularly surprising given that such activities are considered the primary “killer application” for networks.

    It may seem unintuitive to argue that packaged commercial content can co-exist alongside consumer content while concurrently stimulating content creation and sharing. In order to understand how this can work, it is crucial to understand how the current system of copyright is broken and can be fixed.

    First of all, copyright in the multimedia digital age is inherently broken. Historically, copyright works because it is difficult to copy or edit works and because only few people produce new works over a very long period of time. Today, technology allows us to find, sample, edit and share very quickly. The problem is that the current notion of copyright is not capable of addressing the complexity and the speed of what technology enables artists to create. Large copyright holders, notably Hollywood studios, have aggressively extended and strengthened their copyright protections to try to keep the ability to produce and distribute creative works in the realm of large corporations.

    Hollywood asserts, “all rights reserved” on works that they own. Sampling music, having a TV show running in the background in a movie scene or quoting lyrics to a song in a book about the history of music all require payment to and a negotiation with the copyright holder. Even though the Internet makes available a wide palette of wonderful works based on content from all over the world, the current copyright practices forbid most of such creation.

    However, most artists are happy to have their music sampled if they receive attribution. Most writers are happy to be quoted or have their books copied for non-commercial use. Most creators of content realize that all content builds on the past and the ability for people to build on what one has created is a natural and extremely important part of the creative process.

    Creative Commons tries to give artists that choice. By providing a more flexible copyright than the standards “all rights reserved” copyright of commercial content providers, Creative Commons allows artists to set a variety of rights to their works. This includes the ability to reuse for commercial use, copy, sample, require attribution, etc. Such an approach allows artists to decide how their work can be used, while providing people with the materials necessary for increased creation and sharing.

    Creative Commons also provides for a way to make the copyright of pieces of content machine-readable. This means that a search engine or other tool to manipulate content is able to read the copyright. As such, an artist can search for songs, images and text to use while having the information to provide the necessary attribution.

    Creative Commons can co-exist with the stringent copyright regimes of the Hollywood studios while allowing professional and amateur artists to take more control of how much they want their works to be shared and integrated into the commons. Until copyright law itself is fundamentally changed, the Creative Commons will provide an essential tool to provide an alternative to the completely inflexible copyright of commercial content.

    Content is not like some lump of gold to be horded and owned which diminishes in value each time it is shared. Content is a foundation upon which community and relationships are formed. Content is the foundation for culture. We must evolve beyond the current copyright regime that was developed in a world where the creation and transmission of content was unwieldy and expense, reserved to those privileged artists who were funded by commercial enterprises. This will provide the emerging wireless networks and mobile devices with the freedom necessary for them to become the community building tools of sharing that is their destiny.

    [via Joi Ito's Web]
    ]]>
    Essay about current and past trends -- Joi Itohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/528Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    XML based generation of Rich and Native UI's is gathering momentum, it might also be a point to understand the complimentary relationship that exists between XForms and these XML based GUI generators.

    BTW - Here is a great XForms presentation that helps aids in the contextualization of my prior comments.

    The actual Macromedia MXML (Flex) review by Jon Udell follows:

    After a decade of web-style development, I'm sold on the idea of using markup languages to describe the layouts of user interfaces and to coordinate the event-driven code that interconnects widgets and binds them to data. The original expression of that model was HTML and JavaScript, but variations have flourished. Mozilla-based applications have been using XUL (XML User Interface Language) for years. The Laszlo Presentation Server uses a description language called LZX. Microsoft has previewed XAML (Extensible Application Markup Language) for Longhorn.

    Now comes MXML (Macromedia Flex Markup Language), the latest development in Macromedia's ongoing quest to reposition the near-ubiquitous Flash player as a general-purpose presentation engine for rich Internet applications. With XML markup at its core, Flex is inherently IDE- friendly, and Macromedia has two IDE initiatives underway. One, code-named Brady, builds on Dreamweaver MX. The other, code-named Partridge, leverages Eclipse.

    Full Review: http://www.infoworld.com/article/04/03/29/13TCflex_1.html

    Also see XML for UI Languages: http://xml.coverpages.org/userInterfaceXML.html

    Nothing stops any of the engines mentioned above (proprietary user interfaces as per the diagram below)

    ]]>
    Macromedia Brings Flash to the Enterprisehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/498Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    WebDAV is one of those interesting standards that sometimes gets lost in the broader industry hoopla. Well I finally decided to take a look at Mozilla's Calendar project as more open solution for sharing my calendar. After browsing around a little I came a across the following piece:

    To share your calendars, you need access to a webDAV server. If you run your own web server, you can install mod_dav, a free Apache module that will turn your web server into a webDAV server. Instructions on how to set it up are on their website. Once you set up your webDAV server, you can publish your calendar to the site, then subscribe to it from any other Mozilla Calendar. Automatically updating the calendar will give you a poor man's calendar server.

    Through WebDAV we will be able to share calendars across disparate calendaring tools (albeit with some degree of pain when Outlook is in the mix). Even better for me, I can post my shared calendar data via a Virtuoso instance (internally and externally since WebDAV is one of the many protocols that it implements), in short I could even seriously consider generating this on the fly and sharing it via this blog (Wow!).

    We aren't too many miles away from open and standards compliant Unified Data Storage thanks to WebDAV.

     

    ]]>
    Remember WebDAVhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/462Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Databases get a grip on XML
    From Inforworld.

    The next iteration of the SQL standard was supposed to arrive in 2003. But SQL standardization has always been a glacially slow process, so nobody should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready yet. Even so, 2003 was a year in which XML-oriented data management, one of the areas addressed by the forthcoming standard, showed up on more and more developers? radar screens.  >> READ MORE

    This article rounds up product for 2003 in the critical area of Enterprise Database Technology. It's certainly provides an apt reflection of how Virtuoso compares with offerings from some the larger (but certainly slower to implement) database vendors in this space. As usual Jon Udell's quote pretty much sums this up:

    "While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. OpenLink Software?s Virtuoso 3.0, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated ?joins? across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian?s Mono."

    Albeit still somewhat unknown to the broader industry we have remained true our "innovator" discipline, which still remains our chosen path to market leadership. Thus, its worth a quick Virtuoso release history, and features recap as we get set to up the ante even further in 2004:

    1998 - Virtuoso's initial public beta release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources.

    1999 - Virtuoso's official commercial release, with emphasis still on Virtual Database functionality for ODBC, JDBC accessible SQL Databases.

    2000 - Virtuoso 2.0 adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI, HTTP, Replication, Free Text Indexing (*feature update*), POP3, and NNTP support.

    2002 - Virtuoso 2.7 extends Virtualization prowess beyond data access via enhancements to its Web Services protocol stack implementation by enabling SQL Stored Procedures to be published as Web Services. It also debuts its Object-Relational engine enhancements that include the incorporation of Java and Microsoft .NET Objects into its User Defined Type, User Defined Functions, and Stored Procedure offerings.

    2003 - Virtuoso 3.0 extends data and application logic virtualization into the Application Server realm (basically a Virtual Application server too!), by adding support for ASP.NET, PHP, Java Server Pages runtime hosting (making applications built using any of these languages deployable using Virtuoso across all supported platforms).

    Collectively each of these releases have contributed to a very premeditated architecture and vision that will ultimately unveil the inherent power of critical I.S infrastructure virtualization along the following lines; data storage, data access , and application logic via coherent integration of SQL, XML, Web Services, and Persistent Stored Modules (.NET, Java, and other object based component building blocks).

     

    ]]>
    Enterprise Databases get a grip on XMLhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/442Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    NETWORK WORLD NEWSLETTER: MARK GIBBS ON WEB APPLICATIONS

    Today's focus: A Virtuoso of a server

    By Mark Gibbs

    One of the bigger drags of Web applications development is that building a system of even modest complexity is a lot like herding cats - you need a database, an applications server, an XML engine, etc., etc. And as they all come from different vendors you are faced with solving the constellation of integration issues that inevitably arise.

    If you are lucky, your integration results in a smoothly functioning system. If not, you have a lot of spare parts flying in loose formation with the risk of a crash and burn at any moment.

    An alternative is to look for all of these features and services in a single package but you'll find few choices in this arena.

    One that is available and looks very promising is OpenLink's Virtuoso (see links below).

    Virtuoso is described as a cross platform (runs on Windows, all Unix flavors, Linux, and Mac OS X) universal server that provides databases, XML services, a Web application server and supporting services all in a single package.

    OpenLink's list of supported standards is impressive and includes .Net, Mono, J2EE, XML Web Services (Simple Object Application Protocol, Web Services Description Language, WS-Security, Universal Description, Discovery and Integration), XML, XPath, XQuery, XSL-T, WebDav, HTTP, SMTP, LDAP, POP3, SQL-92, ODBC, JDBC and OLE-DB.

    Virtuoso provides an HTTP-compliant Web Server; native XML document creation, storage and management; a Web services platform for creation, hosting and consumption of Web services; content replication and synchronization services; free text index server, mail delivery and storage and an NNTP server.

    Another interesting feature is that with Virtuoso you can create Web services from existing SQL Stored Procedures, Java classes,

    C++ classes, and 'C' functions as well as create dynamic XML

    documents from ODBC and JDBC data sources.

    This is an enormous product and implies a serious commitment on the part of adopters due to its scope and range of services.

    Virtuoso is enormous by virtue of its architectural ambitions, but actual disk requirements are

    ]]>
    A Virtuoso of a Serverhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/395Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    RSS: The Best Of All Possible Worlds

    The thing that most surprised me today in the SoftEdge panel on Social Software was the reaction to RSS. I should be clear that I am an RSS true believer. It seems to me that metadata as a byproduct of social software engines (be it blogging or social networking or whatever) is not only enviable, it is inevitable. RSS and FOAF and other yet-to-be-determined social software data protocols will become standards because it simply makes good sense for them to be standardized. Anyone paying attention to the unbelievable development and adoption curve of wireless can appreciate the immense value driven by standards -- and, in particular, standards that are truly standard. So it came as a bit of a shock to me that when I questioned the panelists on the implications of RSS and the Semantic Web, they were less sold on the inevitability of it all.

    When asked the question of whether the proliferation of RSS and FOAF might make it possible for reader technology to be the next killer application in knowledge management, I got very strong reactions from both Reid Hoffman and Meg Hourihan. Reid stated that he did not believe that RSS was sufficiently robust to provide significant value an any level. Meg followed up with a general indictment of the semantic web, which she views merely as a geek utopia. I will admit that I'm a fan of Candide (particularly at the hands of Bernstein), but I hardly view myself as Panglos. One need look no further than, for example, the tools that Oddpost has incorporated into its web email client to allow an integrated email and blog experience. Better yet, through a relatively simple web service, Oddpost can deliver an RSS feed of a particular Google News search so that you can keep track of keywords that are of interest to you without having to visit Google repeatedly to find out if your company or candidate or favorite band has been mentioned in today's news. The same is true of watch lists on Technorati. Rather than periodically check to see if someone has linked to your blog, Technorati will do the work for you and deliver the info to your inbox only when there is information to be delivered. These examples are just the tip of the iceberg but the demonstrate the nascent power of RSS and related standards. I'll have to wait for another panel to have that argument with Reid and Meg.

    [via VentureBlog]
    ]]>
    RSS: The Best Of All Possible Worldshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/383Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Data Structures and RDF Time to chime in on the RDF debate. There are four general ways of storing information: A list, in which one has a number of items, which may or not be related to one another. A table, in which one has a number of items (records), each with a distinct set of properties or columns. A tree, in which one has a hierarchy of items. A graph, in which one has a number of items (nodes), with the nodes connected to each other in some way. There are others, but they are more or less just variations of the same. There are examples all over of each type. Arrays are examples of lists. Of course, they are used all over the place. Relational databases typically store all of their data in tables. So do spreadsheets. Trees are used for mail or news messages and your bookmarks. XML is a syntax for specifying trees of information. The Windows and Classic Macintosh file systems are presented and/or stored as a tree. The Unix file system however isn't a tree. It's a graph. RDF is a graph. The Web is also a graph -- it's a bunch of pages connected via links. Each of the four storage methods, lists, tables, trees, and graphs, increase in complexity as you go up. Lists are simple to store. Graphs are the most difficult. Actually, that doesn't need to be the case. But, very few programming languages come with any kind of Graph structure ready to use. Due to the complexity, you should probably store data in the lowest type possible, depending on the kind of data you have. You can always use one of the structures higher than what is necessary. A list could be stored in a table with only one column, a table can be stored in a tree, where a root node has a set of records, each with a set of properties, and a tree is really a specialized form of graph. However, the reverse is not true. You can't store a graph in a tree, you can't store a tree in a table, and you can't store a table in a list. Any place where you see someone trying to is a hack. Many people don't know this though. So they just store everything in a tabular database or in XML, regardless of what it is. This has two problems. First, you get data that can be stored in a simpler format, stored in some more complex format. So you get people passing lists of things around using XML. Or, configuration files stored in XML. Second, you get people trying to coerce more complex data into a simpler format, so you might see people trying to shove trees of data into a database. Or you get serialized RDF written as XML. Many people think that XML is the ultimate format for storing data. It isn't. It can represent trees nicely, and it can do tables and lists if you really wanted it to, but it can't represent graphs, not cleanly anyway. Perhaps what is needed is an eXtensible Graph Language, which represents graphs of data. There is RDF-XML, and XGMML but both use a language for describing trees. Actually, it shouldn't be called the eXtensible Graph Language, because then people will get confused thinking it's like XML. Because a tree can be represented as a graph, all data could be represented in the Graph Language (not that it should be, of course), unlike XML which can't. Of course, this assumes there isn't some higher level structure above the graph. Long, long ago, people stored data in lists, because that was all that was available. Then, someone came up with the idea of storing data in tables. So relational databases came along and people moved up the ladder to tables. A few years ago, XML came along so data moved up again to trees. Can you guess what will happen next? The Semantic Web folks want us to move to using graphs. Should we move to graphs? Seems to be the next logical step in information evolution. What's holding us back? Well, it's probably too soon. The world is still in the tree phase. One day, graphs will start to become more popular -- it will just take time. In 30 years, someone might come up with something beyond graphs, and we'll all slowly switch to it as well. There's also the RSS in RDF debate. Many people don't see the value in storing RSS data in RDF. This is because the information stored in a single RSS file isn't a graph -- it's a tree, so plain-old XML actually makes more sense. Of course, the Semantic Web folks don't agree. Why? Because they aren't thinking in terms of a single RSS file - they are thinking of building giant collections of RSS data, all linked together so that it forms one giant - hey, it's not a tree - it's a graph. Then, you can search and navigate it like you can with the existing Web. But of course, the Semantic Web lets the servers and the software you're using, know more about what you're talking about. This is unlike current popular search engines like Google which are pretty much just guessing. You can make it better, sure, but the best way to acheive accuracy is if someone tells it the answer to begin with.]]><a href="http://www.xulplanet.com/cgi-bin/ndeakin/homeN.cgi?ai=133">Data Structures and RDF</a>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/330Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>XML Features of Oracle 8i and 9i

    XML and relational databases are both technologies for structuring, cataloguing and processing data. If data has a regular and atomic structure, it is more appropriate and efficient to use a database than XML.

    Databases store data, XML is not a storage mechanism, it is a storage format (amongst its many capabilities).

    In this case, why would you wish to go to the trouble of converting such data from a database into XML and vice versa? Reasons include:

    • XML is easy to convert further into different formats as required: e.g HTML, PDF, and plain text. This gives a flexibility to web applications where data can be searched for and accessed from the database, and then formatted for output in different formats using e.g XSL.

    XML seperates data from formatting (and programming logic). XSL is now broken down into two parts; XSLT

    ]]>
    <font size="+1">XML Features of Oracle 8i and 9i</font>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/296Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Ingres - A Forgottent Database The Untold Story

    Ingres (technically, Advantage Ingres Enterprise) is, arguably, the forgotten database. There used to be five major databases: Oracle, DB2, Sybase, Informix and Ingres. Then along came Microsoft and, if you listened to most press comment (or the lack of it), you would think that there were only two of these left, plus SQL Server. [From IT-Director]

    Oracle, Microsoft, and IBM would certainly like the illusion of a 3 horse race, as this is the only way they can induce Ingres, Informix, and Sybase users to jump ship, and this, even though database migrations are by far the most risk prone and problematic aspects of any IT infrastructure.

    Here is the interesting logic from the self-made big three, if you want to take advanatage of new paradigms and technologies such as XML, Web Services, and anything else in the pipeline you have to move all your data out of these databases, and then get all the mission critical applications re-associated with one of these databases, and by the way when you do so it is advisable that you use native interfaces (so that sometime in the future you have no chance whatsoever of repeating this folly at their expense).

    The simple fact of the matter (which the self-made big three do not want you to know) is that you can put ODBC, JDBC, even platform specific data access APIs such as OLE DB and ADO.NET atop any of these databases, and then explore and exploit the benefits of new technologies and paradigms as long as the tool pool supports one of more of these standards.

    Unfortunately the no-brainer above appears to be the more difficult of the choices before decision makers. In other words, many would rather dig themselves into a deeper hole (unknowingly i can only presume) that ultimately leads to technology lock-in.

    The biggest challenge before any RDBMS based infrastructure today isn't which of the self-made big three to migrate to wholesale, rather, how to make progressive use of the pool of disparate applications, and application databases that proliferate the enterprise.

    This is another way of understanding the burgeoning market for Virtual Databases, which in my opiion present the new frontier in database technology.

     

    ]]>
    Ingres - A Forgotten Database, the untold storyhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/279Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    An interesting piece by Michael Carey architect for Liquid Data at BEA re. Enterprise Information Integration from XML Journal.

    Key quote.

    Since the dawn of the database era more than three decades ago, enterprises have been amassing an ever-increasing volume of information - both current and historical - about their operations. For the past two of those three decades, the database world has struggled with the problem of somehow integrating information that natively resides in multiple database systems or other information sources (Landers and Rosenberg).

    This is the root cause of many of the systems integration challenges facing may IT decsion makers. They want to exploit the new and emerging technologies, but the internal disparity of data and application logic presents many obstacles.

    Michael had this to say in his introduction.

    The IT world knows this problem today as the enterprise information integration (EII) problem: enterprise applications need to be able to easily access and combine information about a given business entity from a distributed and highly varied collection of information sources. Relevant sources include various relational database systems (RDBMSs); packaged applications from vendors such as Siebel, PeopleSoft, SAP, and others; "homegrown" proprietary systems; and an increasing number of data sources that are starting to speak XML, such as XML files and Web services.

    Virtuoso (which coincedentally has been used to build and host this blog) has been developed to address the challenges presented above; by providing a Virtual Database Engine for disparate data and application logic (all the GEMs on this page have been generated on the fly using it's SQL-XML functionality).

    Additional article excerpts:
    With XQuery, the solution sketched above can be implemented by viewing the enterprise's different data sources all as virtual XML documents and functions. XQuery can stitch the distributed customer information together into a comprehensive, reusable base view.

    A critical issue at this point is how sensistive the XML VIEW is to underlying data source changes. Enterprises are dynamic, so static XML VIEWs are going to be suboptimal in many situations. Applications are only as relevant as the underlying data fluidity served up by the data access (this issue is data format agnostic).

    Virtuoso addresses this problem through its support of Persistent and Transient forms of XML VIEWs (which are derived from SQL, XML, Web Services, or any combination of these).

    Final excerpt:
    The relational data sources can be exposed using simple default XML Schemas, and the other sources - SAP and the credit-checking Web service - can be exposed to XQuery as callable XQuery functions with appropriate signatures.

    Unfortunately XML Schemas aren't easy, so making this a requirement for producing XML VIEWs is somewhat problematic (or should I say challenging). Of course this approach has it merits, but it does put a significant knowledge acquisition burden on the end-user or developer. This is why Virtuoso also supports an approach based on SQL extensions for generating  XML from SQL that facilitate the production of Well Formed and/or Valid XML documents on the fly from heterogeneous SQL Data Sources (this syntax is identical to the FOR XML RAW | AUTO | EXPLICIT modes of SQL Server). It can also use it's in-built XSL-T engine to further transform other non SQL XML data sources (and then generate an XML Schema for the final product if required and validate against this schema using it's in-build XML Schema validaton engine).

    This article certainly sheds light on the kinds of problems that EII based technologies such as Virtual Databases are positioned to address.

    There is a live XQuery demo of Virtuoso at: http://demo.openlinksw.com:8890/xqdemo

    ]]>
    <a href="http://www.sys-con.com/xml/article2a.cfm?id=652&amp;count=18437&amp;tot=14&amp;page=12">piece</a>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/276Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    When Virtuoso first unleashed support for XML (in-built XSL, Native XML Storage, Validating XML Parser, XPath, and XQuery) the core message was the delivery of a single server solution that would address the challenges of creating XML data.

    In the year 2000 the question of the shape and form of XML data was unclear to many, and reading the article below basically took me back in time to when we released Virtuoso 2.0 (we are now at release 3.0 commercially with a 3.2 beta dropping any minute).

    RSS is a great XML application, and it does a great job of demonstrating how XML --the new data access foundation layer-- will galvanize the next generation Web (I refer to this as Web 2.0.).

    RSS: INJAN (It's not just about news)

    RSS is not just about news, according to Ian Davis on rss-dev.
    He presents a nice list of alternatives, which I reproduce here (and to which I�d add, of course, bibliography management)

    • Sitemaps: one of the S�s in RSS stands for summary. A sitemap is a summary of the content on a site, the items are pages or content areas. This is clearly a non-chronological ordering of items. Is a hierarchy of RSS sitemaps implied here � how would the linking between them work? How hard would it be to hack a web browser to pick up the RSS sitemap and display it in a sidebar when you visit the site?
    • Small ads: also known as classifieds. These expire so there�s some kind of dynamic going on here but the ordering of items isn�t necessarily chronological. How to describe the location of the seller, or the condition of the item or even the price. Not every ad is selling something � perhaps it�s to rent out a room.
    • Personals: similar model to the small ads. No prices though (I hope). Comes with a ready made vocabulary of terms that could be converted to an RDF schema. Probably should do that just for the hell of it anyway � gsoh
    • Weather reports: how about a week�s worth of weather in an RSS channel. If an item is dated in the future, should an aggregator display it before time? Alternate representations include maps of temperature and pressure etc.
    • Auctions: again, related to small ads, but these are much more time limited since there is a hard cutoff after which the auction is closed. The sequence of bids could be interesting � would it make sense to thread them like a discussion so you can see the tactics?
    • TV listings: this is definitely chronological but with a twist � the items have durations. They also have other metadata such as cast lists, classification ratings, widescreen, stereo, program type. Some types have additional information such as director and production year.
    • Top ten listings: top ten singles, books, dvds, richest people, ugliest, rear of the year etc. Not chronological, but has definate order. May update from day to day or even more often.
    • Sales reporting: imagine if every department of a company reported their sales figures via RSS. Then the divisions aggregate the departmental figures and republish to the regional offices, who aggregate and add value up the chain. The chairman of the company subscribes to one super-aggregate feed.
    • Membership lists / buddy lists: could I publish my buddy list from Jabber or other instant messengers? Maybe as an interchange format or perhaps could be used to look for shared contacts. Lots of potential overlap with FOAF here.
    • Mailing lists: or in fact any messaging system such as usenet. There are some efforts at doing this already (e.g. yahoogroups) but we need more information � threads; references; headers; links into archives.
    • Price lists / inventory: the items here are products or services. No particular ordering but it�d be nice to be able to subscribe to a catalog of products and prices from a company. The aggregator should be able to pick out price rises or bargains given enough history.

    Thus, if we can comprehend RSS (the blog article below does a great job) we should be able to see the fundamental challenges that are before any organization seeking to exploit the potential of the imminent Web 2.0 inflection; how will you cost-effectively create XML data from existing data sources? Without upgrading or switching database engines, operating systems, programming languages? Put differently how can you exploit this phenomenon without losing your ever dwindling technology choices (believe me choices are dwindling fast but most are oblivious to this fact).

     

    ]]>
    RSS: INJAN (It's not just about news)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/241Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Microsoft just made the VSIP program free of charge. Awesome.

    [via The Scobleizer Weblog]

    Now this is good news from Microsoft! This means that products like Virtuoso can now compete head-on with Yukon (on a level playing field when it arrives) as far as Visual Studio.NET integration goes. Hopefully I will no longer have to rant about any of the following:

    1. Missing Data Access Controls and Wizards for ODBC (we already have annbsp interesting Generic ADO.NET Provider en route to GA release)
    2. Tightly bound integration between Visual Studio.NET ("Whidbey" or "Orcas")nbspand Yukon (next release of SQL Server), it's up to us (OpenLink) to get the same degree of integration re. Virtuoso (via VSIP), but most importantly Visual Studio's future will not be inextricably linked to Yukon's (let's hope the same applies to IE and Longhorn)

    I wonder if the same degree of openness could extend to Web Matrix? That would be something indeed!

    ]]>
    VSIP program free of chargehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/209Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    O'Reilly on the Commoditization of Software

    Certinaly an interesting proposition, or should I say vision, but I don't think this proposition does justice to some of the valid insights contained in this recent IDG interview with Tim O'Reilly. Here are some of Tim's quotes:

    "Nobody is pointing out something that I think is way more significant: all of the killer apps of the Internet era: Amazon (.com, Inc), Google (Inc.), and Maps.yahoo.com. They run on Linux or FreeBSD, but they're not apps in the way that people have traditionally thought of applications, so they just don't get considered. Amazon is built with Perl on top of Linux. It's basically a bunch of open source hackers, but they're working for a company that's as fiercely proprietary as any proprietary software company."

    Solutions are always more important that the technology that makes up the solutions from a business development perspective. The trouble is that the constituent parts of a solution ultimately affect the longevity of the solution (the future adaptability of the solution), hence the middleware and components segments of the software industry.

    "With eBay it's even clearer. The fact is, it's the critical mass of marketplace buyers and sellers and all the information that people have put in that marketplace as a repository."

    "So I think we're going to find more and more places where that happens, where somebody gets a critical mass of customers and data and that becomes their source of value. On that basis, I will predict that -- this is an outrageous prediction -- but eBay will buy Oracle someday. The value will have moved so much to people who are not now seen as software suppliers."

    In reading this article that I can only assume that Tim does realize the inevitable; computing is, and always will be about data -- creation, transformation, dissemination, and exploitation. That said, you don't maximize the opportunities that such a realization accords by acquiring the largest vendor of database software.

    The largest database vendor doesn't imply dominance in any of the following areas:

    1. Data Creation
    2. Data Storage
    3. Data Access
    4. Data Dissemination
    5. Data Exploitation

    I see the Internet as the Database (comprising various forms), and the Web as a dominant database segment within Internet realm. Every Internet Point of Presence is really a point of Data interaction; Creation, Storage, Access, Dissemination, and Exploitation.

    eBay can acquire a license from Oracle or any other database vendor and still be sucessful, and all they need to do is come to the actual realization that like Amazon and Google they could become a very important Executable and Semantic Web platform by finally understanding that their home page isn't that important, it's the interactions with the site that matter. All of this is certainly achievable without acquiring Oracle.

    In short, this applies to any organization that seeks to incorporate the Internet into their operational strategy (Business Development, Customer Services, Intranets, Extranets etc.). I am inclined to believe that Sofware Commoditization (which has been with us for a very long time) is the new moniker for "its all about data" or to quote Sam Ruby, "It's just data".

    ]]>
    eBay Will Someday Buy Oracle?http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/202Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    How Amazon Opens Up And Cleans Up

    Just yesterday we had an article about how Amazon's technology was becoming their biggest product, but that could soon change as people continue to innovate around Amazon's web services offering, letting just about anyone access Amazon's vast database, and built interesting and useful applications on it. When they originally launched this offering a number of developers thought it was cool, but weren't sure what could actually be done with it. However, given some time, data, and an open API, creative developers are always going to come up with interesting solutions. I don't know if any of these are really a "killer app" yet, but Amazon now has a vision of being the "e-commerce platform" for the world. There's something appealing about that notion. If, anytime you wanted to sell something on your website, you could easily hook into Amazon's catalog, transaction processing, and fulfillment process, there are some interesting possibilities. Right now, it's just simple things, such as creating a way to automatically match up the top song titles being played on the radio with those CDs at Amazon. In the future, though, you could see how an even bigger and more powerful Amazon could become something of a central "bucket of e-commerce" which many other sites pull from in creative ways. So, then, the question becomes how big is this opportunity, really? As I said, it's an appealing idea, but how many people actually buy through these sorts of applications vs. those who just go to Amazon and buy it themselves. The "killer app" built on top of Amazon would need to have really compelling reasons to buy directly through it - and I don't think anyone's gotten that far yet. [via Techdirt]

    There is nothing wrong with embracing Open Standards. Amazon is demonstrating

    ]]>
    How Amazon Opens Up And Cleans Uphttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/179Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is DBpedia?

    DBpedia is a community effort to provide a contemporary deductive database derived from Wikipedia content. Project contributions can be partitioned as follows:

    1. Ontology Construction and Maintenance
    2. Dataset Generation via Wikipedia Content Extraction & Transformation
    3. Live Database Maintenance & Administration -- includes actual Linked Data loading and publishing, provision of SPARQL endpoint, and traditional DBA activity
    4. Internationalization.

    Why is DBpedia important?

    Comprising the nucleus of the Linked Open Data effort, DBpedia also serves as a fulcrum for the burgeoning Web of Linked Data by delivering a dense and highly-interlinked lookup database. In its most basic form, DBpedia is a great source of strong and resolvable identifiers for People, Places, Organizations, Subject Matter, and many other data items of interest. Naturally, it provides a fantastic starting point for comprehending the fundamental concepts underlying TimBL's initial Linked Data meme.

    How do I use DBpedia?

    Depending on your particular requirements, whether personal or service-specific, DBpedia offers the following:

    • Datasets that can be loaded on your deductive database (also known as triple or quad stores) platform of choice
    • Live browsable HTML+RDFa based entity description pages
    • A wide variety of data formats for importing entity description data into a broad range of existing applications and services
    • A SPARQL endpoint allowing ad-hoc querying over HTTP using the SPARQL query language, and delivering results serialized in a variety of formats
    • A broad variety of tools covering query by example, faceted browsing, full text search, entity name lookups, etc.

    What is the DBpedia 3.6 + Virtuoso Cluster Edition Combo?

    OpenLink Software has preloaded the DBpedia 3.6 datasets into a preconfigured Virtuoso Cluster Edition database, and made the package available for easy installation.

    Why is the DBpedia+Virtuoso package important?

    The DBpedia+Virtuoso package provides a cost-effective option for personal or service-specific incarnations of DBpedia.

    For instance, you may have a service that isn't best-served by competing with the rest of the world for ad-hoc query time and resources on the live instance, which itself operates under various restrictions which enable this ad-hoc query service to be provided at Web Scale.

    Now you can easily commission your own instance and quickly exploit DBpedia and Virtuoso's database feature set to the max, powered by your own hardware and network infrastructure.

    How do I use the DBpedia+Virtuoso package?

    Pre-requisites are simply:

    1. Functional Virtuoso Cluster Edition installation.
    2. Virtuoso Cluster Edition License.
    3. 90 GB of free disk space -- you ultimately only need 43 gigs, but this our recommended free disk space size pre installation completion.

    To install the Virtuoso Cluster Edition simply perform the following steps:

    1. Download Software.
    2. Run installer
    3. Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell):

      . /opt/virtuoso/virtuoso-enterprise.sh
    4. Run the mkcluster.sh script which defaults to a 4 node cluster
    5. Set VIRTUOSO_HOME environment variable -- if you want to start cluster databases distinct from single server databases via distinct root directory for database files (one that isn't adjacent to single-server database directories)
    6. Start Virtuoso Cluster Edition instances using command:
      virtuoso-start.sh
    7. Stop Virtuoso Cluster Edition instances using command:
      virtuoso-stop.sh

    To install your personal or service specific edition of DBpedia simply perform the following steps:

    1. Navigate to your installation directory
    2. Download Installer script (dbpedia-install.sh)
    3. Set execution mode on script using command:
      chmod 755 dbpedia-install.sh
    4. Shutdown any Virtuoso instances that may be currently running
    5. Set your VIRTUOSO_HOME environment variable, e.g., to the current directory, via command (this may vary depending on your shell):
      export VIRTUOSO_HOME=`pwd`
    6. Run script using command:
      sh dbpedia-install.sh

    Once the installation completes (approximately 1 hour and 30 minutes from start time), perform the following steps:

    1. Verify that the Virtuoso Conductor (HTML based Admin UI) is in place via:
      http://localhost:[port]/conductor
    2. Verify that the Precision Search & Find UI is in place via:
      http://localhost:[port]/fct
    3. Verify that DBpedia's Green Entity Description Pages are in place via:
      http://localhost:[port]/resource/DBpedia

    Related

    ]]>
    Virtuoso + DBpedia 3.6 Installation Guide (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1654Tue, 25 Jan 2011 19:46:26 GMT42011-01-25T14:46:26-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is SPARQL?

    A declarative query language from the W3C for querying structured propositional data (in the form of 3-tuple [triples] or 4-tuple [quads] records) stored in a deductive database (colloquially referred to as triple or quad stores in Semantic Web and Linked Data parlance).

    SPARQL is inherently platform independent. Like SQL, the query language and the backend database engine are distinct. Database clients capture SPARQL queries which are then passed on to compliant backend databases.

    Why is it important?

    Like SQL for relational databases, it provides a powerful mechanism for accessing and joining data across one or more data partitions (named graphs identified by IRIs). The aforementioned capability also enables the construction of sophisticated Views, Reports (HTML or those produced in native form by desktop productivity tools), and data streams for other services.

    Unlike SQL, SPARQL includes result serialization formats and an HTTP based wire protocol. Thus, the ubiquity and sophistication of HTTP is integral to SPARQL i.e., client side applications (user agents) only need to be able to perform an HTTP GET against a URL en route to exploiting the power of SPARQL.

    How do I use it, generally?

    1. Locate a SPARQL endpoint (DBpedia, LOD Cloud Cache, Data.Gov, URIBurner, others), or;
    2. Install a SPARQL compliant database server (quad or triple store) on your desktop, workgroup server, data center, or cloud (e.g., Amazon EC2 AMI)
    3. Start the database server
    4. Execute SPARQL Queries via the SPARQL endpoint.

    How do I use SPARQL with Virtuoso?

    What follows is a very simple guide for using SPARQL against your own instance of Virtuoso:

    1. Software Download and Installation
    2. Data Loading from Data Sources exposed at Network Addresses (e.g. HTTP URLs) using very simple methods
    3. Actual SPARQL query execution via SPARQL endpoint.

    Installation Steps

    1. Download Virtuoso Open Source or Virtuoso Commercial Editions
    2. Run installer (if using Commercial edition of Windows Open Source Edition, otherwise follow build guide)
    3. Follow post-installation guide and verify installation by typing in the command: virtuoso -? (if this fails check you've followed installation and setup steps, then verify environment variables have been set)
    4. Start the Virtuoso server using the command: virtuoso-start.sh
    5. Verify you have a connection to the Virtuoso Server via the command: isql localhost (assuming you're using default DB settings) or the command: isql localhost:1112 (assuming demo database) or goto your browser and type in: http://<virtuoso-server-host-name>:[port]/conductor (e.g. http://localhost:8889/conductor for default DB or http://localhost:8890/conductor if using Demo DB)
    6. Go to SPARQL endpoint which is typically -- http://<virtuoso-server-host-name>:[port]/sparql
    7. Run a quick sample query (since the database always has system data in place): select distinct * where {?s ?p ?o} limit 50 .

    Troubleshooting

    1. Ensure environment settings are set and functional -- if using Mac OS X or Windows, so you don't have to worry about this, just start and stop your Virtuoso server using native OS services applets
    2. If using the Open Source Edition, follow the getting started guide -- it covers PATH and startup directory location re. starting and stopping Virtuoso servers.
    3. Sponging (HTTP GETs against external Data Sources) within SPARQL queries is disabled by default. You can enable this feature by assigning "SPARQL_SPONGE" privileges to user "SPARQL". Note, more sophisticated security exists via WebID based ACLs.

    Data Loading Steps

    1. Identify an RDF based structured data source of interest -- a file that contains 3-tuple / triples available at an address on a public or private HTTP based network
    2. Determine the Address (URL) of the RDF data source
    3. Go to your Virtuoso SPARQL endpoint and type in the following SPARQL query: DEFINE GET:SOFT "replace" SELECT DISTINCT * FROM <RDFDataSourceURL> WHERE {?s ?p ?o}
    4. All the triples in the RDF resource (data source accessed via URL) will be loaded into the Virtuoso Quad Store (using RDF Data Source URL as the internal quad store Named Graph IRI) as part of the SPARQL query processing pipeline.

    Note: the data source URL doesn't even have to be RDF based -- which is where the Virtuoso Sponger Middleware comes into play (download and install the VAD installer package first) since it delivers the following features to Virtuoso's SPARQL engine:

    1. Transformation of data from non RDF data sources (file content, hypermedia resources, web services output etc..) into RDF based 3-tuples (triples)
    2. Cache Invalidation Scheme Construction -- thus, subsequent queries (without the define get:soft "replace" pragma will not be required bar when you forcefully want to override cache).
    3. If you have very large data sources like DBpedia etc. from CKAN, simply use our bulk loader .

    SPARQL Endpoint Discovery

    Public SPARQL endpoints are emerging at an ever increasing rate. Thus, we've setup up a DNS lookup service that provides access to a large number of SPARQL endpoints. Of course, this doesn't cover all existing endpoints, so if our endpoint is missing please ping me.

    Here are a collection of commands for using DNS-SD to discover SPARQL endpoints:

    1. dns-sd -B _sparql._tcp sparql.openlinksw.com -- browse for services instances
    2. dns-sd -Z _sparql._tcp sparql.openlinksw.com -- output results in Zone File format

    Related

    1. Using HTTP from Ruby -- you can just make SPARQL Protocol URLs re. SPARQL
    2. Using SPARQL Endpoints via Ruby -- Ruby example using DBpedia endpoint
    3. Interactive SPARQL Query By Example (QBE) tool -- provides a graphical user interface (as is common in SQL realm re. query building against RDBMS engines) that works with any SPARQL endpoint
    4. Other methods of loading RDF data into Virtuoso
    5. Virtuoso Sponger -- architecture and how it turns a wide variety of non RDF data sources into SPARQL accessible data
    6. Using OpenLink Data Explorer (ODE) to populate Virtuoso -- locate a resource of interest; click on a bookmarklet or use context menus (if using ODE extensions for Firefox, Safari, or Chrome); and you'll have SPARQL accessible data automatically inserted into your Virtuoso instance.
    7. W3C's SPARQLing Data Access Ingenuity -- an older generic SPARQL introduction post
    8. Collection of SPARQL Query Examples -- GoodRelations (Product Offers), FOAF (Profiles), SIOC (Data Spaces -- Blogs, Wikis, Bookmarks, Feed Collections, Photo Galleries, Briefcase/DropBox, AddressBook, Calendars, Discussion Forums)
    9. Collection of Live SPARQL Queries against LOD Cloud Cache -- simple and advanced queries.
    ]]>
    Simple Virtuoso Installation & Utilization Guide for SPARQL Users (Update 5)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1647Wed, 19 Jan 2011 15:43:35 GMT102011-01-19T10:43:35-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    After a long period of trying to demystify and unravel the wonders of standards compliant structured data access, combined with protocols (e.g., HTTP) that separate:

    1. Identity,
    2. Access,
    3. Storage,
    4. Representation, and
    5. Presentation.

    I ended up with what I can best describe as the Data 3.0 Manifesto. A manifesto for standards complaint access to structured data object (or entity) descriptors.

    Some Related Work

    Alex James (Program Manager Entity Frameworks at Microsoft), put together something quite similar to this via his Base4 blog (around the Web 2.0 bootstrap time), sadly -- quoting Alex -- that post has gone where discontinued blogs and their host platforms go (deep deep irony here).

    It's also important to note that this manifesto is also a variant of the TimBL's Linked Data Design Issues meme re. Linked Data, but totally decoupled from RDF (data representation formats aspect) and SPARQL which -- in my world view -- remain implementation details.

    Data 3.0 manifesto

    • An "Entity" is the "Referent" of an "Identifier."
    • An "Identifier" SHOULD provide a global, unambiguous, and unchanging (though it MAY be opaque!) "Name" for its "Referent".
    • A "Referent" MAY have many "Identifiers" (Names), but each "Identifier" MUST have only one "Referent".
    • Structured Entity Descriptions SHOULD be based on the Entity-Attribute-Value (EAV) Data Model, and SHOULD therefore take the form of one or more 3-tuples (triples), each comprised of:
      • an "Identifier" that names an "Entity" (i.e., Entity Name),
      • an "Identifier" that names an "Attribute" (i.e., Attribute Name), and
      • an "Attribute Value", which may be an "Identifier" or a "Literal".
    • Structured Descriptions SHOULD be CARRIED by "Descriptor Documents" (i.e., purpose specific documents where Entity Identifiers, Attribute Identifiers, and Attribute Values are clearly discernible by the document's intended consumers, e.g., humans or machines).
    • Structured Descriptor Documents can contain (carry) several Structured Entity Descriptions
    • Stuctured Descriptor Documents SHOULD be network accessible via network addresses (e.g., HTTP URLs when dealing with HTTP-based Networks).
    • An Identifier SHOULD resolve (de-reference) to a Structured Representation of the Referent's Structured Description.

    Related

    ]]>
    Data 3.0 (a Manifesto for Platform Agnostic Structured Data) Update 5http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1624Tue, 25 May 2010 21:10:28 GMT82010-05-25T17:10:28.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is URIBurner?

    A service from OpenLink Software, available at: http://uriburner.com, that enables anyone to generate structured descriptions -on the fly- for resources that are already published to HTTP based networks. These descriptions exist as hypermedia resource representations where links are used to identify:

    • the entity (data object or datum) being described,
    • each of its attributes, and
    • each of its attributes values (optionally).

    The hypermedia resource representation outlined above is what is commonly known as an Entity-Attribute-Value (EAV) Graph. The use of generic HTTP scheme based Identifiers is what distinguishes this type of hypermedia resource from others.

    Why is it Important?

    The virtues (dual pronged serendipitous discovery) of publishing HTTP based Linked Data across public (World Wide Web) or private (Intranets and/or Extranets) is rapidly becoming clearer to everyone. That said, the nuance laced nature of Linked Data publishing presents significant challenges to most. Thus, for Linked Data to really blossom the process of publishing needs to be simplified i.e., "just click and go" (for human interaction) or REST-ful orchestration of HTTP CRUD (Create, Read, Update, Delete) operations between Client Applications and Linked Data Servers.

    How Do I Use It?

    In similar vane to the role played by FeedBurner with regards to Atom and RSS feed generation, during the early stages of the Blogosphere, it enables anyone to publish Linked Data bearing hypermedia resources on an HTTP network. Thus, its usage covers two profiles: Content Publisher and Content Consumer.

    Content Publisher

    The steps that follow cover all you need to do:

    • place a tag within your HTTP based hypermedia resource (e.g. within section for HTML )
    • use a URL via the @href attribute value to identify the location of the structured description of your resource, in this case it takes the form: http://linkeddata.uriburner.com/about/id/{scheme-or-protocol}/{your-hostname-or-authority}/{your-local-resource}
    • for human visibility you may consider adding associating a button (as you do with Atom and RSS) with the URL above.

    That's it! The discoverability (SDQ) of your content has just multiplied significantly, its structured description is now part of the Linked Data Cloud with a reference back to your site (which is now a bona fide HTTP based Linked Data Space).

    Examples

    HTML+RDFa based representation of a structured resource description:

    <link rel="describedby" title="Resource Description (HTML)"type="text/html" href="http://linkeddata.uriburner.com/about/id/http/example.org/xyz.html"/>

    JSON based representation of a structured resource description:

    <link rel="describedby" title="Resource Description (JSON)" type="application/json" href="http://linkeddata.uriburner.com/about/id/http/example.org/xyz.html"/>

    N3 based representation of a structured resource description:

    <link rel="describedby" title="Resource Description (N3)" type="text/n3" href="http://linkeddata.uriburner.com/about/id/http/example.org/xyz.html"/>

    RDF/XML based representations of a structured resource description:

    <link rel="describedby" title="Resource Description (RDF/XML)" type="application/rdf+xml" href="http://linkeddata.uriburner.com/about/id/http/example.org/xyz.html"/>

    Content Consumer

    As an end-user, obtaining a structured description of any resource published to an HTTP network boils down to the following steps:

    1. go to: http://uriburner.com
    2. drag the Page Metadata Bookmarklet link to your Browser's toolbar
    3. whenever you encounter a resource of interest (e.g. an HTML page) simply click on the Bookmarklet
    4. you will be presented with an HTML representation of a structured resource description (i.e., identifier of the entity being described, its attributes, and its attribute values will be clearly presented).

    Examples

    If you are a developer, you can simply perform an HTTP operation request (from your development environment of choice) using any of the URL patterns presented below:

    HTML:
    • curl -I -H "Accept: text/html" http://linkeddata.uriburner.com/about/id/{scheme}/{authority}/{local-path}

    JSON:

    • curl -I -H "Accept: application/json" http://linkeddata.uriburner.com/about/id/{scheme}/{authority}/{local-path}
    • curl http://linkeddata.uriburner.com/about/data/json/{scheme}/{authority}/{local-path}

    Notation 3 (N3):

    • curl -I -H "Accept: text/n3" http://linkeddata.uriburner.com/about/id/{scheme}/{authority}/{local-path}
    • curl http://linkeddata.uriburner.com/about/data/n3/{scheme}/{authority}/{local-path}
    • curl -I -H "Accept: text/turtle" http://linkeddata.uriburner.com/about/id/{scheme}/{authority}/{local-path}
    • curl http://linkeddata.uriburner.com/about/data/ttl/{scheme}/{authority}/{local-path}

    RDF/XML:

    • curl -I -H "Accept: application/rdf+xml" http://linkeddata.uriburner.com/about/id/{scheme}/{authority}/{local-path}
    • curl http://linkeddata.uriburner.com/about/data/xml/{scheme}/{authority}/{local-path}

    Conclusion

    URIBurner is a "deceptively simple" solution for cost-effective exploitation of HTTP based Linked Data meshes. It doesn't require any programming or customization en route to immediately realizing its virtues.

    If you like what URIBurner offers, but prefer to leverage its capabilities within your domain -- such that resource description URLs reside in your domain, all you have to do is perform the following steps:

    1. download a copy of Virtuoso (for local desktop, workgroup, or data center installation) or
    2. instantiate Virtuoso via the Amazon EC2 Cloud
    3. enable the Sponger Middleware component via the RDF Mapper VAD package (which includes cartridges for over 30 different resources types)

    When you install your own URIBurner instances, you also have the ability to perform customizations that increase resource description fidelity in line with your specific needs. All you need to do is develop a custom extractor cartridge and/or meta cartridge.

    Related:

    ]]>
    URIBurner: Painless Generation & Exploitation of Linked Data (Update 1 - Demo Links Added)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1613Thu, 11 Mar 2010 15:16:34 GMT52010-03-11T10:16:34.000003-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Motivation for this post arose from a series of Twitter exchanges between Tony Hirst and I, in relation to his blog post titled: So What Is It About Linked Data that Makes it Linked Data™ ?

    At the end of the marathon session, it was clear to me that a blog post was required for future reference, at the very least :-)

    What is Linked Data?

    "Data Access by Reference" mechanism for Data Objects (or Entities) on HTTP networks. It enables you to Identify a Data Object and Access its structured Data Representation via a single Generic HTTP scheme based Identifier (HTTP URI). Data Object representation formats may vary; but in all cases, they are hypermedia oriented, fully structured, and negotiable within the context of a client-server message exchange.

    Why is it Important?

    Information makes the world tick!

    Information doesn't exist without data to contextualize.

    Information is inaccessible without a projection (presentation) medium.

    All information (without exception, when produced by humans) is subjective. Thus, to truly maximize the innate heterogeneity of collective human intelligence, loose coupling of our information and associated data sources is imperative.

    How is Linked Data Delivered?

    Linked Data is exposed to HTTP networks (e.g. World Wide Web) via hypermedia resources bearing structured representations of data object descriptions. Remember, you have a single Identifier abstraction (generic HTTP URI) that embodies: Data Object Name and Data Representation Location (aka URL).

    How are Linked Data Object Representations Structured?

    A structured representation of data exists when an Entity (Datum), its Attributes, and its Attribute Values are clearly discernible. In the case of a Linked Data Object, structured descriptions take the form of a hypermedia based Entity-Attribute-Value (EAV) graph pictorial -- where each Entity, its Attributes, and its Attribute Values (optionally) are identified using Generic HTTP URIs.

    Examples of structured data representation formats (content types) associated with Linked Data Objects include:

    • text/html
    • text/turtle
    • text/n3
    • application/json
    • application/rdf+xml
    • Others

    How Do I Create Linked Data oriented Hypermedia Resources?

    You markup resources by expressing distinct entity-attribute-value statements (basically these a 3-tuple records) using a variety of notations:

    • (X)HTML+RDFa,
    • JSON,
    • Turtle,
    • N3,
    • TriX,
    • TriG,
    • RDF/XML, and
    • Others (for instance you can use Atom data format extensions to model EAV graph as per OData initiative from Microsoft).

    You can achieve this task using any of the following approaches:

    • Notepad
    • WYSIWYG Editor
    • Transformation of Database Records via Middleware
    • Transformation of XML based Web Services output via Middleware
    • Transformation of other Hypermedia Resources via Middleware
    • Transformation of non Hypermedia Resources via Middleware
    • Use a platform that delivers all of the above.

    Practical Examples of Linked Data Objects Enable

    • Describe Who You Are, What You Offer, and What You Need via your structured profile, then leave your HTTP network to perform the REST (serendipitous discovery of relevant things)
    • Identify (via map overlay) all items of interest based on a 2km+ radious of my current location (this could include vendor offerings or services sought by existing or future customers)
    • Share the latest and greatest family photos with family members *only* without forcing them to signup for Yet Another Web 2.0 service or Social Network
    • No repetitive signup and username and password based login sequences per Web 2.0 or Mobile Application combo
    • Going beyond imprecise Keyword Search to the new frontier of Precision Find - Example, Find Data Objects associated with the keywords: Tiger, while enabling the seeker disambiguate across the "Who", "What", "Where", "When" dimensions (with negation capability)
    • Determine how two Data Objects are Connected - person to person, person to subject matter etc. (LinkedIn outside the walled garden)
    • Use any resource address (e.g blog or bookmark URL) as the conduit into a Data Object mesh that exposes all associated Entities and their social network relationships
    • Apply patterns (social dimensions) above to traditional enterprise data sources in combination (optionally) with external data without compromising security etc.

    How Do OpenLink Software Products Enable Linked Data Exploitation?

    Our data access middleware heritage (which spans 16+ years) has enabled us to assemble a rich portfolio of coherently integrated products that enable cost-effective evaluation and utilization of Linked Data, without writing a single line of code, or exposing you to the hidden, but extensive admin and configuration costs. Post installation, the benefits of Linked Data simply materialize (along the lines described above).

    Our main Linked Data oriented products include:

    • OpenLink Data Explorer -- visualizes Linked Data or Linked Data transformed "on the fly" from hypermedia and non hypermedia data sources
    • URIBurner -- a "deceptively simple" solution that enables the generation of Linked Data "on the fly" from a broad collection of data sources and resource types
    • OpenLink Data Spaces -- a platform for enterprises and individuals that enhances distributed collaboration via Linked Data driven virtualization of data across its native and/or 3rd party content manager for: Blogs, Wikis, Shared Bookmarks, Discussion Forums, Social Networks etc
    • OpenLink Virtuoso -- a secure and high-performance native hybrid data server (Relational, RDF-Graph, Document models) that includes in-built Linked Data transformation middleware (aka. Sponger).

    Related

    ]]>
    Revisiting HTTP based Linked Data (Update 1 - Demo Video Links Added)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1611Mon, 08 Mar 2010 14:59:37 GMT42010-03-08T09:59:37.000010-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Socially enhanced enterprise and invididual collaboration is becoming a focal point for a variety of solutions that offer erswhile distinct content managment features across the realms of Blogging, Wikis, Shared Bookmarks, Discussion Forums etc.. as part of an integrated platform suite. Recently, Socialtext has caught my attention courtesy of its nice features and benefits page . In addition, I've also found the Mike 2.0 portal immensely interesting and valuable, for those with an enterprise collaboration bent.

    Anyway, Socialtext and Mike 2.0 (they aren't identical and juxtaposition isn't seeking to imply this) provide nice demonstrations of socially enhanced collaboration for individuals and/or enterprises is all about:

    1. Identifying Yourself
    2. Identifying Others (key contributors, peers, collaborators)
    3. Serendipitous Discovery of key contributors, peers, and collaborators
    4. Serendipitous Discovery by key contributors, peers, and collaborators
    5. Develop and sustain relationships via socially enhanced professional network hybrid
    6. Utilize your new "trusted network" (which you've personally indexed) when seeking help or propagating a meme.

    As is typically the case in this emerging realm, the critical issue of discrete "identifiers" (record keys in sense) for data items, data containers, and data creators (individuals and groups) is overlooked albeit unintentionally.

    How HTTP based Linked Data Addresses the Identifier Issue

    Rather than using platform constrained identifiers such as:

    • email address (a "mailto" scheme identifier),
    • a dbms user account,
    • application specific account, or
    • OpenID.

    It enables you to leverage the platform independence of HTTP scheme Identifiers (Generic URIs) such that Identifiers for:

    1. You,
    2. Your Peers,
    3. Your Groups, and
    4. Your Activity Generated Data,

    simply become conduits into a mesh of HTTP -- referencable and accessible -- Linked Data Objects endowed with High SDQ (Serendipitious Discovery Quotient). For example my Personal WebID is all anyone needs to know if they want to explore:

    1. My Profile (which includes references to data objects associated with my interests, social-network, calendar, bookmarks etc.)
    2. Data generated by my activities across various data spaces (via data objects associated with my online accounts e.g. Del.icio.us, Twitter, Last.FM)
    3. Linked Data Meshups via URIBurner (or any other Virtuoso instance) that provide an extend view of my profile

    How FOAF+SSL adds Socially aware Security

    Even when you reach a point of equilibrium where: your daily activities trigger orchestratestration of CRUD (Create, Read, Update, Delete) operations against Linked Data Objects within your socially enhanced collaboration network, you still have to deal with the thorny issues of security, that includes the following:

    1. Single Sign On,
    2. Authentication, and
    3. Data Access Policies.

    FOAF+SSL, an application of HTTP based Linked Data, enables you to enhance your Personal HTTP scheme based Identifer (or WebID) via the following steps (peformed by a FOAF+SSL compliant platform):

    1. Imprint WebID within a self-signed x.509 based public key (certificate) associated with your private key (generated by FOAF+SSL platform or manually via OpenSSL)
    2. Store public key components (modulous and exponent) into your FOAF based profile document which references your Personal HTTP Identifier as its primary topic
    3. Leverage HTTP URL component of WebID for making public key components (modulous and exponent) available for x.509 certificate based authentication challenges posed by systems secured by FOAF+SSL (directly) or OpenID (indirectly via FOAF+SSL to OpenID proxy services).

    Contrary to conventional experiences with all things PKI (Public Key Infrastructure) related, FOAF+SSL compliant platforms typically handle the PKI issues as part of the protocol implementation; thereby protecting you from any administrative tedium without compromising security.

    Conclusions

    Understanding how new technology innovations address long standing problems, or understanding how new solutions inadvertently fail to address old problems, provides time tested mechanisms for product selection and value proposition comprehension that ultimately save scarce resources such as time and money.

    If you want to understand real world problem solution #1 with regards to HTTP based Linked Data look no further than the issues of secure, socially aware, and platform independent identifiers for data objects, that build bridges across erstwhile data silos.

    If you want to cost-effectively experience what I've outlined in this post, take a look at OpenLink Data Spaces (ODS) which is a distributed collaboration engine (enterprise of individual) built around the Virtuoso database engines. It simply enhances existing collaboration tools via the following capabilities:

    Addition of Social Dimensions via HTTP based Data Object Identifiers for all Data Items (if missing)

    1. Ability to integrate across a myriad of Data Source Types rather than a select few across RDBM Engines, LDAP, Web Services, and various HTTP accessible Resources (Hypermedia or Non Hypermedia content types)
    2. Addition of FOAF+SSL based authentication
    3. Addition of FOAF+SSL based Access Control Lists (ACLs) for policy based data access.

    Related:

    ]]>
    Linked Data & Socially Enhanced Collaboration (Enterprise or Individual) -- Update 1http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1610Thu, 04 Mar 2010 00:50:37 GMT42010-03-03T19:50:37-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Situation Analysis

    Since the beginning of the modern IT era, each period of innovation has inadvertently introduced its fair share of Data Silos. The driving force behind this anomaly remains an overemphasis on the role of applications when selecting problem solutions. Unfortunately, most solution selecting decision makers remain oblivious to the fact that most applications are architecturally monolithic; i.e., they fail to separate the following five layers that are critical to all solutions:

    1. Data Unit (Datum or Data Object) Identity,
    2. Data Storage/Persistence,
    3. Data Access,
    4. Data Representation, and
    5. Data Presentation/Visualization.

    The rise of the Internet, and its exponentially-growing user-friendly enclave known as the World Wide Web, is bringing the intrinsic costs of the monolithic application architecture anomaly to bear -- in manners unanticipated by many. For example, the emergence of network-oriented solutions across the realms of Enterprise 2.0-based Collaboration and Web 2.0-based Software-as-a-Service (SaaS), combined with the overarching influence of Social Media, are producing more heterogeneously-structured and disparately-located data sources than people can effectively process.

    As is often the case, a variety of problem and product monikers have emerged for the data access and integration challenges outlined above. Contemporary examples include Enterprise Information Integration, Master Data Management, and Data Virtualization. Labeling aside, the fundamental issues of the unresolved Data Integration challenge boil down to the following:

    • Data Model Heterogeneity
    • Data Quality (Cleanliness)
    • Semantic Variance across Contexts (e.g., weights and measures).

    Effectively solving today's data integration challenges requires a move away from monolithic application architecture to loosely-coupled, network-centric application architectures. Basically, we need a ubiquitous network-centric application protocol that lends itself to loosely-coupled across-the-wire orchestration of data interactions. In short, this will be what revitalizes the art of application development and deployment.

    The World Wide Web is built around a network application protocol called HTTP. This protocol intrinsically separates the five layers listed earlier, thereby enabling:

    • Use of Generic HTTP URIs as Data Object (Entity) Identifiers;
    • Identifier Co-reference, such that multiple Data Object Identifiers may reference the same Data Object;
    • Use of the Entity-Attribute-Value Model to describe Data Objects using real world modeling friendly conceptual graphs;
    • Use of HTTP URLs to Identify Locations of Resources that bear (host) Data Object Descriptions (Representations);
    • Data Access mechanism for retrieving Data Object Representations from persistent or transient storage locations.

    What is Virtuoso?

    A uniquely designed to address today's escalating Data Access and Integration challenges without compromising performance, security, or platform independence. At its core lies an unrivaled commitment to industry standards combined with unique technology innovation that transcends erstwhile distinct realms such as:

    When Virtuoso is installed and running, HTTP-based Data Objects are automatically created as a by-product of its powerful data virtualization, transcending data sources and data representation formats. The benefits of such power extend across profiles such as:

    Product Benefits Summary

    • Enterprise Agility — Virtuoso lets you mix-&-match best-of-class combinations of Operating Systems, Programming Environments, Database Engines and Data-Access Middleware when building or tweaking your IS infrastructure, without the typical impedance of vendor-lock-in.
    • Data Model Dexterity — By supporting multiple protocols and data models in a single product, Virtuoso protects you against costly vulnerabilities such as: perennial acquisition and accumulation of expensive data model specific DBMS products that still operate on the fundamental principle of: proprietary technology lock-in, at a time when heterogeneity continues to intrinsically define the information technology landscape.
    • Cost-effectiveness — By providing a single point of access (and single-sign-on, SSO) to a plethora of Web 2.0-style social networks, Web Services, and Content Management Systems, and by using Data Object Identifiers as units of Data Virtualization that become the focal points of all data access, Virtuoso lowers the cost to exploit emerging frontiers such as socially-enhanced enterprise collaboration.
    • Speed of Exploitation — Virtuoso provides the ability to rapidly assemble 360-degree conceptual views of data, across internal line-of-business application (CRM, ERP, ECM, HR, etc.) data and/or external data sources, whether these are unstructured, semi-structured, or fully structured.

    Bottom line, Virtuoso delivers unrivaled flexibility and scalability, without compromising performance or security.

    Related

     

    ]]>
    OpenLink Virtuoso - Product Value Proposition Overiewhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1609Sat, 27 Feb 2010 17:46:36 GMT32010-02-27T12:46:36-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    In recent times a lot of the commentary and focus re. Virtuoso has centered on the RDF Quad Store and Linked Data. What sometimes gets overlooked is the sophisticated Virtual Database Engine that provides the foundation for all of Virtuoso's data integration capabilities.

    In this post I provide a brief re-introduction to this essential aspect of Virtuoso.

    What is it?

    This component of Virtuoso is known as the Virtual Database Engine (VDBMS). It provides transparent high-performance and secure access to disparate data sources that are external to Virtuoso. It enables federated access and integration of data hosted by any ODBC- or JDBC-accessible RDBMS, RDF Store, XML database, or Document (Free Text)-oriented Content Management System. In addition, it facilitates integration with Web Services (SOAP-based SOA RPCs or REST-fully accessible Web Resources).

    Why is it important?

    In the most basic sense, you shouldn't need to upgrade your existing database engine version simply because your current DBMS and Data Access Driver combo isn't compatible with ODBC-compliant desktop tools such as Microsoft Access, Crystal Reports, BusinessObjects, Impromptu, or other of ODBC, JDBC, ADO.NET, or OLE DB-compliant applications. Simply place Virtuoso in front of your so-called "legacy database," and let it deliver the compliance levels sought by these tools

    In addition, it's important to note that today's enterprise, through application evolution, company mergers, or acquisitions, is often faced with disparately-structured data residing in any number of line-of-business-oriented data silos. Compounding the problem is the exponential growth of user-generated data via new social media-oriented collaboration tools and platforms. For companies to cost-effectively harness the opportunities accorded by the increasing intersection between line-of-business applications and social media, virtualization of data silos must be achieved, and this virtualization must be delivered in a manner that doesn't prohibitively compromise performance or completely undermine security at either the enterprise or personal level. Again, this is what you get by simply installing Virtuoso.

    How do I use it?

    The VDBMS may be used in a variety of ways, depending on the data access and integration task at hand. Examples include:

    Relational Database Federation

    You can make a single ODBC, JDBC, ADO.NET, OLE DB, or XMLA connection to multiple ODBC- or JDBC-accessible RDBMS data sources, concurrently, with the ability to perform intelligent distributed joins against externally-hosted database tables. For instance, you can join internal human resources data against internal sales and external stock market data, even when the HR team uses Oracle, the Sales team uses Informix, and the Stock Market figures come from Ingres!

    Conceptual Level Data Access using the RDF Model

    You can construct RDF Model-based Conceptual Views atop Relational Data Sources. This is about generating HTTP-based Entity-Attribute-Value (E-A-V) graphs using data culled "on the fly" from native or external data sources (Relational Tables/Views, XML-based Web Services, or User Defined Types).

    You can also derive RDF Model-based Conceptual Views from Web Resource transformations "on the fly" -- the Virtuoso Sponger (RDFizing middleware component) enables you to generate RDF Model Linked Data via a RESTful Web Service or within the process pipeline of the SPARQL query engine (i.e., you simply use the URL of a Web Resource in the FROM clause of a SPARQL query).

    It's important to note that Views take the form of HTTP links that serve as both Data Source Names and Data Source Addresses. This enables you to query and explore relationships across entities (i.e., People, Places, and other Real World Things) via HTTP clients (e.g., Web Browsers) or directly via SPARQL Query Language constructs transmitted over HTTP.

    Conceptual Level Data Access using ADO.NET Entity Frameworks

    As an alternative to RDF, Virtuoso can expose ADO.NET Entity Frameworks-based Conceptual Views over Relational Data Sources. It achieves this by generating Entity Relationship graphs via its native ADO.NET Provider, exposing all externally attached ODBC- and JDBC-accessible data sources. In addition, the ADO.NET Provider supports direct access to Virtuoso's native RDF database engine, eliminating the need for resource intensive Entity Frameworks model transformations.

    Related

    ]]>
    Re-introducing the Virtuoso Virtual Database Engine http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1608Wed, 17 Feb 2010 21:46:53 GMT12010-02-17T16:46:53-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Thanks to the TechCrunch post titled: Ten Technologies That Will Rock 2010, I've been able to quickly construct a derivative post that condenses the ten item list down to a Single Technology That Will Rock 2010 :-)

    Sticking with the TechCrunch layout, here is why all roads simply lead to Linked Data come 2010 and beyond:

    1. The Tablet: a new form factor addition re. Internet and Web application hosts which is just another way of saying: Linked Data will be accessible from Tablet applications.
    2. Geo: GPS chips are now standard features of mobile phones, so geolocation is increasingly becoming a necessary feature for any killer app. Thus, GeoSpatial Linked Data and GeopSpatial Queries are going to be a critical success factor for any endeavor that seeks to engage mobile applications developers and ultimately their end-users. Basiacally, you want to be able to perform Esoteric Search from these devices of the form: Find Vendors of a Camcorder (e.g., with a Zoom Factor: Weight Ratio of X) within a 2km Radius of my current location. Or how many items from my WishList are available from a Vendor within a 2km radius of my current location. Conversely, provide Vendors with the ability to spot potential Customers within a 2km of a given "clicks & mortar" location (e.g. BestBuy store).
    3. Realtime Search: Rich Structured Profiles that leverage standards such as FOAF and FOAF+SSL will enable Highly Personalized Realtime Search (HPRS) without compromisng privacy. Tecnically, this is about WebIDs securely bound to X.509 Certificates, providing access to verifiable and highly navigable Personal Profile Data Spaces that also double as personal search index entry points.
    4. Chrome OS: Just another operating system for exploiting the burgeoning Web of Linked Data
    5. HTML5: Courtesy of RDFa, just another mechanism for exposing Linked Data by making HTML+RDFa a bona fide markup for metadata (i.e., format for describing real world objects via their attribute-value graphs)
    6. Mobile Video: Simplifies the production and sharing of Video annotations (comments, reviews etc.) en route to creating rich Linked Discourse Data Spaces.
    7. Augmented Reality: Ditto
    8. Mobile Transactions: As per points 1&2 above, Vendor Discovery and Transaction Conusmation will increasingly be driven by high SDQ applications. The "Funnel Effect" (more choices based on individual preferences) will be a critical success factor for any one operating in the Mobile Transaction realm. Note, without Linked Data you cannot deliver scalable solutions that handle the combined requirements of: SDQ, "Funnel Effect", and Mobile Device form factor, will simply maginify the importance of Web accessible Linked Data.
    9. Android: An additional platform for items 1-8; basically, 2010 isn't going to be an iPhone only zone. Personally, this reminds me of a battle from the past i.e., Microsoft vs Apple, re. desktop computing dominance. Google has studied history very well :-)
    10. Social CRM: this is simply about applying points 1-9 alongide the construction of Linked Data from eCRM Data Spaces.

    As I've stated in the past (across a variety of mediums), you cannot build applications that have long term value without addressing the following issues:

    1. Data Item or Object Identity
    2. Data Structure -- Data Models
    3. Data Representation -- Data Model Entity & Relationships Representation mechanism (as delivered by metadata oriented markup)
    4. Data Storage -- Database Management Systems
    5. Data Access -- Data Access Protocols
    6. Data Presentation -- How you present Views and Reports from Structured Data Sources
    7. Data Security -- Data Access Policies

    The items above basically showcase the very essence of the HTTP URI abstraction that drives HTTP based Linked Data; which is also the basic payload unit that underlies REST.

    Conclusion

    I simply hope that the next decade marks a period of broad appreciation and comprehension of Data Access, Integration, and Management issues on the parts of: application developers, integrators, analysts, end-users, and decision makers. Remember, without structured Data we cannot produce or share Information, and without Information, we cannot produce of share Knowledge.

    Related

    ]]>
    One Technology That Will Rock 2010 (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1601Mon, 01 Feb 2010 14:02:41 GMT12010-02-01T09:02:41-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    We have just released an Amazon EC2 based public Snapshot of DBpedia 3.4. Thus, you can now instantiate a personal and/or service specific variant of the DBpedia 3.4 Linked Data Space. Basically, you can replicate what we host, within minutes (as opposed to days). In addition, you no longer need to squabble --on an unpredictable basis with others-- for the infrastructure resources behind DBpedia's public instance, when using the SPARQL Endpoint, Faceted Search & Find Services, or HTML Browser Pages etc.

    How Does It work?

    1. Instantiate a Virtuoso EC2 AMI (paid variety, which is aggressively priced at $49.99 for setup and $19.99 per month thereafter)
    2. Mount the shared DBpedia 3.4 public snapshot
    3. Start Virtuoso Server
    4. Start exploiting the DBpedia Linked Data Space.

    What Interfaces are exposed?

    1. SPARQL Endpoint
    2. Linked Data Viewer Pages (as you see in the public DBpedia instance)
    3. Faceted Search & Find UI and Web Services (REST or SOAP)
    4. All the inference rules for UMBEL, SUMO, YAGO, OpenCYC, and DBpedia-OWL data dictionaries
    5. Type Correlations Between DBpedia and Freebase

    Enjoy!

    ]]>
    Personal and/or Service Specific Linked Data Spaces in the Cloud: DBpedia 3.4http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1599Mon, 01 Feb 2010 13:58:14 GMT12010-02-01T08:58:14-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    One of the real problems that pervades all routes to Linked Data value prop. incomprehension stems from the layering of its value pyramid; especially when communicating with -initially detached- end-users.

    Note to Web Programmers: Linked Data is about Data (Wine) and not about Code (Fish). Thus, it isn't a "programmer only zone", far from it. More than anything else, its inherently inclusive and spreads its participation net widely across: Data Architects, Data Integrators, Power Users, Knowledge Workers, Information Workers, Data Analysts, etc.. Basically, everyone that can "click on a link" is invited to this particular party; remember, it is about "Linked Data" not "Linked Code", after all. :-)

    Problematic Value Pyramid Layering

    Here is an example of a Linked Data value pyramid that I am stumbling across --with some frequency-- these days (note: 1 being the pyramid apex):

    1. SPARQL Queries
    2. RDF Data Stores
    3. RDF Data Sets
    4. HTTP scheme URIs

    Basically, Linked Data deployment (assigning de-referencable HTTP URIs to DBMS records, their attributes, and attribute values [optionally] ) is occurring last. Even worse, this happens in the context of Linked Open Data oriented endeavors, resulting in nothing but confusion or inadvertent perpetuation of the overarching pragmatically challenged "Semantic Web" stereotype.

    As you can imagine, hitting SPARQL as your introduction to Linked Data is akin to hitting SQL as your introduction to Relational Database Technology, neither is an elevator-style value prop. relay mechanism.

    In the relational realm, killer demos always started with desktop productivity tools (spreadsheets, report-writers, SQL QBE tools etc.) accessing, relational data sources en route to unveiling the "Productivity" and "Agility" value prop. that such binding delivered i.e., the desktop application (clients) and the databases (servers) are distinct, but operating in a mutually beneficial manner to all, courtesy of a data access standards such as ODBC (Open Database Connectivity).

    In the Linked Data realm, learning to embrace and extend best practices from the relational dbms realm remains a challenge, a lot of this has to do with hangovers from a misguided perception that RDF databases will somehow completely replace RDBMS engines, rather than compliment them. Thus, you have a counter productive variant of NIH (Not Invented Here) in play, taking us to the dreaded realm of: Break the Pot and You Own It (exemplified by the 11+ year Semantic Web Project comprehension and appreciation odyssey).

    From my vantage point, here is how I believe the Linked Data value pyramid should be layered, especially when communicating the essential value prop.:

    1. HTTP URLs -- LINKs to documents (Reports) that users already appreciate, across the public Web and/or Intranets
    2. HTTP URIs -- typically not visually distinguishable from the URLs, so use the Data exposed by de-referencing a URL to show how each Data Item (Entity or Object) is uniquely identified by a Generic HTTP URI, and how clicking on the said URIs leads to more structured metadata bearing documents available in a variety of data representation formats, thereby enabling flexible data presentation (e.g., smarter HTML pages)
    3. SPARQL -- when a user appreciates the data representation and presentation dexterity of a Generic HTTP URI, they will be more inclined to drill down an additional layer to unravel how HTTP URIs mechanically deliver such flexibility
    4. RDF Data Stores -- at this stage the user is now interested data sources behind the Generic HTTP URIs, courtesy of natural desire to tweak the data presented in the report; thus, you now have an engaged user ready to absorb the "How Generic HTTP URIs Pull This Off" message
    5. RDF Data Sets -- while attempting to make or tweak HTTP URIs, users become curious about the actual data loaded into the RDF Data Store, which is where data sets used to create powerful Lookup Data Spaces (e.g., DBpedia) come into play such as those from the LOD constellation as exemplified by DBpedia (extractions from Wikipedia).

    Related

    ]]>
    Getting The Linked Data Value Pyramid Layers Right (Update #2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1593Mon, 01 Feb 2010 14:02:14 GMT22010-02-01T09:02:14.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
  • It isn't World Wide Web Specific (HTTP != World Wide Web)
  • It isn't Open Data Specific
  • It isn't about "Free" (Beer or Speech)
  • It isn't about Markup (so don't expect to grok it via "markup first" approach)
  • It's about Hyperdata - the use of HTTP and REST to deliver a powerful platform agnostic mechanism for Data Reference, Access, and Integration.
  • When trying to understand HTTP based Linked Data, especially if you're well versed in DBMS technology use (User, Power User, Architect, Analyst, DBA, or Programmer) think:

    • Open Database Connectivity (ODBC) without operating system, data model, or wire-protocol specificity or lock-in potential
    • Java Database Connectivity (JDBC) without programming language specificity
    • ADO.NET without .NET runtime specificity and .NET bound language specificity
    • OLE-DB without Windows operating system & programming language specificity
    • XMLA without XML format specificity - with Tabular and Multidimensional results formats expressible in a variety of data representation formats.
    • All of the above scoped to the Record rather than Container level, with Generic HTTP scheme URIs associated with each Record, Field, and Field value (optionally)

    Remember the need for Data Access & Integration technology is the by product of the following realities:

    1. Human curated data is ultimately dirty, because:
      • our thick thumbs, inattention, distractions, and general discomfort with typing, make typos prevalent
      • database engines exist for a variety of data models - Graph, Relational, Hierarchical;
      • within databases you have different record container/partition names e.g. Table Names;
      • within a database record container you have records that are really aspects of the same thing (different keys exist in a plethora of operational / line of business systems that expose aspects of the same entity e.g., customer data that spans Accounts, CRM, ERP application databases);
      • different field names (one database has "EMP" while another has "Employee") for the same record
      • .
    2. Units of measurement is driven by locale, the UK office wants to see sales in Pounds Sterling while the French office prefers Euros etc.
    3. All of the above is subject to context halos which can be quite granular re. sensitivity e.g. staff travel between locations that alter locales and their roles; basically, profiles matters a lot.

    Related

    ]]>
    5 Very Important Things to Note about HTTP based Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1591Mon, 01 Feb 2010 14:00:56 GMT22010-02-01T09:00:56-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    We have just released an Amazon EC2 based public Snapshot of DBpedia 3.4. Thus, you can now instantiate a personal and/or service specific variant of the DBpedia 3.4 Linked Data Space. Basically, you can replicate what we host, within minutes (as opposed to days). In addition, you no longer need to squabble --on an unpredictable basis with others-- for the infrastructure resources behind DBpedia's public instance, when using the SPARQL Endpoint, Faceted Search & Find Services, or HTML Browser Pages etc.

    How Does It work?

    1. Instantiate a Virtuoso EC2 AMI (paid variety, which is aggressively priced at $49.99 for setup and $19.99 per month thereafter)
    2. Mount the shared DBpedia 3.4 public snapshot
    3. Start Virtuoso Server
    4. Start exploiting the DBpedia Linked Data Space.

    What Interfaces are exposed?

    1. SPARQL Endpoint
    2. Linked Data Viewer Pages (as you see in the public DBpedia instance)
    3. Faceted Search & Find UI and Web Services (REST or SOAP)
    4. All the inference rules for UMBEL, SUMO, YAGO, OpenCYC, and DBpedia-OWL data dictionaries
    5. Type Correlations Between DBpedia and Freebase

    Enjoy!

    ]]>
    Personal and/or Service Specific Linked Data Spaces in the Cloud: DBpedia 3.4http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1589Mon, 16 Nov 2009 18:30:20 GMT12009-11-16T13:30:20-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Situation Analysis:

    Dr. Dre is one of the artists in the Linked Data Space we host for the BBC. He is also referenced in music oriented data spaces such as DBpedia, MusicBrainz and Last.FM (to name a few).

    Challenge:

    How do I obtain a holistic view of the entity "Dr. Dre" across the BBC, MusicBrainz, and Last.FM data spaces? We know the BBC published Linked Data, but what about Last.FM and MusicBrainz? Both of these data spaces only expose XML or JSON data via REST APIs?

    Solution:

    Simple 3 step Linked Data Meshup courtesy of Virtuoso's in-built RDFizer Middleware "the Sponger" (think ODBC Driver Manager for the Linked Data Web) and its numerous Cartridges (think ODBC Drivers for the Linked Data Web).

    Steps:

    1. Go to Last.FM and search using pattern: Dr. Dre (you will end up with this URL: http://www.last.fm/music/Dr.+Dre)
    2. Go to the Virtuoso powered BBC Linked Data Space home page and enter: http://bbc.openlinksw.com/about/html/http://www.last.fm/music/Dr.+Dre
    3. Go to the BBC Linked Data Space home page and type full text pattern (using default tab): Dr. Dre, then view Dr. Dre's metadata via the Statistics Link.

    What Happened?

    The following took place:

    1. Virtuoso Sponger sent an HTTP GET to Last.FM
    2. Distilled the "Artist" entity "Dr. Dre" from the page, and made a Linked Data graph
    3. Inverse Functional Property and sameAs reasoning handled the Meshup (augmented graph from a conjunctive query processing pipeline)
    4. Links for "Dr. Dre" across BBC (sameAs), Last.FM (seeAlso), via DBpedia URI.

    The new enhanced URI for Dr. Dre now provides a rich holistic view of the aforementioned "Artist" entity. This URI is usable anywhere on the Web for Linked Data Conduction :-)

    Related (as in NearBy)

    ]]>
    BBC Linked Data Meshup In 3 Stepshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1560Fri, 12 Jun 2009 20:38:34 GMT22009-06-12T16:38:34.000046-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    While exploring the Subject Headings Linked Data Space (LCSH) recently unveiled by the Library of Congress, I noticed that the URI for the subject heading: World Wide Web, exposes an "owl:sameAs" link to resource URI: "info:lc/authorities/sh95000541" -- in fact, a URI.URN that isn't HTTP protocol scheme based.

    The observations above triggered a discussion thread on Twitter that involved: @edsu, @iand, and moi. Naturally, it morphed into a live demonstration of: human vs machine, interpretation of claims expressed in the RDF graph.

    What makes this whole thing interesting?

    It showcases (in Man vs Machine style) the issue of unambiguously discerning the meaning of the owl:sameAs claim expressed in the LCSH Linked Data Space.

    Perspectives & Potential Confusion

    From the Linked Data perspective, it may spook a few people to see owl:sameAs values such as: "info:lc/authorities/sh95000541", that cannot be de-referenced using HTTP.

    It may confuse a few people or user agents that see URI de-referencing as not necessarily HTTP specific, thereby attempting to de-reference the URI.URN on the assumption that it's associated with a "handle system", for instance.

    It may even confuse RDFizer / RDFization middleware that use owl:sameAs as a data provider attribution mechanism via hint/nudge URI values derived from original content / data URI.URLs that de-reference to nothing e.g., an original resource URI.URL plus "#this" which produces URI.URN-URL -- think of this pattern as "owl:shameAs" in a sense :-)

    Unambiguously Discerning Meaning

    Simply bring OWL reasoning (inference rules and reasoners) into the mix, thereby negating human dialogue about interpretation which ultimately unveils a mesh of orthogonal view points. Remember, OWL is all about infrastructure that ultimately enables you to express yourself clearly i.e., say what you mean, and mean what you say.

    Path to Clarity (using Virtuoso, its in-built Sponger Middleware, and Inference Engine):

    1. GET the data into the Virtuoso Quad store -- what the sponger does via its URIBurner Service (while following designated predicates such as owl:sameAs in case they point to other mesh-able data sources)
    2. Query the data in Quad Store with "owl:sameAs" inference rules enabled
    3. Repeat the last step with the inference rules excluded.

    Actual SPARQL Queries:

    Observations:

    The SPARQL queries against the Graph generated and automatically populated by the Sponger reveal -- without human intervention-- that: "info:lc/authorities/sh95000541", is just an alternative name for < xmlns="http" id.loc.gov="id.loc.gov" authorities="authorities" sh95000541="sh95000541" concept="concept">, and that the graph produced by LCSH is self-describing enough for an OWL reasoner to figure this all out courtesy of the owl:sameAs property :-).

    Hopefully, this post also provides a simple example of how OWL facilitates "Reasonable Linked Data".

    Related

    ]]>
    Library of Congress & Reasonable Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1556Wed, 06 May 2009 18:26:15 GMT22009-05-06T14:26:15.000034-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Problem:

    Your Life, Profession, Web, and Internet do not need to become mutually exclusive due to "information overload".

    Solution:

    A platform or service that delivers a point of online presence that embodies the fundamental separation of: Identity, Data Access, Data Representation, Data Presentation, by adhering to Web and Internet protocols.

    How:

    Typical post installation (Local or Cloud) task sequence:

    1. Identify myself (happens automatically by way of registration)
    2. If in an LDAP environment, import accounts or associate system with LDAP for account lookup and authentication
    3. Identify Online Accounts (by fleshing out profile) which also connects system to online accounts and their data
    4. Use Profile for granular description (Biography, Interests, WishList, OfferList, etc.)
    5. Optionally upstream or downstream data to and from my online accounts
    6. Create content Tagging Rules
    7. Create rules for associating Tags with formal URIs
    8. Create automatic Hyperlinking Rules for reuse when new content is created (e.g. Blog posts)
    9. Exploit Data Portability virtues of RSS, Atom, OPML, RDFa, RDF/XML, and other formats for imports and exports
    10. Automatically tag imported content
    11. Use function-specific helper application UIs for domain specific data generation e.g. AddressBook (optionally use vCard import), Calendar (optionally use iCalendar import), Email, File Storage (use WebDAV mount with copy and paste or HTTP GET), Feed Subscriptions (optionally import RSS/Atom/OPML feeds), Bookmarking (optionally import bookmark.html or XBEL) etc..
    12. Optionally enable "Conversation" feature (today: Social Media feature) across the relevant application domains (manage conversations under covers using NNTP, the standard for this functionality realm)
    13. Generate HTTP based Entity IDs (URIs) for every piece of data in this burgeoning data space
    14. Use REST based APIs to perform CRUD tasks against my data (local and remote) (SPARQL, GData, Ubiquity Commands, Atom Publishing)
    15. Use OpenID, OAuth, FOAF+SSL, FOAF+SSL+OpenID for accessing data elsewhere
    16. Use OpenID, OAuth, FOAF+SSL, FOAF+SSL+OpenID for Controlling access to my data (Self Signed Certificate Generation, Browser Import of said Certificate & associated Private Key, plus persistence of Certificate to FOAF based profile data space in "one click")
    17. Have a simple UI for Entity-Attribute-Value or Subject-Predicate-Object arbitrary data annotations and creation since you can't pre model an "Open World" where the only constant is data flow
    18. Have my Personal URI (Web ID) as the single entry point for controlled access to my HTTP accessible data space

    I've just outlined a snippet of the capabilities of the OpenLink Data Spaces platform. A platform built using OpenLink Virtuoso, architected to deliver: open, platform independent, multi-model, data access and data management across heterogeneous data sources.

    All you need to remember is your URI when seeking to interact with your data space.

    Related

    1. Get Yourself a URI (Web ID) in 5 Minutes or Less!
    2. Various posts over the years about Data Spaces
    3. Future of Desktop Post
    4. Simplify My Life Post by Bengee Nowack
    ]]>
    Take N: Yet Another OpenLink Data Spaces Introductionhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1542Wed, 22 Apr 2009 19:32:06 GMT22009-04-22T15:32:06.000020-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Yesterday, I stumbled across an ebiz article by David Linthicum titled: RDF & Data Integration. Naturally, I read it, and while reading encountered a number of inaccuracies that compelled me to comment on the post.

    Today, I revisited the same article -- and to my shock and horror -- my comments do not exist (note: the site did accept my comments yesterday!). Even more frustrating for me, I now have to expend time I don't have re-writing my comments due to the depth and danger of the inaccuracies in this post re. RDF in general.

    Important Note to ebiz and David:

    Please look into what happened to my comments. It's too early for me to conclude that subjective censorship is a play on the Web -- which isn't a hard copy journalistic format style of platform where editors get away with such shenanigans. The Web is a sticky database, and outer joining is well and truly functional (meaning: exclusion and omission ultimately come back to bite via full outer join query results against the Web DB).

    By the way, if you publish the comments I made to the post (yesterday), I will add a note to this post, accordingly.

    Yes! David just confirmed to me via Twitter that this is yet another comment system related issue and absolutely no intent to censor etc. His words Twervatim :-)

    For sake of clarity, I've itemized the inaccuracies and applied my correction comments (inline) accordingly:

    Inaccuracy #1:

    Resource Description Framework (RDF), a part of the XML story, provides interoperability between applications that exchange information.

    Correction #1:

    RDF and XML are not inextricably linked in any way. RDF is part Data Model (EAV/CR style Graph) with associated markup and data serialization formats that include: N3, Turtle, TriX, RDF/XML etc.

    Inaccuracy #2:

    RDF uses XML to define a foundation for processing metadata and to provide a standard metadata infrastructure for both the Web and the enterprise.

    Correction #2:

    RDF/XML is an XML based markup and data serialization format. As a markup language it can be used for creating RDF model records/statements (using Subject, Predicate, Object or Entity, Attribute, Value). As a serialization format, it provides a mechanism for marshaling RDF data across data managers and data consumers.

    Inaccuracy #3:

    The difference between the two is that XML is used to transport data using a common format, while RDF is layered on top of XML defining a broad category of data.

    Correction #3:

    See earlier corrections above.

    Inaccuracy #4:

    When the XML data is declared to be of the RDF format, applications are then able to understand the data without understanding who sent it.

    Correction #4:

    You do not declare data to be of RDF format. RDF isn't a format it is a data model (as stated above). You can "up lift" or map data from XML to RDF (hierarchical to graph model mapping). Likewise you can "down shift" or map data from RDF to XML (example: SPARQL SELECT query patterns "down shift" to SPARQL Results XML, which isn't RDF/XML, while keeping access to graphs via URIs or Entity Identifiers that reside within the serialization).

    Inaccuracy #5:

    RDF extends the XML model and syntax to be specified for describing either resources or a collection of information. (XML points to a resource in order to scope and uniquely identify a set of properties known as the schema.).

    Correction #5:

    See earlier comments.

    The single accurate paragraph in this ebiz article lies right at the end and it states the following:

    "I've always thought RDF has been underutilized for data integration, and it's really an old standard. Now that we're focused on both understanding and integrating data, perhaps RDF should make a comeback."

    Related:

    ]]>
    ebiz RDF & Data Integration Article Retorthttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1522Thu, 29 Jan 2009 21:25:58 GMT22009-01-29T16:25:58-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As the world works it way through a "once in a generation" economic crisis, the long overdue downgrade of the RDBMS, from its pivotal position at the apex of the data access and data management pyramid is nigh.

    What is the Data Access, and Data Management Value Pyramid?

    As depicted below, a top-down view of the data access and data management value chain. The term: apex, simply indicates value primacy, which takes the form of a data access API based entry point into a DBMS realm -- aligned to an underlying data model. Examples of data access APIs include: Native Call Level Interfaces (CLIs), ODBC, JDBC, ADO.NET, OLE-DB, XMLA, and Web Services.

    See: AVF Pyramid Diagram.

    The degree to which ad-hoc views of data managed by a DBMS can be produced and dispatched to relevant data consumers (e.g. people), without compromising concurrency, data durability, and security, collectively determine the "Agility Value Factor" (AVF) of a given DBMS. Remember, agility as the cornerstone of environmental adaptation is as old as the concept of evolution, and intrinsic to all pursuits of primacy.

    In simpler business oriented terms, look at AVF as the degree to which DBMS technology affects the ability to effectively implement "Market Leadership Discipline" along the following pathways: innovation, operation excellence, or customer intimacy.

    Why has RDBMS Primacy has Endured?

    Historically, at least since the late '80s, the RDBMS genre of DBMS has consistently offered the highest AVF relative to other DBMS genres en route to primacy within the value pyramid. The desire to improve on paper reports and spreadsheets is basically what DBMS technology has fundamentally addressed to date, even though conceptual level interaction with data has never been its forte.

    See: RDBMS Primacy Diagram.

    For more then 10 years -- at the very least -- limitations of the traditional RDBMS in the realm of conceptual level interaction with data across diverse data sources and schemas (enterprise, Web, and Internet) has been crystal clear to many RDBMS technology practitioners, as indicated by some of the quotes excerpted below:

    "Future of Database Research is excellent, but what is the future of data?"

    "..it is hard for me to disagree with the conclusions in this report. It captures exactly the right thoughts, and should be a must read for everyone involved in the area of databases and database research in particular."

    -- Dr. Anant Jingran, CTO, IBM Information Management Systems, commenting on the 2007 RDBMS technology retreat attended by a number of key DBMS technology pioneers and researchers.

    "One size fits all: A concept whose time has come and gone

    1. They are direct descendants of System R and Ingres and were architected more than 25 years ago
    2. They are advocating "one size fits all"; i.e. a single engine that solves all DBMS needs.

    -- Prof. Michael Stonebreaker, one of the founding fathers of the RDBMS industry.

    Until this point in time, the requisite confluence of "circumstantial pain" and "open standards" based technology required to enable an objective "compare and contrast" of RDBMS engine virtues and viable alternatives hasn't occurred. Thus, the RDBMS has endured it position of primacy albeit on a "one size fits all basis".

    Circumstantial Pain

    As mentioned earlier, we are in the midst of an economic crisis that is ultimately about a consistent inability to connect dots across a substrate of interlinked data sources that transcend traditional data access boundaries with high doses of schematic heterogeneity. Ironically, in a era of the dot-com, we haven't been able to make meaningful connections between relevant "real-world things" that extend beyond primitive data hosted database tables and content management style document containers; we've struggled to achieve this in the most basic sense, let alone evolve our ability to connect inline with the exponential rate at which the Internet & Web are spawning "universes of discourse" (data spaces) that emanate from user activity (within the enterprise and across the Internet & Web). In a nutshell, we haven't been able to upgrade our interaction with data such that "conceptual models" and resulting "context lenses" (or facets) become concrete; by this I mean: real-world entity interaction making its way into the computer realm as opposed to the impedance we all suffer today when we transition from conceptual model interaction (real-world) to logical model interaction (when dealing with RDBMS based data access and data management).

    Here are some simple examples of what I can only best describe as: "critical dots unconnected", resulting from an inability to interact with data conceptually:

    Government (Globally) -

    Financial regulatory bodies couldn't effectively discern that a Credit Default Swap is an Insurance policy in all but literal name. And in not doing so the cost of an unregulated insurance policy laid the foundation for exacerbating the toxicity of fatally flawed mortgage backed securities. Put simply: a flawed insurance policy was the fallback on a toxic security that financiers found exotic based on superficial packaging.

    Enterprises -

    Banks still don't understand that capital really does exists in tangible and intangible forms; with the intangible being the variant that is inherently dynamic. For example, a tech companies intellectual capital far exceeds the value of fixture, fittings, and buildings, but you be amazed to find that in most cases this vital asset has not significant value when banks get down to the nitty gritty of debt collateral; instead, a buffer of flawed securitization has occurred atop a borderline static asset class covering the aforementioned buildings, fixtures, and fittings.

    In the general enterprise arena, IT executives continued to "rip and replace" existing technology without ever effectively addressing the timeless inability to connect data across disparate data silos generated by internal enterprise applications, let alone the broader need to mesh data from the inside with external data sources. No correlations made between the growth of buzzwords and the compounding nature of data integration challenges. It's 2009 and only a miniscule number of executives dare fantasize about being anywhere within distance of the: relevant information at your fingertips vision.

    Looking more holistically at data interaction in general, whether you interact with data in the enterprise space (i.e., at work) or on the Internet or Web, you ultimately are delving into a mishmash of disparate computer systems, applications, service (Web or SOA), and databases (of the RDBMS variety in a majority of cases) associated with a plethora of disparate schemas. Yes, but even today "rip and replace" is still the norm pushed by most vendors; pitting one mono culture against another as exemplified by irrelevances such as: FOSS/LAMP vs Commercial or Web vs. Enterprise, when none of this matters if the data access and integration issues are recognized let alone addressed (see: Applications are Like Fish and Data Like Wine).

    Like the current credit-crunch, exponential growth of data originating from disparate application databases and associated schemas, within shrinking processing time frames, has triggered a rethinking of what defines data access and data management value today en route to an inevitable RDBMS downgrade within the value pyramid.

    Technology

    There have been many attempts to address real-world modeling requirements across the broader DBMS community from Object Databases to Object-Relational Databases, and more recently the emergence of simple Entity-Attribute-Value model DBMS engines. In all cases failure has come down to the existence of one or more of the following deficiencies, across each potential alternative:

    1. Query language standardization - nothing close to SQL standardization
    2. Data Access API standardization - nothing close to ODBC, JDBC, OLE-DB, or ADO.NET
    3. Wire protocol standardization - nothing close to HTTP
    4. Distributed Identity infrastructure - nothing close to the non-repudiatable digital Identity that foaf+ssl accords
    5. Use of Identifiers as network based pointers to data sources - nothing close to RDF based Linked Data
    6. Negotiable data representation - nothing close to Mime and HTTP based Content Negotiation
    7. Scalability especially in the era of Internet & Web scale.

    Entity-Attribute-Value with Classes & Relationships (EAV/CR) data models

    A common characteristic shared by all post-relational DBMS management systems (from Object Relational to pure Object) is an orientation towards variations of EAV/CR based data models. Unfortunately, all efforts in the EAV/CR realm have typically suffered from at least one of the deficiencies listed above. In addition, the same "one DBMS model fits all" approach that lies at the heart of the RDBMS downgrade also exists in the EAV/CR realm.

    What Comes Next?

    The RDBMS is not going away (ever), but its era of primacy -- by virtue of its placement at the apex of the data access and data management value pyramid -- is over! I make this bold claim for the following reasons:

    1. The Internet aided "Global Village" has brought "Open World" vs "Closed World" assumption issues to the fore e.g., the current global economic crisis remains centered on the inability to connect dots across "Open World" and "Closed World" data frontiers
    2. Entity-Attribute-Value with Classes & Relationships (EAV/CR) based DBMS models are more effective when dealing with disparate data associated with disparate schemas, across disparate DBMS engines, host operating systems, and networks.

    Based on the above, it is crystal clear that a different kind of DBMS -- one with higher AVF relative to the RDBMS -- needs to sit atop today's data access and data management value pyramid. The characteristics of this DBMS must include the following:

    1. Every item of data (Datum/Entity/Object/Resource) has Identity
    2. Identity is achieved via Identifiers that aren't locked at the DBMS, OS, Network, or Application levels
    3. Object Identifiers and Object values are independent (extricably linked by association)
    4. Object values should be de-referencable via Object Identifier
    5. Representation of de-referenced value graph (entity, attributes, and values mesh) must be negotiable (i.e. content negotiation)
    6. Structured query language must provide mechanism for Creation, Deletion, Updates, and Querying of data objects
    7. Performance & Scalability across "Closed World" (enterprise) and "Open World" (Internet & Web) realms.

    Quick recap, I am not saying that RDBMS engine technology is dead or obsolete. I am simply stating that the era of RDBMS primacy within the data access and data management value pyramid is over.

    The problem domain (conceptual model views over heterogeneous data sources) at the apex of the aforementioned pyramid has simply evolved beyond the natural capabilities of the RDBMS which is rooted in "Closed World" assumptions re., data definition, access, and management. The need to maintain domain based conceptual interaction with data is now palpable at every echelon within our "Global Village" - Internet, Web, Enterprise, Government etc.

    It is my personal view that an EAV/CR model based DBMS, with support for the seven items enumerated above, can trigger the long anticipated RDBMS downgrade. Such a DBMS would be inherently multi-model because you would need to the best of RDBMS and EAV/CR model engines in a single product, with in-built support for HTTP and other Internet protocols in order to effectively address data representation and serialization issues.

    EAV/CR Oriented Data Access & Management Technology

    Examples of contemporary EAV/CR frameworks that provide concrete conceptual layers for data access and data management currently include:

    The frameworks above provide the basis for a revised AVF pyramid, as depicted below, that reflects today's data access and management realities i.e., an Internet & Web driven global village comprised of interlinked distributed data objects, compatible with "Open World" assumptions.

    See: New EAV/CR Primacy Diagram.

    Related

    ]]>
    Time for RDBMS Primacy Downgrade is Nigh! (No Embedded Images Edition - Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1520Tue, 17 Mar 2009 15:50:58 GMT22009-03-17T11:50:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As the world works it way through a "once in a generation" economic crisis, the long overdue downgrade of the RDBMS, from its pivotal position at the apex of the data access and data management pyramid is nigh.

    What is the Data Access, and Data Management Value Pyramid?

    As depicted below, a top-down view of the data access and data management value chain. The term: apex, simply indicates value primacy, which takes the form of a data access API based entry point into a DBMS realm -- aligned to an underlying data model. Examples of data access APIs include: Native Call Level Interfaces (CLIs), ODBC, JDBC, ADO.NET, OLE-DB, XMLA, and Web Services.

    Image

    The degree to which ad-hoc views of data managed by a DBMS can be produced and dispatched to relevant data consumers (e.g. people), without compromising concurrency, data durability, and security, collectively determine the "Agility Value Factor" (AVF) of a given DBMS. Remember, agility as the cornerstone of environmental adaptation is as old as the concept of evolution, and intrinsic to all pursuits of primacy.

    In simpler business oriented terms, look at AVF as the degree to which DBMS technology affects the ability to effectively implement "Market Leadership Discipline" along the following pathways: innovation, operation excellence, or customer intimacy.

    Why has RDBMS Primacy has Endured?

    Historically, at least since the late '80s, the RDBMS genre of DBMS has consistently offered the highest AVF relative to other DBMS genres en route to primacy within the value pyramid. The desire to improve on paper reports and spreadsheets is basically what DBMS technology has fundamentally addressed to date, even though conceptual level interaction with data has never been its forte.

    Image

    For more then 10 years -- at the very least -- limitations of the traditional RDBMS in the realm of conceptual level interaction with data across diverse data sources and schemas (enterprise, Web, and Internet) has been crystal clear to many RDBMS technology practitioners, as indicated by some of the quotes excerpted below:

    "Future of Database Research is excellent, but what is the future of data?"

    "..it is hard for me to disagree with the conclusions in this report. It captures exactly the right thoughts, and should be a must read for everyone involved in the area of databases and database research in particular."

    -- Dr. Anant Jingran, CTO, IBM Information Management Systems, commenting on the 2007 RDBMS technology retreat attended by a number of key DBMS technology pioneers and researchers.

    "One size fits all: A concept whose time has come and gone

    1. They are direct descendants of System R and Ingres and were architected more than 25 years ago
    2. They are advocating "one size fits all"; i.e. a single engine that solves all DBMS needs.

    -- Prof. Michael Stonebreaker, one of the founding fathers of the RDBMS industry.

    Until this point in time, the requisite confluence of "circumstantial pain" and "open standards" based technology required to enable an objective "compare and contrast" of RDBMS engine virtues and viable alternatives hasn't occurred. Thus, the RDBMS has endured it position of primacy albeit on a "one size fits all basis".

    Circumstantial Pain

    As mentioned earlier, we are in the midst of an economic crisis that is ultimately about a consistent inability to connect dots across a substrate of interlinked data sources that transcend traditional data access boundaries with high doses of schematic heterogeneity. Ironically, in a era of the dot-com, we haven't been able to make meaningful connections between relevant "real-world things" that extend beyond primitive data hosted database tables and content management style document containers; we've struggled to achieve this in the most basic sense, let alone evolve our ability to connect inline with the exponential rate at which the Internet & Web are spawning "universes of discourse" (data spaces) that emanate from user activity (within the enterprise and across the Internet & Web). In a nutshell, we haven't been able to upgrade our interaction with data such that "conceptual models" and resulting "context lenses" (or facets) become concrete; by this I mean: real-world entity interaction making its way into the computer realm as opposed to the impedance we all suffer today when we transition from conceptual model interaction (real-world) to logical model interaction (when dealing with RDBMS based data access and data management).

    Here are some simple examples of what I can only best describe as: "critical dots unconnected", resulting from an inability to interact with data conceptually:

    Government (Globally) -

    Financial regulatory bodies couldn't effectively discern that a Credit Default Swap is an Insurance policy in all but literal name. And in not doing so the cost of an unregulated insurance policy laid the foundation for exacerbating the toxicity of fatally flawed mortgage backed securities. Put simply: a flawed insurance policy was the fallback on a toxic security that financiers found exotic based on superficial packaging.

    Enterprises -

    Banks still don't understand that capital really does exists in tangible and intangible forms; with the intangible being the variant that is inherently dynamic. For example, a tech companies intellectual capital far exceeds the value of fixture, fittings, and buildings, but you be amazed to find that in most cases this vital asset has not significant value when banks get down to the nitty gritty of debt collateral; instead, a buffer of flawed securitization has occurred atop a borderline static asset class covering the aforementioned buildings, fixtures, and fittings.

    In the general enterprise arena, IT executives continued to "rip and replace" existing technology without ever effectively addressing the timeless inability to connect data across disparate data silos generated by internal enterprise applications, let alone the broader need to mesh data from the inside with external data sources. No correlations made between the growth of buzzwords and the compounding nature of data integration challenges. It's 2009 and only a miniscule number of executives dare fantasize about being anywhere within distance of the: relevant information at your fingertips vision.

    Looking more holistically at data interaction in general, whether you interact with data in the enterprise space (i.e., at work) or on the Internet or Web, you ultimately are delving into a mishmash of disparate computer systems, applications, service (Web or SOA), and databases (of the RDBMS variety in a majority of cases) associated with a plethora of disparate schemas. Yes, but even today "rip and replace" is still the norm pushed by most vendors; pitting one mono culture against another as exemplified by irrelevances such as: FOSS/LAMP vs Commercial or Web vs. Enterprise, when none of this matters if the data access and integration issues are recognized let alone addressed (see: Applications are Like Fish and Data Like Wine).

    Like the current credit-crunch, exponential growth of data originating from disparate application databases and associated schemas, within shrinking processing time frames, has triggered a rethinking of what defines data access and data management value today en route to an inevitable RDBMS downgrade within the value pyramid.

    Technology

    There have been many attempts to address real-world modeling requirements across the broader DBMS community from Object Databases to Object-Relational Databases, and more recently the emergence of simple Entity-Attribute-Value model DBMS engines. In all cases failure has come down to the existence of one or more of the following deficiencies, across each potential alternative:

    1. Query language standardization - nothing close to SQL standardization
    2. Data Access API standardization - nothing close to ODBC, JDBC, OLE-DB, or ADO.NET
    3. Wire protocol standardization - nothing close to HTTP
    4. Distributed Identity infrastructure - nothing close to the non-repudiatable digital Identity that foaf+ssl accords
    5. Use of Identifiers as network based pointers to data sources - nothing close to RDF based Linked Data
    6. Negotiable data representation - nothing close to Mime and HTTP based Content Negotiation
    7. Scalability especially in the era of Internet & Web scale.

    Entity-Attribute-Value with Classes & Relationships (EAV/CR) data models

    A common characteristic shared by all post-relational DBMS management systems (from Object Relational to pure Object) is an orientation towards variations of EAV/CR based data models. Unfortunately, all efforts in the EAV/CR realm have typically suffered from at least one of the deficiencies listed above. In addition, the same "one DBMS model fits all" approach that lies at the heart of the RDBMS downgrade also exists in the EAV/CR realm.

    What Comes Next?

    The RDBMS is not going away (ever), but its era of primacy -- by virtue of its placement at the apex of the data access and data management value pyramid -- is over! I make this bold claim for the following reasons:

    1. The Internet aided "Global Village" has brought "Open World" vs "Closed World" assumption issues to the fore e.g., the current global economic crisis remains centered on the inability to connect dots across "Open World" and "Closed World" data frontiers
    2. Entity-Attribute-Value with Classes & Relationships (EAV/CR) based DBMS models are more effective when dealing with disparate data associated with disparate schemas, across disparate DBMS engines, host operating systems, and networks.

    Based on the above, it is crystal clear that a different kind of DBMS -- one with higher AVF relative to the RDBMS -- needs to sit atop today's data access and data management value pyramid. The characteristics of this DBMS must include the following:

    1. Every item of data (Datum/Entity/Object/Resource) has Identity
    2. Identity is achieved via Identifiers that aren't locked at the DBMS, OS, Network, or Application levels
    3. Object Identifiers and Object values are independent (extricably linked by association)
    4. Object values should be de-referencable via Object Identifier
    5. Representation of de-referenced value graph (entity, attributes, and values mesh) must be negotiable (i.e. content negotiation)
    6. Structured query language must provide mechanism for Creation, Deletion, Updates, and Querying of data objects
    7. Performance & Scalability across "Closed World" (enterprise) and "Open World" (Internet & Web) realms.

    Quick recap, I am not saying that RDBMS engine technology is dead or obsolete. I am simply stating that the era of RDBMS primacy within the data access and data management value pyramid is over.

    The problem domain (conceptual model views over heterogeneous data sources) at the apex of the aforementioned pyramid has simply evolved beyond the natural capabilities of the RDBMS which is rooted in "Closed World" assumptions re., data definition, access, and management. The need to maintain domain based conceptual interaction with data is now palpable at every echelon within our "Global Village" - Internet, Web, Enterprise, Government etc.

    It is my personal view that an EAV/CR model based DBMS, with support for the seven items enumerated above, can trigger the long anticipated RDBMS downgrade. Such a DBMS would be inherently multi-model because you would need to the best of RDBMS and EAV/CR model engines in a single product, with in-built support for HTTP and other Internet protocols in order to effectively address data representation and serialization issues.

    EAV/CR Oriented Data Access & Management Technology

    Examples of contemporary EAV/CR frameworks that provide concrete conceptual layers for data access and data management currently include:

    The frameworks above provide the basis for a revised AVF pyramid, as depicted below, that reflects today's data access and management realities i.e., an Internet & Web driven global village comprised of interlinked distributed data objects, compatible with "Open World" assumptions.

    Related

    ]]>
    The Time for RDBMS Primacy Downgrade is Nigh!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1519Wed, 03 Jun 2009 22:09:58 GMT72009-06-03T18:09:58.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As I cannot post directly to Glenn's blog titled: This is Not the Near Future (Either), I have to basically respond to him here, in blog post form :-(

    What is our "Search" and "Find" demonstration about? It is about how you use the "Description" of "Things" to unambiguously locate things in a database at Web Scale.

    To our perpetual chagrin, we are trying to demonstrate an engine -- not UI prowess -- but the immediate response is to jump to the UI aesthetics.

    Google, Yahoo etc.. offer a simple input form for full text search patterns, they have a processing window for completing full text searches across Web Content indexed on their servers. Once the search patterns are processed, you get a page ranked result set (collection of Web pages basically that claim/state: we found N pages out of a document corpus of about M indexed pages).

    Note: the estimate aspect of traditional search results in like "advertising small print" the user lives with the illusion that all possible documents on the Web (or even Internet) have been searched whereas in reality: 25% of the possible total is a major stretch; since the Web and Internet are fractal networks and scale-free, inherently growing at exponential rates "ad infinitum" across boundless dimensions of human comprehension.

    The power of Linked Data ultimately comes down to the fact that the user constructs the path to what they seek via the properties of the "Things" in question. The routes are not hardwired since URI de-referencing (follow your nose pattern) is available to Linked Data aware query engines and crawlers.

    We are simply trying to demonstrate how you can combine the best of full text search with the best of structured querying while reusing familiar interaction patterns from Google/Yahoo. Thus, you start with full text search, find get all the entities associated with the pattern, then use the entity types or entity properties to find what you seek.

    You state in your post:

    "To state the obvious caveat, the claim OpenLink is making about this demo is not that it delivers better search-term relevance, therefore the ranking of searching results is not the main criteria on which it is intended to be assessed."

    Correct.

    "On the other hand, one of the things they are bragging about is that their server will automatically cut off long-running queries. So how do you like your first page of results?".

    Not exactly correct. We are performing aggregates using a configurable interactive time factor. Example: tell me how many entities of type: Person, with interest: Semantic Web, exist in this database within 2 seconds. Also understand that you could retry the same query and get different numbers within the same interactive time factor. It isn't your basic "query cut-off".

    "And on the other other hand, the big claim OpenLink is making about this demo is that the aggregate experience of using it is better than the aggregate experience of using "traditional" search. So go ahead, use it. If you can."

    Yes, "Microsoft" was a poor example for sure, the example could have been pattern: "glenn mcdonald", which should demonstrate the fundamental utility of what we are trying to demonstrate i.e., entity disambiguation courtesy of entity properties and/or entity type filtering.

    Compare Googles results for: Glenn McDonald with those from our demo (which dissambiguate "Glenn McDonald" via associated properties and/or types), assuming we both agree that your Web Site or Blog Home isn't the center of your entity graph or personal data space (i.e., data about you); so getting your home page at the top of the Google page rank offers limited value, in reality.

    What are we bragging about? A little more than what you attempt to explain. Yes, we are showing that we can find stuff within a processing window, but understand the following:

    • Processing Time Window (or interactive time) is configurable
    • Data Corpus is a Billion+ Triples (from Billion Triples Challenge Data Set)
    • SPARQL doesn't have Aggregation capabilities by default (we have implemented SPARQL-BI to deliver aggregates for analytics against large data sets, we even handle the TPC-H industry standard benchmark with SPARQL-BI)
    • Paging isn't possible without aggregates, and doing aggregates on a Billion+ triples as part of a query processing cycle isn't trivial stuff (otherwise it would be everywhere due to inherent and obvious necessity).

    I hope I've clarified what's going on with our demo? If not, pose your challenge via examples and I will respond with solutions or simply cry out loud: "no mas!".

    As for your "Mac OX X Leopard" comments, I can only say this: I emphasized that this is a demo, the data is pretty old, and the input data has issues (i.e. some of the input data is bad as your example shows). The purpose of this demo is not about the text per se., it's about the size of the data corpus and faceted querying. We are going to have the entire LOD Cloud loaded into the real thing, and in addition to that our Sponger Middleware will be enabled, and then you can take issue with data quality as per your reference to "Cyndi Lauper" (btw - it takes one property filter to find information about her quickly using "dbpprop:name" after filtering for properties with text values).

    Of all things, this demo had nothing to do with UI and Information presentation aesthetics. It was all about combining full text search and structured queries (sparql behind the scenes) against a huge data corpus en route to solving challenges associated with faceted browsing over large data sets. We have built a service that resides inside Virtuoso. The Service is naturally of the "Web Service" variety and can be used from any consumer / client environment that speaks HTTP (directly or indirectly).

    To be continued ...

    ]]>
    In Response to: This is Not the Future (Update #3) http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1518Thu, 22 Jan 2009 00:02:47 GMT62009-01-21T19:02:47-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The first salvo of what we've been hinting about re. server side faceted browsing over Unlimited Data within configurable Interactive Time-frames is now available for experimentation at: http://b3s.openlinksw.com/fct/facet.vsp.

    Simple example / demo:

    Enter search pattern: Microsoft

    You will get the usual result from a full text pattern search i.e., hits and text excerpts with matching patterns in boldface. This first step is akin to throwing your net out to sea while fishing.

    Now you have your catch, what next? Basically, this is where traditional text search value ends since regex or xpath/xquery offer little when the structure of literal text is the key to filtering or categorization based analysis of real-world entities. Naturally, this is where the value of structured querying of linked data starts, as you seek to use entity descriptions (combination of attribute and relationship properties) to "Find relevant things".

    Continuing with the demo.

    Click on "Properties" link within the Navigation section of the browser page which results in a distillation and aggregation of the properties of the entities associated with the search results. Then use the "Next" link to page through the properties until to find the properties that best match what you seek. Note, this particular step is akin to using the properties of the catch (using fishing analogy) for query filtering, with each subsequent property link click narrowing your selection further.

    Using property based filtering is just one perspective on the data corpus associated with the text search pattern; thus, you can alter perspectives by clicking on the "Class" link so that you can filter you search results by entity type. Of course, in a number of scenarios you would use a combination of entity types and entity properties filters to locate the entities of interest to you.

    A Few Notes about this demo instance of Virtuoso:

    • Lookup Data Size (Local Linked Data Corpus): 2 Billion+ Triples (entity-attribute-value tuples)
    • This is a *temporary* teaser / precursor to the LOD (Linking Open Data Cloud) variant of our Linked Data driven "Search" & "Find" service; we decided to implement this functionality prior to commissioning a larger and more up to date instance based on the entire LOD Cloud
    • The browser is simply using a Virtuoso PL function that also exists in Web Service form for loose binding by 3rd parties that have a UI orientation and focus (our UI is deliberately bare boned).
    • The properties and entity types (classes) links expose formal definitions and dictionary provenance information materialized in an HTML page (of course your browser or any other HTTP user agent can negotiation alternative representations of this descriptive information)
    • UMBEL based inference rules are enabled, giving you a live and simple demonstration of the virtues of Linked Data Dictionaries for example: click on the description link of any property or class from the foaf (friend-of-a-friend vocabulary), sioc (semantically-interlinked-online-communities ontology), mo (music ontology), bibo (bibliographic data ontology) namespaces to see how the data between these lower level vocabularies or ontologies are meshed with OpenCyc's upper level ontology.

    Related

    ]]>
    A Linked Data Web Approach To Semantic "Search" & "Find" (Updated)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1517Sat, 10 Jan 2009 18:55:56 GMT22009-01-10T13:55:56.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What is Neurocommons?

    Excerpted from the project home page:

    The NeuroCommons project seeks to make all scientific research materials - research articles, annotations, data, physical materials - as available and as useable as they can be. We do this by both fostering practices that render information in a form that promotes uniform access by computational agents - sometimes called "interoperability". We want knowledge sources to combine meaningfully, enabling semantically precise queries that span multiple information sources.

    In a nutshell, a great project that makes practical use of Linked Data Web technology in the areas of computational biology and neuroscience.

    What is Virtuoso and Neurocommons AMI for EC2?

    A pre-installed and fully tuned edition of Virtuoso that includes a fully configured Neurocommons Knowledgebase (in RDF Linked Data form) on Amazon's EC2 Cloud platform.

    Benefits?

    Generally, it provides a no-hassles mechanism for instantiating personal-, organization-, or service-specific instances of a very powerful research knowledgebase within approximately 1.15 hours compared to a lengthy rebuild from RDF source data alternative that takes 14 hours or more, depending on machine hardware configuration and host operating system resources.

    Features:

    1. Neurocommons public instance functionality replica (re. RDF and (X)HTML resource description representations & SPARQL endpoint)
    2. Local URI de-referencing (so no contention with public endpoint) as part of the RDF Linked Data Deployment
    3. Fully tuned Virtuoso instance for neurocommons knowledgebase.

    Installation Guide

    Simply read the Virtuoso+NeuroCommons EC2 AMI installation guide.

    Related

    ]]>
    Virtuoso+Neurocommons EC2 AMI released! (Update - 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1491Thu, 11 Dec 2008 03:48:49 GMT32008-12-10T22:48:49-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    We are just about done with an end-to-end workflow pattern that enables reconstitution of DBpedia 3.2 instances in the Clouds courtesy of Virtuoso and EC2.

    Basically this is how it works.

    1. Instantiate a Virtuoso EC2 AMI (paid variety)
    2. Install the special EC2 extensions (ec2ext_dav.vad) VAD via the Conductor UI or iSQL
    3. Restore the Virtuoso+DBpedia backup from our S3 bucket
    4. After approx. 1 hr, you will have a complete DBpedia replica in your own data space on the Linked Data Web.

    DBpedia replica implies:

    1. SPARQL Endpoint
    2. Linked Data Viewer Pages (as you see in the public DBpedia instance)
    3. All requisite re-write rules for URI de-referencing and attribution (i.e., low cost triples that links back to main DBpedia using terms from our little Attribution Ontology)
    4. All the inference rules for UMBEL, YAGO, OpenCYC, and DBpedia-OWL data dictionaries
    5. All Full Text Indexes
    6. All Bitmap Indexes.

    Tomorrow is the official go live day (due to last minute price changes), but you can instantiate a paid Virtuoso AMI starting now :-)

    To be continued...

    ]]>
    Your Personal Edition of DBpedia in the Cloudshttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1486Tue, 25 Nov 2008 23:55:55 GMT12008-11-25T18:55:55-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Thanks to RDF and Linked Data, it's becoming a lot easier for us to explain and reveal the depth of the OpenLink technology portfolio.

    Here is a look at our offerings by product family:

    As you explore the Linked Data graph exposed via our product portfolio, I expect you to experience, or at least spot, the virtuous potential of high SDQ (Serendipitous Discovery Quotient) courtesy of Linked Data, which is Web 3.0's answer to SEO. For instance, how Database, Operating System, and Processor family paths in the product portfolio graph (data network) unveil a lot more about OpenLink Software than meets the proverbial "eye" :-)

    ]]>
    Dog-fooding: Linked Data and OpenLink Product Portfoliohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1463Fri, 24 Oct 2008 22:13:50 GMT12008-10-24T18:13:50-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Runtime hosting is functionality realm of Virtuoso that is sometimes easily overlooked. In this post I want to provide a simple no-hassles HOWTO guide for installing Virtuoso on Windows (32 or 64 Bit), Mac OS X (Universal or Native 64 Bit), and Linux (32 or 64 Bit). The installation guide also covers the instantiation of phpBB3 as verification of the Virtuoso hosted PHP 3.5 runtime.

    What are the benefits of PHP Runtime Hosting?

    Like Apache, Virtuoso is a bona-fide Web Application Server for PHP based applications. Unlike Apache, Virtuoso is also the following:

    • a Hybrid Native DBMS Engine (Relational, RDF-Graph, and Document models) that is accessible via industry standard interfaces (solely)
    • a Virtual DBMS or Master Data Manager (MDM) that virtualizes heterogeneous data sources (ODBC, JDBC, Web Services, Hypermedia Resources, Non Hypermedia Resources)
    • an RDF Middleware solution for RDF-zation of non RDF resources across the Web and enterprise Intranets and/or Extranets (in the form of Cartridges for data exposed via REST or SOA oriented SOAP interfaces)
    • an RDF Linked Data Server (meaning it can deploy RDF Linked Data based on its native and/or virtualized data)

    As result of the above, when you deploy a PHP application using Virtuoso, you inherit the following benefits:

    1. Use of PHP-iODBC for in-process communication with Virtuoso
    2. Easy generation of RDF Linked Data Views atop the SQL schemas of PHP applications
    3. Easy deployment of RDF Linked Data from virtualized data sources
    4. Less LAMP monoculture (*there is no such thing as virtuous monoculture*) when dealing with PHP based Web applications.

    As indicated in prior posts, producing RDF Linked Data from the existing Web, where a lot of content is deployed by PHP based content managers, should simply come down to RDF Views over the SQL Schemas and deployment / publishing of the RDF Views in RDF Linked data form. In a nutshell, this is what Virtuoso delivers via its PHP runtime hosting and pre packaged VADs (Virtuoso Application Distribution packages), for popular PHP based applications such as: phpBB3, Drupal, WordPress, and MediaWiki.

    In addition, to the RDF Linked Data deployment, we've also taken the traditional LAMP installation tedium out of the typical PHP application deployment process. For instance, you don't have to rebuild PHP 3.5 (32 or 64 Bit) on Windows, Mac OS X, or Linux to get going, simply install Virtuoso, and then select a VAD package for the relevant application and you're set. If the application of choice isn't pre packaged by us, simply install as you would when using Apache, which comes dow to situating the PHP files in your Web structure under the Web Application's root directory.

    Installation Guide

    1. Download the Virtuoso installer for Windows (32 Bit msi file or 64 Bit msi file), Mac OS X (Universal Binary dmg file), or instantiate the Virtuoso EC2 AMI (*search for pattern: "Virtuoso when using the Firefox extension for EC2 as the AMI ID is currently: ami-7c31d515 and name: virtuoso-test/virtuoso-cloud-beta-9-i386.manifest.xml, for latest cut*)
    2. Run the installer (or download the movies using the links in the related section below)
    3. Go to the Virtuoso Conductor (*which will show up at the end of the installation process* or go to http://localhost:8890/conductor)
    4. Go to the "Admin" tab within the (X)HTML based UI and select the "Packages" sub-menu item (a Tab)
    5. Pick phpBB3 (or any other pre-packaged PHP app) and then click on "Install/Upgrase"
    6. The watch one of my silent movies or read the initial startup guides for Virtuoso hosted phpBB3, Drupal, Wordpress, MediaWiki.

    Related

    At the current time, I've only provided links to ZIP files containing the Virtuoso installation "silent movies". This approach is a short-term solution to some of my current movie publishing challenges re. YouTube and Vimeo -- where the compressed output hasn't been of acceptable visual quality. Once resolved, I will publish much more "Multimedia Web" friendly movies :-)

    ]]>
    Virtuoso, PHP Runtime Hosting: phpBB, Wordpress, Drupal, MediaWiki, and Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1461Fri, 26 Mar 2010 01:19:59 GMT52010-03-25T21:19:59-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I just stumbled across an post from ITBusines Edge titled: How Semantic Technology Can Help Companies with Integration. While reading the post I encountered the term: Master Data Manager (MDM), and wondered to myself, "what's that?" only to realize it's the very same thing I described as a Data Virtualization or Virtual Database technology (circa. 1998).

    Now, if re-labeling can confuse me when applied to a realm I've been intimately involved with for eons (internet time). I don't want to imagine what it does for others who aren't that intimately involved with the important data access and data integration realms.

    On the more refreshing side, the article does shed some light on the potency of RDF and OWL when applied to the construction of conceptual views of heterogeneous data sources.

    "How do you know that data coming from one place calculates net revenue the same way that data coming from another place does? You’ve got people using the same term for different things and different terms for the same things. How do you reconcile all of that? That’s really what semantic integration is about."

    BTW - I discovered this article via another titled: Understanding Integration And How It Can Help with SOA, that covers SOA and Integration matters. Again, in this piece I feel the gradual realization of the virtues that RDF, OWL, and RDF Linked Data bring to bear in the vital realm of data integration across heterogeneous data silos.

    Conclusion

    A number of events, at the micro and macro economic levels, are forcing attention back to the issue of productive use of existing IT resources. The trouble with the aforementioned quest is that it ultimately unveils the global IT affliction known as: heterogeneous data silos, and the challenges of pain alleviation, that have been ignored forever or approached inadequately as clearly shown by the rapid build up of SOA horror stories in the data integration realm.

    Data Integration via conceptualization of heterogenous data sources, that result in concrete conceptual layer data access and management, remains the greatest and most potent application of technologies associated with the "Semantic Web" and/or "Linked Data" monikers.

    Related

    ]]>
    The Trouble with Labels (Contd.): Data Integration & SOAhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1457Sun, 12 Oct 2008 22:54:22 GMT22008-10-12T18:54:22-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    As articulated in timeless fashion by Albert Einstein:

    The significant problems we face cannot be solved at the same level of thinking we were at when we created them.

    This quote also applies to the current global financial mess because the essence of this crisis remains inextricably linked to dependency on outdated "closed world" systems.

    How we got here (5,000 ft. view)

    We have a global human network that depends on systems driven by, and confined to, data silos! Every time you hear a CEO, Government Official, work colleague, neighbor, sibling, or relative tell you they didn't see it coming, just remember:

    • For every action, there is an equal and opposite reaction
    • For every debit there is a credit
    • What goes around, comes around
    • No man is an Island (little tweak: Human)
    • We are all Linked whether we like it or not
    • System preserving reboots are a feature of all intelligently designed systems.

    Why there won't be a Depression

    There won't be a depression because we can't afford one. Just like we couldn't afford to continue with the manner in which our systems work today. Unlike the '30s, we all know that there are no absolute safe havens right now, we have enough information at our disposal to eventually understand (post panic) that stuffing the mattress isn't an option (even government bonds won't cut it, ditto money market accounts).

    The Opportunity

    Take a deep breadth and tell traditional media to "shut up". As per usual, the traditional mass media wants to have it both ways by stoking the panic and maxing out on the frenzy with reckless abandon (as per usual). If there is a time to appreciate the blogosphere and quality journalism etc.. It's now.

    Anyway, as the saying goes: "It's always darkest before dawn", and as bizarre as this may sound in some quarters, things will ultimately change for the better. It just so happened that a really big cane was required in order for us to change our dysfunctional ways :-(

    I recently wrote a post about "zero based cognition" that sought to bring attention to the power of "Human Thought" in relation to value creation.

    Innovative creation and dissemination of value is how we will eventually get out of the current mess (as we've done in the past). The predictability of the aforementioned reality is significantly increased by the sheer link density and resulting "network effects" potential of the Internet and World Wide Web. Our ability to "connect the dots" as part of our value creation, dissemination, and consumption processing pipelines is what will ultimately separate the winners from the losers (individuals, enterprises, nations).

    Related

    ]]>
    The Calamitous Nature of Opportunityhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1456Fri, 24 Oct 2008 02:20:17 GMT52008-10-23T22:20:17-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Human beings, courtesy of the gift of cognition, are capable of creating reusable data, information, knowledge from simple or complex observations in an abstract realm. A machine on the other hand can only discover and infere based on a substrate of structured and interlinked data, information, or knowledge in a concrete human created realm e.g., a Web of Linked Data.

    As is quite common these days, Yihong Ding has written another great piece titled: A New Take on Internet-Based AI, that delves into this specific matter. Yihong expresses an vital insight as excerpted below:
    "Artificial intelligence is supposed to let machines do things for people. The risk is that we may rely too much on them. Two months ago, for instance, writer Nicolas Carr asked whether Google is making us stupid. In my recent blog series "The Age of Google," I extended Carr’s discussion. Due to the success of Google, we are relying more on objective search than on active thinking to answer questions. In consequence, the more Google has advanced its service, the farther Google users have drifted from active thinking."
    "But at least one form of human thinking cannot be replaced by machines. I am not talking about inference/discovery (which machines may be capable of doing) but about creation/generation-from-nothing (which I don’t believe machines may ever do)."

    I tend to describe our ability to create/generate-from-nothing as "Zero-based Cognition", which is initially about "thought" and the eventually about "speed of thought dissemination" and "global thought meshing".

    In a peculiar sense, Zero-based cognition is analogous to Zero-based budgeting from the accounting realm :-)

    ]]>
    Zero-based Cognition (Difference between Humans & Machines)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1440Fri, 17 Oct 2008 11:23:42 GMT12008-10-17T07:23:42.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    All enterprises run IS/MIS/EIS systems that are supposed to enable optimized exploitation of data, information, and knowledge. Unfortunately, applications, services (SOAP or REST), database engines, middleware, operating systems, programming languages, development frameworks, network protocols, network topologies, or some other piece of infrastructure, eventually lay claim (possessively) to the data.

    Courtesy of Linked Data, we are now able to extend the "document to document" linking mechanism of the Web (Hypertext Linking) to more granular "entity to entity" level linking. And in doing so, we have a layer of abstraction that in one swoop alleviates all of the infrastructure oriented data access impediments of yore. I know this sounds simplistic, but be rest assured, imbibing Linked Data's value proposition is really just that simple, once you engage solutions (e.g. Virtuoso) that enable you to deploy Linked Data across your enterprise.

    Example:

    Microsoft ACCESS, SQL Server, and Virtuoso all use the Northwind SQL DB Schema as the basis of the demonstration database shipped with each DBMS product. This schema is comprised of common IS/MIS entities that include: Customers, Contacts, Orders, Products, Employees etc.

    What we all really want to do as data, information, and knowledge consumers and/or dispatchers, is be no more than a single "mouse click" away from relevant data/information/knowledge data access and/or exploration. Even better (but not always so obvious), we also want anyone in our network (company, division, department, cube-cluster) to inherit these data access efficiencies.

    In this example, the Web Page about the Customer "ALKI" provides me with a myriad of exploration and data access paths e.g., when I click on the foaf:primarytopic property value link.

    This simple example, via a single Web Page, should put to rest any doubts about the utility of Linked Data. Of course this is an old demo, but this time around the UI is minimalist as my prior attempts skipped a few steps i.e., starting from within a Linked Data explorer/browser.

    Important note: I haven't exported SQL into an RDF data warehouse, I am converting the SQL into RDF Linked Data on the fly which has two fundamental benefits:

    1. No vulnerability to changes in the source DBMS
    2. Superior performance over the RDF warehouse since the source schema is SQL based and I can leverage the optimization of the underlying SQL engine when translating between SPARQL and SQL.

    Enjoy!

    Related

    1. Requirements for Relational to RDF Mapping
    2. Handling Graph Transitivity in a SQL/RDF Hybrid Engine
    3. How Virtuoso handles the Web Aspects of Linked Data Queries.
    ]]>
    Business Value of Linked Data (Enterprise Angle)? http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1437Thu, 11 Sep 2008 19:52:48 GMT22008-09-11T15:52:48.000050-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I continue to be intrigued by Yihong Ding's shared insights as expressed in part 2 of his blog series titled: Programming the Universe. The blog series shares Yihong's thoughts and reflections stimulated by the book, also titled: Programming the Universe.

    What strikes me the most, is how sharing his findings act as serendipitous connectors to related insights and points of view, that ultimately create deeper shared knowledge about the core subject matter, courtesy of the Web hosted Blogosphere.

    Related

    ]]>
    Programming the Universe http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1428Wed, 03 Sep 2008 11:56:50 GMT22008-09-03T07:56:50-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here are some demonstrations of (X)HTML based representations of resource descriptions from Freebase, DBpedia, BBC Music Beta, CrunchBase, OpenCyc, and UMBEL etc. What is really being demonstrated here is the use of Proxy / Wrapper URIs to expose powerful links across entities distilled from their container documents (or information resources). Of course, you see exactly the same technique in action whenever you visit DBpedia pages. Again, we are moving the concept of Linking from the document to document level, down to the document-entity to document-entity level. The evolution of network link focal points is illustrated in slides 15 to 22 of my Linked Data Planet presentation remix.

    Live Examples

    1. Abraham Lincoln - Freebase (note: link from Freebase to DBpedia via Wikipedia)
    2. Amazon - CrunchBase (note: links from CruncBase to DBpedia)
    3. Cold Play - BBC Music Beta (note: links to Musicbrainz)
    4. Linked Data Planet Presentation - Also a Slidy, Bibo Ontology, and RDFa usage example
    5. Music - OpenCyc Concept which exposes a Hyperdata link to its equivalent UMBEL Subject Concept and back

    Virtuoso's RDFization Middleware & Linked Data Deployment Architecture Diagram





    Note: You can substitute my examples using any Web resource URL. The underlying RDFization and Linked Data deployment functionality of the Virtuoso demo instance takes care of everything else. Also note that the HTML based resource description page capability is now deployed as part of the Virtuoso Sponger component of every Virtuoso installation starting with from version 5.0.8.

    ]]>
    Connecting Freebase, Wikipedia, DBpedia, and other Linked Data Spaces (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1427Fri, 29 Aug 2008 18:57:02 GMT32008-08-29T14:57:02.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The title of this post is an expression of my gut reaction to the quotes below, which originate from Leo Sauermann's post about the Nepomuk Semantic Desktop for KDE:

    Ansgar Bernardi, deputy head of the Knowledge Management Department at Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI, or the German Research Center for Artificial Intelligence) and Nepomuk's coordinator, explains, "The basic problem that we all face nowadays is how to handle vast amounts of information at a sensible rate." According to Bernardi, Nepomuk takes a traditional approach by creating a meta-data layer with well-defined elements that services can be built upon to create and manipulate the information.

    The comment above echoes my sentiments about the imminence of "information overload" due to the vast amounts of user generated content on the Internet as a whole. We are going to need to process more an more data within a fixed 24 hour timeframe, while attempting to balance our professional and personal lives. Be rest assured, this is a very serious issue, and you cannot event begin to address it without a Web of Linked Data.

    "The first idea of building the semantic desktop arose from the fact that one of our colleagues could not remember the girlfriends of his friends," Bernard says, more than half-seriously. "Because they kept changing -- you know how it is. The point is, you have a vast amount of information on your desktop, hidden in files, hidden in emails, hidden in the names and structures of your folders. Nepomuk gives a standard way to handle such information."

    If you get a personal URI for Entity "You", via a Linked Data aware platform (e.g. OpenLink Data Spaces) that virtualizes data across your existing Web data spaces (blogs, feed subscriptions, wikis, shared bookmarks, photo galleries, calendars, etc.), you then only have to remember your URI whenever you need to "Find" something, imagine that!

    To conclude, "information overload" is the imminent challenge of our time, and the keys to challenge alleviation lie in our ability to construct and maintain (via solutions) few context lenses (URIs) that provide coherent conduits into the dense mesh of structured Linked Data on the Web.

    ]]>
    The Essence of the Matter re. Information Overloadhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1425Thu, 28 Aug 2008 19:56:20 GMT12008-08-28T15:56:20-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    There are many challenges that have dogged attempts to mesh the DBMS & Object Technology realms for years, critical issues include:

    1. data access & manipulation impedance arising from Model mismatches between Relational Databases and Object Oriented & Object based Languages
    2. Record / Data Object Referencing by ID.

    The big deal about LINQ has been the singular focus on addressing point 1, in particular.

    I've already written about the Linq2Rdf effort that meshes the best of .NET with the virtues of the "Linked Data Web".

    Here is an architecture diagram that seeks to illustrate the powerful data access and manipulation options that the combination of Linq2RDF and Linked Data deliver:


    What may not have been obvious to most in the past, is the fact that Mapping from Object Models to Relational Models wasn't really the solution to the problem at hand. Instead, the mapping should have been the other way around i.e., Relational to Object Model mapping. The emergence of RDF and RDBMS to RDF mapping technology is what makes this age-old headache addressable in very novel ways.

    Related

    1. RDBMS to RDF Mapping - W3C Workshop Presentation
    2. Virtuoso RDBMS to RDF Mapping - W3C Rdb2Rdf Incubator Group Presentation
    3. Creating RDF Views over SQL Data Sources - Technology Tutorial
    ]]>
    Virtuoso, Linked Data, and Linq2Rdf (Update 1)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1420Wed, 27 Aug 2008 11:51:23 GMT22008-08-27T07:51:23.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    WUPnP Cheatsheet: "

    The Web Universal Plug and Play (WUPnP) Cheatsheet:

    Web Universal Plug and Play (WUPnP) Cheatsheet

    Essentially, if you build an application and use the technologies suggested in the ‘glue section’ then your web application/service (whether it’s front-end or back-end) will fit into many many other web applications/services… and therefore also more manageable for the future! This is WUPnP.

    Key technologies for making your services/applications as sticky as possible:

    Web-based plug and play fun!

    "

    (Via Daniel Lewis.)

    ]]>
    WUPnP Cheatsheethttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1397Tue, 29 Jul 2008 17:06:40 GMT22008-07-29T13:06:40-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Metcalfe’s law states that the value of a telecommunications network is proportional to the square of the number of users of the system (n²), where the linkages between users (nodes) exist by definition. For information bases, the data objects are the nodes. Linked Data works to add the connections between the nodes.

    I would tweak of the law modification expressed in Mike Bergman's post which states:

    the value of a Linked Data network is proportional to the square of the number of links between the data objects.
    By simply injecting "Context" which is what a high fidelity linked data mesh facilitates i.e. a mesh of weighted links endowed with specifically typed links (as opposed to a single ambiguous type unspecific link), you end up with an even more insight into the power of a Linked Data Web.

    Channeling Einstein

    How about Einstein's famous equaton: E=mc2? I am talking Energy (vitality) and Mass equivalence, where "E" is for Energy, "m" for Network Mesh base Mass ( where each entity network node contains sub-particles that are themselves dense network meshes all endowed with typed links and weightings), and "c" is for computer processing speed (processing speed is growing exponentially!). When you beam queries down a context rich mesh (a giant global graph comprised of named and dereferencable data sources), especially a mesh to which we are all connected, what do you get? Infrastructure for generating an unbelievable amount of intellectual energy (the result of exploding the sub-data-graphs within graph nodes) that is much better equipped to handle current and future challenges. Even better, we end up making constructive use of Einstein's findings (remember, we built a bomb the first time around!). TimBL articulates this fundamental value of the Web in slightly different language, but at the core, this is the essence of the Web as I believe he envisioned; the ability to connect us all in such a way that we exploit our collective manpower and knowledge constructively and unobtrusively, en route to making the world a much better place :-)

    Note: None of this in incongruent with being compensated (i.e. making money) for contributing tangible value into, or around, the Mesh we know as the Web :-)

    Related

    ]]>
    Metcalfe, Einstein, and Linked Datahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1390Tue, 02 Sep 2008 17:03:01 GMT22008-09-02T13:03:01-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've finally found a second to drop a note about my keynote.

    The keynote: Creating, Deploying, and Exploiting Linked Data, sought to achieve the fundamental goal of: Demystify the concept of "Linked Data" using anecdotal material that resonates with enterprise decision makers.

    To my pleasure, 90% of the audience members confirmed familiarization with the "Data Source Name" concept of Open Database Connectivity (ODBC). Thus, all I had to do was map "Linked Data" to ODBC, and then unveil the fundamental add-ons that "Linked Data" delivers:

    • The ability to give database records names (Identifiers)
    • The use of HTTP in the database record naming mechanism - which expands a named database record's reference scope via the expanse of the Web (i.e HTTP based Identifiers called URIs).

    I believe a majority of attendees came to realize that the combination above injects a new Web interaction dynamic: access to "Subject matter Concepts" and Named Entities contained within a page via HTTP base Data Source Names (URIs).

    BTW - My presentation is a Linked Data Space in it's own right courtesy of the Bibliographic Ontology (which provides slide show modeling) and RDFa that allows me to embed annotations into my Slidy based presentation :-)

    Related

    ]]>
    My Linked Data Planet Keynote (Updated with missing link)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1387Thu, 19 Jun 2008 13:48:14 GMT52008-06-19T09:48:14-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The current live instance of DBpedia has just received dose #1 of a series of planned "Context" oriented booster shots. These shots seek to to protect DBpedia from contextual incoherence as it grows in data set expanse and popularity. Dose #1 (vaccine label: Yago) equips DBpedia with a functional (albeit non exclusive) Data Dictionary component courtesy of the Yago Class Hierarchy .

    When the DBpedia & Yago integration took place last year (around WWW2007, Banff) there was a little, but costly omission that occurred: nobody sought to load the Yago Class Hierarchy into the Virtuoso's Inference Engine :-(

    Anyway, the Class Hierarchy has now been loaded into the Virtuoso's inference engine (as Virtuoso Inference Rules) and the following queries are now feasible using the live Virtuoso based DBpedia instance hosted by OpenLink Software:

    -- Find all Fiction Books associated with a property "dbpedia:name" that has literal value:  "The Lord of the Rings" .

     

    DEFINE input:inference "http://dbpedia.org/resource/inference/rules/yago#"

    PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;

    PREFIX dbpedia: &lt;http://dbpedia.org/property&gt;

    PREFIX yago: &lt;http://dbpedia.org/class/yago&gt;&nbsp;


    SELECT DISTINCT  ?s
    FROM < xmlns="http" dbpedia.org="dbpedia.org">//dbpedia.org>
    WHERE {
    ?s a yago:Fiction106367107 .
    ?s dbpedia:name "The Lord of the Rings"@en .
    }

    -- Variant of query with Virtuoso's Full Text Index extension via the bif:contains function/magic predicate

    DEFINE input:inference "http://dbpedia.org/resource/inference/rules/yago#"

    PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;

    PREFIX dbpedia: &lt;http://dbpedia.org/property&gt;

    PREFIX yago: &lt;http://dbpedia.org/class/yago&gt;&nbsp;

    SELECT DISTINCT ?s ?n

    FROM < xmlns="http" dbpedia.org="dbpedia.org">//dbpedia.org>

    WHERE {

    ?s a yago:Fiction106367107 .

    ?s dbpedia:name ?n .

    ?n bif:contains 'Lord and Rings'

    }

    -- Retrieve all individuals instances of Fiction Class which should include all Books.

    DEFINE input:inference "http://dbpedia.org/resource/inference/rules/yago#"

    PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;

    PREFIX dbpedia: &lt;http://dbpedia.org/property&gt;


    PREFIX yago: &lt;http://dbpedia.org/class/yago&gt;&nbsp;


    SELECT DISTINCT ?s
    FROM < xmlns="http" dbpedia.org="dbpedia.org">//dbpedia.org>
    WHERE {
    ?s a yago:Fiction106367107 .
    } LIMIT 50

    Note: you can also move the inference pragmas to the Virtuoso Sever side i.e place the inference rules in a server instance config file, thereby negating the need to place "define input:inference 'http://dbpedia.org/resource/inference/rules/yago#'" pragmas directly in your SPARQL queries.

    Related

    ]]>
    DBpedia receives shot #1 of CLASSiness vaccinehttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1372Tue, 13 Jul 2010 14:45:40 GMT62010-07-13T10:45:40-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Courtesy a post by Chris Bizer to the LOD community mailing list, here is a list of Linked Data oriented talks at the upcoming XTech 2008 event (also see the XTech 2008 Schedule which is Linked Data friendly). Of course, I am posting this to my Blog Data Space with the sole purpose of adding data to the rapidly growing Giant Global Graph of Linked Data, basically adding to my collection of live Linked Data utility demos :-)

    Here is the list:

    1. Linked Data Deployment (Daniel Lewis, OpenLink Software)
    2. The Programmes Ontology (Tom Scott, BBC and all)
    3. SemWebbing the London Gazette (Jeni Tennison, The Stationery Office)
    4. Searching, publishing and remixing a Web of Semantic Data (Richard Cyganiak, DERI Galway)
    5. Building a Semantic Web Search Engine: Challenges and Solutions (Aidan Hogan, DERI Galway)
    6. 'That's not what you said yesterday!' - evolving your Web API (Ian Davis, Talis)
    7. Representing, indexing and mining scientific data using XML and RDF: Golem and CrystalEye (Andrew Walkingshaw, University of Cambridge)

    For the time challenged (i.e. those unable to view this post using it's permalink / URI as a data source via the OpenLink RDF Browser, Zitgist Data Viewer, DISCO Hyperdata Browser, or Tabulator), the benefits of this post are as follows:

    • automatic URI generation for all linked items in this post
    • automatic propagation of tags to del.icio.us, Technorati, and PingTheSemanticWeb
    • automatic association of formal meanings to my Tags using the MOAT Ontology
    • automatic collation and generation of statistical data about my tags using the SCOT Ontology (*missing link is a callout to SCOT Tag Ontology folks to sort the project's home page URL at the very least*)
    • explicit typing of my Tags as SKOS Concepts.

    Put differently, I cost-effectively contribute to the GGG across all Web interaction dimensions (1.0, 2.0, 3.0) :-)

    ]]>
    XTech Talks covering Linked Data http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1355Mon, 05 May 2008 21:07:17 GMT42008-05-05T17:07:17-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Typo cleansed edition :-)

    Objectives

    • Meet LOD Community Members
    • Participate in Workshop

    Meeting LOD Community Members

    Although the Web continues to shrink the planet by removing the restrictions of geopgrahic location, meeting people face-to-face remains invaluable (*priceless in Mastercard AD speak*). Naturally, meeting and chatting with as many LOD community members as possible was high up on my agenda.

    Participate in Workshop

    As one of the co-chairs of the Linking Open Data Workshop (LODW), I had a 5 minute workshop opening slot during which I spoke about the following:

    Where we are today:

    We have DBpedia as a major hub on the burgeoning Linked Data Web. When OpenLink offered to host DBpedia (a combination of Virtuoso DBMS Software and sizable backend Hardware infrastructure), it did so knowing that such an effort would emphatically address the "chicken and egg" conundrum that, prior to this undertaking, stifled the ability to demonstrate practical utility of HTTP based Linked Data.

    Today, the Linked Data bootstrap mission has been accomplished.

    Where we go next:

    Although DBpedia is a hub (ground zero of Linked Data), we have to put it into perspective in relation to a new set of needs and expectations moving forward. Today, DBpedia is a Sun at the heart of a Solar System within the Linked Data Galaxy. But unlike Space as we know it, in Cyberspace we can have connectivity and collaboration across Solar Systems -- life exists elsewhere and we are part of a collaborative collective unimpeded by constraints of space travel etc. Thus, expect to see the emergence of other Solar Systems accessible to DBpedia and its collections of planets (see. LOD diagram). Examples underway include UMBEL which will serve the Linked Data planets from OpenCyc (Subject Matter Concepts), Yago (Named Entities), and Bio2RDF (which provides powerful Bio Informatics based Linked Data planet).

    I urged the community to veer more aggressively towards developing and demonstrating practical Linked Data driven solutions that are aligned to well known problems. Of course, I encouraged all presenters to make this an integral part of their presentations :-)

    Workshop Summary:

    The workshop was well attended and I found all the presentations engaging and full of enthusiasm.

    As the sessions progressed, it became clear during a number of accompanying Q&A sessions that a new Linked Data exploitation frontier is emerging. The frontier in question takes the form of a Linked Data substrate capable of addressing the taxonomic needs of solutions aimed at automated Named Entity Extraction, Disambiguation, Subject matter Concept alignment, transparently integrated with existing Web Content. Thus, we are moving beyond the minting and deployment of of dereferencable URIs and RDF data sets to automagically associating existing Web Content with Named Entities (People, Organizations, Places, Events etc..) and Subject matter Concepts (Politics, Music, Sports, and others) while remaining true to the Linking Open Data Community creed i.e. ensuring the Named Entity and Subject matter Concept URIs are available to user agents or users seeking to produce alternative data views (i.e. Mesh-ups).

    I will get to part 2 of this report once the actual workshop sessions slides go live (*these are different from the pre-event PDFs links*).

    ]]>
    Linked Data Trip Report - Part 1 (Update 2)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1343Tue, 29 Apr 2008 15:07:43 GMT32008-04-29T11:07:43.000002-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Daniel Lewis has put together a nice collection of Linked Data related posts that illustrate the fundamentals of the Linked Data Web and the vital role that Virtuoso plays as a deployment platform. Remember, Virtuoso was architected in 1998 (see Virtuoso History) in anticipation of the eventual Internet, Intranet, and Extranet level requirements for a different kind of Server. At the time of Virtuoso's inception, many thought our desire to build a multi-protocol, multi-model, and multi-purpose, virtual and native data server was sheer craziness, but we pressed on (courtesy of our vision and technical capabilities). Today, we have a very sophisticated Universal Server Platform (in Open Source and Commercial forms) that is naturally equipped to do the following via very simple interfaces:
      - Provide highly scalable RDF Data Management via a Quad Store (DBpedia is an example of a live demonstration)
      - Powerful WebDAV innovations that simplify read-write mode interaction with Linked Data
      - More...
    ]]>
    Linked Data Illustrated and a Virtuoso Functionality Reminderhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1342Mon, 28 Apr 2008 18:47:06 GMT12008-04-28T14:47:06.000001-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Courtesy of Thomas Vander Wal's interesting blog post titled: Explaining the Granular Social Network, I found a nice video that highlights the Who + What you know aspect of Social Networking ad the GGG in general.

    As I can't quite remix Videos on the spur of the moment (yet), I would encourage you to watch the video and then click on the link to my FOAF Profile, then follow the "Linked Data" tab to see how Linked Data oriented platforms (in my case OpenLink Data Spaces) that exist today actually deliver what's explained in the video.

    "What You Know" (Data & Friend Networks) ultimately trumps "Who You Know" (Friend only Networks). The exploitation power of this reality is enhanced exponentially via the Linked Data Web once the implications of beaming SPARQL queries down specific URIs (entry points to Linked Data graphs) become clearer :-)

    ]]>
    Explaining the Granular Social Networkhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1341Tue, 15 Apr 2008 21:22:42 GMT12008-04-15T17:22:42-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The new RDB2RDF Incubator Group is now official. The group is sponsored by Oracle, HP, PartnersHealth, and OpenLink Software.

    Goals

    The goal of this effort is standardization of approaches (syntax and methodology) for mapping Relational Data Model instance data to RDF (Graph Data Model).

    Benefits

    Every record in a relational table/view/stored procedure (Table Valued Functions/Procedures) is declaratively morphed into an Entity (instance of a Class associated with a Schema/Ontology). The derived entities become part of a graph that exposes relationships and relationship traversal paths that have lower JOIN Costs than attempting the same thing directly via SQL. In a nutshell, you end up with a conceptual interface atop a logical data layer that enables a much more productive mechanism for exploring homogeneous and/or heterogeneous data without confinement at the DB instance, SQL DBMS type, host operating system, local area network, or wide area network levels.

    Just as we have to mesh the Linked Data and Document Webs, unobtrusively. It's also important that the same principles to apply to exposure of RDBMS hosted data as RDF based Linked Data.

    We all know that a large amount of data driving the IT engines of most enterprises resides in Relational Databases. And contrary to recent RDBMS vs RDF database misunderstandings espoused (hopefully inadvertently) by some commentators, Relational Database engines aren't going away anytime soon. Meshing Relational (logical) and Graph (conceptual) data models a natural progression along an evolutionary path towards: Analysis for All. By the way, there is a parallel evolution occurring in others realms such as Microsoft's ADO.NET's Entity Framework.

    How would I use RDB2RDF Mapping?

    To Unobtrusively expose existing data sources as RDF Linked Data. The links that follow provide examples:

    Related

    1. Virtuoso's Meta Schema Language for Declaratively generating RDF Views of SQL Data (Presentation, White Paper, Tutorial, and Online Docs)
    2. ESW Wiki's Collection of SQL-RDF Mapping Tools
    3. What the Semantic Web means for your Business
    ]]>
    New W3C Incubator Group: Relational Database to RDF Mappinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1320Tue, 11 Mar 2008 17:58:24 GMT52008-03-11T13:58:24-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    If your Data Space was a Solar System, your personal Identity would be the Sun. I say this because your Identity is the conduit (access mechanism) to your data graph; the data you generate from various application interaction activities such as: Blogging, Bookmarking, Photo Sharing, Feed Aggregation etc.

    Daniel Lewis has just published a nice blog post titled: The Data Space Philosophy, that puts the underlying Data Space concept in perspective.

    The Linked Data Web is a Giant Global Graph of Data Spaces (meshes of data and identity exposed by graphs connecting data and identity)

    Data Portability ultimately depends on platforms that provide unobtrusive generation of Linked Data (for data referencing) alongside support for a plethora of industry standard data formats -- which is what OpenLink Data Spaces has been about for a very long time :-)

    Related

    ]]>
    Data Spaces, User Identity, and Data Portabilityhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1311Mon, 04 Feb 2008 15:06:43 GMT12008-02-04T10:06:43-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    A new release of Virtuoso is now available in both Open Source and Commercial variants. The main features and Enhancements associated with this release include:

      * 64-bit Integer Support
      * RDF Sink Folders for WebDAV - enabling RDF Quad Store population by simply dropping RDF files into WebDAV or via HTTP (meaning you can use CURL as an RDF in put mechanism for instance)
      * Additional Sponger Cartridges from Audio binary files (i.e ID3 tag extraction and Music Ontology mapping which exposes the fine details of music as RDF based Structured Data; one for the DJs & Remixers out there!)
      * New Sponger Cartridges for Facebook, Freebase, Wikipedia, GRDDL, RDFa, eRDF and more
      * Support for PHP 5.2 runtime hosting (Virtuoso is a bona fide deployment platform for: Wordpress, MediaWiki, phpBB, Drupal etc.)
      * Enhanced UI for managing RDF Linked Data deployment (covering Multi Homed domains, Virtual Directories associated with URL-rewrite rules
      * Demonstration Database includes SQL-RDF Views & SQL Table samples for the THALIA Web Data Integration benchmark and test-suite
      * Tutorial Application includes Linked Data style SQL-RDF Views for the Northwind SQL DBMS schema (which is the same as the standard Virtuoso demo atabase schema)
      * SQL-RDF Views implementation of the TPC-D benchmark (Yes, we can run this grueling SQL benchmark via RDF views of SQL Data!)
      * A new Amazon EC2 Image for Virtuoso that enables you to instantiate a fully configured instance comprising the Virtuoso core, OpenLink Data Spaces platform and the OpenLink Ajax Toolkit (OAT) (we now have bona fide Data Spaces in the Clouds as an addition to the emerging Semantic Data Web mesh).

    Download Lnks:

    ]]>
    Virtuoso 5.0.2 Released!http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1265Mon, 08 Oct 2007 14:27:27 GMT12007-10-08T10:27:27-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I've written extensively on the subject of Data Spaces in relation to the Data Web for while. I've also written sparingly about OpenLink Data Spaces (a Data Web Platform that build using Virtuoso). On the other hand, I haven't shed much light on installation and deployment of OpenLink Data Spaces.

    Jon Udell recently penned a post titled: The Fourth Platform. The post arrives at a spookily coincidental time (this happens quite often between Jon and I as demonstrated last year during our podcast; the "Fourth" in his Innovators Podcast series).

    The platform that Jon describes is "Cloud Based" and comprised of Storage and Computation. I would like to add Data Access and Management (native and virtual) under the fourth platform banner with the end product called: "Cloud based Data Spaces".

    As I write, we are releasing a Virtuoso AMI (Amazon Image) labeled: virtuoso-dataspace-server. This edition of Virtuoso includes the OpenLink Data Spaces Layer and all of the OAT applications we've been developing for a while.

    What Benefits Does this offer?

    1. Personal Data Spaces in the Cloud - a place where you can control and consolidate data across your Blogs, Wikis, RSS/Atom Feed Subscriptions, Shared Bookmarks, Shared Calendars, Discussion Threads, Photo Galleries etc
    2. All the data in your Data Space is SPARQL or GData accessible.
    3. All of the data in your Personal Data Space is Linked Data from the get go. Each Item of data is URI addressable
    4. SIOC support - your Blogs, Wikis, Bookmarks etc.. are based on the SIOC ontology for Semantically Interlinking Online Communities (think: Open social-graph++)
    5. FOAF support - your FOAF Profile page provides a URI that is an in-road to all Data in your Data Space.
    6. OpenID support - your Personal Data Space ID is usable wherever OpenID is supported. OpenID and FOAF are integrated as per latest FOAF specs
    7. Two Integration with Facebook - You can access your Data Space from Facebook or access Facebook from your Data Space
    8. Unified Storage - The WebDAV based filesystem provides Cloud Storage that's integrated with Amazon S3; It also exposes all of your Data Space data via a traditional filesystem UI (think virtual Spotlight); You can also mount this drive to your local filesystem via your native operating system's WebDAV support
    9. SyncML - you can sync calendar and contact details with your Data Space in the cloud from your Mobile phone.
    10. A practical Semantic Data Web solution - based on Web Infrastructure and doesn't require you to do anything beyond exposing URIs for data in your Data Spaces.

    EC2-AMI Details:

      AMI ID: ami-e2ca2f8b
      Manifest file: virtuoso-images/virtuoso-dataspace-server.manifest.xml

    Installation Guide:

    1. Get an Amazon Web Services (AWS) account
    2. Signup for S3 and EC2 services
    3. Install the EC2 plugin for Firefox
    4. Start the EC2 plugin
    5. Locate the row containing ami-7c31d515  Manifest virtuoso-test/virtuoso-cloud-beta-9-i386.manifest.xml (sort using the AMI ID or Manifest Columns or search on pattern: virtuoso, due to name flux)
    6. Start the Virtuoso Data Space Server AMI
    7. Wait 4-5 minutes (*take a few minutes to create the pre-configured Linux Image*)
    8. Connect to http://http://your-ec2-instance-cname:8890/ Log in with user/password dba/dba
    9. Go to the Admin UI (Virtuoso Conductor) and change the PWDs for the 'dba' and 'dav' accounts (*Important!*)
    10. Give the "SPARQL" user "SPARQL_UPDATE" privileges (required if you want to exploit the in-built Sponger Middleware)
    11. Click on the ODS (OpenLink Data Spaces) link to start an Personal Editon of OpenLink Data Spaces (or go to: http://your-ec2-instance-cname/dataspace/ods/index.html)
    12. Log-in using the username and password credentials for the 'dav' account (or register a new user note: OpenID is an option here also) Create an Data Space Application Instance by clicking on a Data Space App. Tab
    13. Import data from your existing Web 2.0 style applications into OpenLink Data Spaces e.g. subscribe to a few RSS/Atom feeds via the "Feeds Manager" application or import some Bookmarks using the "Bookmarks" application
    14. Then look at the imported data in Linked Data form via your ODS generated URIs based on the patterns: http://your-ec2-instance-cname/dataspace/person/your-ods-id#this (URI for You the Person), http://your-ec2-instance-cname/dataspace/person/your-ods-id (FOAF File URI), http://your-ec2-instance-cname/dataspace/your-ods-id (SIOC File URI)

    (OAT) from your Data Space instance

    Install the OAT VAD package via the Admin UI and then apply the URI patterns below within your browser:
    1. http://:8890/oatdemo - Entire OAT Demo Collection
    2. http://:8890/rdfbrowser - RDF Browser
    3. http://:8890/isparql - SPARQL Query Builder (iSPARQL)
    4. http://:8890/qbe - SQL Query Builder (iSQL)
    5. http://:8890/formdesigner - Forms Builder (for building Meshups based on RDF, SQL, or Web Servives Data Souces)
    6. http://:8890/dbdesigner - SQL DB Schema Designer (note a Visual SQL-RDF Mapper is also on it's way
    7. http://:8890/DAV/JS/ - To view the OAT Tree (there are some experimental demos that are missing from the main demo app etc..)

    There's more to come!

    ]]>
    Fourth Platform: Data Spaces in The Cloud (Update)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1261Sun, 26 Oct 2008 21:59:33 GMT202008-10-26T17:59:33-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Ivan Herman just posted another nice example of practical RDFa usage in a blog post titled: Yet Another RDFa Proccessor. In his post, Ivan exposes a URI for his FOAF-in-RDFa file.

    Since I am aggressively tracking RDFa developments, I decided to quickly view Ivan's FOAF-in-RDFa file via the OpenLink RDF Browser. The full implications are best understood when you click on each of the Browser's Tabs -- each providing a different perspective on this interesting addition to the Semantic Data Web (note: the Fresnel Tab which demonstrates declarative UI templating using N3).

    What's Going on Here?

    The OpenLink RDF Browser is a Rich Internet Application built using OAT (OpenLink Ajax Toolkit). In my case, I am deploying the RDF Browser from a Virtuoso instance, which implies that the Browser is able to use the Virtuoso Sponger Middleware (exposed as a REST Service at the Virtuoso instance endpoint: /proxy); which includes an RDFa Cartridge comprised of a metadata extractor and an RDF Schema / OWL Ontology mapper. That's it!

    ]]>
    Yet Another RDFa Demohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1249Tue, 05 Feb 2008 01:44:37 GMT22008-02-04T20:44:37.000009-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Ben Adida re. RDFa as part of a perpetual certification process for my ODS based Weblog. The most recent post from Ben contains a link to an "RDFa in the Wild" portal (in the making).

    One I installed Opertaor 0.8 and then scanned a few of the pages from the RDFa portal. Operator 0.8 didn't do much for me i.e. if the RDFa didn't express RDF aligned in some form to a microformat that it understood, it simply routed it's findings to a generic "resource" category :-( Of course, it is possible to enhance this aspect of Operator (and I may get round to that some day). Anyway, I pressed on, and took one of the more interesting URIs from the RDFa page and pasted that into the OpenLink RDF Browser instead. Here are the links:

    1. Semantically annotated publication database using Ajax (a page containing structured data expressed in RDF and exposed via RDFa)

    2. Same Page via OpenLink RDF Browser

    The RDF Browser uses the Virtuoso Sponger to extract the embedded RDF from RDFa embedded in the page.]]>
    The Power of Structured Data Exposure via RDFahttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1243Tue, 05 Feb 2008 01:45:02 GMT22008-02-04T20:45:02.000004-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    An update for RSS crawlers that don't have synchronization capability etc..

    "Reasonable Humans" continue to struggle with the congruency of social-networking and the Semantic Web. Funnily enough, our less capable counterparts (in the reasoning department) don't have such struggles. Here is my little interaction with an IRC Bot called "Phenny" (possibly mother of the Agent Web or Web of Agents or Web 4.0):

    kidehen: .w web

    phenny: web n. 1: An intricate network suggesting something that was formed by weaving or interweaving
    phenny: web n. 2: An intricate trap that entangles or ensnares its victim
    phenny: web v. 1: Construct or form a web, as if by weaving.


    kidehen:
    .w network

    phenny: network n. 1: An interconnected system of things or people
    phenny: network n. 2: (broadcasting) a communication system consisting of a group of broadcasting stations that all transmit the same programs
    phenny: network v. 1: Communicate with and within a group.



    kidehen: .w social

    phenny: social n. 1: A party of people assembled to promote sociability and communal activity
    phenny: social a. 1: Relating to human society and its members
    phenny: social a. 2: Living together or enjoying life in communities or organized groups.


    kidehen: .w semantic

    phenny: semantic a. 1: Of or relating to meaning or the study of meaning.


    kidehen: .w graph

    phenny: graph n. 1: A visual representation of the relations between certain quantities plotted with reference to a set of axes
    phenny: graph v. 1: Represent by means of a graph
    phenny: graph v. 2: Plot upon a graph.

    Note: Wikipedia's Graph definition is the one the applies to RDF :-)

    To conclude, the whole Semantic Data Web vision and Open social-networking go hand-in-hand. You can't attain "Open social-networking" without appreciating the underlying core concepts (Structured Data, Graphs, and Triples). There is no way around the reality that all networks are graphs. Of course, this doesn't imply that solution users (end-users) have to master graph theory en route to network participation and exploitation. It simply implies that solution developers need some graph data model appreciation, or a graph data model aware platform, when developing social-networking solutions that seek to leverage the Web.

    Funnily enough, my session with the Phenny Bot was triggered by a prior session between Chimezie Ogbuji (who also trains a Bot called "Emeka" that does SPARQL) and said Bot:

    
     


    chimezie: .w tautology

    phenny: tautology n. 1: (logic) a statement that is necessarily true
    phenny: tautology n. 2: Useless repetition.


    chimezie: .ety tautology

    phenny: "1579, from L.L. tautologia 'representation of the same thing' (c.350), from Gk. tautologia, from tautologos 'repeating what has been said,' from tauto 'the same' + -logos 'saying,' related to legein 'to say' (see lecture)." - http://etymonline.com/?term=tautology

    That lead me to the following sequence (preceding the initial IRC session dump in this post):

    
     

    kidehen: .w conflagration

    phenny: conflagration n. 1: A very intense and uncontrolled fire.

    kidehen: .w imbroglio

    phenny: imbroglio n. 1: An intricate and confusing interpersonal or political situation

    phenny: imbroglio n. 2: A very embarrassing misunderstanding.


    kidehen: .w buzzword

    phenny: buzzword n. 1: Stock phrases that have become nonsense through endless repetition.

    In sense, proposing the Semantic Data Web as a solution to open social-networiing challenges, more often than not results in your "No Semantic Web here" imbroglio. In a sense, the shortest path to a buzzword fueled conflagration :-)

    ]]>
    Social-Networking & Semantic Web (update)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1241Wed, 15 Aug 2007 22:14:36 GMT32007-08-15T18:14:36.000003-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Last week we officially released Virtuoso 5.0.1 (in Commercial and Open Source Editions). The press release provided us with an official mechanism and timestamp for the current Virtuoso feature set.

    A vital component of the new Virtuoso release is the finalization of our SQL to RDF mapping functionality -- enabling the declarative mapping of SQL Data to RDF. Additional technical insight covering other new features (delivered and pending) is provided by Orri Erling, as part of a series of post-Banff posts.

    Why is SQL to RDF Mapping a Big Deal?

    A majority of the world's data (especially in the enterprise realm) resides in SQL Databases. In addition, Open Access to the data residing in said databases remains the biggest challenge to enterprises for the following reasons:

    1. SQL Data Sources are inherently heterogeneous because they are acquired with business applications that are in many cases inextricably bound to a particular DBMS engine
    2. Data is predictably dirty
    3. DBMS vendors ultimately hold the data captive and have traditionally resisted data access standards such as ODBC (*trust me they have, just look at the unprecedented bad press associated with ODBC the only truly platform independent data access API. Then look at how this bad press arose..*)

    Enterprises have known from the beginning of modern corporate times that data access, discovery, and manipulation capabilities are inextricably linked to the "Real-time Enterprise" nirvana (hence my use of 0.0 before this becomes 3.0).

    In my experience, as someone whose operated in the data access and data integration realms since the late '80s, I've painfully observed enterprises pursue, but unsuccessfully attain, full control over enterprise data (the prized asset of any organization) such that data-, information-, knowledge-workers are just a click away from commencing coherent platform and database independent data drill-downs and/or discovery that transcend intranet, internet, and extranet boundaries -- serendipitous interaction with relevant data, without compromise!

    Okay, situation analysis done, we move on..

    At our most recent (12th June) monthly Semantic Web Gathering, I unveiled to TimBL and a host of other attendees a simple, but powerful, demonstration of how Linked Data, as an aspect of the Semantic Data Web, can be applied to enterprise data integration challenges.

    Actual SQL to RDF Mapping Demo / Experiment

    Hypothesis

    A SQL Schema can be effectively mapped declaratively to RDF such that SQL Rows morph into RDF Instance Data (Entity Sets) based on the Concepts & Properties defined in a Concrete Conceptual Data Model oriented Data Dictionary (RDF Schema and/or OWL Ontology). In addition, the solution must demonstrate how "Linked Data in the Web" is completely different from "Data on the Web" or "Linked Data on the Web" (btw - Tom Heath eloquently unleashed this point in his recent podcast interview with Talis).

    Apparatus

    An Ontology - in this case we simply derived the Northwind Ontology from the XML Schema based CSDL (Conceptual Schema Definition Language) used by Microsoft's public Astoria demo (specifically the Northwind Data Services demo). SQL Database Schema - Northwind (comes bundled with ACCESS, SQL Server, and Virtuoso) comprised of tables such as: Customer, Employee, Product, Category, Supplier, Shipper etc. OpenLink Virtuoso - SQL DBMS Engine (although this could have been any ODBC or JDBC accessible Database), SQL-RDF Metaschema Language, HTTP URL-rewriter, WebDAV Engine, and DBMS hosted XSLT processor Client Tools - iSPARQL Query Builder, RDF Browser (which could also have been Tabulator or DISCO or a standard Web Browser)

    Experiment / Demo

    1. Declaratively map the Northwind SQL Schema to RDF using the Virtuoso Meta Schema Language (see: Virtuoso PL based Northwind_SQL_RDF script)
    2. Start browsing the data by clicking on the URIs that represent the RDF Data Model Entities resulting from the SQL to RDF Mapping

    Observations

    1. Via a single Data Link click I was able to obtain specific information about the Customer represented by the URI "ALFKI" (act of URI Dereferencing as you would an Object ID in an Object or Object-Relational Database)
    2. Via a Dynamic Data Page I was able to explore all the entity relationships or specific entity data (i.e Exploratory or Entity specific dereferencing) in the Northwind Data Space
    3. I was able to perform similar exploration (as per item 2) using our OpenLink Browser.

    Conclusions

    The vision of data, information, or knowledge at your fingertips is nigh! Thanks to the infrastructure provided by the Semantic Data Web (URIs, RDF Data Model, variety of RDF Serialization Formats[1][2][3], and Shared Data Dictionaries / Schemas / Ontologies [1][2][3][4][5]) it's now possible to Virtualize enterprise data from the Physical Storage Level, through the Logical Data Management Levels (Relational), up to a Concrete Conceptual Model (Graph) without operating system, development environment or framework, or database engine lock-in.

    Next Steps

    We produce a shared ontology for the CRM and Business Reporting Domains. I hope this experiment clarifies how this is quite achievable by converting XML Schemas to RDF Data Dictionaries (RDF Schemas or Ontologies). Stay tuned :-)

    Also watch TimBL amplify and articulate Linked Data value in a recent interview.

    Other Related Matters

    To deliver a mechanism that facilitates the crystallization of this reality is a contribution of boundless magnitude (as we shall all see in due course). Thus, it is easy to understand why even "her majesty", the queen of England, simply had to get in on the act and appoint TimBL to the "British Order of Merit" :-)

    Note: All of the demos above now work with IE & Safari (a "remember what Virtuoso is epiphany") by simply putting Virtuoso's DBMS hosted XSLT engine to use :-) This also applies to my earlier collection of demos from the Hello Data Web and other Data Web & Linked Data related demo style posts.

    ]]>
    Enterprise 0.0, Linked Data, and Semantic Data Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1224Tue, 05 Feb 2008 04:19:26 GMT42008-02-04T23:19:26.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Frederick Giasson has put out a number of interesting posts (via his blog) about a conceptual Music Data Space (one of many Data Spaces that will ultimately permeate the Semantic Data Web). Anyway, While reading his initial post covering Music Domain URIs and Linked Data, it occurred to me that by only exposing the raw RDF instance data (RDF/XML format in this case) via URIs for: Diana Ross, Paul McCartney, The Beatles, and Madonna, the essence of the post may not be revealed to all, so I've knocked up a few demos to illustrate the core message:

    Note: the enhanced hyperlink (typed data link) lookup presents options to perform an Explore (all data about subject across Domains in the data space i.e. data links to and from Subject), Dereference (specific data in the Subject's Domain i.e. data links originating from subject).

    1. Diana Ross
    2. Paul McCartney
    3. The Beatles
    4. Madonna

    I built these Linked Data Pages by simply doing the following:

    1. Open up our OAT based iSPARQL (Interactive SPARQL Query By Example) Tool
    2. Paste a URI of Interest into the Data Source URI input field
    3. Execute the Query (hitting the ">" button)
    4. Saving the Query to WebDAV as a Linked Data Page (or what I initial called Dynamic Data Web pages in my Hello Data Web series of posts).
    5. Share your Data, Information, Knowledge with others via URIs (as shown in the section above).
    ]]>
    Exploring a Music Data Space via Linked Data http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1204Tue, 05 Feb 2008 04:20:47 GMT22008-02-04T23:20:47.000003-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    What's the best way to move Radio UserLand over to a new computer? Without breaking anything. Yeah, I've read the "backup Radio" site, but that's not what I want to do. I want to move my entire Radio license, copy, and all the data contained therein, to a newly-setup computer. I can't get it to work. Any tips?[via The Scobleizer Weblog]

    Well what I wanted to do, and have successfully achieved, is as follows (this isn't to knock Radio Userland which in my opinion is a fabulous piece of pioneering work in the weblog space):
    1. Migrate my Radio Blog Web to a Virtuoso Blog Server (it is a  Web Log server that supports; Blogger API 1.0/2.0, Meta-Weblog API, Moveable Type, and xmlStorageSystem)
    2. Continue to use Radio as my desktop blogging tool, but also as the local blog server gateway for other tools that I use such as w.bloggar, FM Radio and Newzcrawler

    How was this acheived?

    1. I had to reconfigure the Radio #upstream.xml file so that it points to my Virtuoso Server for xmlStorageSystem Web Publishing
      • This is my modified version of #upstream.xml
        <!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com) by Kingsley Idehen (OpenLink Software) -->
        <
        upstream type="xmlStorageSystem" version="1.0">
          <!--This is my Virtuoso WebDAV account-->
          <usernum>kingsley</usernum>
          <name>Kingsley Idehen</name>
          <!--This is my Radio Password Name Reference-->
          <passwordName>default</passwordName>
          <!--This is the Virtuoso instance reference-->
          <server>demo2.usnet.private</server>
          <!--Virtuoso HTTP Server Instance Port Number-->
          <port>8890</port>
          <protocol>soap</protocol>
          <!--Virtuoso XML-RPC or SOAP Endpoint-->
          <rpcPath>/xmlStorageSystem</rpcPath>
          <soapAction>/xmlStorageSystem</soapAction>
        </
        upstream>

      • You also have to make the following change via the following Userland Radio menu path "Radio"->Window->Radio.root->user->radio->prefs->upstream->servers:
        'serverCapabilities'->flError = true;

    2. Publish my local Radio site, this time to Virtuoso rather than the Userland Community Server destination

    New Architecture

    ----------------------
    | Blogging Clients
    ---------------------
          |
    ------------
    | Local Radio Userland Web Server
    --------------------------------
          |
    -----------
    | Virtuoso Server (RSS, RDF, XML, SQL etc.. in one place for further use)
    ----------------------

    End result is productive blogging, and reusable content storage in my Virtuoso knowledgebase.

     

    ]]>
    <a href="http://radio.weblogs.com/0100059/stories/2002/04/05/howToBackupImportantRadioFiles.html">backup Radio</a>http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/12Thu, 22 Jun 2006 12:56:58 GMT12006-06-22T08:56:58-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Here are some examples of using Exhibit against RDF via SPARQL on the fly:

    1. Flickr photos tagged under rdf and semanticweb
    2. Del.icio.us tags for semanticweb

    The examples above combine OAT and Exhibit. OAT handles the binding to SPARQL.

    Here is a pure OAT variation of the prior examples that includes an enhanced anchor (hyperlink) feature that enables a variety of traversal behaviors and actions against the same RDF Data:

    1. Dynamic Data Web Page for Flickr photos tagged under rdf and semanticweb (click on a URI associated with a jpeg to see metadata for a given picture)
    2. Del.icio.us tags for semanticweb.

    Note: Use the "dereference option" (retrieve/get data associated with URI) for maximum effect. The "explore" is useful after you've dereferenced a few URIs. Also note that columns are resizable, like those in a spreadsheet, which also implies dynamic sorting capability.

    ]]>
    Exhibit & SPARQLhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1158Thu, 20 Mar 2008 04:14:10 GMT62008-03-20T00:14:10-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    Virtuoso joins Boca and ARC 2.0 as RDF Quad or Triple Stores with Full Text Index extensions to SPARQL. Here is our example applied to DBpedia:

    PREFIX dbpedia: <http://dbpedia.org/>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
    SELECT ?name ?birth ?death
    FROM <http://dbpedia.org>
    WHERE {
       ?person dbpedia:birthplace <http://dbpedia.org/resource/Berlin> .
       ?person dbpedia:birth ?birth .
       ?person foaf:name ?name .
       ?person dbpedia:death ?death
       FILTER (?birth < "1900-01-01"^^xsd:date and bif:contains (?name,
    'otto')) .
    }
    ORDER BY ?name
    
    

    You can test further using our SPARQL Endpoint for DBpedia or via the DBPedia bound Interactive SPARQL Query Builder or just click *Here* for results courtesy of the SPARQL Protocol (REST based Web Service).

    Note: This is in-built functionality as Virtuoso has possessed Full Text Indexing since 1998-99. This capability applies to physical and virtual graphs managed by Virtuoso.

    A per usual, there is more to come as we now have a nice intersection point for SPARQL and XQuery/XPath since Triple Objects (the Literal variety) can take the form of XML Schema based Complex Types :-) A point I alluded too in my podcast interview with Jon Udell last year (*note: mechanical turk based transcript is bad*). The point I made went something like this: "...you use SPARQL to traverse the typed links and then use XPath/XQuery for further granular access to the data if well-formed..."

    Anyway, the podcast interview lead to this InfoWorld article titled: Unified Data Theory.

    ]]>
    SPARQL and Full Text Indexing implementations are growinghttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1157Tue, 13 Mar 2007 10:09:43 GMT12007-03-13T06:09:43-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    OAT: OpenAjax Alliance Compliant Toolkit: "

    Ondrej Zara and his team at Openlink Software have created a Openlink Software JS Toolkit, known as OAT. It is a full-blown JS framework, suitable for developing
    rich applications with special focus to data access.

    OAT works standalone, offers vast number of widgets and has some rarely seen features, such as on-demand library loading (which reduces the total amount of downloaded JS code).

    OAT is one of the first JS toolkits which show full OpenAjax Alliance conformance: see the appropriate wiki page and conformance test page.

    There is a lot to see with this toolkit:

    You can see some of the widgets in a Kitchen sink application

    Sample data access applications:

    OAT is Open Source and GPL’ed over at sourceforge and the team has recently managed to incorporate our OAT data access layer as a
    module to dojo datastore.

    (Via Ajaxian Blog.)

    This is a corrected version of the initial post. Unfortunately, the initial post was inadvertently littered with invalid links :-( Also, since the original post we have released OAT 1.2 that includes integration of our iSPARQL QBE into the OAT Form Designer application.

    Re. Data Access, It is important to note that OAT's Ajax Database Connectivity layers supports data binding to the following data source types:

    1. RDF - via SPARQL (Query Language, Protocol, and Resultset Serialization formats: RDF/XML, RDF/N3, RDF/Turtle, XML, and JSON)
    2. SQL - via XMLA (somewhat forgotten SOAP protocol for SQL Data Access that can sit atop ODBC, ADO.NET, OLE-DB, and even JDBC)
    3. XML - via SOAP or REST style Web Services
    In all cases, OAT also provides Data Aware controls for the above that include:
    1. Tabular Grids
    2. Pivot Tables
    3. TimeLines
    4. Extended Anchor Tags
    5. Map Service Controls (Google, Yahoo!, OpenLayers, Microsoft Visual Earth)
    6. SVG based RDF Graph Control (Opera 9.x provides best viewing experience at the current time)

    OAT also includes a number of prototype applications that are completely developed using OAT Controls and Libraries:

    1. Visual SPARQL Query Builder
    2. Visual SQL Query Builder
    3. Web Forms Designer (includes Drag-Drop usage of Data Aware Controls etc.)
    4. Visual DB Designer

    Note: Pick "Local DSN" from page initialization dialog's drop-down list control when prompted

    ]]>
    OAT: OpenAjax Alliance Compliant Toolkit (Live Links Version)http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1129Fri, 02 Feb 2007 15:29:55 GMT22007-02-02T10:29:55-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    I tried to post a comment to Dare Obasanjo's blog post: How Do We Get Rid of Lies on Wikipedia, without success (due to my attempts to add links to the post etc..). Hence a Blog style response instead.

    Dare:

    I have been through the Wikipedia fires a few times. If you recall that I actually triggered the early Web 2.0 Wikipedia article. along the following lines:

    1. Asked one of my staff to start a post with the sole intention of defining Web 2.0 properly
    2. I then attempted to edit the initial post
    3. I left a typo re. REST
    4. Got set on Fire etc... (see very beginning of Wikipedia Web 2.0 history page)

    As annoying as the experience above was, I didn't find this inconsistent with the spirit of Wikipedia (i.e. open contribution and discourse). I felt, at the time, that a lot of historical data was being left in place for future reference etc.. In addition, the ultimate aim of creating an evolving Web 2.0 document did commence albeit some distance from "modern man" re. accuracy and meaningfulness as of my last read (today).

    Even closer to home, I repeated the process above re. Virtuoso Universal Server. This basically ended up being a live case study on how you handle the Wikipedia NPOV conundurum. Just look at the Virtuoso Universal Server Talk Pages to see how the process evolved (the key was Virtuoso's lineage and it's proximity to the very DBMS platform upon which Wikipedia runs i.e MySQL).

    Bearing in mind the size and magnitude of Microsoft, there should be no reason why Microsoft's "Microsoft Digital Caucus" ( legions of Staff, MSDN members, Integrators, and other partners) can't simply go into Wikipedia and participate in the edit and discourse process.

    Truth cannot be surpressed! At best, it can only be temporarily delayed :-) Even more so on the Web!

    ]]>
    Microsoft & Wikipedia Imbrogliohttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1124Thu, 25 Jan 2007 23:47:47 GMT22007-01-25T18:47:47.000001-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    The Music Ontology: "

    A new and exciting project in the Semantic Web area: The Music Ontology, by Frederic Giasson (PTSW, TalkDigger).

    Its goal is to provide a vocabulary to describe Artists, Releases, Songs and so on in RDF. It is mainly based on the MusicBrainz Metadata Vocabulary, but with new improvements as defining relationships between artists and links to external services. And, most important thing, a lot of triples from the current MusicBrainz database should be available in a few weeks. A mailing-list has been launched for discussions and improvements.

    I was waiting for this kind of vocabulary (and data) for some time (as I never took time to look as MBz database export) especially to easilly find all covers of a given song. From another point of view, I'll be happy to use it to represent - and query - various releases of a given record (using the mo:other_release_of property), especially for vynil records with reissues (so what about a mo:reissue property ?) with different colors, inner sleeve ...

    Well, finally what about converting the FLEX book in RDF to query this huge punk and hardcore database (and use its URIs for want-lists) ?

    "

    (Via Alexandre Passant - Terraces.)

    ]]>
    The Music Ontologyhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1104Tue, 26 Dec 2006 16:18:28 GMT12006-12-26T11:18:28.000005-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>
    A quick dump that demonstrates how I integrate tags and links from del.icio.us with links from my local bookmark database via one of my public Data Spaces (this demo uses the kidehen Data Space).

    SPARQL (query language for the Semantic Web) basically enables me to query a collection of typed links (predicates/properties/attributes) in my Data Space (ODS based of course) without breaking my existing local bookmarks database or the one I maintain at del.icio.us.

    I am also demonstrating how Web 2.0 concepts such as Tagging mesh nicely with the more formal concepts of Topics in the Semantic Web realm. The key to all of this is the ability to generate RDF Data Model Instance Data based on Shared Ontologies such as SIOC (from DERI's SIOC Project) and SKOS (again showing that Ontologies and Folksonomies are complimentary).

    This demo also shows that Ajax also works well in the Semantic Web realm (or web dimension of interaction 3.0) especially when you have a toolkit with Data Aware controls (for SQL, RDF, and XML) such as OAT (OpenLink Ajax Toolkit). For instance, we've successfully used this to build a Visual Query Building Tool for SPARQL (alpha) that really takes a lot of the pain out of constructing SPARQL Queries (there is much more to come on this front re. handling of DISTINCT, FILTER, ORDER BY etc..).

    For now, take a look at the SPARQL Query dump generated by this SIOC & SKOS SPARQL QBE Canvas Screenshot.

    You can cut and paste the queries that follow into the Query Builder or use the screenshot to build your variation of this query sample. Alternatively, you can simply click on *This* SPARQL Protocol URL to see the query results in a basic HTML Table. And one last thing, you can grab the SPARQL Query File saved into my ODS-Briefcase (the WebDAV repository aspect of my Data Space).

    Note the following SPARQL Protocol Endpoints:

    1. MyOpenLink Data Space
    2. Experimental Data Space SPARQL Query Builder (you need to register at http://myopenlink.net:8890/ods to use this version)
    3. Live Demo Sever
    4. Demo Server SPARQL Query Builder (use: demo for both username and pwd when prompted)

    My beautified Version of the SPARQL Generated by QBE (you can cut and paste into "Advanced Query" section of QBE) is presented below:

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX sioc: <http://rdfs.org/sioc/ns#>
    PREFIX dct: <http://purl.org/dc/elements/1.1/>
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
    
    SELECT distinct ?forum_name, ?owner, ?post, ?title, ?link, ?url, ?tag FROM <http://myopenlink.net/dataspace> WHERE { ?forum a sioc:Forum; sioc:type "bookmark"; sioc:id ?forum_name; sioc:has_member ?owner. ?owner sioc:id "kidehen". ?forum sioc:container_of ?post . ?post dct:title ?title . optional { ?post sioc:link ?link } optional { ?post sioc:links_to ?url } optional { ?post sioc:topic ?topic. ?topic a skos:Concept; skos:prefLabel ?tag}. }

    Unmodified dump from the QBE (this will be beautified automatically in due course by the QBE):

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX sioc: <http://rdfs.org/sioc/ns#>
    PREFIX dct: <http://purl.org/dc/elements/1.1/>
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
    
    SELECT ?var8 ?var9 ?var13 ?var14 ?var24 ?var27 ?var29 ?var54 ?var56 WHERE { graph ?graph { ?var8 rdf:type sioc:Forum . ?var8 sioc:container_of ?var9 . ?var8 sioc:type "bookmark" . ?var8 sioc:id ?var54 . ?var8 sioc:has_member ?var56 . ?var9 rdf:type sioc:Post . OPTIONAL {?var9 dc:title ?var13} . OPTIONAL {?var9 sioc:links_to ?var14} . OPTIONAL {?var9 sioc:link ?var29} . ?var9 sioc:has_creator ?var37 . OPTIONAL {?var9 sioc:topic ?var24} . ?var24 rdf:type skos:Concept . OPTIONAL {?var24 skos:prefLabel ?var27} . ?var56 rdf:type sioc:User . ?var56 sioc:id "kidehen" . } }

    Current missing items re. Visual QBE for SPARQL are:

    1. Ability to Save properly to WebDAV so that I can then expose various saved SPARQL Queries (.rq file) from my Data Space via URIs
    2. Handling of DISTINCT, FILTERS (note: OPTIONAL is handled via dotted predicate-links)
    3. General tidying up re. click event handling etc.
    Note: You can even open up your own account (using our Live Demo or Live Experiment Data Space servers) which enables you to repeat this demo by doing the following (post registration/sign-up):
    1. Export some bookmarks from your local browser to the usual HTML bookmarks dump file
    2. Create an ODS-Bookmarks Instance using your new ODS account
    3. Use the ODS-Bookmark Instance to import your local bookmarks from the HTML dump file
    4. Repeat the same import sequence using the ODS-Bookmark Instance, but this time pick the del.icio.us option
    5. Build your query (change 'kidehen' to your ODS-user-name)
    6. That's it you now have Semantic Web presence in the form of a Data Space for your local and del.icio.us hosted bookmarks with tags integrated

    Quick Query Builder Tip: You will need to import the following (using the Import Button in the Ontologies & Schemas side-bar);

    1. http://www.w3.org/1999/02/22-rdf-syntax-ns# (RDF)
    2. http://rdfs.org/sioc/ns# (SIOC)
    3. http://purl.org/dc/elements/1.1/ (Dublin Core)
    4. http://www.w3.org/2004/02/skos/core# (SKOS)

    Browser Support: The SPARQL QBE is SVG based and currently works fine with the following browsers; Firefox 1.5/2.0, Camino (Cocoa variant of Firefox for Mac OS X), Webkit (Safari pre-release / advanced sibling), Opera 9.x. We are evaluating the use of the Adobe SVG plugin re. IE 6/7 support.

    Of course this should be a screencast, but I am the middle of a plethora of things right now :-)

    ]]>
    SPARQL, Ajax, Tagging, Folksonomies, Share Ontologies and Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1095Wed, 13 Dec 2006 20:09:50 GMT112006-12-13T15:09:50-05:00Kingsley Uyi Idehen <kidehen@openlinksw.com>

    Goggle vs Semantic Web: "Google exec challenges Berners-Lee 'At the end of the keynote, however, things took a different turn. Google Director of Search and AAAI Fellow Peter Norvig was the first to the microphone during the Q&A session, and he took the opportunity to raise a few points.

    'What I get a lot is: 'Why are you against the Semantic Web?' I am not against the Semantic Web. But from Google's point of view, there are a few things you need to overcome, incompetence being the first,' Norvig said. Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user.'

    Related: Google Base -- summing up."

    (Via More News.)

    When will we drop the ill conceived notion that end-users are incompetent?

    Has it every occurred to software developers and technology vendors that incompetent, dumb, and other contemptuous end-user adjectives simply reflect the inability of most technology products to surmount end-user "Interest Activation Thresholds"?

    Interest Activation Threshold (IAT)? What's That?

    I have a fundamental personal belief that all human beings are intelligent. Our ability to demonstrate intelligence, or be perceived as intelligent, is directly proportional to our interest level in a given context. In short, we have "Ambivalence Quotients" (AQs) just as we have "Intelligence Quotients" (IQs).

    An interested human being is an inherently intelligent entity. The abstract nature of human intelligence also makes locating the IQ and AQ on/off buttons a mercurial quest at the best of times.

    Technology end-users exhibit high AQs, most of the time due to the inability of most technology products to truly engage, and ultimately stimulate genuine interest, by surmounting IAT and reducing AQ.

    Ironically, when a technology vendor is lagging behind its competitors in the "features arms race" it is common place to use the familiar excuse: "our end-users aren't asking for this feature".

    Note To Google:

    Ambivalence isn't incompetence. If end-users were genuinely incompetent, how is that they run rings around your page rank algorithms by producing google-friendly content at the expense of valuable context? What about the deteriorating value of Adsense due to click fraud? Likewise, the continued erosion of the value of your once exemplary "keyword based search" service? As we all know, necessity is the mother of invention, so when users develop high AQs because there is nothing better, we end up with a forced breech of "IAT"; which is why the issues that I mention remain long term challenges for you. Ironically, the so called "incompetents" are already outsmarting you, and you don't seem to comprehend this reality or its inevitable consequences.

    Finally, how you are going to improve value without integrating the Semantic Web vision into your R&D roadmap? I can tell you categorically that you have little or no wiggle room re. this matter, especially if you want to remain true to your: "don't be evil" mantra. My guess is that you will incorporate Semantic Web technologies sooner rather than later (Google Co-op is a big clue). I would even go as far as predicting a Google hosted SPARQL Query Endpoint alongside your GData endpints during the next 6-12 months (if even that long). I believe that your GData protocol (like the rest of Web 2.0) will ultimately accelerate your appreciation of the data model dexterity that RDF brings to loosely coupled knowledge networks espoused by the Semantic Web vision.

    Google & Semantic Web Paradox

    The Semantic Web vision has the RDF graph data model at its core (and for good reason), but even more confusing for me, as I process Google sentiments about the Semantic Web, is the fact that RDF's actual creator (Ramanathan Guha aka. Guha) currently works at Google. There's a strange disconnect here IMHO.

    If I recall correctly, Google wants to organize the worlds data and information, leaving the knowledge organization to someone else which is absolutely fine. What is increasingly irksome, is the current tendency to use corporate stature to generate Fear, Uncertainty, and Doubt when the subject matter is the "Semantic Web".

    BTW - I've just read Frederick Giasson's perspective on the Google Semantic Web paradox which ultimately leads to the same conclusions regarding Google's FUD stance when dealing with matters relating to the Semantic Web.

    I wonder if anyone is tracking the google hits for "fud google semantic web"?

    ]]>
    Google vs Semantic Webhttp://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1018Sat, 29 Jul 2006 23:55:57 GMT32006-07-29T19:55:57-04:00Kingsley Uyi Idehen <kidehen@openlinksw.com>