Details

OpenLink Software
Burlington, United States

Subscribe

Post Categories

Recent Articles

Community Member Blogs

Display Settings

articles per page.
order.

Translate

Showing posts in all categories RefreshRefresh
RDF Browsers & RDF Data Middleware [ Kingsley Uyi Idehen ]

Frederick Giasson penned an interesting post earlier today that highlighted the RDF Middleware services offered by Triplr and the Virtuoso Sponger

Some Definitions (as per usual):

RDF Middleware (as defined in this context) is about producing RDF from non RDF Data Sources. This implies that you can use non RDF Data Sources (e.g. (X)HTML Web Pages, (X)HTML Web Pages hosting Microformats, and even Web Services such as those from Google, Del.icio.us, Flickr etc..) as Semantic Web Data Source URIs (pointers to RDF Data).

In this post I would like to provide a similar perspective on this ability to treat non RDF as RDF from RDF Browser perspective.

First off, what's an RDF Browser?

An RDF Browser is a piece of technology that enables you to Browse RDF Data Sources by way of Data Link Traversal. The key difference between this approach and traditional browsing is that Data Links are typed (they possess inherent meaning and context) whereas traditional links are untyped (although universally we have been trained to type them as links to Blurb in the form of (X)HTML pages or what is popularly called "Web Content".).

There are a number of RDF Browsers that I am aware off (note: pop me a message directly of by way of a comment to this post if you have a browser that I am unaware of), and they include (in order of creation and availability):

  1. Tabulator
  2. DISCO - Hyperdata Browser
  3. OpenLink Ajax Toolkit's RDF Browser (a component of the OAT Javascript Toolkit)

Each of the browsers above can consume the services of Triplr or the Virtuoso Sponger en route to unveiling a RDF Data that is traversable via URI dereferencing (HTTP GETing the data exposed by the Data Pointer). Thus you can cut&paste the following into each of the aforementioned RDF Browsers:

  1. Triplr's RDF Data (Triples) extractions from Dan Connolly's Home Page
  2. The Virtuoso Sponger's RDF Data (Triples) extractions from Dan Connolly's Home Page

Since we are all time challenged (naturally!) you can also just click on these permalinks for the OAT RDF Browser demos:

  1. Permalink for Triplr's RDF Data (Triples) extractions from Dan Connolly's Home Page
  2. Permalink for the Virtuoso Sponger's RDF Data (Triples) extractions from Dan Connolly's Home Page
# PermaLink Comments [0]
03/28/2007 19:17 GMT Modified: 04/29/2007 14:59 GMT
Recent Virtuoso Developments [ Virtuso Data Space Bot ]
Recent Virtuoso Developments

We have been extensively working on virtual database refinements.  There aremany SQL cost model adjustments to better model  distributed queries and wenow support direct access to Oracle and Informix statistics system tables.Thus, when you attach  a table from one or the other, you automatically getup to date statistics.  This helps Virtuoso optimize distributed  queries.Also the documentation is updated as concerns these, with a new section ondistributed query optimization.

On the applications side, we have been keeping up with the SIOC RDF ontologydevelopments.  All ODS applications now make  their data available as SIOCgraphs for download and SPARQL query access.

What is most exciting however is our advance in mapping relational data intoRDF.  We now have a mapping language that makes  arbitrary legacy data in Virtuoso or elsewhere in the relational world RDF queriable.  We will putout a white paper on  this in a few days.

Also we have some innovations in mind for optimizing the physical storage ofRDF triples.  We keep experimenting, now with  our sights set to the highend of triple storage, towards billion triple data sets.  We areexperimenting with a new more space efficient index structure  for betterworking set behavior.  Next week will yield the first results.

# PermaLink Comments [0]
09/19/2006 07:45 GMT
Recent Virtuoso Developments [ Orri Erling ]

We have been extensively working on virtual database refinements.  There aremany SQL cost model adjustments to better model  distributed queries and wenow support direct access to Oracle and Informix statistics system tables.Thus, when you attach  a table from one or the other, you automatically getup to date statistics.  This helps Virtuoso optimize distributed  queries.Also the documentation is updated as concerns these, with a new section ondistributed query optimization.

On the applications side, we have been keeping up with the SIOC RDF ontologydevelopments.  All ODS applications now make  their data available as SIOCgraphs for download and SPARQL query access.

What is most exciting however is our advance in mapping relational data intoRDF.  We now have a mapping language that makes  arbitrary legacy data in Virtuoso or elsewhere in the relational world RDF queriable.  We will putout a white paper on  this in a few days.

Also we have some innovations in mind for optimizing the physical storage ofRDF triples.  We keep experimenting, now with  our sights set to the highend of triple storage, towards billion triple data sets.  We areexperimenting with a new more space efficient index structure  for betterworking set behavior.  Next week will yield the first results.

# PermaLink Comments [0]
09/19/2006 10:59 GMT Modified: 09/19/2006 07:45 GMT
Data Spaces and Web of Databases [ Kingsley Uyi Idehen ]

Note: An updated version of a previously unpublished blog post:

Continuing from our recent Podcast conversation, Jon Udell sheds further insight into the essence of our conversation via a “Strategic Developer” column article titled: Accessing the web of databases.

Below, I present an initial dump of a DataSpace FAQ below that hopefully sheds light on the DataSpace vision espoused during my podcast conversation with Jon.

What is a DataSpace?

A moniker for Web-accessible atomic containers that manage and expose Data, Information, Services, Processes, and Knowledge.

What would you typically find in a Data Space? Examples include:

  • Raw Data - SQL, HTML, XML (raw), XHTML, RDF etc.

  • Information (Data In Context) - XHTML (various microformats), Blog Posts (in RSS, Atom, RSS-RDF formats), Subscription Lists (OPML, OCS, etc), Social Networks (FOAF, XFN etc.), and many other forms of applied XML.
  • Web Services (Application/Service Logic) - REST or SOAP based invocation of application logic for context sensitive and controlled data access and manipulation.
  • Persisted Knowledge - Information in actionable context that is also available in transient or persistent forms expressed using a Graph Data Model. A modern knowledgebase would more than likely have RDF as its Data Language, RDFS as its Schema Language, and OWL as its Domain  Definition (Ontology) Language. Actual Domain, Schema, and Instance Data would be serialized using formats such as RDF-XML, N3, Turtle etc).

How do Data Spaces and Databases differ?
Data Spaces are fundamentally problem-domain-specific database applications. They offer functionality that you would instinctively expect of a database (e.g. AICD data management) with the additonal benefit of being data model and query language agnostic. Data Spaces are for the most part DBMS Engine and Data Access Middleware hybrids in the sense that ownership and control of data is inherently loosely-coupled.

How do Data Spaces and Content Management Systems differ?
Data Spaces are inherently more flexible, they support multiple data models and data representation formats. Content management systems do not possess the same degree of data model and data representation dexterity.

How do Data Spaces and Knowledgebases differ?
A Data Space cannot dictate the perception of its content. For instance, what I may consider as knowledge relative to my Data Space may not be the case to a remote client that interacts with it from a distance, Thus, defining my Data Space as Knowledgebase, purely, introduces constraints that reduce its broader effectiveness to third party clients (applications, services, users etc..). A Knowledgebase is based on a Graph Data Model resulting in significant impedance for clients that are built around alternative models. To reiterate, Data Spaces support multiple data models.

What Architectural Components make up a Data Space?

  • ORDBMS Engine - for Data Modeling agility (via complex purpose specific data types and data access methods), Data Atomicity, Data Concurrency, Transaction Isolation, and Durability (aka ACID).

  • Virtual Database Engine - for creating a single view of, and access point to, heterogeneous SQL, XML, Free Text, and other data. This is all about Virtualization at the Data Access Level.
  • Web Services Platform - enabling controlled access and manipulation (via application, service, or protocol logic) of Virtualized or Disparate Data. This layer handles the decoupling of functionality from monolithic wholes for function specific invocation via Web Services using either the SOAP or REST approach.

Where do Data Spaces fit into the Web's rapid evolution?
They are an essential part of the burgeoning Data Web / Semantic Web. In short, they will take us from data “Mash-ups” (combining web accessible data that exists without integration and repurposing in mind) to “Mesh-ups” (combining web accessible data that exists with integration and repurposing in mind).

Where can I see a DataSpace along the lines described, in action?

Just look at my blog, and take the journey as follows:

What about other Data Spaces?

There are several and I will attempt to categorize along the lines of query method available:
Type 1 (Free Text Search over HTTP):
Google, MSN, Yahoo!, Amazon, eBay, and most Web 2.0 plays .

Type 2 (Free Text Search and XQuery/XPath over HTTP)
A few blogs and Wikis (Jon Udell's and a few others)

Type 3 (RDF Data Sets and SPARQL Queryable):
Type 4 (Generic Free Text Search, OpenSearch, GData, XQuery/XPath, and SPARQL):
Points of Semantic Web presence such as the Data Spaces at:

What About Data Space aware tools?

  •    OpenLink Ajax Toolkit - provides Javascript Control level binding to Query Services such as XMLA for SQL, GData for Free Text, OpenSearch for Free Text, SPARQL for RDF, in addition to service specific Web Services (Web 2.0 hosted solutions that expose service specific APIs)
  •    Semantic Radar - a Firefox Extension
  •    PingTheSemantic - the Semantic Webs equivalent of Web 2.0's weblogs.com
  •    PiggyBank - a Firefox Extension

# PermaLink Comments [1]
08/28/2006 19:38 GMT Modified: 09/04/2006 18:58 GMT
Contd: Ajax Database Connectivity Demos [ Kingsley Uyi Idehen ]

Last week I put out a series of screencast style demos that sought to demonstrate the core elements of our soon to be released Javascript Toolkit called OAT (OpenLink Ajax Toolkit) and its Ajax Database Connectivity layer.

The screencasts covered the following functionality realms:

  1. SQL Query By Example (basic)
  2. SQL Query By Example (advanced - pivot table construction)
  3. Web Form Design (basic database driven map based mashup)
  4. Web Form Design (advanced database driven map based mashup)

To bring additional clarity to the screencasts demos and OAT in general, I have saved a number of documents that are the by products of activities in the screenvcasts:

  1. Live XML Document produced using SQL Query By Example (basic) (you can use drag and drop columns across the grid to reorder and sort presentation)
  2. Live XML Document produced using QBE and Pivot Functionality (you can drag and drop the aggregate columns and rows to create your own views etc..)
  3. Basic database driven map based mashup (works with FireFox, Webkit, Camino; click on pins to see national flag)
  4. Advanced database driven map based mashup (works with FireFox, Webkit, Camino; records, 36, 87, and 257 will unveil pivots via lookup pin)

Notes:

  • “Advanced”, as used above, simply means that I am embedding images (employee photos and national flags) and a database driven pivot into the map pins that serve as details lookups in classic SQL master/details type scenarios.
  • The “Ajax Call In Progress..” dialog is there to show live interaction with a remote database (in this case Virtuoso but this could be any ODBC, JDBC, OLEDB, ADO.NET, or XMLA accessible data source)
  • The data access magic source (if you want to call it that) is XMLA - a standard that has been in place for years but completely misunderstood and as a result under utilized

You can see a full collection of saved documents at the following locations:

# PermaLink Comments [0]
06/01/2006 22:48 GMT Modified: 06/22/2006 08:56 GMT
My podcast conversation with Jon Udell [ Kingsley Uyi Idehen ]

Jon and I had a recent chat yesterday that is now available in Podcast form.

"In my fourth Friday podcast we hear from Kingsley Idehen, CEO of OpenLink Software. I wrote about OpenLink's universal database and app server, Virtuoso, back in 2002 and 2003. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the transition to open source. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. ..."

(Via Jon's Radio.)

I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as "Adam Bosworth's fingerprints." I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years).
BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what's already in place).

I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam's (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn't mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).

Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)

If you are encountering Virtuoso for the first time via this post or Jon's, please make time to read the product history article on the Virtuoso Wiki (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).

That said, I better go listen to the podcast :-)

# PermaLink Comments [0]
04/28/2006 14:43 GMT Modified: 06/29/2006 10:14 GMT
Are You a Google Away from Being Amazoned? [ Kingsley Uyi Idehen ]

This piece from SD Times that I simply do not agree with! Lead me to the question: Are you a "google" away from being "amazoned".

Here is the excerpt in SD times that irked me so much:

Eric Newcomer, CTO of Iona Technologies PLC, argues that avoiding vendor lock-in is not the most important role played by standards. "We hear a lot about the importance of standards. And the standards argument usually centers on guarding against vendor lock-in, since lock-in can be an expensive prospect. You will even find that most vendors readily acknowledge this benefit. While I do not dispute that avoiding vendor lock-in is of some importance, I do argue that of far more significance is the role industry standards play in reducing the overall cost of developing software and increasing developer productivity, especially for enterprise applications. What's needed is a common way of programming to any language or operating system, and a common way of communicating between any two or more programs. Heterogeneous hardware, operating- system and software environments are the main problems that businesses have, and will continue to have into the foreseeable future.

The benefit of standards is to prevent Lock-in, this might be vendor or technology lock-in. There is a lot of hype around Real-Time Enterprise vision, and most technology vendors (OpenLink included) have realization of this vision as part of their value proposition. Any enterprise that is locked into a technology or vendor is simply abdicating a timeless responsibility to attain the enterprise agility levels espoused by the Real-Time Enterprise vision.

The real cost of engaging any technology or vendor is all about the long term impact on the customers ability; the ability to respond to market inflections via existing and future IT infrastructure.  

A standards based IT infrastructure enables a company to dispose of those components that impede its ability to sustain desired agilitiy levels. Put differently, standards enable companies to assemble IT infrastructure from an increasingly heterogeneous pool of vendors. Thus, a company should be able to mix and match "best of class" IT infrastructure components in line with Enterprise Agility goals -something that is only attainable via a commitment to standards based infrastructure components in the first place.

An enterprise cannot be locked into a database, operating system, programming language, or technolgy religion, and expect to be agile. Failure to engage standards ultimately implies that you are a "google" away from being "amazoned" in your chosen market place. Be forewarned!

# PermaLink Comments [0]
06/30/2004 19:59 GMT Modified: 09/01/2006 17:06 GMT
         
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform