Details

OpenLink Software
Burlington, United States

Subscribe

Post Categories

Recent Articles

Community Member Blogs

Display Settings

articles per page.
order.

Translate

Showing posts in all categories RefreshRefresh
The URI, URL, and Linked Data Meme's Generic HTTP URI (Updated) [ Kingsley Uyi Idehen ]

Situation Analysis

As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:

"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..

And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..

What's up with that?

Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).

The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.

What is a Real World Object?

People, Places, Music, Books, Cars, Ideas, Emotions etc..

What is a URI?

A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.

URI Generic Syntax

The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below: Image

What is a URL?

A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:

  1. Resource Address/Location Identifier
  2. Data Access mechanism for an Information bearing Resource (Document, File etc..)

So far so good!

What is an HTTP based URI?

The kind of URI Linked Data aficionados mean when they use the term: URI.

An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:

  1. RWO Identfier/Name
  2. RWO Metadata document Locator (courtesy of URL aspect)
  3. Negotiable Representation of the Located Document (courtesy of HTTP's content negotiation feature).

What is Metadata?

Data about Data. Put differently, data that describes other data in a structured manner.

How Do we Model Metadata?

The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).

What about RDF?

The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:

  1. Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes & Relationships (Data Dictionaries e.g., OWL) metadata model
  2. A plethora of instance data representation formats that include: RDFa (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc.

What's the Problem Today?

The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.

Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?

How Does the Link Data meme solve the problem?

The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.

Conclusion

Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)

Related

  1. History of how "Resource" became part of URI - historic account by TimBL
  2. Linked Data Design Issues Document - TimBL's initial Linked Data Guide
  3. Linked Data Rules Simplified - My attempt at simplifying the Linked Data Meme without SPARQL & RDF distraction
  4. Linked Data & Identity - another related post
  5. The Linked Data Meme's Value Proposition
  6. So What Does "HREF" stand for anyway?
  7. My Del.icio.us hosted Bookmark Data Space for Identity Schemes
  8. TimBL's Ted Talk re. "Raw Linked Data"
  9. Resource Oriented Architecture
  10. More Famous Than Simon Cowell .
# PermaLink Comments [2]
08/07/2009 14:34 GMT Modified: 03/28/2010 12:19 GMT
The URI, URL, and Linked Data Meme's Generic HTTP URI (Updated) [ Kingsley Uyi Idehen ]

Situation Analysis

As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:

"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..

And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..

What's up with that?

Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).

The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.

What is a Real World Object?

People, Places, Music, Books, Cars, Ideas, Emotions etc..

What is a URI?

A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.

URI Generic Syntax

The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below: Image

What is a URL?

A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:

  1. Resource Address/Location Identifier
  2. Data Access mechanism for an Information bearing Resource (Document, File etc..)

So far so good!

What is an HTTP based URI?

The kind of URI Linked Data aficionados mean when they use the term: URI.

An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:

  1. RWO Identfier/Name
  2. RWO Metadata document Locator (courtesy of URL aspect)
  3. Negotiable Representation of the Located Document (courtesy of HTTP's content negotiation feature).

What is Metadata?

Data about Data. Put differently, data that describes other data in a structured manner.

How Do we Model Metadata?

The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).

What about RDF?

The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:

  1. Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes & Relationships (Data Dictionaries e.g., OWL) metadata model
  2. A plethora of instance data representation formats that include: RDFa (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc.

What's the Problem Today?

The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.

Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?

How Does the Link Data meme solve the problem?

The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.

Conclusion

Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)

Related

  1. History of how "Resource" became part of URI - historic account by TimBL
  2. Linked Data Design Issues Document - TimBL's initial Linked Data Guide
  3. Linked Data Rules Simplified - My attempt at simplifying the Linked Data Meme without SPARQL & RDF distraction
  4. Linked Data & Identity - another related post
  5. The Linked Data Meme's Value Proposition
  6. So What Does "HREF" stand for anyway?
  7. My Del.icio.us hosted Bookmark Data Space for Identity Schemes
  8. TimBL's Ted Talk re. "Raw Linked Data"
  9. Resource Oriented Architecture
  10. More Famous Than Simon Cowell .
# PermaLink Comments [2]
08/07/2009 14:34 GMT Modified: 03/28/2010 12:19 GMT
1995 [ Kingsley Uyi Idehen ]

1995: "

1995 (and the early 90’s) must have been a visionaries time of dreaming… most of their dreams are happening today.

Watch Steve Jobs (then of NeXT) discuss what he thinks will be popular in 1996 and beyond at OpenStep Days 1995:

Heres a spoiler:

  • There is static web document publishing
  • There is dynamic web document publishing
  • People will want to buy things off the web: e-commerce

The thing that OpenStep propose is:

What Steve was suggesting was one of the beginnings of the Data Web! Yep, Portable Distributed Objects and Enterprise Objects Framework was one of the influences of the Semantic Web / Linked Data Web…. not surprising as Tim Berners-Lee designed the initial web stack on a NeXT computer!

I’m going to spend a little time this evening figuring out how much ‘distributed objects’ stuff has been taken from the OpenStep stuff into the Objective-C + Cocoa environment. (<- I guess I must be quite geeky ;-))

"

(Via Daniel Lewis.)

# PermaLink Comments [1]
06/04/2008 21:05 GMT Modified: 06/06/2008 07:54 GMT
Semantic Data Web Epiphanies: One Node at a Time [ Kingsley Uyi Idehen ]

In 2006, I stumbled across Jason Kolb (online) via a 4-part series of posts titled: Reinventing the Internet. At the time, I realized that Jason was postulating about what is popularly known today as "Data Portability", so I made contact with him (blogosphere style) via a post of my own titled: Data Spaces, Internet Reinvention, and the Semantic Web. Naturally, I tried to unveil to Jason the connection between his vision and the essence of the Semantic Web. Of course, he was skeptical :-)

Jason recently moved to Massachusetts which lead to me pinging him about our earlier blogosphere encounter and the emergence of a Data Portability Community. I also informed him about the fact that TimBL, myself, and a number of other Semantic Web technology enthusiasts, frequently meet on the 2nd Tuesday of each month at the MIT hosted Cambridge Semantic Web Gatherings, to discuss, demonstrate, debate all aspects of the Semantic Web. Luckily (for both of us), Jason attended the last event, and we got to meet each other in person.

Following our face to face meeting in Cambridge, a number of follow-on conversations ensued covering, Linked Data and practical applications of the Semantic Web vision. Jason writes about our exchanges a recent post titled: The Semantic Web. His passion for Data Portability enabled me to use OpenID and FOAF integration to connect the Semantic Web and Data Portability via the Linked Data concept.

During our conversations, Jason also eluded to the fact that he had already encountered OpenLink Software while working with our ODBC Drivers (part of or UDA product family) for IBM Informix (Single-Tier or Multi-Tier Editions) a few years ago (interesting random connection).

As I've stated in the past, I've always felt that the Semantic Web vision will materialize by way of a global epiphany. The count down to this inevitable event started at the birth of the blogosphere, ironically. And accelerated more recently, through the emergence of Web 2.0 and Social Networking, even more ironically :-)

The blogosphere started the process of Data Space coalescence via RSS/Atom based semi-strucutured data enclaves, Web 2.0 RDFpropagated Web Service usage en route to creating service provider controlled, data and information silosRDF, Social NetworkingRDF brought attention to the fact that User Generated Data wasn't actually owned or controlled by the Data Creators etc.

The emergence of "Data Portability" has created a palatable moniker for a clearly defined, and slightly easier to understand, problem: the meshing of Data and Identity in cyberspace i.e. individual points of presence in cyberspace, in the form of "Personal Data Spaces in the Clouds" (think: doing really powerful stuff with .name domains). In a sense, this is the critical inflection point between the document centric "Web of Linked Documents" and the data centric "Web or Linked Data". There is absolutely no other way solve this problem in a manner that alleviates the imminent challenges presented by information overload -- resulting from the exponential growth of user generated data across the Internet and enterprise Intranets.

# PermaLink Comments [0]
01/17/2008 22:59 GMT Modified: 01/18/2008 02:27 GMT
2008, Facebook Data Portability, and the Giant Global Graph of Linked Data [ Kingsley Uyi Idehen ]

As 2007 came to a close I repeatedly mulled over the idea of putting together a usual "year in review" and a set of predictions for the coming year etc. Anyway, the more I pondered, the smaller the list became. While pondering (as 2008 rolled around), the Blogosphere was set ablaze with the Robert Scoble's announcement of his account suspension by Facebook. Of course, many chimed in expressing views either side of the ensuing debate: Who is right -- Scoble or Facebook. The more I assimilated the views expressed about this event, the more ironic I found the general discourse, for the following reasons:

  1. Web 2.0 is fundamentally about Web Services as the prime vehicle for interactions across "points of Web presence"
  2. Facebook is a Web 2.0 hosted service for social networking that provides Web Services APIs for accessing data in the Facebook data space. You have to do so "on the fly" within clearly defined constraints i.e you can interact with data across your social network via Facebook APIs, but you cannot cache the data (perform an export style dump of the data)
  3. Facebook is a main driver of the term: "social graph", but their underlying data model is relational and the Web Services response (data you get back) doesn't return a data graph, instead it returns an tree (i.e XML)
  4. Scoble's had a number of close encounters with Linked Data Web | Semantic Data Web | Web 3.0 aficionados in various forms throughout 2007, but still doesn't quite make the connection between Web Services APIs as part of a processing pipeline that includes structured data extraction from XML data en route to producing Data Graphs comprised of Data Objects (Entities) endowed with: Unique Identifiers, Classification or Categorization schemes, Attributes, and Relationships prescribed by one or more shared Data Dictionaries/Schemas/Ontologies
  5. A global information bus that exposes a Linked Data mesh comprised of Data Objects, Object Attributes, and Object Relationships across "points of Web presence" is what TimBL described in 1998 (Semantic Web Roadmap) and more recently in 2007 (Giant Global Graph)
  6. The Linked Data mesh (i.e Linked Data Web or GGG) is anchored by the use of HTTP to mint Location, Structure, and Value independent Object Identifiers called URIs or IRIs. In addition, the Linked Data Web is also equipped with a query language, protocol, and results serialization format for XML and JSON called: SPARQL.

So, unlike Scoble, I am able to make my Facebook Data portable without violating Facebook rules (no data caching outside Facebook realm) by doing the following:

  1. Use an RDFizer for Facebook to convert XML response data from Facebook Web Services into RDF "on the fly" Ensure that my RDF is comprised of Object Identifiers that are HTTP based and thereby dereferencable (i.e. I can use SPARQL to unravel the Linked Data Graph in my Facebook data space)
  2. The act of data dereferencing enables me to expose my Facebook Data as Linked Data associated with my Personal URI
  3. This interaction only occurs via my data space and in all cases the interactions with data work via my RDFizer middleware (e.g the Virtuoso Sponger) that talks directly to Facebook Web Services.

In a nutshell, my Linked Data Space enables you to reference data in my data space via Object Identifiers (URIs), and some cases the Object IDs and Graphs are constructed on the fly via RDFization middleware.

Here are my URIs that provide different paths to my Facebook Data Space:

To conclude, 2008 is clearly the inflection year during which we will final unshackle Data and Identity from the confines of "Web Data Silos" by leveraging the HTTP, SPARQL, and RDF induced virtues of Linked Data.

Related Posts:

  1. 2008 and the Rise of Linked Data
  2. Scoble Right, Wrong, and Beyond
  3. Scoble interviewing TimBL (note to Scoble: re-watch your interview since he made some specific points about Linked Data and URIs that you need to grasp)
  4. Prior Blog posts my this Blog Data Space that include the literal patterns: Scoble Semantic Web
# PermaLink Comments [0]
01/05/2008 17:11 GMT Modified: 01/07/2008 11:44 GMT
Retrospective and Outlook for 2008 [ Virtuso Data Space Bot ]
Retrospective and Outlook for 2008

At this close of the year, I'll give a little recap over the past year in terms of Virtuoso development and a look at where we are headed for 2008.  

A year ago, I was in the middle of redoing the Virtuoso database engine for better SMP performance.  We redid the way traversal of index structures and cache buffers was serialized for SMP and generally compared Virtuoso and Oracle engines function by function.  We had just returned from the ISWC 2006 in Athens, Georgia and the Virtuoso database was becoming a usable triple store.

Soon thereafter, we comfirmed that all this worked when we put out the first cut of Dbpedia with Chris Bizer et al and were working with Alan Ruttenberg on what would become the Banff health care and life sciences demo.

The WWW 2007 conference in Banff, Canada, was a sort of kick-off for the Linking Open Data movement, which started as a community project under SWEO, the W3C interest group for Semantic Web Education and Outreach, and has gained a life of its own since.

Right after WWW 2007 the Virtuoso development effort split on two tracks, one for enhancing the then new 5.0 release and one for building a new generation of Virtuoso, notably featuring clustering and double storage density for RDF.

The first track produced constant improvements to the relational to RDF mapping functionality, SPARQL enhancements, Redland, Jena and Sesame compatible client libraries with Virtuoso as as a triple store.  These things have been out with testers for a while and are all generally available as of this writing.  

The second track started with adding key compression to the storage engine, specifically with regard to RDF, even though there are some gains in relational applications as well.  With RDF, the space consumption drops to about half, all without recourse to any non-random access compatible compression like gzip.  Since the start of August, we turned to clustering and are now code complete, pretty much with all the tricks one would expect, of course full function SQL and taking advantage of colocated joins and doing aggregation and generally all possible processing where the data is.  I have covered details of this along the way in previous posts.  The key ppoint is that now the thing is written and works with test cases.

 

In late October, we were at the W3C workshop for mapping relational data to RDF.  For us, this confirmed the importance of mapping and scalability in general.  Ivan Herman proposed forming a W3C incubator group on benchmarking.  Also a W3C incubator group of relational to RDF mapping is being formed. 

Now, scalability has two sides.  One is dealing with volume and the other is dealing with complexity.  Volume alone will not help if interesting queries cannot be formulated.  Hence we recently extended SPARQL with subqueries so that we can now express at least any SQL workloads, which was previously not the case.  It is sort of a contradiction in terms to say that SPARQL is the universal language for information integration while not being able to express for example the TPC H queries.  Well, we fixed this.  A separate post will jhighlight how.  The W3C process will eventually follow, as the necessity of these things is undeniable, on the unimpeachable authority of the whole SQL world.  Anyway, for now, SPARQL as it is ought to become a recommendation and extensions can be addressed later.

For now, the only RDF benchmark that seems to be out there is the loading part of the LUBM.  We did a couple of enhancements of our own for that just recently but much bigger things are on the way.  Also, the billion triples challenge is an interesting initiative in the area.  We all recognize that loading any number of triples is a finite problem with known solutions.  The challenge is running interesting queries on large volumes. 

Our present emphasis is demonstrating both RDF data warehousing and RDF mapping with complex queries and large data.  We start with the TPC H benchmark and doing the queries both through mapping to SQL against any RDBMS, Oracle, DB2, Virtuoso or other, and by querying the physical RDF rendition of the data in Virtuoso.  From there, we move to querying a collection of RDBMS's hosting similar data.

Doing this with performance at the level of direct SQL in the case of mappping and not very much slower with physical triples is an important milestone on the way to real world enterprise data web.  Real life has harder and more unexpected issues than a benchmark but at any rate doing the benchmark without breaking a sweat is a step on the way.  We sent a paper to ESWC 2008 about that but it was rather incomplete.  By the time of the VLDB submissions deadline in March we'll have more meat.

Another tack soon to start is a rearchitecting of Zitgist around clustered Virtuoso.  Aside matters of scale, we will make a number of qualitatively new things possible.  Again, more will be released in the first quarter of 08.

Beyond these short and mid-term goals we have the introduction of entirely dynamic and demand driven partitioning, a la Google Bigtable or Amazon Dynamo.  Now, regular partitioning will do for a while yet but this is the future when we move the the vision of linked dataeverywhere.

In conclusion, this year we have built the basis and the next year is about deployment.  The bulk of really new development is behind us and now we start applying.  Also, the community will find adoption easier due to our recent support of the common RDF API's.

 
# PermaLink Comments [0]
12/18/2007 07:22 GMT
More on RDF and Vertical Storage [ Virtuso Data Space Bot ]
We actually did the experiment I mentioned a couple of posts back, about storing RDF triples column-wise.

The test loads 4.8 million triples of LUBM data and reads the whole set on one index and then checks if it finds the same row on another index.

Reading GSPO and checking OGPS takes 27 seconds.  Doing the same with column wise bitmap indices on S, G, P and O takes 86 seconds.   The latter checks the existence of the row by AND'ing 4 bitmap indices and the former checks its existence by a single lookup in a multi-part index whose last part is a bitmap.  The result is approximately what one would expect.  The bitmap AND could be optimized a bit, dropping the time to maybe 70 seconds. 

Now speaking of compression, it is true that column storage will work better.  For example the G and P columns will compress to pretty much nothing.  On a row layout they compress too but not to nothing since even if a value is not unique you have to store the place where the value is if you want to read rows in constant time per row.

What is nice with the 4 bitmaps is that no combination of search conditions is penalized.  But the trick of using bitmaps for self-join is lost:  You can't evaluate {?s a Person . ?s name "Mary"} by and'ing the S bitmaps for persons and for subjects named "Mary".

The 4 bitmap indices are remarkably compact, though. 8840 pages all together.
We could probably get the G, S, P, O columns in 3000 pages or so, using very little  compression.
The OGPS index is   5169 pages and the GSPO index is 21243 pages.

None of the figures have any compression, except what a bitmap naturally produces.

Now we have figured out a modified row layout which will about double working set with the same memory and keep things in rows.  We will try that.  The GSPO index will be about  10000 pages and OGPS will be about 4500.  We do not expect much impact on search or insert times.

We looked at using gzip for database pages.  They go to between 1/4 to 1/3 page.   But this does not improve working set and having variable length pages generates all kinds of special cases you don’tt want.  So we will improve working set first and deal with somewhat compressed data in the execution engine.
After that, maybe gzip will cut the size to 1/2 or so but  that will be good for disk only.  And it does not so much matter how much you transfer but how many seeks you do.

Still, column-wise storage will likely win for size.  So if the working set is much larger than memory this may have an edge.  To keep all bases covered we will eventually add this as an option.
 

| | | ||


# PermaLink Comments [0]
06/11/2007 04:35 GMT Modified: 06/11/2007 04:36 GMT
Announcing Virtuoso Open-Source Edition v5.0.0 [ Virtuso Data Space Bot ]
All, OpenLink Software are pleased to announce a new release of Virtuoso, Open-Source Edition, version 5.0.0. This version includes:
  • Significant rewrite of database engine resulting in 50%-100% improvement on single CPU and in some cases up to 300% on multiprocessor CPUs by decreasing resource-contention between threads and other optimizations.
  • Radical expansion of RDF support including
  • In-built middleware (called the Sponger) for transforming non-RDF into RDF "on the fly" (e.g. producing Triples from Microformats, REST-style Web Services, and (X)HTML etc.)
  • Full Text Indexing of Literal Objects in Triple Patterns (via Filter or magic bif:contains predicate applied to Literal Objects)
  • Basic Inferencing (Subclass and Subproperty Support)
  • SPARQL Aggregate Functions
  • SPARQL Update Language Support (Updates, Inserts, Deletions in SPARQL)
  • Improved Support of XML Schema Type System (including the use of XML Schema Complex Types as Objects of bif:xcontains predicate)
  • Enhancements to the in-built SPARQL to SQL Compiler's Cost Optimizer
  • Performance Optimizations to RDF VIEWs (SQL to RDF Mapping)
  • Various bug-fixes
NOTE: Databases created with earlier versions of Virtuoso will be automatically upgraded to Virtuoso 5.0 but after upgrade will not be readable with older Virtuoso versions. For more information please see: Virtuoso Open Source Edition: Home Page: http://virtuoso.openlinksw.com/wiki/main/ Download Page: http://virtuoso.openlinksw.com/wiki/main/Main/VOSDownload OpenLink Data Spaces: Home Page: http://virtuoso.openlinksw.com/wiki/main/Main/OdsIndex SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): http://virtuoso.openlinksw.com/wiki/main/Main/ODSSIOCRef Interactive SPARQL Demo: http://demo.openlinksw.com/isparql/ OpenLink AJAX Toolkit (OAT): Project Page: http://sourceforge.net/projects/oat Live Demonstration: http://demo.openlinksw.com/DAV/JS/oat/index.html

Technorati Tags: , , , , , , , , ,

# PermaLink Comments [0]
04/12/2007 13:48 GMT Modified: 04/12/2007 09:50 GMT
Announcing Virtuoso Open-Source Edition v5.0.0 [ Orri Erling ]
All, OpenLink Software are pleased to announce a new release of Virtuoso , Open-Source Edition, version 5.0.0. This version includes:
  • Significant rewrite of database engine resulting in 50%-100% improvement on single CPU and in some cases up to 300% on multiprocessor CPUs by decreasing resource-contention between threads and other optimizations.
  • Radical expansion of RDF support including
  • In-built middleware (called the Sponger) for transforming non-RDF into RDF "on the fly" (e.g. producing Triples from Microformats, REST-style Web Services, and (X)HTML etc.)
  • Full Text Indexing of Literal Objects in Triple Patterns (via Filter or magic bif:contains predicate applied to Literal Objects)
  • Basic Inferencing (Subclass and Subproperty Support)
  • SPARQL Aggregate Functions
  • SPARQL Update Language Support (Updates, Inserts, Deletions in SPARQL)
  • Improved Support of XML Schema Type System (including the use of XML Schema Complex Types as Objects of bif:xcontains predicate)
  • Enhancements to the in-built SPARQL to SQL Compiler's Cost Optimizer
  • Performance Optimizations to RDF VIEWs (SQL to RDF Mapping)
  • Various bug-fixes
NOTE: Databases created with earlier versions of Virtuoso will be automatically upgraded to Virtuoso 5.0 but after upgrade will not be readable with older Virtuoso versions. For more information please see: Virtuoso Open Source Edition: OpenLink Data Spaces: OpenLink AJAX Too lkit (OAT):

Technorati Tags: , , , , , , , , ,

# PermaLink Comments [0]
04/12/2007 10:30 GMT Modified: 04/12/2007 06:30 GMT
Describing the Semantic Data Web (Take 3) [ Kingsley Uyi Idehen ]

Scobleizer's had a Semantic Web Epiphany but can't quite nail down what his discovered in laymans prose :-)

Well, I'll have a crack at helping him out i.e. defining the Semantic Data Web in simple terms with linked examples :-)

Tip: Watch the recent TimBL video interview re. the Semantic Data Web before, during, or after reading this post.

Here goes!

The popular Web is a "Web of Documents". The Semantic Data Web is a "Web of Data". Going down a level, the popular web connects documents across the web via hyperlinks. The Semantic Data Web connects data on the web via hyperlinks. Next level, hyperlinks on the popular web have no inherent meaning (lack context beyond: "there is another document"). Hyperlinks on the Semantic Data Web have inherent meaning (they possess context: "there is a Book" or "there is a Person" or "this is a piece of Music" etc..).

Very simple example:

Click the traditional web document URLs for Dan Connolly and Tim Berners-Lee. Then attempt to discern how they are connected. Of course you will see some obvious connections by reading the text, but you won't easily discern other data driven connections. Basically, this is no different to reading about either individual in a print journal, bar the ability to click on hyperlinks that open up other pages. The Data Extraction process remains labour intensive :-(

Repeat the exercise using the traditional web document URLs as Data Web URIs, this time around, paste the hyperlinks above into an RDF aware Browser (in this case the OpenLink RDF Browser). Note, we are making a subtle but critical change i.e. the URLs are now being used as Semantic Data Web URIs (a small-big-deal kind of thing).

If you're impatient or simply strapped for time (aren't we all these days), simply take a look at these links:

  1. Dan Connolly (DanC) RDF Browser Session permalink
  2. Tim Berners-Lee (TimBL) RDF Browser Session permalink
  3. TimBL and DanC combined RDF Browser Session permalink

Note: There are other RDF Browsers out there such as:

  1. Tabulator
  2. DISCO
  3. Objectviewer

All of these RDF Browsers (or User Agents) demonstrate the same core concepts in subtly different ways.

If I haven't lost you, proceed to a post I wrote a few weeks ago titled: Hello Data Web (Take 3 - Feel the "RDF" Force).

If you've made it this far, simply head over to DBpedia for a lot of fun :-)

Note Re. my demos: we make use of SVG in our RDF Browser which makes them incompatible with IE (6 or 7) and Safari. That said, Firefox (1.5+), Opera 9.x, WebKit (Open Source Safari), and Camino work fine.

Note to Scoble:

All the Blogs, Wikis, Shared Bookmarks, Image Galleries, Discussion Forums and the like are Semantic Web Data Spaces. The great thing about all of this is that through RSS 2.0's wild popularity, Blogosphere has done what I postulated about a while back: The Semantic Web would be self-annotating, and so it has come to be :-)

To prove the point above: paste your blog's URL into the OpenLink RDF Browser and see it morph into a Semantic Data Web URI (a pointer to Web Data that's you've created) once you click the "Query" button (click on the TimeLine tab for full effect). The same applies to del.icio.us, Flickr, Googlebase, and basically any REST style Web Service as per my RDF Middleware post.

Lazy Semantic Web Callout:

If you're a good animator (pro or hobbyist), please produce an animation of a document going through a shredder. The strips that emerge from the shredder represent the granular data that was once the whole document. The same thing is happening on the Web right now, we are putting photocopies of (X)HTML documents through the shredder (in a good way) en route to producing granular items of data that remain connected to the original copy while developing new and valuable connections to other items of Web Data.

That's it!

# PermaLink Comments [0]
04/05/2007 20:50 GMT Modified: 04/13/2007 17:15 GMT
 <<     | 1 | 2 | 3 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform