Details

OpenLink Software
Burlington, United States

Subscribe

Post Categories

Recent Articles

Community Member Blogs

Display Settings

articles per page.
order.

Translate

Showing posts in all categories RefreshRefresh
VLDB 2009 (1 of 5) [ Orri Erling ]

I was at the VLDB 2009 conference in Lyon, France. I will in the next few posts discuss some of the prominent themes and how they relate to our products or to RDF and Linked Data.

Firstly, RDF was as good as absent from the presentations and discussions we saw. There were a few mentions in the panel on structured data on the web, however RDF was not in any way seen to be essential for this. There were also a couple of RDF mentions in questions at other sessions, but that was about it.

It is a common perception that RDF and database people do not talk with each other. Evidence seems to bear this out.

As a database developer I did get a lot of readily applicable ideas from the VLDB talks. These run across the whole range of DBMS topics, from key compression and SQL optimization, to column storage, CPU cache optimization, and the like. In this sense, VLDB is directly relevant to all we do. In a conversation, someone was mildly confused that I should on one hand mention I was doing RDF, and on the other hand also be concerned about database performance. These things are not seen to belong together, even though making RDF do something useful certainly depends on a great deal of database optimization.

The question of all questions — that of infinite scale-out with complex queries, resilience, replication, and full database semantics — was strongly in the air.

But it was in the air more as a question than as an answer. Not very much at all was said about the performance of distributed query plans, of 2pc (two-phase commit), of the impact of interconnect latency, and such things. On the other hand, people were talking quite liberally about optimizing CPU cache and local multi-core execution, not to mention SQL plans and rewrites. Also, almost nothing was said about transactions.

Still, there is bound to be a great deal of work in scale-out of complex workloads by any number of players. Either these things are all figured out and considered self-evidently trivial, or they are so hot that people will go there only by way of allusion and vague reference. I think it is the latter.

By and large, we were confirmed in our understanding that infinite scale-out on the go, with redundancy, is the ticket, especially if one can offer complex queries and transactional semantics coupled with instant data loading and schema-last.

Column storage and cache optimizations seem to come right after these.

Certainly the database space is diversifying.

MapReduce was discussed quite a bit, as an intruder into what would be the database turf. We have no great problem with MapReduce; we do that in SQL procedures if one likes to program in this way. Greenplum also seems to have come by the same idea.

As said before, RDF and RDF reasoning were ignored. Do these actually offer something to the database side? Certainly for search, discovery, integration, and resource discovery, linked data has evident advantages.

Two points of the design space — the warehouse, and the web-scale key-value store — got a lot of attention. Would I do either in RDF? RDF is a slightly different design space point, like key-value with complex queries — on the surface, a fusion of the two. As opposed to RDF, the relational warehouse gains from fixed data-types and task-specific layout, whether row or column. The key-value store gains from having a concept of a semi-structured record, a bit like the RDF subject of a triple, but now with ad-hoc (if any) secondary indices, and inline blobs. The latter is much simpler and more compact than the generic RDF subject with graphs and all, and can be easily treated as a unit of version control and replication mastering. RDF, being more generic and more normalized, is representationally neither as ad-hoc nor as compact.

But RDF will be the natural choice when complex queries and ad-hoc schema meet, for example in web-wide integrations of application data.

There seems to be a huge divide in understanding between database-developing people and those who would be using databases. On one side, this has led to a back-to-basics movement with no SQL, no ACID, key-value pairs instead of schema, MapReduce instead of fancy but hard-to-follow parallel execution plans. On the other side, the database space specializes more and more; it is no longer simply transactions vs. analytics, but many more points of specialization.

Some frustration can be sensed in the ivory towers of science when it is seen that the ones most in need of database understanding in fact have the least. Google, Yahoo!, and Microsoft know what they are doing, with or without SQL, but the medium-size or fast-growing web sites seem to be in confusion when LAMP or Ruby or the scripting-du-jour can no longer cut it.

Can somebody using a database be expected to understand how it works? I would say no, not in general. Can a database be expected to unerringly self-configure based on workload? Sure, a database can suggest layouts, but it ought not restructure itself on the spur of the moment under full load.

It is safe to say that the community at large no longer believes in "one size fits all". Since there is no general solution, there is a fragmented space of specific solutions. We will be looking at some of these issues in the following posts.

# PermaLink Comments [0]
09/01/2009 11:30 GMT Modified: 09/01/2009 16:53 GMT
VLDB 2009 (1 of 5) [ Virtuso Data Space Bot ]

I was at the VLDB 2009 conference in Lyon, France. I will in the next few posts discuss some of the prominent themes and how they relate to our products or to RDF and Linked Data.

Firstly, RDF was as good as absent from the presentations and discussions we saw. There were a few mentions in the panel on structured data on the web, however RDF was not in any way seen to be essential for this. There were also a couple of RDF mentions in questions at other sessions, but that was about it.

It is a common perception that RDF and database people do not talk with each other. Evidence seems to bear this out.

As a database developer I did get a lot of readily applicable ideas from the VLDB talks. These run across the whole range of DBMS topics, from key compression and SQL optimization, to column storage, CPU cache optimization, and the like. In this sense, VLDB is directly relevant to all we do. In a conversation, someone was mildly confused that I should on one hand mention I was doing RDF, and on the other hand also be concerned about database performance. These things are not seen to belong together, even though making RDF do something useful certainly depends on a great deal of database optimization.

The question of all questions — that of infinite scale-out with complex queries, resilience, replication, and full database semantics — was strongly in the air.

But it was in the air more as a question than as an answer. Not very much at all was said about the performance of distributed query plans, of 2pc (two-phase commit), of the impact of interconnect latency, and such things. On the other hand, people were talking quite liberally about optimizing CPU cache and local multi-core execution, not to mention SQL plans and rewrites. Also, almost nothing was said about transactions.

Still, there is bound to be a great deal of work in scale-out of complex workloads by any number of players. Either these things are all figured out and considered self-evidently trivial, or they are so hot that people will go there only by way of allusion and vague reference. I think it is the latter.

By and large, we were confirmed in our understanding that infinite scale-out on the go, with redundancy, is the ticket, especially if one can offer complex queries and transactional semantics coupled with instant data loading and schema-last.

Column storage and cache optimizations seem to come right after these.

Certainly the database space is diversifying.

MapReduce was discussed quite a bit, as an intruder into what would be the database turf. We have no great problem with MapReduce; we do that in SQL procedures if one likes to program in this way. Greenplum also seems to have come by the same idea.

As said before, RDF and RDF reasoning were ignored. Do these actually offer something to the database side? Certainly for search, discovery, integration, and resource discovery, linked data has evident advantages.

Two points of the design space — the warehouse, and the web-scale key-value store — got a lot of attention. Would I do either in RDF? RDF is a slightly different design space point, like key-value with complex queries — on the surface, a fusion of the two. As opposed to RDF, the relational warehouse gains from fixed data-types and task-specific layout, whether row or column. The key-value store gains from having a concept of a semi-structured record, a bit like the RDF subject of a triple, but now with ad-hoc (if any) secondary indices, and inline blobs. The latter is much simpler and more compact than the generic RDF subject with graphs and all, and can be easily treated as a unit of version control and replication mastering. RDF, being more generic and more normalized, is representationally neither as ad-hoc nor as compact.

But RDF will be the natural choice when complex queries and ad-hoc schema meet, for example in web-wide integrations of application data.

There seems to be a huge divide in understanding between database-developing people and those who would be using databases. On one side, this has led to a back-to-basics movement with no SQL, no ACID, key-value pairs instead of schema, MapReduce instead of fancy but hard-to-follow parallel execution plans. On the other side, the database space specializes more and more; it is no longer simply transactions vs. analytics, but many more points of specialization.

Some frustration can be sensed in the ivory towers of science when it is seen that the ones most in need of database understanding in fact have the least. Google, Yahoo!, and Microsoft know what they are doing, with or without SQL, but the medium-size or fast-growing web sites seem to be in confusion when LAMP or Ruby or the scripting-du-jour can no longer cut it.

Can somebody using a database be expected to understand how it works? I would say no, not in general. Can a database be expected to unerringly self-configure based on workload? Sure, a database can suggest layouts, but it ought not restructure itself on the spur of the moment under full load.

It is safe to say that the community at large no longer believes in "one size fits all". Since there is no general solution, there is a fragmented space of specific solutions. We will be looking at some of these issues in the following posts.

# PermaLink Comments [0]
09/01/2009 11:30 GMT Modified: 09/01/2009 16:53 GMT
Faceted Search: Unlimited Data in Interactive Time [ Orri Erling ]

Why not see the whole world of data as facets? Well, we'd like to, but there is the feeling that this is not practical.

The old problem has been that it is not really practical to pre-compute counts of everything for all possible combinations of search conditions and counting/grouping/sorting. The actual matches take time.

Well, neither is in fact necessary. When there are large numbers of items matching the conditions, counting them can take time but then this is the beginning of the search, and the user is not even likely to look very closely at the counts. It is enough to see that there are many of one and few of another. If the user already knows the precise predicate or class to look for, then the top-level faceted view is not even needed. The faceted view for guiding search and precise analytics are two different problems.

There are client-side faceted views like Exhibit or our own ODE. The problem with these is that there are a few orders of magnitude difference between the actual database size and what fits on the user agent. This is compounded by the fact that one does not know what to cache on the user agent because of the open nature of the data web. If this were about a fixed workflow, then a good guess would be possible — but we are talking about the data web, the very soul of serendipity and unexpected discovery.

So we made a web service that will do faceted search on arbitrary RDF. If it does not get complete results within a timeout, it will return what it has counted so far, using Virtuoso's Anytime feature. Looking for subjects with some specific combination of properties is however a bit limited, so this will also do JOINs. Many features are one or two JOINs away; take geographical locations or social networks, for example.

Yet a faceted search should be point-and-click, and should not involve a full query construction. We put the compromise at starting with full text or property or class, then navigating down properties or classes, to arbitrary depth, tree-wise. At each step, one can see the matching instances or their classes or properties, all with counts, faceted-style.

This is good enough for queries like 'what do Harry Potter fans also like' or 'who are the authors of articles tagged semantic web and machine learning and published in 2008'. For complex grouping, sub-queries, arithmetic or such, one must write the actual query.

But one can begin with facets, and then continue refining the query by hand since the service also returns SPARQL text. We made a small web interface on top of the service with all logic server side. This proves that the web service is usable and that an interface with no AJAX, and no problems with browser interoperability or such, is possible and easy. Also, the problem of syncing between a user-agent-based store and a database is entirely gone.

If we are working with a known data structure, the user interface should choose the display by the data type and offer links to related reports. This is all easy to build as web pages or AJAX. We show how the generic interface is done in Virtuoso PL, and you can adapt that or rewrite it in PHP, Java, JavaScript, or anything else, to accommodate use-case specific navigation needs such as data format.

The web service takes an XML representation of the search, which is more restricted and easier to process by machine than the SPARQL syntax. The web service returns the results, the SPARQL query it generated, whether the results are complete or not, and some resource use statistics.

The source of the PL functions, Web Service and Virtuoso Server Page (HTML UI) will be available as part of Virtuoso 6.0 and higher. A Programmer's Guide will be available as part of the standard Virtuoso Documentation collection, including the Virtuoso Open Source Edition Website.

# PermaLink Comments [0]
01/09/2009 22:03 GMT Modified: 01/09/2009 17:15 GMT
Faceted Search: Unlimited Data in Interactive Time [ Virtuso Data Space Bot ]

Why not see the whole world of data as facets? Well, we'd like to, but there is the feeling that this is not practical.

The old problem has been that it is not really practical to pre-compute counts of everything for all possible combinations of search conditions and counting/grouping/sorting. The actual matches take time.

Well, neither is in fact necessary. When there are large numbers of items matching the conditions, counting them can take time but then this is the beginning of the search, and the user is not even likely to look very closely at the counts. It is enough to see that there are many of one and few of another. If the user already knows the precise predicate or class to look for, then the top-level faceted view is not even needed. The faceted view for guiding search and precise analytics are two different problems.

There are client-side faceted views like Exhibit or our own ODE. The problem with these is that there are a few orders of magnitude difference between the actual database size and what fits on the user agent. This is compounded by the fact that one does not know what to cache on the user agent because of the open nature of the data web. If this were about a fixed workflow, then a good guess would be possible — but we are talking about the data web, the very soul of serendipity and unexpected discovery.

So we made a web service that will do faceted search on arbitrary RDF. If it does not get complete results within a timeout, it will return what it has counted so far, using Virtuoso's Anytime feature. Looking for subjects with some specific combination of properties is however a bit limited, so this will also do JOINs. Many features are one or two JOINs away; take geographical locations or social networks, for example.

Yet a faceted search should be point-and-click, and should not involve a full query construction. We put the compromise at starting with full text or property or class, then navigating down properties or classes, to arbitrary depth, tree-wise. At each step, one can see the matching instances or their classes or properties, all with counts, faceted-style.

This is good enough for queries like 'what do Harry Potter fans also like' or 'who are the authors of articles tagged semantic web and machine learning and published in 2008'. For complex grouping, sub-queries, arithmetic or such, one must write the actual query.

But one can begin with facets, and then continue refining the query by hand since the service also returns SPARQL text. We made a small web interface on top of the service with all logic server side. This proves that the web service is usable and that an interface with no AJAX, and no problems with browser interoperability or such, is possible and easy. Also, the problem of syncing between a user-agent-based store and a database is entirely gone.

If we are working with a known data structure, the user interface should choose the display by the data type and offer links to related reports. This is all easy to build as web pages or AJAX. We show how the generic interface is done in Virtuoso PL, and you can adapt that or rewrite it in PHP, Java, JavaScript, or anything else, to accommodate use-case specific navigation needs such as data format.

The web service takes an XML representation of the search, which is more restricted and easier to process by machine than the SPARQL syntax. The web service returns the results, the SPARQL query it generated, whether the results are complete or not, and some resource use statistics.

The source of the PL functions, Web Service and Virtuoso Server Page (HTML UI) will be available as part of Virtuoso 6.0 and higher. A Programmer's Guide will be available as part of the standard Virtuoso Documentation collection, including the Virtuoso Open Source Edition Website.

# PermaLink Comments [0]
01/09/2009 22:03 GMT Modified: 01/09/2009 17:15 GMT
Where Are All the RDF-based Semantic Web Applications? [ Kingsley Uyi Idehen ]

In response to the "Semantic Web Technology" application classification scheme espoused by ReadWriteWeb (RWW), emphasized in the post titled: Where are all the RDF-based Semantic Web Apps?, here is my attempt to clarify and reintroduce what OpenLink Software offers (today) in relation to Semantic Web technology.

From the RWW Top-Down category, which I interpret as: technologies that produce RDF from non RDF data sources. Our product portfolio is comprised of the following; Virtuoso Universal Server, OpenLink Data Spaces, OpenLink Ajax Toolkit, and OpenLink Data Explorer (which includes ubiquity commands).

Virtuoso Universal Server functionality summary:

  1. Generation of RDF Linked Data Views of SQL, XML, and Web Services in general
  2. Deployment of RDF Linked Data
  3. "On the Fly" generation of RDF Linked Data from Document Web information resources (i.e. distillation of entities from their containers e.g. Web pages) via Cartridges / Drivers
  4. SPARQL query language support
  5. SPARQL extensions that bring SPARQL closer to SQL e.g Aggregates, Update, Insert, Delete Named Graph support (i.e. use of logical names to partition RDF data within Virtuoso's multi-model dbms engine)
  6. Inference Engine (currently in use re. DBpedia via Yago and UMBEL)
  7. Host and exposes data from Drupal, Wordpress, MediaWiki, phpBB3 as RDF Linked Data via in-built support for PHP runtime
  8. Available as an EC2 AMI
  9. etc..

OpenLink Data Spaces functionality summary:

  1. Simple mechanism for Linked Data Web enabling yourself by giving you an HTTP based User ID (a de-referencable URI) that is linked to a FOAF based Profile page and OpenID
  2. Binds all your data sources (blogs, wikis, bookmarks, photos, calendar items etc. ) to your URI so can "Find" things by only remembering your URI
  3. Makes your profile page and personal URI the focal point of Linked Data Web presence
  4. Delivers Data Portability (using data access by value or data access by reference) across data silos (e.g. Web 2.0 style social networks)
  5. Allows you make annotations about anything in your own Data Space(s) on the Web without exposure to RDF markup
  6. A Briefcase feature that provides a WebDAV driven RDF Linked Data variant of functionality seen in Mac OS X Spotlight and WinFS with the addition of SPARQL compliance
  7. Automatically generates RDFa in its (X)HTML pages
  8. Blog, Wiki, WebDAV File Server, Shared Bookmarks, Calendar, and other applications that look and feel like Web 2.0 counterparts but emitt RDF Linked Data amongst a plethora of data exchange formats
  9. Available as an EC2 AMI
  10. etc..

OpenLink Ajax Toolkit functionality summary:

  1. Provides binding to SQL, RDF, XML, and Web Services via Ajax Database Connectivity Layer (you only need an ODBC, JDBC, OLE-DB, ADO.NET, XMLA Driver, or Web Service on the backend for dynamic data access from Javascript)
  2. All controls are Ajax Database Connectivity bound (widgets get their data from Ajax Database Connectivity data sources)
  3. Bundled with Virtuoso and ODS installations.
  4. etc.

OpenLink Data Explorer functionality summary

  1. Distills entities associated with information resource style containers (e.g. Web Pages or files) as RDF Linked Data
  2. Exposes the RDF based Linked Data graph associated with information resources (see the Linked Data behind Web pages)
  3. Ubiquity commands for invoking the above
  4. Available as a Hosted Service or Firefox Extension
  5. Bundled with Virtuoso and ODS installations
  6. etc.

Note:

Of course you could have simply looked up OpenLink Software's FOAF based Profile page (*note the Linked Data Explorer tab*), or simply passed the FOAF profile page URL to a Linked Data aware client application such as: OpenLink Data Explorer, Zitgist Data Viewer, Marbles, and Tabulator, and obtained information. Remember, OpenLink Software is an Entity of Type: foaf:Organization, on the burgeoning Linked Data Web :-)

Related

# PermaLink Comments [3]
10/01/2008 19:09 GMT Modified: 10/02/2008 15:27 GMT
Response to: Whole Data Post (Update 3) [ Kingsley Uyi Idehen ]

This post is in response to Glenn McDonald's post titled: Whole Data, where he highlights a number of issues relating to "Semantic Web" marketing communications and overall messaging, from his perspective.

By coincidence, Glenn and I presented at this month's Cambridge Semantic Web Gathering.

I've provided a dump of Glenn's issues and my responses below:

Issue - RDF

  • Ingenious data decomposition idea, but:
  • too low-level; the assembly language of data, where we need Java or Ruby
  • "resource" is not the issue; there's no such thing as "metadata", it's all data; "meta" is a perspective
  • lists need to be effortless, not painful and obscure
  • nodes need to be represented, not just implied; they need types and literals in a more pervasive, integrated way.

Response:

RDF is a Graph based Data Model it stands for Resource Description Framework. The Metadata data angle comes from it's Meta Content Framework (MCF) origins. You can express and serialize data based on the RDF Data Model using: Turtle, N3, TriX, N-Triples, and RDF/XML.

Issue - SPARQL (and Freebase's MQL)

These are just appeasement:
- old query paradigm: fishing in dark water with superstitiously tied lures; only works well in carefully stocked lakes
- we don't ask questions by defining answer shapes and then hoping they're dredged up whole.

Response:

SPARQL, MQL, and Entity-SQL are Graph Model oriented Query Languages. Query Languages always accompany Database Engines. SQL is the Relational Model equivalent.

Issue - Linked Data

Noble attempt to ground the abstract, but:
- URI dereferencing/namespace/open-world issues focus too much technical attention on cross-source cases where the human issues dwarf the technical ones anyway
- FOAF query over the people in this room? forget it.
- link asymmetry doesn't scale
- identity doesn't scale
- generating RDF from non-graph sources: more appeasement, right where the win from actually converting could be biggest!

Response:

Innovative use of HTTP to deliver "Data Access by Reference" to the Linked Data Web.

When you have a Data Model, Database Engine, and Query Language, the next thing you need is a Data Access mechanism that provides "Data Access by Reference". ODBC and JDBC (amongst others) provide "Data Access by Reference" via Data Source Names. Linked Data is about the same thing (URIs are Data Source Names) with the following differences:

  • Naming is scoped to the entity level rather than container level
  • HTTP's use within the data source naming scheme expands the referencability of the Named Entity Descriptions beyond traditional confines such as applications, operating systems, and database engines.

Issue - Giant Global Graph

Hugely motivating and powerful idea, worthy of a superhero (Graphius!), but:
- giant and global parts are too hard, and starting global makes every problem harder
- local projects become unmanageable in global context (Cyc, Freebase data-modeling lists...). And my thus my plea, again. Forget "semantic" and "web", let's fix the database tech first:
- node/arc data-model, path-based exploratory query-model
- data-graph applications built easily on top of this common model; building them has to be easy, because if it's hard, they'll be bad
- given good database tech, good web data-publishing tech will be trivial!
- given good tools for graphs, the problems of uniting them will be only as hard as they have to be.

Response:

Giant Global Graph is just another moniker for a "Web of Linked Data" or "Linked Data Web".

Multi-Model Database technology that meshes the best of the Graph & Relational Models exist. In a nutshell, this is what Virtuoso is all about and it's existed for a very long time :-)

Virtuoso is also a Virtual DBMS engine (so you can see Heterogeneous Relational Data via Graph Model Context Lenses). Naturally, it is also a Linked Data Deployment platform (or Linked Data Sever).

The issue isn't the "Semantic Web" moniker per se., it's about how Linked Data (foundation layer of Semantic Web) gets introduced to users. As I said during the MIT Gathering: "The Web is experienced via Web Browsers primarily, so any enhancement to the Web must be exposed via traditional Web Browsers", which is why we've opted to simply add "View Linked Data Sources" to the existing set of common Browser options that includes:

  1. View page in rendered form (default)
  2. View page source (i.e., how you see the markup behind the page)

By exposing the Linked Data Web option as described above, you enable the Web user to knowingly transition from the traditional Rendered (X)HTML page view to the Linked Data View (i.e., structured data behind the page). This simple "User Interaction" tweak makes the notion of exploiting a Structured Web becomes somewhat clearer.

The Linked Data Web isn't a panacea. It's just an addition to the existing Web that enrichens the things you can do with the Web. It's predominance, like any application feature, will be subject to the degrees to which it delivers tangible value or matrializes internal and external opportunity costs.

Note: The Web isn't ubiquitous today becuase all it's users groked HTML Markup. It's ubquitity is a function of opportunity costs: there simply came a point in the Web boostrap when nobody could afford the opportunity costs associated with being off the Web. The same thing will play out with Linked Data and the broader Semantic Web vision.

Links:
  1. Linked Data Journey part of my Linked Data Planet Presentation Remix(from slides 15 to 22 - which include bits from TimBL's presentation)
  2. OpenLink Data Explorer
  3. OpenLink Data Explorer Screenshots and examples.
# PermaLink Comments [0]
08/15/2008 13:06 GMT Modified: 08/15/2008 18:31 GMT
Virtuoso's Universal Server Architecture (Conceptual & Technical) [ Kingsley Uyi Idehen ]
As they say, a picture speaks a thousand words, so I am exposing two views of Virtuoso that have been on the Web for while.

Remember, Virtuoso offers data management, data access, web application server, enterprise service bus, and virtualization of disparate and heterogeneous data sources, as part of a single, multi threaded, cross-platform server solution; hence it's description as a "Universal Server".

Conceptual View:

Image

Technical View (kinda missing PHP, Perl, Python runtime hosting in the Virtual Application Sever realm):

Image


Virtuoso's architecture is not a reaction to current trends. The diagrams above are pretty old (with minor touch ups in recent times). At OpenLink Software, we've have a consistent world-view re. standards and the vital role they play when it comes to developing software that enables the construction and exploitation of "Context Lenses" that tap into a substrate of Virtualized Logical Data Sources (SQL, XML, RDF, Web Services, Full Text etc.).




# PermaLink Comments [0]
08/03/2008 13:07 GMT Modified: 08/05/2008 18:07 GMT
Missing Bits from semanticweb.com Interview [ Kingsley Uyi Idehen ]

Yikes! I've just discovered that the final part of the semanticweb.com's interview with Jim Hendler and I, includes critical paragraphs that omit my example links :-( As you can imagine, this is a quite excruciating, bearing in mind that "Literals" are of marginal value in a Linked Data world.

Anyway, thanks to the Blogosphere, I can attempt to fix this problem myself -- via this post :-)

Q. If you wanted to provide a bewildered but still curious novice a public example of Linked Data at work in their everyday life, what would it be?

Kingsley Idehen: Any one of the following:

My Linking Open Data community Profile Page - the Linked Data integration is exposed via the "Explore Data" Tab My Linked Data Space - viewed via OpenLink's AJAR (Asynchronous Javascript and RDF) based Linked Data Brower My Events Calendar Tag Cloud - a Linked Data view of my Calendar Space using an RDF-aware browser In all cases, you have the ability to explore my data spaces by simply clicking on the links, which on the surface appear to be standard hypertext links, although in reality you are dealing with hyperdata links (i.e., links to entities that result in the generation of entity description pages that expose entity properties via hyperdata links). Thus, you have a single page that describes me in a very rich way since it encompasses all data associated with me, covering: personal profile, blog posts, bookmarks, tag clouds, social networks etc.

Q. What would you show the CEO or CTO of a company outside the tech industry?

Kingsley Idehen: A link to the Entity ALFKI, from the popular Northwind Database associated with Microsoft Access and SQL Server database installations. This particular link exposes a typical enterprise data space (orders, customers, employees, suppliers ...) in a single page. The hyperdata links represent intricate data relationships common to most business systems that will ultimately seek to repurpose existing legacy data sources and SOA services as Linked Data. Alternatively, I would show the same links via the Zitgist Data Viewer (another Linked Data-aware browser). In both cases, I am exploiting direct access to entities via HTTP due to the protocols incorporation into the Data Source Naming scheme.

# PermaLink Comments [0]
06/13/2008 02:02 GMT Modified: 06/13/2008 09:01 GMT
Linked Data Illustrated and a Virtuoso Functionality Reminder [ Kingsley Uyi Idehen ]
Daniel Lewis has put together a nice collection of Linked Data related posts that illustrate the fundamentals of the Linked Data Web and the vital role that Virtuoso plays as a deployment platform. Remember, Virtuoso was architected in 1998 (see Virtuoso History) in anticipation of the eventual Internet, Intranet, and Extranet level requirements for a different kind of Server. At the time of Virtuoso's inception, many thought our desire to build a multi-protocol, multi-model, and multi-purpose, virtual and native data server was sheer craziness, but we pressed on (courtesy of our vision and technical capabilities). Today, we have a very sophisticated Universal Server Platform (in Open Source and Commercial forms) that is naturally equipped to do the following via very simple interfaces:
    - Provide highly scalable RDF Data Management via a Quad Store (DBpedia is an example of a live demonstration)
    - Powerful WebDAV innovations that simplify read-write mode interaction with Linked Data
    - More...
# PermaLink Comments [0]
04/28/2008 17:32 GMT Modified: 04/28/2008 14:47 GMT
Linked Data enabling PHP Applications [ Kingsley Uyi Idehen ]

Daniel lewis has penned a variation of post about Linked Data enabling PHP applications such as: Wordpress, phpBB3, MediaWiki etc.

Daniel simplifies my post by using diagrams to depict the different paths for PHP based applications exposing Linked Data - especially those that already provide a significant amount of the content that drives Web 2.0.

If all the content in Web 2.0 information resources are distillable into discrete data objects endowed with HTTP based IDs (URIs), with zero "RDF handcrafting Tax", what do we end up with? A Giant Global Graph of Linked Data; the Web as a Database.

So, what used to apply exclusively, within enterprise settings re. Oracle, DB2, Informix, Ingres, Sybase, Microsoft SQL Server, MySQL, PostrgeSQL, Progress Open Edge, Firebird, and others, now applies to the Web. The Web becomes the "Distributed Database Bus" that connects database records across disparate databases (or Data Spaces). These databases manage and expose records that are remotely accessible "by reference" via HTTP.

As I've stated at every opportunity in the past, Web 2.0 is the greatest thing that every happened to the Semantic Web vision :-) Without the "Web 2.0 Data Silo Conundrum" we wouldn't have the cry for "Data Portability" that brings a lot of clarity to some fundamental Web 2.0 limitations that end-users ultimately find unacceptable.

In the late '80s, the SQL Access Group (now part of X/Open) addressed a similar problem with RDBMS silos within the enterprise that lead to the SAG CLI which is exists today as Open Database Connectivity.

In a sense we now have WODBC (Web Open Database Connectivity), comprised of Web Services based CLIs and/or traditional back-end DBMS CLIs (ODBC, JDBC, ADO.NET, OLE-DB, or Native), Query Language (SPARQL Query Language), and a Wire Protocol (HTTP based SPARQL Protocol) delivering Web infrastructure equivalents of SQL and RDA, but much better, and with much broader scope for delivering profound value due to the Web's inherent openness. Today's PHP, Python, Ruby, Tcl, Perl, ASP.NET developer is the enterprise 4GL developer of yore, without enterprise confinement. We could even be talking about 5GL development once the Linked Data interaction is meshed with dynamic languages (delivering higher levels of abstraction at the language and data interaction levels). Even the underlying schemas and basic design will evolve from Closed World (solely) to a mesh of Closed & Open World view schemas.

# PermaLink Comments [0]
04/10/2008 18:09 GMT Modified: 04/10/2008 14:12 GMT
 <<     | 1 | 2 | 3 | 4 | 5 | 6 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform