We have reached a beachead re. the Virtuoso instance
hosting the Linked Open Data (LOD) Cloud; meaning, we are not
going to be performing any major updates and deletions short-term,
bar incorporation of fresh data sets from the Freebase and Bio2RDF projects
(both communities a prepping new RDF data sets).
At the current time we have loaded 100% of all the very large
data sets from the LOD Cloud. As result, we can start the
process of exposing Linked Data virtues in a manner that's
palatable to users, developers, and database professionals across
the Web 1.0, 2.0, and 3.0 spectrums.
What does this mean?
You can use the "Search & Find" or"URI Lookup" or SPARQL endpoint associated with the LOD cloud
hosting instance to perform the following tasks:
- Find entities associated with full text search patterns -- Google Style,
but with Entity & Text proximity Rank instead of
Page Rank, since we are dealing with Entities rather than documents
about entities
- Find and Lookup entities by Identifier (URI) -- which is
helpful when locating URIs to use for identify entities in your own
linked data spaces on the Web
- View entity descriptions via a variety of representation
formats (HTML, RDFa, RDF/XML, N3, Turtle etc.)
- Determine uses of entity identifiers across the LOD cloud --
which helps you select preferred URIs based on usage
statistics.
What does it offer Web 1.0 and 2.0 developers?
If you don't want to use the SPARQL
based Web Service, or other Linked Data Web oriented APIs for interacting with the
LOD cloud programmatically, you can simply use the powerful
REST style Web Service that provides
URL parameters for performing full text
oriented "Search", entity oriented "Find" queries, and faceted
navigation over the huge data corpus with results data returned in
JSON and XML formats.
Next Steps:
Amazon have agreed to add all the LOD Cloud data sets to their
existing public data sets collective. Thus, the data
sets we are loading will be available in "raw data" (RDF) format on
the public data sets page via Named Elastic Block Storage (EBS)
Snapshots); meaning, you can make an EC2 AMI (e.g. a Linux,
Windows, Solaris) and install an RDF quad or triple store of choice
into your AMI, then simply load data from the LOD cloud based on
your needs.
In addition to the above, we are also going to offer a Virtuoso 6.0 Cluster Edition based LOD Cloud
AMI (as we've already done with DBpedia, MusicBrainz, NeuroCommons, and Bio2Rdf) that will enable you to simply
instantiate a personal and service specific edition of Virtuoso
with all the LOD data in place and fully tuned for performance and
scalability; basically, you will simply press "Instantiate AMI" and
a LOD cloud data space, in true Linked Data from, will be
at your disposal within minutes (i.e. the time it takes the DB to
start).
Work on the migration of the LOD data to EC2 starts this week.
Thus, if you are interested in contributing an RDF based data set
to the LOD cloud now is the time to get your archive links in place
on the (see: ESW Wiki page for LOD Data Sets).