Details
Kingsley Uyi Idehen
Lexington, United States
Subscribe
Post Categories
Subscribe
Recent Articles
Display Settings
|
Showing posts in all categories Refresh
Linked Data Rules Simplified
As a compliment to the most recent Linked Data Design Issues note by TimBL, I would like to add this subtle tweak to the enumerated rules:
-
Identify or Name things using HTTP URIs
-
Describe things using the RDF metadata model
-
Increase link data mesh density on the Web by linking (referring) to things in other data spaces using their HTTP URIs.
If you perform the steps above, on any HTTP network (e.g. World Wide Web), you implicitly bind the Names/Identifiers of things to negotiable representations of their metadata (description) bearing documents.
Also note, you can create and deploy the resulting RDF metadata using any of the following approaches:
-
RDFa within (X)HTML documents
-
N3, Turtle, TriX, RDF/XML etc. based documents
- Programmatically generated variants of 1&2.
Related
|
06/26/2009 10:49 GMT-0500
|
Modified:
06/26/2009 23:18 GMT-0500
|
BBC Linked Data Meshup In 3 Steps
Situation Analysis:
Dr. Dre is one of the artists in the Linked Data Space we host for the BBC. He is also referenced in music oriented data spaces such as DBpedia, MusicBrainz and Last.FM (to name a few).
Challenge:
How do I obtain a holistic view of the entity "Dr. Dre" across the BBC, MusicBrainz, and Last.FM data spaces? We know the BBC published Linked Data, but what about Last.FM and MusicBrainz? Both of these data spaces only expose XML or JSON data via REST APIs?
Solution:
Simple 3 step Linked Data Meshup courtesy of Virtuoso's in-built RDFizer Middleware "the Sponger" (think ODBC Driver Manager for the Linked Data Web) and its numerous Cartridges (think ODBC Drivers for the Linked Data Web).
Steps:
-
Go to Last.FM and search using pattern: Dr. Dre (you will end up with this URL: http://www.last.fm/music/Dr.+Dre)
-
Go to the Virtuoso powered BBC Linked Data Space home page and enter: http://bbc.openlinksw.com/about/html/http://www.last.fm/music/Dr.+Dre
-
Go to the BBC Linked Data Space home page and type full text pattern (using default tab): Dr. Dre, then view Dr. Dre's metadata via the Statistics Link.
What Happened?
The following took place:
-
Virtuoso Sponger sent an HTTP GET to Last.FM
-
Distilled the "Artist" entity "Dr. Dre" from the page, and made a Linked Data graph
-
Inverse Functional Property and sameAs reasoning handled the Meshup (augmented graph from a conjunctive query processing pipeline)
- Links for "Dr. Dre" across BBC (sameAs), Last.FM (seeAlso), via DBpedia URI.
The new enhanced URI for Dr. Dre now provides a rich holistic view of the aforementioned "Artist" entity. This URI is usable anywhere on the Web for Linked Data Conduction :-)
Related (as in NearBy)
|
06/12/2009 14:09 GMT-0500
|
Modified:
06/12/2009 16:38 GMT-0500
|
Understanding the BBC's Virtuoso Powered Linked Data Space
The BBC's recently announced Linked Data space for Programmes and Music data, joins a growing list of immediately useful "Virtuoso Powered" linked data spaces, driving the burgeoning Web of Linked Data. Others include: DBpedia, Bio2RDF, NeuroCommons etc (the click friendly version of the LOD-Cloud diagram reveals a snapshot of other Virtuoso driven linked data spaces).
Why is it important?
As a leading media organization, the BBC's use of Linked Data provides a clear beacon to other media players re. the imminence of a serious Linked Data induced sector inflection. In a nutshell, every Web Site has to evolve into a Linked Data Space: a location on the Web that provides granular access to discrete data items in line with the core principles of the Linked Data meme.
Remember, the essence of the Linked Data meme is simply this: you reference data items and access their metadata, in variety of formats via a single HTTP based URI. This approach to Web data publishing is compatible with any HTTP aware user agent (e.g., your Web Browser or tools & applications that provide abstracted access to HTTP).
How Do I use it?
There a number of very powerful things available to end-users and developers alike.
End-Users:
The most powerful feature of our variant of the BBC's Linked Data Space is the exposure of Faceted Find (think Search++ and beyond). Thus, you can go the the home page of the service and commence data discovery and exploration via any of the following interfaces:
-
Full Text Search Tab -- type in a full text pattern and then experience Linked Data Entity Ranking as opposed to Page Ranking
- URI Lookup (By Label) Tab -- type in part of a URI and let the system auto-complete by looking up Entity Labels
- URI Lookup (Raw String Pattern) Tab -- type in part of a URI and let the system auto-complete by looking up the raw URI
-
OpenLink Data Explorer Service -- "deceptively simple" Linked Data explorer and Data Mesher (simply type in a URI or Text pattern, then view the data via a myriad of entity type specific viewer tabs).
Once you are comfortable with at least one of the items above, you can exploit the system further by performing any of the following:
Information Architects & Developers
Disambiguated Search (aka. Search++ or Find)
In line with the time-tested "embrace and extend" pattern, we provide Full Text search capability, but unlike Google, Yahoo!, Bing and other search engines, we don't use use "Page Rank" algorithm to sort results; instead, we use an "Entity Rank" algorithm since we are dealing with an RDF based Graph model DBMS where links exist between entities across instance data and data dictionary (vocabularies, schemas, ontologies) boundaries. In addition, when you get results (by clicking "show values" or "show values with distinct counts") that list entities associated with a full text search pattern, we take a quantum leap beyond search engines by allowing you to use "Entity Type" and/or "Entity Properties" (all of these have HTTP URIs too) to set your own context for what you seek.
Much more to come in the form of BBC specific demo queries and tutorials :-)
Related
-
Live LOD Cloud Cache instance that combines BBC data with other data sets from the LOD Cloud (in a single Virtuoso RDF DBMS hosting 5 Billion+ triples & counting)
|
06/11/2009 17:59 GMT-0500
|
Modified:
06/26/2009 23:15 GMT-0500
|
Library of Congress & Reasonable Linked Data
While exploring the Subject Headings Linked Data Space (LCSH) recently unveiled by the Library of Congress, I noticed that the URI for the subject heading: World Wide Web, exposes an "owl:sameAs" link to resource URI: "info:lc/authorities/sh95000541" -- in fact, a URI.URN that isn't HTTP protocol scheme based.
The observations above triggered a discussion thread on Twitter that involved: @edsu, @iand, and moi. Naturally, it morphed into a live demonstration of: human vs machine, interpretation of claims expressed in the RDF graph.
What makes this whole thing interesting?
It showcases (in Man vs Machine style) the issue of unambiguously discerning the meaning of the owl:sameAs claim expressed in the LCSH Linked Data Space.
Perspectives & Potential Confusion
From the Linked Data perspective, it may spook a few people to see owl:sameAs values such as: "info:lc/authorities/sh95000541", that cannot be de-referenced using HTTP.
It may confuse a few people or user agents that see URI de-referencing as not necessarily HTTP specific, thereby attempting to de-reference the URI.URN on the assumption that it's associated with a "handle system", for instance.
It may even confuse RDFizer / RDFization middleware that use owl:sameAs as a data provider attribution mechanism via hint/nudge URI values derived from original content / data URI.URLs that de-reference to nothing e.g., an original resource URI.URL plus "#this" which produces URI.URN-URL -- think of this pattern as "owl:shameAs" in a sense :-)
Unambiguously Discerning Meaning
Simply bring OWL reasoning (inference rules and reasoners) into the mix, thereby negating human dialogue about interpretation which ultimately unveils a mesh of orthogonal view points. Remember, OWL is all about infrastructure that ultimately enables you to express yourself clearly i.e., say what you mean, and mean what you say.
Path to Clarity (using Virtuoso, its in-built Sponger Middleware, and Inference Engine):
- GET the data into the Virtuoso Quad store -- what the sponger does via its URIBurner Service (while following designated predicates such as owl:sameAs in case they point to other mesh-able data sources)
- Query the data in Quad Store with "owl:sameAs" inference rules enabled
- Repeat the last step with the inference rules excluded.
Actual SPARQL Queries:
Observations:
The SPARQL queries against the Graph generated and automatically populated by the Sponger reveal -- without human intervention-- that: "info:lc/authorities/sh95000541", is just an alternative name for < xmlns="http" id.loc.gov="id.loc.gov" authorities="authorities" sh95000541="sh95000541" concept="concept">, and that the graph produced by LCSH is self-describing enough for an OWL reasoner to figure this all out courtesy of the owl:sameAs property :-).
Hopefully, this post also provides a simple example of how OWL facilitates "Reasonable Linked Data".
Related
|
05/05/2009 13:53 GMT-0500
|
Modified:
05/06/2009 14:26 GMT-0500
|
Linked Data & Identity
A person, organization, place, idea, subject matter topic/heading, and other real world things possess "identity" --
that is, a constellation of characteristics that distinguish them from any other identity. Associated with this abstraction can be a label used as a reference, or "identifier". This is the distinction between a thing and the name of the thing.
section from IETF's Domain Keys spec. (paraphrased by me) .
The Linked Data meme is based on the use of HTTP based URIs as reference / identifier labels associated with the "identity abstraction" referred to above. Thus, when you de-reference (request information about) an HTTP based URI you ultimately end up with a resource URL that exposes the "constellation of characteristics" mentioned above, in a representation negotiated at request time -- between an HTTP client and server e.g., (X)HTML, JSON, XML, RDF/XML, N3, Turtle, Trix, others :-)
Related
|
04/29/2009 16:05 GMT-0500
|
Modified:
05/01/2009 12:25 GMT-0500
|
What is the Linked Data Meme about?
The act of using URIs to "refer to" (reference) Web addressable data objects. It's also the act of using the same URI to de-reference the description of a referenced data object; in this case, the representation of the description is negotiated by a Web client and/or Web server. Thus, you can access the description of a data object via data representation formats such as: JSON, XML, (X)HTML, RDF/XML, N3, Turtle, TriX etc.
Note: In proper Web parlance, a data object is referred to as a resource.
Simple example (using DBpedia)
In the Linked Data realm, If you want to make a reference to the Linked Data meme in a blog post, you are better off using the resource URI: http://dbpedia.org/resource/Linked_Data, instead of the Web page URL: http://dbpedia.org/page/Linked_Data, which is the address of a physical document (an information conveying artifact) that at best visually presents the negotiated representation of a resource description.
Why is this valuable?
In the simplest sense, you only have one focal point for referencing (referring to) and de-referencing (retrieving data about) a given Web resource. It protects you from the impact of Web document location changes (amongst many other things).
Remember, a single URI is a conduit into a realm where the identity, access, representation, presentation, and storage of a resource (data object) are completely distinct. It's the mechanism for conducting data across network, machine, operating system, dbms engine, application, and service (API) boundaries. Thus, without "linked data meme" prescribed URI referencing and de-referencing, we are simply back to "business as usual" re. the industry at large, where networks, operating systems, dbms engines, applications, and services (APIs) become the basis for "data lock-in" and silo construction.
Going forward
Take a second to think about the profound virtues of the ubiquitous Web of Linked Document URLs that we have today, and then apply that thinking to the burgeoning Web of Linked Data URIs, that has just turned corner and heading in everyone's direction at full blast.
Note to "Social Media" players: Who you know isn't the canonical object of sociality. What you are i.e., your description and the data objects it exposes, are real objects of your sociality :-)
Related
|
04/29/2009 11:32 GMT-0500
|
Modified:
04/29/2009 16:31 GMT-0500
|
Simple Explanation of RDF and Linked Data Dynamics
What is RDF?
The acronym stands for: Resource Description Framework. And that's just what it is.
RDF is comprised of a Data Model (EAV/CR Graph) and Data Representation Formats such as: N3, Turtle, RDF/XML etc.
RDF's essence is about: "Entities" and "Attributes" being URI based, while "Values" may be URI or Literals (typed or untyped) based.
URIs are Entity Identifiers.
Short for "Web of Linked Data" or "Linked Data Web".
A term coined by TimBL that describes an HTTP based "data access by reference pattern" that uses a single pointer or handle for "referring to" and "obtaining actual data about" an entity.
Linked Data uses the deceptively simple messaging scheme of HTTP to deliver a granular entity reference and access mechanism that transcends traditional computing boundaries such as: operating system, application, database engines, and networks.
How are Linked Data & RDF Related?
Linked Data simply mandates the following re. RDF:
- URIs should be HTTP based so that you can "refer to" (Reference) an Entity, its Attributes, or URI based Attribute values via the Web (infact any HTTP based network e.g., Intranets and Extranets)
-
URIs should also be HTTP based so that you can use them to de-reference resource descriptions via the Web (or Intranets and Extranets).
Note: by Entity I am also referring to: a resource (Web parlance), data item, data object, real-world object, or datum.
Linked Data is also about, using URIs and HTTP's content negotiation feature to separate: presentation, representation, access, and identity of data items. Even better, content negotiation can be driven by user agent and/or data server based quality of service algorithms (representation preference order schemes).
To conclude, Linked Data is ultimately about the realization that: Data is the new Electricity, and it's conductors are URIs :-)
Tip to governments of the world: we are in exponential times, the current downturn is but one side of the "exponential times ledger", the other side of the "exponential times ledger" is simply about unleashing "raw data" -- in structured form -- into the Web, so that "citizen analysts" can blossom and ultimately deliver the transparency desperately sought at every level of the economic value chain. Think: "raw data ready" whenever you ponder about "shovel ready" infrastructure projects!
|
04/24/2009 16:59 GMT-0500
|
Modified:
04/24/2009 17:14 GMT-0500
|
Take N: Yet Another OpenLink Data Spaces Introduction
Problem:
Your Life, Profession, Web, and Internet do not need to become mutually exclusive due to "information overload".
Solution:
A platform or service that delivers a point of online presence that embodies the fundamental separation of: Identity, Data Access, Data Representation, Data Presentation, by adhering to Web and Internet protocols.
How:
Typical post installation (Local or Cloud) task sequence:
-
Identify myself (happens automatically by way of registration)
- If in an LDAP environment, import accounts or associate system with LDAP for account lookup and authentication
-
Identify Online Accounts (by fleshing out profile) which also connects system to online accounts and their data
- Use Profile for granular description (Biography, Interests, WishList, OfferList, etc.)
- Optionally upstream or downstream data to and from my online accounts
- Create content Tagging Rules
- Create rules for associating Tags with formal URIs
- Create automatic Hyperlinking Rules for reuse when new content is created (e.g. Blog posts)
- Exploit Data Portability virtues of RSS, Atom, OPML, RDFa, RDF/XML, and other formats for imports and exports
- Automatically tag imported content
- Use function-specific helper application UIs for domain specific data generation e.g. AddressBook (optionally use vCard import), Calendar (optionally use iCalendar import), Email, File Storage (use WebDAV mount with copy and paste or HTTP GET), Feed Subscriptions (optionally import RSS/Atom/OPML feeds), Bookmarking (optionally import bookmark.html or XBEL) etc..
- Optionally enable "Conversation" feature (today: Social Media feature) across the relevant application domains (manage conversations under covers using NNTP, the standard for this functionality realm)
- Generate HTTP based Entity IDs (URIs) for every piece of data in this burgeoning data space
- Use REST based APIs to perform CRUD tasks against my data (local and remote) (SPARQL, GData, Ubiquity Commands, Atom Publishing)
- Use OpenID, OAuth, FOAF+SSL, FOAF+SSL+OpenID for accessing data elsewhere
- Use OpenID, OAuth, FOAF+SSL, FOAF+SSL+OpenID for Controlling access to my data (Self Signed Certificate Generation, Browser Import of said Certificate & associated Private Key, plus persistence of Certificate to FOAF based profile data space in "one click")
- Have a simple UI for Entity-Attribute-Value or Subject-Predicate-Object arbitrary data annotations and creation since you can't pre model an "Open World" where the only constant is data flow
- Have my Personal URI (Web ID) as the single entry point for controlled access to my HTTP accessible data space
I've just outlined a snippet of the capabilities of the OpenLink Data Spaces platform. A platform built using OpenLink Virtuoso, architected to deliver: open, platform independent, multi-model, data access and data management across heterogeneous data sources.
All you need to remember is your URI when seeking to interact with your data space.
Related
-
Get Yourself a URI (Web ID) in 5 Minutes or Less!
-
Various posts over the years about Data Spaces
-
Future of Desktop Post
-
Simplify My Life Post by Bengee Nowack
|
04/22/2009 14:46 GMT-0500
|
Modified:
04/22/2009 15:32 GMT-0500
|
Live Virtuoso instance hosting Linked Open Data (LOD) Cloud
We have reached a beachead re. the Virtuoso instance hosting the Linked Open Data (LOD) Cloud; meaning, we are not going to be performing any major updates and deletions short-term, bar incorporation of fresh data sets from the Freebase and Bio2RDF projects (both communities a prepping new RDF data sets). At the current time we have loaded 100% of all the very large data sets from the LOD Cloud. As result, we can start the process of exposing Linked Data virtues in a manner that's palatable to users, developers, and database professionals across the Web 1.0, 2.0, and 3.0 spectrums. What does this mean? You can use the "Search & Find" or"URI Lookup" or SPARQL endpoint associated with the LOD cloud hosting instance to perform the following tasks: - Find entities associated with full text search patterns -- Google Style, but with Entity & Text proximity Rank instead of Page Rank, since we are dealing with Entities rather than documents about entities
- Find and Lookup entities by Identifier (URI) -- which is helpful when locating URIs to use for identify entities in your own linked data spaces on the Web
- View entity descriptions via a variety of representation formats (HTML, RDFa, RDF/XML, N3, Turtle etc.)
- Determine uses of entity identifiers across the LOD cloud -- which helps you select preferred URIs based on usage statistics.
What does it offer Web 1.0 and 2.0 developers? If you don't want to use the SPARQL based Web Service, or other Linked Data Web oriented APIs for interacting with the LOD cloud programmatically, you can simply use the powerful REST style Web Service that provides URL parameters for performing full text oriented "Search", entity oriented "Find" queries, and faceted navigation over the huge data corpus with results data returned in JSON and XML formats. Next Steps: Amazon have agreed to add all the LOD Cloud data sets to their existing public data sets collective. Thus, the data sets we are loading will be available in "raw data" (RDF) format on the public data sets page via Named Elastic Block Storage (EBS) Snapshots); meaning, you can make an EC2 AMI (e.g. a Linux, Windows, Solaris) and install an RDF quad or triple store of choice into your AMI, then simply load data from the LOD cloud based on your needs. In addition to the above, we are also going to offer a Virtuoso 6.0 Cluster Edition based LOD Cloud AMI (as we've already done with DBpedia, MusicBrainz, NeuroCommons, and Bio2Rdf) that will enable you to simply instantiate a personal and service specific edition of Virtuoso with all the LOD data in place and fully tuned for performance and scalability; basically, you will simply press "Instantiate AMI" and a LOD cloud data space, in true Linked Data from, will be at your disposal within minutes (i.e. the time it takes the DB to start). Work on the migration of the LOD data to EC2 starts this week. Thus, if you are interested in contributing an RDF based data set to the LOD cloud now is the time to get your archive links in place on the (see: ESW Wiki page for LOD Data Sets).
|
03/30/2009 11:27 GMT-0500
|
Modified:
04/01/2009 14:26 GMT-0500
|
How Linked Data will change Advertising
This post is a reply to Jason Kolb's post titled: Using Advertising to Take Over the World. Jason's post is a response to Robert Scoble's post titled: Why Facebook has never listened and why it definitely won’t start now. Jason: Scoble is sensing what comes next, but in my opinion, describes it using an old obtrusive advertising model anecdote. I've penned a post or two about the "Magic of You" which is all about the new Web power broker (Entity: "You"). Personally, I've long envisaged a complete overhaul of advertising where obtrusive advertising simply withers away; ultimately replaced by an unobtrusive model that is driven by individualized relevance and high doses of serendipity. Basically, this is ultimately about "taking the Ad out of item placement in Web pages". The fundamental ingredients of an unobtrusive advertising landscape would include the following Human facts: - We are social beings and need stuff from time to time
- We know what we need and would like to "Find stuff" when we are in "I Need Stuff" mode.
Ideally, we would like to be able to simply state the following, via a Web accessible profile: - Here are my "Wants" or "Needs" (my Wish-List)
- Here are the products and services that I "Offer" (my Offer-List).
Now put the above into the context of an evolving Web where data items are becoming more visible by the second, courtesy of the "Linked Data" meme. Thus, things that weren't discernable via the Web: "People", "Places", "Music", "Books", "Products", etc., become much easier to identify and describe. Assuming the comments above hold true re. the Web's evolution into a collection of Linked Data Spaces, and the following occur: - Structured profile pages become the basic units of Web presence
- Wish-Lists and Offer-Lists are exposed by profile pages
Wish-Lists and Offer-Lists will gradually start bonding with increasing degrees of serendipity courtesy of exponential growth in Linked Data Web density. So based on what I've stated so far, Scoble would simply browse the Web or visit his profile page, and in either scenario enjoy a "minority report" style of experience albeit all under his control (since he is the one driving his Web user agent). What I describe above simply comes down to "Wish-lists" and associated recommendations becoming the norm outside the confines of Amazon's data space on the Web. Serendipitous discovery, intelligent lookups, and linkages are going to be the fundamental essence of Linked Data Web oriented applications, services, agents. Beyond Scoble, it's also important to note that access to data will be controlled by entity "You". Your data space on the Web will be something you will controll access to in a myriad of ways, and it will include the option to provide licensed access to commercial entities on your terms. Naturally, you will also determine the currency that facilitates the value exchange :-) Related
|
03/22/2009 23:39 GMT-0500
|
Modified:
03/25/2009 08:30 GMT-0500
|
|
|