Details

OpenLink Software
Burlington, United States

Subscribe

Post Categories

Recent Articles

Community Member Blogs

Display Settings

articles per page.
order.

Translate

Showing posts in all categories RefreshRefresh
5 Very Important Things to Note about HTTP based Linked Data [ Kingsley Uyi Idehen ]
  1. It isn't World Wide Web Specific (HTTP != World Wide Web)
  2. It isn't Open Data Specific
  3. It isn't about "Free" (Beer or Speech)
  4. It isn't about Markup (so don't expect to grok it via "markup first" approach)
  5. It's about Hyperdata - the use of HTTP and REST to deliver a powerful platform agnostic mechanism for Data Reference, Access, and Integration.

When trying to understand HTTP based Linked Data, especially if you're well versed in DBMS technology use (User, Power User, Architect, Analyst, DBA, or Programmer) think:

  • Open Database Connectivity (ODBC) without operating system, data model, or wire-protocol specificity or lock-in potential
  • Java Database Connectivity (JDBC) without programming language specificity
  • ADO.NET without .NET runtime specificity and .NET bound language specificity
  • OLE-DB without Windows operating system & programming language specificity
  • XMLA without XML format specificity - with Tabular and Multidimensional results formats expressible in a variety of data representation formats.
  • All of the above scoped to the Record rather than Container level, with Generic HTTP scheme URIs associated with each Record, Field, and Field value (optionally)

Remember the need for Data Access & Integration technology is the by product of the following realities:

  1. Human curated data is ultimately dirty, because:
    • our thick thumbs, inattention, distractions, and general discomfort with typing, make typos prevalent
    • database engines exist for a variety of data models - Graph, Relational, Hierarchical;
    • within databases you have different record container/partition names e.g. Table Names;
    • within a database record container you have records that are really aspects of the same thing (different keys exist in a plethora of operational / line of business systems that expose aspects of the same entity e.g., customer data that spans Accounts, CRM, ERP application databases);
    • different field names (one database has "EMP" while another has "Employee") for the same record
    • .
  2. Units of measurement is driven by locale, the UK office wants to see sales in Pounds Sterling while the French office prefers French Francs
  3. All of the above is subject to context halos which can be quite granular re. sensitivity e.g. staff travel between locations that alter locales and their roles; basically, profiles matters a lot.

Related

# PermaLink Comments [1]
11/19/2009 14:49 GMT Modified: 11/20/2009 07:12 GMT
5 Game Changing Things about the OpenLink Virtuoso + AWS Cloud Combo [ Kingsley Uyi Idehen ]

Here are 5 powerful benefits you can immediately derive from the combination of Virtuoso and Amazon's AWS services (specifically the EC2 and EBS components):

  1. Acquire your own personal or service specific data space in the Cloud. Think DBase, Paradox, FoxPRO, Access of yore, but with the power of Oracle, Informix, Microsoft SQL Server etc.. using a Conceptual, as opposed to solely Logical, model based DBMS (i.e., a Hybrid DBMS Engine for: SQL, RDF, XML, and Full Text)
  2. Ability to share and control access to your resources using innovations like FOAF+SSL, OpenID, and OAuth, all from one place
  3. Construction of personal or organization based FOAF profiles in a matter of minutes; by simply creating a basic DBMS (or ODS application layer) account; and then using this profile to create strong links (references) to all your Data silos (esp. those from the Web 2.0 realm)
  4. Load data sets from the LOD cloud or Sponge existing Web resources (i.e., on the fly data transformation to RDF model based Linked Data) and then use the combination to build powerful lookup services that enrich the value of URLs (think: Web addressable reports holding query results) that you publish
  5. Bind all of the above to a domain that you own (e.g. a .Name domain) so that you have an attribution-friendly "authority" component for resource URLs and Entity URIs published from your Personal Linked Data Space on the Web (or private HTTP network).

In a nutshell, the AWS Cloud infrastructure simplifies the process of generating Federated presence on the Internet and/or World Wide Web. Remember, centralized networking models always end up creating data silos, in some context, ultimately! :-)

# PermaLink Comments [0]
11/18/2009 14:12 GMT Modified: 11/19/2009 15:20 GMT
5 Game Changing Things about the OpenLink Virtuoso + AWS Cloud Combo [ Kingsley Uyi Idehen ]

Here are 5 powerful benefits you can immediately derive from the combination of Virtuoso and Amazon's AWS services (specifically the EC2 and EBS components):

  1. Acquire your own personal or service specific data space in the Cloud. Think DBase, Paradox, FoxPRO, Access of yore, but with the power of Oracle, Informix, Microsoft SQL Server etc.. using a Conceptual, as opposed to solely Logical, model based DBMS (i.e., a Hybrid DBMS Engine for: SQL, RDF, XML, and Full Text)
  2. Ability to share and control access to your resources using innovations like FOAF+SSL, OpenID, and OAuth, all from one place
  3. Construction of personal or organization based FOAF profiles in a matter of minutes; by simply creating a basic DBMS (or ODS application layer) account; and then using this profile to create strong links (references) to all your Data silos (esp. those from the Web 2.0 realm)
  4. Load data sets from the LOD cloud or Sponge existing Web resources (i.e., on the fly data transformation to RDF model based Linked Data) and then use the combination to build powerful lookup services that enrich the value of URLs (think: Web addressable reports holding query results) that you publish
  5. Bind all of the above to a domain that you own (e.g. a .Name domain) so that you have an attribution-friendly "authority" component for resource URLs and Entity URIs published from your Personal Linked Data Space on the Web (or private HTTP network).

In a nutshell, the AWS Cloud infrastructure simplifies the process of generating Federated presence on the Internet and/or World Wide Web. Remember, centralized networking models always end up creating data silos, in some context, ultimately! :-)

# PermaLink Comments [0]
11/18/2009 14:12 GMT Modified: 11/19/2009 15:20 GMT
Personal and/or Service Specific Linked Data Spaces in the Cloud: DBpedia 3.4 [ Kingsley Uyi Idehen ]

We have just released an Amazon EC2 based public Snapshot of DBpedia 3.4. Thus, you can now instantiate a personal and/or service specific variant of the DBpedia 3.4 Linked Data Space. Basically, you can replicate what we host, within minutes (as opposed to days). In addition, you no longer need to squabble --on an unpredictable basis with others-- for the infrastructure resources behind DBpedia's public instance, when using the SPARQL Endpoint, Faceted Search & Find Services, or HTML Browser Pages etc.

How Does It work?

  1. Instantiate a Virtuoso EC2 AMI (paid variety, which is aggressively priced at $49.99 for setup and $19.99 per month thereafter)
  2. Mount the shared DBpedia 3.4 public snapshot
  3. Start Virtuoso Server
  4. Start exploiting the DBpedia Linked Data Space.

What Interfaces are exposed?

  1. SPARQL Endpoint
  2. Linked Data Viewer Pages (as you see in the public DBpedia instance)
  3. Faceted Search & Find UI and Web Services (REST or SOAP)
  4. All the inference rules for UMBEL, SUMO, YAGO, OpenCYC, and DBpedia-OWL data dictionaries
  5. Type Correlations Between DBpedia and Freebase

Enjoy!

# PermaLink Comments [0]
11/16/2009 13:17 GMT Modified: 11/16/2009 13:30 GMT
Conversation with Jon Udell: Are We There Yet Re. Web++ ? [ Kingsley Uyi Idehen ]

Personally, I believe that we've actually reached a watershed moment re. the evolution of the Web from a mesh of Linked Data Containers (Web of Linked Documents) to a mesh of Linked Data Items (entities or real world objects).

The journey towards this watershed moment started with the Semantic Web Project, gained focus and pragmatism via the Linked Data meme, attained substance & credibility via efforts such as DBpedia and the resulting cloud of Open Linked Data Spaces, and finally arrived at the most important destination of all: broad comprehension and coherence, via RDFa.

Over the years, I've chronicled the journey above via entries in this particular data space (my blog) and most recently, via my rapid-fire comments and debates on Twitter (basically hastag #linkeddata account: kidehen).

On a parallel front re. my chronicles, I've periodically had conversations with Jon Udell, who has always provided a coherent sounding board and reconciliation framework for my world views and open data access vision; naturally, this has a lot to do with his holistic grasp of the big picture issues, associated technical details, and special communication prowess :-)

Against this backdrop, I refer you to my most recent podcast conversation with Jon, which is about how the tandem of HTML+RDFa and the GoodRelations vocabulary deliver the critical missing links re. broad comprehension of the Semantic Web vision en route to mass exploitation.

Related

# PermaLink Comments [0]
09/10/2009 11:03 GMT Modified: 09/10/2009 11:32 GMT
Updated hardware improves LUBM 8000 load rate in Virtuoso 6 [ Orri Erling ]

We repeated the earlier LUBM 8000 experiment on a newer machine, with 2 x Xeon 5520 and 72G 1333MHz memory, and once again with the 2 machines as a networked cluster. Otherwise the settings were the same.

The load rate is now 160,739 triples-per-second.

   Virtuoso 6
(previous run)
   Virtuoso 6
(new run)
   Virtuoso 6
(newest run)
blades    1    1    2
processors    2 x Xeon 5410    2 x Xeon 5520    2 x Xeon 5520
+
2 x Xeon 5410
with 1x1GigE
interconnect
memory    16G 667 MHz    72G 1333 MHz    72G 1333 MHz
+
16G 667 MHz
respectively
reported load rate
triples-per-second
   110,532    160,739    214,188

Again, if others talk about loading LUBM, so must we. Otherwise, this metric is rather uninteresting.

# PermaLink Comments [0]
08/14/2009 19:01 GMT Modified: 08/15/2009 15:27 GMT
Updated hardware improves LUBM 8000 load rate in Virtuoso 6 [ Virtuso Data Space Bot ]

We repeated the earlier LUBM 8000 experiment on a newer machine, with 2 x Xeon 5520 and 72G 1333MHz memory, and once again with the 2 machines as a networked cluster. Otherwise the settings were the same.

The load rate is now 160,739 triples-per-second.

   Virtuoso 6
(previous run)
   Virtuoso 6
(new run)
   Virtuoso 6
(newest run)
blades    1    1    2
processors    2 x Xeon 5410    2 x Xeon 5520    2 x Xeon 5520
+
2 x Xeon 5410
with 1x1GigE
interconnect
memory    16G 667 MHz    72G 1333 MHz    72G 1333 MHz
+
16G 667 MHz
respectively
reported load rate
triples-per-second
   110,532    160,739    214,188

Again, if others talk about loading LUBM, so must we. Otherwise, this metric is rather uninteresting.

# PermaLink Comments [0]
08/14/2009 19:01 GMT Modified: 08/15/2009 15:27 GMT
The URI, URL, and Linked Data Meme's Generic HTTP URI (Updated) [ Kingsley Uyi Idehen ]

Situation Analysis

As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:

"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..

And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..

What's up with that?

Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).

The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.

What is a Real World Object?

People, Places, Music, Books, Cars, Ideas, Emotions etc..

What is a URI?

A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.

URI Generic Syntax

The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below: Image

What is a URL?

A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:

  1. Resource Address/Location Identifier
  2. Data Access mechanism for an Information bearing Resource (Document, File etc..)

So far so good!

What is an HTTP based URI?

The kind of URI Linked Data aficionados mean when they use the term: URI.

An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:

  1. RWO Identfier/Name
  2. RWO Metadata document Locator (courtesy of URL aspect)
  3. Negotiable Representation of the Located Document (courtesy of HTTP's content negotiation feature).

What is Metadata?

Data about Data. Put differently, data that describes other data in a structured manner.

How Do we Model Metadata?

The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).

What about RDF?

The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:

  1. Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes & Relationships (Data Dictionaries e.g., OWL) metadata model
  2. A plethora of instance data representation formats that include: RDFa (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc.

What's the Problem Today?

The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.

Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?

How Does the Link Data meme solve the problem?

The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.

Conclusion

Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)

Related

  1. History of how "Resource" became part of URI - historic account by TimBL
  2. Linked Data Design Issues Document - TimBL's initial Linked Data Guide
  3. Linked Data Rules Simplified - My attempt at simplifying the Linked Data Meme without SPARQL & RDF distraction
  4. Linked Data & Identity - another related post
  5. The Linked Data Meme's Value Proposition
  6. My Del.icio.us hosted Bookmark Data Space for Identity Schemes
  7. TimBL's Ted Talk re. "Raw Linked Data".
# PermaLink Comments [2]
08/07/2009 14:34 GMT Modified: 10/07/2009 08:02 GMT
The URI, URL, and Linked Data Meme's Generic HTTP URI (Updated) [ Kingsley Uyi Idehen ]

Situation Analysis

As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:

"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..

And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..

What's up with that?

Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).

The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.

What is a Real World Object?

People, Places, Music, Books, Cars, Ideas, Emotions etc..

What is a URI?

A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.

URI Generic Syntax

The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below: Image

What is a URL?

A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:

  1. Resource Address/Location Identifier
  2. Data Access mechanism for an Information bearing Resource (Document, File etc..)

So far so good!

What is an HTTP based URI?

The kind of URI Linked Data aficionados mean when they use the term: URI.

An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:

  1. RWO Identfier/Name
  2. RWO Metadata document Locator (courtesy of URL aspect)
  3. Negotiable Representation of the Located Document (courtesy of HTTP's content negotiation feature).

What is Metadata?

Data about Data. Put differently, data that describes other data in a structured manner.

How Do we Model Metadata?

The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).

What about RDF?

The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:

  1. Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes & Relationships (Data Dictionaries e.g., OWL) metadata model
  2. A plethora of instance data representation formats that include: RDFa (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc.

What's the Problem Today?

The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.

Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?

How Does the Link Data meme solve the problem?

The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.

Conclusion

Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)

Related

  1. History of how "Resource" became part of URI - historic account by TimBL
  2. Linked Data Design Issues Document - TimBL's initial Linked Data Guide
  3. Linked Data Rules Simplified - My attempt at simplifying the Linked Data Meme without SPARQL & RDF distraction
  4. Linked Data & Identity - another related post
  5. The Linked Data Meme's Value Proposition
  6. My Del.icio.us hosted Bookmark Data Space for Identity Schemes
  7. TimBL's Ted Talk re. "Raw Linked Data".
# PermaLink Comments [2]
08/07/2009 14:34 GMT Modified: 10/07/2009 08:02 GMT
Why Do We Put Stuff On The Web, Really? [ Kingsley Uyi Idehen ]

As espoused by the Ubuntu philosophy, no Human is an Island. Thus, although the objects of our sociality are vast and varied; that said, the basic foundation still centers on the pursuit and/or delivery of products and services.

Today, the we put stuff on the Web because we want it do be discovered as part of a "sharing act". Likewise, we make regular use of Search Engine Services because we want to "Find" stuff in a productive manner.

Putting, the above in context, you don't need to be Einstein to figure out that to date the Web hasn't enabled vendors to describe their products and services clearly. Likewise, it hasn't enabled us to describe what we want, when we want it, and how much we are willing to pay etc. Basically, the SDQ of Web Content is excruciatingly low!

The Linked Data meme is about using the essence of the Web -- HTTP URIs -- as the mechanism for conducting data across the Web that unambiguously unveils basic things like:

  1. Using a personal profile to describe exactly who I am, my interests, favorite things, what I want (wishlist), what I have to offer (offerlist) etc.
  2. Using an company profile to describe my entire product catalog, inventory levels, store locations, distributor and reseller networks, feature specs, price specs, deal terms and duration, and even opening and closing hours.

Conclusions

A Web of Linked Data enables a complete redefinition of eCommerce, and that's just for starters :-)

Related

# PermaLink Comments [0]
07/24/2009 11:54 GMT Modified: 07/24/2009 21:00 GMT
 <<     | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform