Hopefully, I can expand further :-)
]]>"We all know that structured data is boring and useless; while unstructured data is sexy and chock full of value. Well, only up to a point, Lord Copper. Genuinely unstructured data can be a real nuisance - imagine extracting the return address from an unstructured letter, without letterhead and any of the formatting usually applied to letters. A letter may be thought of as unstructured data, but most business letters are, in fact, highly-structured." ....Duncan Pauly, founder and chief technology officer of Coppereye add's eloquent insight to the conversation:
"The labels "structured data" and "unstructured data" are often used ambiguously by different interest groups; and often used lazily to cover multiple distinct aspects of the issue. In reality, there are at least three orthogonal aspects to structure:* The structure of the data itself.
* The structure of the container that hosts the data.
* The structure of the access method used to access the data.
These three dimensions are largely independent and one does not need to imply another. For example, it is absolutely feasible and reasonable to store unstructured data in a structured database container and access it by unstructured search mechanisms."
Data understanding and appreciation is dwindling at a time when the reverse should be happening. We are supposed to be in the throws of the "Information Age", but for some reason this appears to have no correlation with data and "data access" in the minds of many -- as reflected in the broad contradictory positions taken re. unstructured data vs structured data, structured is boring and useless while unstructured is useful and sexy....
The difference between "Structured Containers" and "Structured Data" are clearly misunderstood by most (an unfortunate fact).
For instance all DBMS products are "Structured Containers" aligned to one or more data models (typically one). These products have been limited by proprietary data access APIs and underlying data model specificity when used in the "Open-world" model that is at the core of the World Wide Web. This confusion also carries over to the misconception that Web 2.0 and the Semantic/Data Web are mutually exclusive.
But things are changing fast, and the concept of multi-model DBMS products is beginning to crystalize. On our part, we have finally released the long promised "OpenLink Data Spaces" application layer that has been developed using our Virtuoso Universal Server. We have structured unified storage containment exposed to the data web cloud via endpoints for querying or accessing data using a variety of mechanisms that include; GData, OpenSearch, SPARQL, XQuery/XPath, SQL etc..
To be continued....
]]>T-Mobile responds to Paris Hilton Sidekick hacking
[via Venture Chronicles by Jeff Nolan]
This incident is an interesting one to follow as there is a little more to it than the purported T-Mobile stance: "..Paris may have given out her password.." .
I have written about database and data access security matters on numerous occasions, and my underlying message has always been that there are many dimensions to security vulnerability that aren't catered for when the distinct functional domains of data access and data storage intersect (I am almost certain that the infrastructure at the bottom of this controversy will comprise at least one or more of the following: data access drivers (free and closed- or open source), relational database engine (closed- or open source), and a web application scripting language (closed- or open source).
Here is a hypothetical situation relating to this matter. Lets assume that Paris did inadvertently give away her password, would it be too much for her to assume that T-mobile's data access infrastructure should be capable of controlling access to her data using any combination of her password and the following:
If a very simple combination of the elements above formed part of the T-mobile authentication and data access security matrix, we would be looking at a much clearer picture of the vulnerability scenarios for this hack that would be confined to the following:
T-mobile should have a data access security infrastructure that would have a rule that restricted sidekick accounts (by default) from direct access from remote locations to address book data for instance. Account owners should be allowed to enable this feature after receiving clear notification about security implications.
]]>At the end of the marathon session, it was clear to me that a blog post was required for future reference, at the very least :-)
"Data Access by Reference" mechanism for Data Objects (or Entities) on HTTP networks. It enables you to Identify a Data Object and Access its structured Data Representation via a single Generic HTTP scheme based Identifier (HTTP URI). Data Object representation formats may vary; but in all cases, they are hypermedia oriented, fully structured, and negotiable within the context of a client-server message exchange.
Information makes the world tick!
Information doesn't exist without data to contextualize.
Information is inaccessible without a projection (presentation) medium.
All information (without exception, when produced by humans) is subjective. Thus, to truly maximize the innate heterogeneity of collective human intelligence, loose coupling of our information and associated data sources is imperative.
Linked Data is exposed to HTTP networks (e.g. World Wide Web) via hypermedia resources bearing structured representations of data object descriptions. Remember, you have a single Identifier abstraction (generic HTTP URI) that embodies: Data Object Name and Data Representation Location (aka URL).
A structured representation of data exists when an Entity (Datum), its Attributes, and its Attribute Values are clearly discernible. In the case of a Linked Data Object, structured descriptions take the form of a hypermedia based Entity-Attribute-Value (EAV) graph pictorial -- where each Entity, its Attributes, and its Attribute Values (optionally) are identified using Generic HTTP URIs.
Examples of structured data representation formats (content types) associated with Linked Data Objects include:
You markup resources by expressing distinct entity-attribute-value statements (basically these a 3-tuple records) using a variety of notations:
You can achieve this task using any of the following approaches:
Our data access middleware heritage (which spans 16+ years) has enabled us to assemble a rich portfolio of coherently integrated products that enable cost-effective evaluation and utilization of Linked Data, without writing a single line of code, or exposing you to the hidden, but extensive admin and configuration costs. Post installation, the benefits of Linked Data simply materialize (along the lines described above).
Our main Linked Data oriented products include:
Anyway, Socialtext and Mike 2.0 (they aren't identical and juxtaposition isn't seeking to imply this) provide nice demonstrations of socially enhanced collaboration for individuals and/or enterprises is all about:
As is typically the case in this emerging realm, the critical issue of discrete "identifiers" (record keys in sense) for data items, data containers, and data creators (individuals and groups) is overlooked albeit unintentionally.
Rather than using platform constrained identifiers such as:
It enables you to leverage the platform independence of HTTP scheme Identifiers (Generic URIs) such that Identifiers for:
simply become conduits into a mesh of HTTP -- referencable and accessible -- Linked Data Objects endowed with High SDQ (Serendipitious Discovery Quotient). For example my Personal WebID is all anyone needs to know if they want to explore:
Even when you reach a point of equilibrium where: your daily activities trigger orchestratestration of CRUD (Create, Read, Update, Delete) operations against Linked Data Objects within your socially enhanced collaboration network, you still have to deal with the thorny issues of security, that includes the following:
FOAF+SSL, an application of HTTP based Linked Data, enables you to enhance your Personal HTTP scheme based Identifer (or WebID) via the following steps (peformed by a FOAF+SSL compliant platform):
Contrary to conventional experiences with all things PKI (Public Key Infrastructure) related, FOAF+SSL compliant platforms typically handle the PKI issues as part of the protocol implementation; thereby protecting you from any administrative tedium without compromising security.
Understanding how new technology innovations address long standing problems, or understanding how new solutions inadvertently fail to address old problems, provides time tested mechanisms for product selection and value proposition comprehension that ultimately save scarce resources such as time and money.
If you want to understand real world problem solution #1 with regards to HTTP based Linked Data look no further than the issues of secure, socially aware, and platform independent identifiers for data objects, that build bridges across erstwhile data silos.
If you want to cost-effectively experience what I've outlined in this post, take a look at OpenLink Data Spaces (ODS) which is a distributed collaboration engine (enterprise of individual) built around the Virtuoso database engines. It simply enhances existing collaboration tools via the following capabilities:
Addition of Social Dimensions via HTTP based Data Object Identifiers for all Data Items (if missing)
The primary topic of a meme penned by TimBL in the form of a Design Issues Doc (note: this is how TimBL has shared his thoughts since the Beginning of the Web).
There are a number of dimensions to the meme, but its primary purpose is the reintroduction of the HTTP URI -- a vital component of the Web's core architecture.
They possess an intrinsic duality that combines persistent and unambiguous Data Identity with platform & representation format independent Data Access. Thus, you can use a string of characters that look like a contemporary Web URL to unambiguously achieve the following:
Enabling more productive use of the Web by users and developers alike. All of which is achieved by tweaking the Web's Hyperlinking feature such that it now includes Hypertext and Hyperdata as link types.
Note: Hyperdata Linking is simply what an HTTP URI facilitates.
Examples problems solved by injecting Linked Data into the Web:
If all of the above still falls into the technical mumbo-jumbo realm, then simply consider Linked Data as delivering Open Data Access in granular form to Web accessible data -- that goes beyond data containers (documents or files).
The value proposition of Linked Data is inextricably linked to the value proposition of the World Wide Web. This is true, because the Linked Data meme is ultimately about an enhancement of the current Web; achieved by reintroducing its architectural essence -- in new context -- via a new level of link abstraction, courtesy of the Identity and Access duality of HTTP URIs.
As a result of Linked Data, you can now have Links on the Web for a Person, Document, Music, Consumer Electronics, Products & Services, Business Opening & Closing Hours, Personal "WishLists" and "OfferList", an Idea, etc.. in addition to links for Properties (Attributes & Values) of the aforementioned. Ultimately, all of these links will be indexed in a myriad of ways providing the substrate for the next major period of Internet & Web driven innovation, within our larger human-ingenuity driven innovation continuum.
The HTTP URI is the secret sauce of the Web that is powerfully and unobtrusively reintroduced via the Linked Data meme (classic back to the future act). This powerful sauce possess a unique power courtesy of its inherent duality i.e., how it uniquely combines Data Item Identity (think keys in traditional DBMS parlance) with Data Access (e.g. access to negotiable representations of associated metadata).
As you can see, I've made no mention of RDF or SPARQL, and I can still articulate the inherent value of the "Linked Data" dimension that the "Linked Data" meme adds to the World Wide Web.
As per usual this post is a live demonstration of Linked Data (dog-food style) :-)
What is our "Search" and "Find" demonstration about? It is about how you use the "Description" of "Things" to unambiguously locate things in a database at Web Scale.
To our perpetual chagrin, we are trying to demonstrate an engine -- not UI prowess -- but the immediate response is to jump to the UI aesthetics.
Google, Yahoo etc.. offer a simple input form for full text search patterns, they have a processing window for completing full text searches across Web Content indexed on their servers. Once the search patterns are processed, you get a page ranked result set (collection of Web pages basically that claim/state: we found N pages out of a document corpus of about M indexed pages).
Note: the estimate aspect of traditional search results in like "advertising small print" the user lives with the illusion that all possible documents on the Web (or even Internet) have been searched whereas in reality: 25% of the possible total is a major stretch; since the Web and Internet are fractal networks and scale-free, inherently growing at exponential rates "ad infinitum" across boundless dimensions of human comprehension.
The power of Linked Data ultimately comes down to the fact that the user constructs the path to what they seek via the properties of the "Things" in question. The routes are not hardwired since URI de-referencing (follow your nose pattern) is available to Linked Data aware query engines and crawlers.
We are simply trying to demonstrate how you can combine the best of full text search with the best of structured querying while reusing familiar interaction patterns from Google/Yahoo. Thus, you start with full text search, find get all the entities associated with the pattern, then use the entity types or entity properties to find what you seek.
You state in your post:
"To state the obvious caveat, the claim OpenLink is making about this demo is not that it delivers better search-term relevance, therefore the ranking of searching results is not the main criteria on which it is intended to be assessed."
Correct.
"On the other hand, one of the things they are bragging about is that their server will automatically cut off long-running queries. So how do you like your first page of results?".
Not exactly correct. We are performing aggregates using a configurable interactive time factor. Example: tell me how many entities of type: Person, with interest: Semantic Web, exist in this database within 2 seconds. Also understand that you could retry the same query and get different numbers within the same interactive time factor. It isn't your basic "query cut-off".
"And on the other other hand, the big claim OpenLink is making about this demo is that the aggregate experience of using it is better than the aggregate experience of using "traditional" search. So go ahead, use it. If you can."
Yes, "Microsoft" was a poor example for sure, the example could have been pattern: "glenn mcdonald", which should demonstrate the fundamental utility of what we are trying to demonstrate i.e., entity disambiguation courtesy of entity properties and/or entity type filtering.
Compare Googles results for: Glenn McDonald with those from our demo (which dissambiguate "Glenn McDonald" via associated properties and/or types), assuming we both agree that your Web Site or Blog Home isn't the center of your entity graph or personal data space (i.e., data about you); so getting your home page at the top of the Google page rank offers limited value, in reality.
What are we bragging about? A little more than what you attempt to explain. Yes, we are showing that we can find stuff within a processing window, but understand the following:
I hope I've clarified what's going on with our demo? If not, pose your challenge via examples and I will respond with solutions or simply cry out loud: "no mas!".
As for your "Mac OX X Leopard" comments, I can only say this: I emphasized that this is a demo, the data is pretty old, and the input data has issues (i.e. some of the input data is bad as your example shows). The purpose of this demo is not about the text per se., it's about the size of the data corpus and faceted querying. We are going to have the entire LOD Cloud loaded into the real thing, and in addition to that our Sponger Middleware will be enabled, and then you can take issue with data quality as per your reference to "Cyndi Lauper" (btw - it takes one property filter to find information about her quickly using "dbpprop:name" after filtering for properties with text values).
Of all things, this demo had nothing to do with UI and Information presentation aesthetics. It was all about combining full text search and structured queries (sparql behind the scenes) against a huge data corpus en route to solving challenges associated with faceted browsing over large data sets. We have built a service that resides inside Virtuoso. The Service is naturally of the "Web Service" variety and can be used from any consumer / client environment that speaks HTTP (directly or indirectly).
To be continued ...
]]>Here is the list:
For the time challenged (i.e. those unable to view this post using it's permalink / URI as a data source via the OpenLink RDF Browser, Zitgist Data Viewer, DISCO Hyperdata Browser, or Tabulator), the benefits of this post are as follows:
Put differently, I cost-effectively contribute to the GGG across all Web interaction dimensions (1.0, 2.0, 3.0) :-)
]]>Daniel Lewis also penned an interesting post in response to Ian's, that actually triggered this post.
I think definition time has long expired re. the Web's many interaction dimensions, evolutionary stages, and versions.
On my watch it's simply demo / dog-food time. Or as Dan Brickley states: Just Show It.
Below, I've created a tabulated view of the various lanes on the Web's Information Super Highway. Of course, this is a Linked Data demo should you be interested in the universe of data exposed via the links embedded in this post :-)
1.0 |
2.0 |
3.0 |
|
Desire |
Information Creation & Retrieval |
Information Creation, Retrieval, and Extraction |
Distillation of Data from Information |
Information Linkage (Hypertext) |
Information Mashing (Mash-ups) |
Linked Data Meshing (Hyperdata) |
|
Enabling Protocol |
HTTP |
HTTP |
|
Markup |
|||
Basic Data Unit | Resource (Data Object) of type "Document" |
Resource (Data Object) of type "Document" |
Resource (Data Object) that may be one of a variety of Types: Person, Place, Event, Music etc. |
Basic Data Unit Identity |
Resource URL (Web Data Object Address)
|
Resource URL (Web Data Object Address)
|
Unique Identifier (URI) that is indepenent of actual Resource (Web Data Object) Address. Note: An Identifier by itself has no utility beyond Identifying a place around which actual data may be clustered.
|
Query or Search |
Full Text Search patterns |
Full Text Search patterns |
Structured Querying via SPARQL |
Deployment |
Web Server (Document Server) |
Web Server + Web Services Deployment modules |
Web Server + Linked Data Deployment modules (Data Server) |
Auto-discovery |
<link rel="alternate"..> |
<link rel="alternate"..> |
<link rel="alternate" | "meta"..>, basic and/or transparent content negotiation |
Target User | Humans |
Humans & Text extraction and manipulation oriented agents (Scrappers) |
Agents with varying degrees of data processing intelligence and capacity |
Serendipitous Discovery Quotient (SDQ) | Low | Low | High |
Pain |
Information Opacity |
Information Silos |
Data Graph Navigability (Quality) |
OpenLink Data Spaces (ODS) now officially supports:
- Attention Profiling Markup Language (APML).
- Meaning of a Tag (MOAT) in conjunction with Simple Knowledge Organisation System (SKOS) and Social-Semantic Cloud of Tags (SCOT).
- OAuth - an Open Authentication Protocol
Which means that OpenLink Data Spaces support all of the main standards being discussed in the DataPortability Interest Group!
APML Example:
All users of ODS automatically get a dynamically created APML file, for example: APML profile for Kingsley Idehen
The URI for an APML profile is: http://myopenlink.net/dataspace/<ods-username>/apml.xml
Meaning of a Tag Example:
All users of ODS automatically have tag cloud information embedded inside their SIOC file, for example: SIOC for Kingsley Idehen on the Myopenlink.net installation of ODS.
But even better, MOAT has been implemented in the ODS Tagging System. This has been demonstrated in a recent test blog post by my colleague Mitko Iliev, the blog post comes up on the tag search: http://myopenlink.net/dataspace/imitko/weblog/Mitko%27s%20Weblog/tag/paris
Which can be put through the OpenLink Data Browser:
OAuth Example:
OAuth Tokens and Secrets can be created for any ODS application. To do this:
- you can log in to MyOpenlink.net beta service, the Live Demo ODS installation, an EC2 instance, or your local installation
- then go to ‘Settings’
- and then you will see ‘OAuth Keys’
- you will then be able to choose the applications that you have instantiated and generate the token and secret for that app.
Related Document (Human) Links
- OpenLink Data Spaces Official Page
- OpenLink Software Page
- OpenLink Data Spaces Wikipedia Page
- Attention Profiling Markup Language Project Website
- Meaning of a Tag Project Website
- Simple Knowledge Organisation Systems Project Website
- Social-Semantic Cloud of Tags Project Website
- OAuth Protocol Website
- DataPortability.org Website
- Semantically Interlinked Online Communities Project Website
Remember (as per my most recent post about ODS), ODS is about unobtrusive fusion of Web 1.0, 2.0, and 3.0+ usage and interaction patterns. Thanks to a lot of recent standardization in the Semantic Web realm (e.g SPARQL), we are now employ the MOAT, SKOS, and SCOT ontologies as vehicles for Structured Tagging.
This is how we take a key Web 2.0 feature (think 2D in a sense), bend it over, to create a Linked Data Web (Web 3.0) experience unobtrusively (see earlier posts re. Dimensions of Web). Thus, nobody has to change how they tag or where they tag, just expose ODS to the URLs of your Web 2.0 tagged content and it will produce URIs (Structured Data Object Identifiers) and a lnked data graph for your Tags Data Space (nee. Tag Cloud). ODS will construct a graph which exposes tag subject association, tag concept alignment / intended meaning, and tag frequencies, that ultimately deliver "relative disambiguation" of intended Tag Meaning (i.e. you can easily discern the taggers meaning via the Tags actual Data Space which is associated with the tagger). In a nutshell, the dynamics of relevance matching, ranking, and the like, change immensely without futile timeless debates about matters such as:
We can just get on with demonstrating Linked Data value using what exists on the Web today. This is the approach we are deliberately taking with ODS.
Tip: This post is best viewed via an RDF aware User Agent (e.g. a Browser or Data Viewer). I say this because the permalink of this post is a URI in a Linked Data Space (My Blog) comprised of more data than meets the eye (i.e. what you see when you read this post via a Document Web Browser) :-)
]]>From my perspective on things I prefer to align my articulation of the changes that are occurring across our industry (courtesy of the Internet Inflection) to the MVC pattern.
Re. the Web Versions (or Dimensions of Interaction):
The same applies to evolution of Openness:
In the (C)ontroller realm where the focal point is Application Logic, data access issues aren't obvious (*I recall my battles with Richard Stallman re. the appropriate Open Source License variant for iODBC during the embryonic years of database and data access technology on Linux*). Data is an enigma in this realm, unfortunately. This implies that "Data Lock-in" occurs deliberately, but in most cases, inadvertently when we make Application Logic the focal point of everything. Another example is Web 2.0 in which the norm (unfortunately) is to suck in your data, and then refuse to give you complete ownership over how it is used (including the fact that you may want to share it elsewhere).
Open Data is a really big deal which is why the SWEO supported Linking Open Data Project is a very big deal. The good news is that this movement is gathering moment at an exponential rate :-)]]>(Via Read/Write Web.)
Web 3.0: When Web Sites Become Web Services: "
.....As more and more of the Web is becoming remixable, the entire system is turning into both a platform and the database. Yet, such transformations are never smooth. For one, scalability is a big issue. And of course legal aspects are never simple.'
But it is not a question of if web sites become web services, but when and how. APIs are a more controlled, cleaner and altogether preferred way of becoming a web service. However, when APIs are not avaliable or sufficient, scraping is bound to continue and expand. As always, time will be best judge; but in the meanwhile we turn to you for feedback and stories about how your businesses are preparing for 'web 3.0'.
We are hitting a little problem re. Web 3.0 and Web 2.0, naturally :-) Web 2.0 is one of several (present and future) Dimensions of Web Interaction that turns Web Sites into Web Services Endpoints; a point I've made repeatedly [1] [2] [3] [4] across the blogosphere, in addition to my early futile attempts to make the Wikipedia's Web 2.0 article meaningful (circa 2005), as per the Wikipedia Web 2.0 Talk Page excerpt below:
Web 2.0 is a web of executable endpoints and well formed content. The executable endpoints and well formed content are accessible via URIs. Put differently, Web 2.0 is a web defined by URIs for invoking Web Services and/or consuming or syndicating well formed content.
Hopefully, someone with more time on their hands will expand on this ( I am kinda busy)
.BTW - Web 2.0 being a platform doesn't distinguish it in anyway from Web 1.0. They are both platforms, the difference comes down to platform focus and mode of experience.
Web 3.0 is about Data Spaces: Points of Semantic Web Presence that provide granular access to Data, Information, and Knowledge via Conceptual Data Model oriented Query Languages and/or APIs.
The common denominator across all the current and future Web Interaction Dimensions is HTTP. While their differences are as follows:
Examples of Web 3.0 Infrastructure:
Web 3.0 is not purely about Web Sites becoming Web Services endpoints. It is about the "M" (Data Model) taking it's place in the MVC pattern as applied to the Web Platform.
I will repeat myself yet again:
The Devil is in the Details of the Data Model. Data Models make or break everything. You ignore data at your own peril. No amount of money in the bank will protect you from Data Ignorance! A bad Data Model will bring down any venture or enterprise, the only variable is time (where time is directly related to your increasing need to obtain, analyze, and then act on data, over repetitive operational cycles, that have ever decreasing intervals).
This applies to the Real-time enterprise of Information and/or knowledge workers and Real-time Web Users alike.
BTW - Data Makes Shifts Happen (spotter: Sam Sethi).
]]>I came across a pretty deep comments trail about the aforementioned items on Fred Wilson's blog (aptly titled: A VC) under the subject heading: Web 3.0 Is The Semantic Web.
Contributions to the general Semantic Web discourse by way of responses to valuable questions and commentary contributed by a Semantic Web skeptic (Ed Addison who may be this Ed Addison according to Google):
Ed, Responses to your points re. Semantic Web Matrialization:<< 1) ontologies can be created and maintained by text extractors and crawlers" >>
Ontologies will be developed by Humans. This process has already commenced and far more landscape has been covered that you may be aware of. For instance, there is an Ontology for Online Communities with Semantics factored in. More importantly, most Blogs, Wikis, and other "points of presence" on the Web are already capable of generating Instance Data for this Ontology by way of the underlying platforms that drive these things. The Ontology is called: SIOC (Semantically-Interlinked Online Communities).
<< 2) the entire web can be marked up, semantically indexed, and maintained by spiders without human assistance >>
Most of it can, and already is :-) Human assistance should, and would, be on an "exception basis" a preferred use of human time (IMHO). We do not need to annotate the Web manually when this labor intensive process can be automated (see my earlier comments).
<< 3) inference over the semantic web does not require an extremely deep heuristic search down multiple, redundant, cyclical pathways with many islands that are disconnected >>
When you have a foundation layer of RDF Data (generated in the manner I've discussed above), you then have a substrate that's far more palatable to Intelligent Reasoning. Note, the Semantic Web is made of many layers. The critical layer at this juncture is the Data-Web (Web of RDF Data). Note, when I refer to RDF I am not referring to RDF/XML the serialization format, I am referring to the Data Model (a Graph).
<< 4) the web becomes smart enough to eliminate websites or data elements that are incorrect, misleading, false, or just plain lousy >>
The Semantic Web vision is not about eliminating Web Sites (The Hypertext-Document-Web). It is simply about adding another dimension of interaction to the Web. This is just like the Services-Web dimension as delivered by Web 2.0.
We are simply evolving within an innovation continuum. There is no mutual exclusivity about any of the Web Dimensions since they collectively provide us with a more powerful infrastructure for building and exploiting "collective wisdom".
As for the Data-Web experiment part of this post, I would expect to see this post exposed as another contribution to the Data-Web via the PingTheSemanticWeb notification service :-) Implying, that all the relevant parts of this conversation are in a format (Instance Data for the SIOC Ontology) that is available for further use in a myriad of forms.
]]>Web Me2.0 -- Exploding the Myth of Web 2.0:"Many people have told me this week that they think 'Web 2.0' has not been very impressive so far and that they really hope for a next-generation of the Web with some more significant innovation under the hood -- regardless of what it's called. A lot of people found the Web 2.0 conference in San Francisco to be underwhelming -- there was a lot of self-congratulation by the top few brands and the companies they have recently bought, but not much else happening. Where was all the innovation? Where was the focus on what's next? It seemed to be a conference mainly about what happened in the last year, not about what will happen in the coming year. But what happened last year is already so 'last year.' And frankly Web 2.0 still leaves a lot to be desired. The reason Tim Berners-Lee proposed the Semantic Web in the first place is that it will finally deliver on the real potential and vision of the Web. Not that today's Web 2.0 sucks completely -- it only sort of sucks. It's definitely useful and there are some nice bells and whistles we didn't have before. But it could still suck so much less!"
Web 2.0 is a (not was) a piece of the overall Web puzzle. The Data Web (so called Web 3.0) is another critical piece of this puzzle, especially as it provides the foundation layer (Layer 1) of the Semantic Web.
Web 2.0 was never about "Open Data Access", "Flexible Data Models", or "Open World" meshing of disparate data sources built atop disparate data schemas (see: Web 2.0's Open Data Access Conundrum). It was simply about "Execution and APIs". I already written about "Web Interaction Dimensions", but you call also look at the relationship of the currently perceived dimensions through the M-V-C programming pattern:
Another point to note, Social Networking is hot, but nearly every social network that I know (and I know and use most of them) suffers from an impedance mismatch between the service(s) they provide (social networks) and their underlying data models (in many cases Relational as opposed to Graph). Networks are about Relationships (N-ary) and your cannot effectively exploit the deep potential of: "Network Effects" (Wisdom of Crowds, Viral Marketing etc..) without a complimentary data model, you simply can't.
Finally, the Data Web is already here, I promised a long time ago (Internet Time) that the manifestation of the Semantic Web would occur unobtrusively, meaning, we will wake up one day and realize we are using critical portions of the Semantic Web (i.e. Data-Web) without even knowing it. Guess what? It's already happening. Simple case in point, you may have started to notice the emergence of SIOC gems in the same way you may have observed those RSS 2.0 gems at the dawn of Web 2.0. What I am implying here is that the real question we should be asking is: Where is the Semantic Web Data? And how easy or difficult will it be to generate? And where are the tools? My answers are presented below:
Next stop, less writing, more demos, these are long overdue! At least from my side of the fence :-) I need to produce a little step-by-guide oriented screencasts that demonstrates how Web 2.0 meshes nicely with the Data-Web.
Here are some (not so end-user friendly) examples of how you can use SPARQL (Data-Web's Query Language) to query Web 2.0 Instance Data projected through the SIOC Ontology:
Note: You can use the online SPARQL Query Interface at: http://demo.openlinksw.com/isparql.
Other Data-Web Technology usage demos include:
Amongst the numerous comments about this subject, I felt most compelled to respond to the commentary from Tim O'Reilly (based on his proximity to Web 2.0 etc..) in relation to his view that the NYT's Web 3.0 = Collective Intelligence Harnessing aspect of his Web 2.0 meme.
My response is dumped semi-verbatim below:
Tim,
A few things:
- We are in an innovation continuum
- The Web as a medium of innovation will evolve forever
- Different commentators have different views about monikers associated with these innovations
- To say Web 3.0 (aka the Data Web or Semantic Web - Layer 1) is what Web 2.0's collective intelligence is all about is a little inaccurate (IMHO); Web 2.0 doesn't provide "Open Data Access"
- Web 2.0 is a "Web of Services" primarily, a dimension of "Web Interaction" defined by interaction with Services
- Web 3.0 ("Data Web" or "Web of Databases" or "Semantic Web - Layer 1") is a Web dimension that provides "Open Data Access" that will be exemplified by the transition from "Mash-ups" (brute force data joining) to "Mesh-ups" (natural data joining)
The original "Web of Hypertext" or "Interactive Web", the current "Web of Services", and the emerging "Data Web" or "Web of Databases" collectively provide dimensions of interaction in the innovation continuum called the Web.
There are many more dimensions to come. Monikers come and go, but the retrospective "Long Shadow" of Innovation is ultimately timeless.
"Mutual Inclusivity" is a critical requirement for truly perceiving these "Web Interaction Dimensions" ("Participation" if I recall). "Mutual Exclusivity" on the other hand, simpy leads to obscuring reality with Versionitis as exemplified by the ongoing: Web 1.0 vs 2.0 vs 3.0 debates.
BTW - I enjoyed reading Nick Carr's take on the Web 3.0 meme, especially his "tongue in cheek" power-grab for the rights to all "Web 3.0" Conferences etc. :-)
]]>