As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:
"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..
And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..
Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).
The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.
People, Places, Music, Books, Cars, Ideas, Emotions etc..
A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.
The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below:
A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:
So far so good!
The kind of URI Linked Data aficionados mean when they use the term: URI.
An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:
Data about Data. Put differently, data that describes other data in a structured manner.
The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).
The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:
The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.
Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?
The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.
Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)
1995 (and the early 90’s) must have been a visionaries time of dreaming… most of their dreams are happening today.
Watch Steve Jobs (then of NeXT) discuss what he thinks will be popular in 1996 and beyond at OpenStep Days 1995:
Heres a spoiler:
The thing that OpenStep propose is:
What Steve was suggesting was one of the beginnings of the Data Web! Yep, Portable Distributed Objects and Enterprise Objects Framework was one of the influences of the Semantic Web / Linked Data Web…. not surprising as Tim Berners-Lee designed the initial web stack on a NeXT computer!
I’m going to spend a little time this evening figuring out how much ‘distributed objects’ stuff has been taken from the OpenStep stuff into the Objective-C + Cocoa environment. (<- I guess I must be quite geeky ;-))
"(Via Daniel Lewis.)
]]>Jason recently moved to Massachusetts which lead to me pinging him about our earlier blogosphere encounter and the emergence of a Data Portability Community. I also informed him about the fact that TimBL, myself, and a number of other Semantic Web technology enthusiasts, frequently meet on the 2nd Tuesday of each month at the MIT hosted Cambridge Semantic Web Gatherings, to discuss, demonstrate, debate all aspects of the Semantic Web. Luckily (for both of us), Jason attended the last event, and we got to meet each other in person.
Following our face to face meeting in Cambridge, a number of follow-on conversations ensued covering, Linked Data and practical applications of the Semantic Web vision. Jason writes about our exchanges a recent post titled: The Semantic Web. His passion for Data Portability enabled me to use OpenID and FOAF integration to connect the Semantic Web and Data Portability via the Linked Data concept.
During our conversations, Jason also eluded to the fact that he had already encountered OpenLink Software while working with our ODBC Drivers (part of or UDA product family) for IBM Informix (Single-Tier or Multi-Tier Editions) a few years ago (interesting random connection).
As I've stated in the past, I've always felt that the Semantic Web vision will materialize by way of a global epiphany. The count down to this inevitable event started at the birth of the blogosphere, ironically. And accelerated more recently, through the emergence of Web 2.0 and Social Networking, even more ironically :-)
The blogosphere started the process of Data Space coalescence via RSS/Atom based semi-strucutured data enclaves, Web 2.0 RDFpropagated Web Service usage en route to creating service provider controlled, data and information silosRDF, Social NetworkingRDF brought attention to the fact that User Generated Data wasn't actually owned or controlled by the Data Creators etc.
The emergence of "Data Portability" has created a palatable moniker for a clearly defined, and slightly easier to understand, problem: the meshing of Data and Identity in cyberspace i.e. individual points of presence in cyberspace, in the form of "Personal Data Spaces in the Clouds" (think: doing really powerful stuff with .name domains). In a sense, this is the critical inflection point between the document centric "Web of Linked Documents" and the data centric "Web or Linked Data". There is absolutely no other way solve this problem in a manner that alleviates the imminent challenges presented by information overload -- resulting from the exponential growth of user generated data across the Internet and enterprise Intranets.
]]>So, unlike Scoble, I am able to make my Facebook Data portable without violating Facebook rules (no data caching outside Facebook realm) by doing the following:
In a nutshell, my Linked Data Space enables you to reference data in my data space via Object Identifiers (URIs), and some cases the Object IDs and Graphs are constructed on the fly via RDFization middleware.
Here are my URIs that provide different paths to my Facebook Data Space:
To conclude, 2008 is clearly the inflection year during which we will final unshackle Data and Identity from the confines of "Web Data Silos" by leveraging the HTTP, SPARQL, and RDF induced virtues of Linked Data.
Related Posts:
Well, I'll have a crack at helping him out i.e. defining the Semantic Data Web in simple terms with linked examples :-)
Tip: Watch the recent TimBL video interview re. the Semantic Data Web before, during, or after reading this post.
Here goes!
The popular Web is a "Web of Documents". The Semantic Data Web is a "Web of Data". Going down a level, the popular web connects documents across the web via hyperlinks. The Semantic Data Web connects data on the web via hyperlinks. Next level, hyperlinks on the popular web have no inherent meaning (lack context beyond: "there is another document"). Hyperlinks on the Semantic Data Web have inherent meaning (they possess context: "there is a Book" or "there is a Person" or "this is a piece of Music" etc..).
Very simple example:
Click the traditional web document URLs for Dan Connolly and Tim Berners-Lee. Then attempt to discern how they are connected. Of course you will see some obvious connections by reading the text, but you won't easily discern other data driven connections. Basically, this is no different to reading about either individual in a print journal, bar the ability to click on hyperlinks that open up other pages. The Data Extraction process remains labour intensive :-(
Repeat the exercise using the traditional web document URLs as Data Web URIs, this time around, paste the hyperlinks above into an RDF aware Browser (in this case the OpenLink RDF Browser). Note, we are making a subtle but critical change i.e. the URLs are now being used as Semantic Data Web URIs (a small-big-deal kind of thing).
If you're impatient or simply strapped for time (aren't we all these days), simply take a look at these links:
Note: There are other RDF Browsers out there such as:
All of these RDF Browsers (or User Agents) demonstrate the same core concepts in subtly different ways.
If I haven't lost you, proceed to a post I wrote a few weeks ago titled: Hello Data Web (Take 3 - Feel the "RDF" Force).
If you've made it this far, simply head over to DBpedia for a lot of fun :-)
Note Re. my demos: we make use of SVG in our RDF Browser which makes them incompatible with IE (6 or 7) and Safari. That said, Firefox (1.5+), Opera 9.x, WebKit (Open Source Safari), and Camino work fine.
Note to Scoble:
All the Blogs, Wikis, Shared Bookmarks, Image Galleries, Discussion Forums and the like are Semantic Web Data Spaces. The great thing about all of this is that through RSS 2.0's wild popularity, Blogosphere has done what I postulated about a while back: The Semantic Web would be self-annotating, and so it has come to be :-)
To prove the point above: paste your blog's URL into the OpenLink RDF Browser and see it morph into a Semantic Data Web URI (a pointer to Web Data that's you've created) once you click the "Query" button (click on the TimeLine tab for full effect). The same applies to del.icio.us, Flickr, Googlebase, and basically any REST style Web Service as per my RDF Middleware post.
Lazy Semantic Web Callout:
If you're a good animator (pro or hobbyist), please produce an animation of a document going through a shredder. The strips that emerge from the shredder represent the granular data that was once the whole document. The same thing is happening on the Web right now, we are putting photocopies of (X)HTML documents through the shredder (in a good way) en route to producing granular items of data that remain connected to the original copy while developing new and valuable connections to other items of Web Data.
That's it!
]]>In this third take on my introduction to the Data Web I would like to share a link with you (a Dynamic Start Page in Web 2.0 parlance) with a Data Web twist: You do not have to preset the Start Page Data Sources (this is a small-big thing, if you get my drift, hopefully!).
Here are some Data Web based Dynamic Start Pages that I have built for some key play ers from the Semantic Web realm (in random order):
"These are RDF prepped Data Sources....", you might be thinking, right? Well here is the reminder: The Data Web is a Global Data Generation and Integration Effort. Participation may be active (Semantic Web & Microformats Community), or passive (web sites, weblogs, wikis, shared bookmarks, feed subscription, discussion forums, mailing lists etc..). Irrespective of participation mode, RDF instance can be generated from close to anything (I say this because I plan to add binary files holding metadata to this mix shortly). Here are examples of Dynamic Start Pages for non RDF Data Sources:
what about Microformats you may be wondering? Here goes:
Let's carry on.
How about some traditional Web Sites? Here goes:
And before I forget, here is My Data Web Start Page .
Due to the use of Ajax in the Data Web Start Pages, IE6 and Safari will not work. For Mac OS X users, Webkit works fine. Ditto re. IE7 on Windows.
]]>Web Me2.0 -- Exploding the Myth of Web 2.0:"Many people have told me this week that they think 'Web 2.0' has not been very impressive so far and that they really hope for a next-generation of the Web with some more significant innovation under the hood -- regardless of what it's called. A lot of people found the Web 2.0 conference in San Francisco to be underwhelming -- there was a lot of self-congratulation by the top few brands and the companies they have recently bought, but not much else happening. Where was all the innovation? Where was the focus on what's next? It seemed to be a conference mainly about what happened in the last year, not about what will happen in the coming year. But what happened last year is already so 'last year.' And frankly Web 2.0 still leaves a lot to be desired. The reason Tim Berners-Lee proposed the Semantic Web in the first place is that it will finally deliver on the real potential and vision of the Web. Not that today's Web 2.0 sucks completely -- it only sort of sucks. It's definitely useful and there are some nice bells and whistles we didn't have before. But it could still suck so much less!"
Web 2.0 is a (not was) a piece of the overall Web puzzle. The Data Web (so called Web 3.0) is another critical piece of this puzzle, especially as it provides the foundation layer (Layer 1) of the Semantic Web.
Web 2.0 was never about "Open Data Access", "Flexible Data Models", or "Open World" meshing of disparate data sources built atop disparate data schemas (see: Web 2.0's Open Data Access Conundrum). It was simply about "Execution and APIs". I already written about "Web Interaction Dimensions", but you call also look at the relationship of the currently perceived dimensions through the M-V-C programming pattern:
Another point to note, Social Networking is hot, but nearly every social network that I know (and I know and use most of them) suffers from an impedance mismatch between the service(s) they provide (social networks) and their underlying data models (in many cases Relational as opposed to Graph). Networks are about Relationships (N-ary) and your cannot effectively exploit the deep potential of: "Network Effects" (Wisdom of Crowds, Viral Marketing etc..) without a complimentary data model, you simply can't.
Finally, the Data Web is already here, I promised a long time ago (Internet Time) that the manifestation of the Semantic Web would occur unobtrusively, meaning, we will wake up one day and realize we are using critical portions of the Semantic Web (i.e. Data-Web) without even knowing it. Guess what? It's already happening. Simple case in point, you may have started to notice the emergence of SIOC gems in the same way you may have observed those RSS 2.0 gems at the dawn of Web 2.0. What I am implying here is that the real question we should be asking is: Where is the Semantic Web Data? And how easy or difficult will it be to generate? And where are the tools? My answers are presented below:
Next stop, less writing, more demos, these are long overdue! At least from my side of the fence :-) I need to produce a little step-by-guide oriented screencasts that demonstrates how Web 2.0 meshes nicely with the Data-Web.
Here are some (not so end-user friendly) examples of how you can use SPARQL (Data-Web's Query Language) to query Web 2.0 Instance Data projected through the SIOC Ontology:
Note: You can use the online SPARQL Query Interface at: http://demo.openlinksw.com/isparql.
Other Data-Web Technology usage demos include:
A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.
BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.
With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:
As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.
We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.
What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).
The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:
Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.
If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:
A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open & Flexible Data Access Model â.
In more succinct form:
A pervasive network of people connected by data or data connected by people.
Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).
To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.
The Enterprise Challenge
Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).
A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.
Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.
Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:
Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business
A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.
BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.
With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:
As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.
We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.
What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).
The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:
Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.
If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:
A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open & Flexible Data Access Model â.
In more succinct form:
A pervasive network of people connected by data or data connected by people.
Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).
To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.
The Enterprise Challenge
Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).
A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.
Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.
Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:
Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business
The Semantic Web is only the beginning and an enabling technology for realizing the dreams of Vannevar Bush, Doug Engelbart and Tim Berners-Lee: My current and future objective is the creation and wide dissemination of the next generation collaboration and augmentation infrastructure - the Social Semantic Desktop.
To ensure the loop is closed I have deliberately added the following references to this post: Vannevar Bush wrote the seminal article; "As We May Think" in which he describes a theoretical analog computer called: "The Memex" - a World Wide Web precursor. This document was also a source of inspiration for Ted Nelson (discussed briefly in an earlier post re. compatibility of his his vision and those of Tim Berners-Lee).
]]>"...Also today I came across the latest project of a man who wants to tear down Tim Berners-Lee's World Wide Web and replace it with his own vision. It used to be known as Xanadu, but has since morphed into Transliterature, A Humanist Design. I am of course referring to Ted Nelson, who invented the term 'hypertext' in 1965 and is generally regarded as a computing pioneer.
Ted Nelson recently wrote an essay about 'Indirect Documents', which got Slashdotted today. In the essay Nelson outlines why (in his opinion) the Xanadu project failed and he explains his new vision for Transliterature. He takes a number of potshots at Tim Berners-Lee's WWW on the way, e.g.:
'Why don't I like the web? I hate its flapping and screeching and emphasis on appearance; its paper-simulation rectangles of Valuable Real Estate, artifically created by the NCSA browser, now hired out to advertisers; its hierarchies exposed and imposed; its untyped one-way links only from inside the document. (The one-way links hidden under text were a regrettable simplification of hypertext which I assented to in '68 on the HES project. But that's another story.) Only trivial links are possible; there is nothing to support careful annotation and study; and, of course, there is no transclusion.'
Ted Nelson is certainly an original and I'm glad he's still around to throw spanners in the works. I've written about him before and I'm sure I will again, Web 2.0 or not.
"(Excerpted From: Read/Write Web.)
My thoughts on the commentary above:
There is nothing fundamentally incompatible between Ted Nelson's pursuits and future incarnation's of the Web. None whatsoever -- we are simply working our way through an process. The process in question is what I call "standards driven ubiquity" (becoming de facto at Internet Speed). Remember Sun's "The Network is the Computer" vision? Well, without a "Computer" in mind-space you can't think in terms of "Operating Systems". Thats all changing, because today we are gradually beginning to accept the imminent reality that "The Internet is the Operating System" and not Windows/UNIX/Mac OS X/Others. Ahem! And after the Operating System what comes next? I think a set of Application Programming Interfaces (APIs), and I think we know what that is (in all of its controversial glory), the very thing we refer to as Web 2.0 (the APIs for the Internet Operating System).
Note: In addition to the Computer, Operating System, and Application Programming Interfaces, we also have those frequently misunderstood and under-appreciated workhorses called "Databases" in place (but we still call them Web Sites for now). And by the way, "Internet Filesystem" has been there forever, but for some reason we can't see WebDAV in all its current and future glory (that will change very soon also!).
Ted and TBL are cool with each (whether they know it or not)! I see no mutual exclusivity in their collective visions (IMHO) :-)
]]>Anyway, Marc's article is a very refreshing read because it provides a really good insight into the general landscape of a rapidly evolving Web alongside genuine appreciation of our broader timeless pursuit of "Openness".
To really help this document provide additional value have scrapped the content of the original post and dumped it below so that we can appreciate the value of the links embedded within the article (note: thanks to Virtuoso I only had to paste the content into my blog, the extraction to my Linkblog and Blog Summary Pages are simply features of my Virtuoso based Blog Engine):
]]>Breaking the Web Wide Open! (complete story)
Even the web giants like AOL, Google, MSN, and Yahoo need to observe these open standards, or they'll risk becoming the "walled gardens" of the new web and be coolio no more.
Editorial Note: Several months ago, AlwaysOn got a personal invitation from Yahoo founder Jerry Yang "to see and give us feedback on our new social media product, y!360." We were happy to oblige and dutifully showed up, joining a conference room full of hard-core bloggers and new, new media types. The geeks gave Yahoo 360 an overwhelming thumbs down, with comments like, "So the only services I can use within this new network are Yahoo services? What if I don't use Yahoo IM?" In essence, the Yahoo team was booed for being "closed web," and we heartily agreed. With Yahoo 360, Yahoo continues building its own "walled garden" to control its 135 million customersÂan accusation also hurled at AOL in the early 1990s, before AOL migrated its private network service onto the web. As the Economist recently noted, "Yahoo, in short, has old media plans for the new-media era."
The irony to our view here is, of course, that today's AO Network is also a "closed web." In the end, Mr. Yang's thoughtful invitation and our ensuing disappointment in his new service led to the assignment of this article. It also confirmed our existing plan to completely revamp the AO Network around open standards. To tie it all together, we recruited the chief architect of our new site, the notorious Marc Canter, to pen this piece. We look forward to our reader feedback.
Breaking the Web Wide Open!
By Marc Canter
For decades, "walled gardens" of proprietary standards and content have been the strategy of dominant players in mainframe computer software, wireless telecommunications services, and the World Wide WebÂit was their successful lock-in strategy of keeping their customers theirs. But like it or not, those walls are tumbling down. Open web standards are being adopted so widely, with such value and impact, that the web giantsÂAmazon, AOL, eBay, Google, Microsoft, and YahooÂare facing the difficult decision of opening up to what they don't control.
The online world is evolving into a new open web (sometimes called the Web 2.0), which is all about being personalized and customized for each user. Not only open source software, but open standards are becoming an essential component.
Many of the web giants have been using open source software for years. Most of them use at least parts of the LAMP (Linux, Apache, MySQL, Perl/Python/PHP) stack, even if they aren't well-known for giving back to the open source community. For these incumbents that grew big on proprietary web services, the methods, practices, and applications of open source software development are difficult to fully adopt. And the next open source movementsÂwhich will be as much about open standards as about codeÂwill be a lot harder for the incumbents to exploit.
While the incumbents use cheap open source software to run their back-ends systems, their business models largely depend on proprietary software and algorithms. But our view a new slew of open software, open protocols, and open standards will confront the incumbents with the classic Innovator's Dilemma. Should they adopt these tools and standards, painfully cannibalizing their existing revenue for a new unproven concept, or should they stick with their currently lucrative model with the risk that eventually a bunch of upstarts eat their lunch?
Credit should go to several of the web giants who have been making efforts to "open up." Google, Yahoo, eBay, and Amazon all have Open APIs (Application Programming Interfaces) built into their data and systems. Any software developer can access and use them for whatever creative purposes they wish. This means that the API provider becomes an open platform for everyone to use and build on top of. This notion has expanded like wildfire throughout the blogosphere, so nowadays, Open APIs are pretty much required.
Other incumbents also have open strategies. AOL has got the RSS religion, providing a feedreader and RSS search in order to escape the "walled garden of content" stigma. Apple now incorporates podcasts, the "personal radio shows" that are latest rage in audio narrowcasting, into iTunes. Even Microsoft is supporting open standards, for example by endorsing SIP (Session Initiation Protocol) for internet telephony and conferencing over Skype's proprietary format or one of its own devising.
But new open standards and protocols are in use, under construction, or being proposed every day, pushing the envelope of where we are right now. Many of these standards are coming from startup companies and small groups of developers, not from the giants. Together with the Open APIs, those new standards will contribute to a new, open infrastructure. Tens of thousands of developers will use and improve this open infrastructure to create new kinds of web-based applications and services, to offer web users a highly personalized online experience.
A Brief History of Openness
At this point, I have to admit that I am not just a passive observer, full-time journalist or "just some blogger"Âbut an active evangelist and developer of these standards. It's the vision of "open infrastructure" that's driving my company and the reason why I'm writing this article. This article will give you some of the background behind on these standards, and what the evolution of the next generation of open standards will look like.
Starting back in the 1980s, establishing a software standard was a key strategy for any software company. My former company, MacroMind (which became Macromedia), achieved this goal early on with Director. As Director evolved into Flash, the world saw that other companies besides Microsoft, Adobe, and Apple could establish true cross-platform, independent media standards.
Then Tim Berners-Lee and Marc Andreessen came along, and changed the rules of the software business and of entrepreneurialism. No matter how entrenched and "standardized" software was, the rug could still get pulled out from under it. Netscape did it to Microsoft, and then Microsoft did it back to Netscape. The web evolved, and lots of standards evolved with it. The leading open source standards (such as the LAMP stack) became widely used alternatives to proprietary closed-source offerings.
Open standards are more than just technology. Open standards mean sharing, empowering, and community support. Someone floats a new idea (or meme) and the community runs with it â with each person making their own contributions to the standard â evolving it without a moment's hesitation about "giving away their intellectual property."
One good example of this was Dave Sifry, who built the Technorati blog-tracking technology inspired by the Blogging Ecosystem, a weekend project by young hacker Phil Pearson. Dave liked what he saw and he ran with itÂturning Technorati into what it is today.
Dave Winer has contributed enormously to this area of open standards. He defined and personally created several open standards and protocolsÂsuch as RSS, OPML, and XML-RPC. Dave has also helped build the blogosphere through his enthusiasm and passion.
By 2003, hundreds of programmers were working on creating and establishing new standards for almost everything. The best of these new standards have evolved into compelling web services platforms â such as del.icio.us, Webjay, or Flickr. Some have even spun off formal standards â like XSPF (a standard for playlists) or instant messaging standard XMPP (also known as Jabber).
Today's Open APIs are complemented by standardized SchemasÂthe structure of the data itself and its associated meta-data. Take for example a podcasting feed. It consists of: a) the radio show itself, b) information on who is on the show, what the show is about and how long the show is (the meta-data) and also c) API calls to retrieve a show (a single feed item) and play it from a specified server.
The combination of Open APIs, standardized schemas for handling meta-data, and an industry which agrees on these standards are breaking the web wide open right now. So what new open standards should the web incumbentsÂand youÂbe watching? Keep an eye on the following developments:
Identity
Attention
Open Media
Microcontent Publishing
Open Social Networks
Tags
Pinging
Routing
Open Communications
Device Management and Control
1. Identity
Right now, you don't really control your own online identity. At the core of just about every online piece of software is a membership system. Some systems allow you to browse a site anonymouslyÂbut unless you register with the site you can't do things like search for an article, post a comment, buy something, or review it. The problem is that each and every site has its own membership system. So you constantly have to register with new systems, which cannot share dataÂeven you'd want them to. By establishing a "single sign-on" standard, disparate sites can allow users to freely move from site to site, and let them control the movement of their personal profile data, as well as any other data they've created.
With Passport, Microsoft unsuccessfully attempted to force its proprietary standard on the industry. Instead, a world is evolving where most people assume that users want to control their own data, whether that data is their profile, their blog posts and photos, or some collection of their past interactions, purchases, and recommendations. As long as users can control their digital identity, any kind of service or interaction can be layered on top of it.
Identity 2.0 is all about users controlling their own profile data and becoming their own agents. This way the users themselves, rather than other intermediaries, will profit from their ID info. Once developers start offering single sign-on to their users, and users have trusted places to store their dataÂwhich respect the limits and provide access controls over that data, users will be able to access personalized services which will understand and use their personal data.
Identity 2.0 may seem like some geeky, visionary future standard that isn't defined yet, but by putting each user's digital identity at the core of all their online experiences, Identity 2.0 is becoming the cornerstone of the new open web.
The Initiatives:
Right now, Identity 2.0 is under construction through various efforts from Microsoft (the "InfoCard" component built into the Vista operating system and its "Identity Metasystem"), Sxip Identity, Identity Commons, Liberty Alliance, LID (NetMesh's Lightweight ID), and SixApart's OpenID.
More Movers and Shakers:
Identity Commons and Kaliya Hamlin, Sxip Identity and Dick Hardt, the Identity Gang and Doc Searls, Microsoft's Kim Cameron, Craig Burton, Phil Windley, and Brad Fitzpatrick, to name a few.
2. Attention
How many readers know what their online attention is worth? If you don't, Google and Yahoo doÂthey make their living off our attention. They know what we're searching for, happily turn it into a keyword, and sell that keyword to advertisers. They make money off our attention. We don't.
Technorati and friends proposed an attention standard, Attention.xml, designed to "help you keep track of what you've read, what you're spending time on, and what you should be paying attention to." AttentionTrust is an effort by Steve Gillmor and Seth Goldstein to standardize on how captured end-user performance, browsing, and interest data are used.
Blogger Peter Caputa gives a good summary of AttentionTrust:"As we use the web, we reveal lots of information about ourselves by what we pay attention to. Imagine if all of that information could be stored in a nice neat little xml file. And when we travel around the web, we can optionally share it with websites or other people. We can make them pay for it, lease it ... we get to decide who has access to it, how long they have access to it, and what we want in return. And they have to tell us what they are going to do with our Attention data."
So when you give your attention to sites that adhere to the AttentionTrust, your attention rights (you own your attention, you can move your attention, you can pay attention and be paid for it, and you can see how your attention is used) are guaranteed. Attention data is crucial to the future of the open web, and Steve and Seth are making sure that no one entity or oligopoly controls it.
Movers and Shakers:
Steve Gillmor, Seth Goldstein, Dave Sifry and the other Attention.xml folks.
3. Open Media
Proprietary media standardsÂFlash, Windows Media, and QuickTime, to name a few Âhelped liven up the web. But they are proprietary standards that try to keep us locked in, and they weren't created from scratch to handle today's online content. That's why, for many of us, an Open Media standard has been a holy grail. Yahoo's new Media RSS standard brings us one step closer to achieving open media, as do Ogg Vorbis audio codecs, XSPF playlists, or MusicBrainz. And several sites offer digital creators not only a place to store their content, but also to sell it.
Media RSS (being developed by Yahoo with help from the community) extends RSS and combines it with "RSS enclosures" Âadds metadata to any media itemÂto create a comprehensive solution for media "narrowcasters." To gain acceptance for Media RSS, Yahoo knows it has to work with the community. As an active member of this community, I can tell you that we'll create Media RSS equivalents for rdf (an alternative subscription format) and Atom (yet another subscription format), so no one will be able to complain that Yahoo is picking sides in format wars.
When Yahoo announced the purchase of Flickr, Yahoo founder Jerry Yang insinuated that Yahoo is acquiring "open DNA" to turn Yahoo into an open standards player. Yahoo is showing what happens when you take a multi-billion dollar company and make openness one of its core valuesÂso Google, beware, even if Google does have more research fellows and Ph.D.s.
The open media landscape is far and wide, reaching from game machine hacks and mobile phone downloads to PC-driven bookmarklets, players, and editors, and it includes many other standardization efforts. XSPF is an open standard for playlists, and MusicBrainz is an alternative to the proprietary (and originally effectively stolen) database that Gracenote licenses.
Ourmedia.org is a community front-end to Brewster Kahle's Internet Archive. Brewster has promised free bandwidth and free storage forever to any content creators who choose to share their content via the Internet Archive. Ourmedia.org is providing an easy-to-use interface and community to get content in and out of the Internet Archive, giving ourmedia.org users the ability to share their media anywhere they wish, without being locked into a particular service or tool. Ourmedia plans to offer open APIs and an open media registry that interconnects other open media repositories into a DNS-like registry (just like the www domain system), so folks can browse and discover open content across many open media services. Systems like Brightcove and Odeo support the concept of an open registry, and hope to work with digital creators to sell their work to fulfill the financial aspect of the "Long Tail."
More Movers and Shakers:
Creative Commons, the Open Media Network, Jay Dedman, Ryanne Hodson, Michael Verdi, Eli Chapman, Kenyatta Cheese, Doug Kaye, Brad Horowitz, Lucas Gonze, Robert Kaye, Christopher Allen, Brewster Kahle, JD Lasica, and indeed, Marc Canter, among others.
4. Microcontent Publishing
Unstructured content is cheap to create, but hard to search through. Structured content is expensive to create, but easy to search. Microformats resolve the dilemma with simple structures that are cheap to use and easy to search.
The first kind of widely adopted microcontent is blogging. Every post is an encapsulated idea, addressable via a URL called a permalink. You can syndicate or subscribe to this microcontent using RSS or an RSS equivalent, and news or blog aggregators can then display these feeds in a convenient readable fashion. But a blog post is just a block of unstructured textânot a bad thing, but just a first step for microcontent. When it comes tostructured data, such as personal identity profiles, product reviews, or calendar-type event data, RSS was not designed to maintain the integrity of the structures.
Right now, blogging doesn't have the underlying structure necessary for full-fledged microcontent publishing. But that will change. Think of local information services (such as movie listings, event guides, or restaurant reviews) that any college kid can access and use in her weekend programming project to create new services and tools.
Today's blogging tools will evolve into microcontent publishing systems, and will help spread the notion of structured data across the blogosphere. New ways to store, represent and produce microcontent will create new standards, such as Structured Blogging and Microformats. Microformats differ from RSS feeds in that you can't subscribe to them. Instead, Microformats are embedded into webpages and discovered by search engines like Google or Technorati. Microformats are creating common definitions for "What is a review or event? What are the specific fields in the data structure?" They can also specify what we can do with all this information.OPML (Outline Processor Markup Language) is a hierarchical file format for storing microcontent and structured data. It was developed by Dave Winer of RSS and podcast fame.
Events are one popular type of microcontent. OpenEvents is already working to create shared databases of standardized events, which would get used by a new generation of event portalsâsuch as Eventful/EVDB, Upcoming.org, and WhizSpark. The idea of OpenEvents is that event-oriented systems and services can work together to establish shared events databases (and associated APIs) that any developer could then use to create and offer their own new service or application. OpenReviews is still in the conceptual stage, but it would make it possible to provide open alternatives to closed systems like Epinions, and establish a shared database of local and global reviews. Its shared open servers would be filled with all sorts of reviews for anyone to access.
Why is this important? Because I predict that in the future, 10 times more people will be writing reviews than maintaining their own blog. The list of possible microcontent standards goes on: OpenJobpostings, OpenRecipes, and even OpenLists. Microsoft recently revealed that it has been working on an important new kind of microcontent: Listsâso OpenLists will attempt to establish standards for the kind of lists we all use, such as lists of Links, lists of To Do Items, lists of People, Wish Lists, etc.
Movers and Shakers:
Tantek Ãelik and Kevin Marks of Technorati, Danny Ayers, Eric Meyer, Matt Mullenweg, Rohit Khare, Adam Rifkin, Arnaud Leene, Seb Paquet, Alf Eaton, Phil Pearson, Joe Reger, Bob Wyman among others.
5. Open Social Networks
I'll never forget the first time I met Jonathan Abrams, the founder of Friendster. He was arrogant and brash and he claimed he "owned"Â all his users, and that he was going to monetize them and make a fortune off them. This attitude robbed Friendster of its momentum, letting MySpace, Facebook, and other social networks take Friendster's place.
Jonathan's notion of social networks as a way to control users is typical of the Web 1.0 business model and its attitude towards users in general. Social networks have become one of the battlegrounds between old and new ways of thinking. Open standards for Social Networking will define those sides very clearly. Since meeting Jonathan, I have been working towards finding and establishing open standards for social networks. Instead of closed, centralized social networks with 10 million people in them, the goal is making it possible to have 10 million social networks that each have 10 people in them.
FOAF (which stands for Friend Of A Friend, and describes people and relationships in a way that computers can parse) is a schema to represent not only your personal profile's meta-data, but your social network as well. Thousands of researchers use the FOAF schema in their "Semantic Web" projects to connect people in all sorts of new ways. XFN is a microformat standard for representing your social network, while vCard (long familiar to users of contact manager programs like Outlook) is a microformat that contains your profile information. Microformats are baked into any xHTML webpage, which means thatany blog, social network page, or any webpage in general can "contain" your social network in itÂand be used byany compatible tool, service or application.
PeopleAggregator is an earlier project now being integrated into open content management framework Drupal. The PeopleAggregator APIs will make it possible to establish relationships, send messages, create or join groups, and post between different social networks. (Sneak preview: this technology will be available in the upcoming GoingOn Network.)
All of these open social networking standards mean that inter-connected social networks will form a mesh that will parallel the blogosphere. This vibrant, distributed, decentralized world will be driven by open standards: personalized online experiences are what the new open web will be all aboutÂand what could be more personalized than people's networks?
Movers and Shakers:
Eric Sigler, Joel De Gan, Chris Schmidt, Julian Bond, Paul Martino, Mary Hodder, Drummond Reed, Dan Brickley, Randy Farmer, and Kaliya Hamlin, to name a few.
6. Tags
Nowadays, no self-respecting tool or service can ship without tags. Tags are keywords or phrases attached to photos, blog posts, URLs, or even video clips. These user- and creator-generated tags are an open alternative to what used to be the domain of librarians and information scientists: categorizing information and content using taxonomies. Tags are instead creating "folksonomies."
The recently proposed OpenTags concept would be an open, community-owned version of the popular Technorati Tags service. It would aggregate the usage of tags across a wide range of services, sites, and content tools. In addition to Technorati's current tag features, OpenTags would let groups of people share their tags in "TagClouds." Open tagging is likely to include some of the open identity features discussed above, to create a tag system that is resilient to spam, and yet trustable across sites all over the web.
OpenTags owes a debt to earlier versions of shared tagging systems, which include Topic Exchange and something called the k-collectorÂa knowledge management tag aggregatorÂfrom Italian company eVectors.
Movers & Shakers:
Phil Pearson, Matt Mower , Paolo Valdemarin, and Mary Hodder and Drummond Reed again, among others.
7. Pinging
Websites used to be mostly static. Search engines that crawled (or "spidered") them every so often did a good enough job to show reasonably current versions of your cousin's homepage or even Time magazine's weekly headlines. But when blogging took off, it became hard for search engines to keep up. (Google has only just managed to offer blog-search functionality, despite buying Blogger back in early 2003.)
To know what was new in the blogosphere, users couldn't depend on services that spidered webpages once in a while. The solution: a way for blogs themselves to automatically notify blog-tracking sites that they'd been updated. Weblogs.com was the first blog "ping service": it displayed the name of a blog whenever that blog was updated. Pinging sites helped the blogosphere grow, and more tools, services, and portals started using pinging in new and different ways. Dozens of pinging services and sitesÂmost of which can't talk to each otherÂsprang up.
Matt Mullenweg (the creator of open source blogging software WordPress) decided that a one-stop service for pinging was needed. He created Ping-o-MaticÂwhich aggregates ping services and simplifies the pinging process for bloggers and tool developers. With Ping-o-Matic, any developer can alert all of the industry's blogging tools and tracking sites at once. This new kind of open standard, with shared infrastructure, is a critical to the scalability of Web 2.0 services.
As Matt said:There are a number of services designed specifically for tracking and connecting blogs. However it would be expensive for all the services to crawl all the blogs in the world all the time. By sending a small ping to each service you let them know you've updated so they can come check you out. They get the freshest data possible, you don't get a thousand robots spidering your site all the time. Everybody wins.
Movers and Shakers:
Matt Mullenweg, Jim Winstead, Dave Winer
8. Routing
Bloggers used to have to manually enter the links and content snippets of blog posts or news items they wanted to blog. Today, some RSS aggregators can send a specified post directly into an associated blogging tool: as bloggers browse through the feeds they subscribe to, they can easily specify and send any post they wish to "reblog" from their news aggregator or feed reader into their blogging tool. (This is usually referred to as "BlogThis.") As structured blogging comes into its own (see the section on Microcontent Publishing), it will be increasingly important to maintain the structural integrity of these pieces of microcontent when reblogging them.
Promising standard RedirectThis will combine a "BlogThis"-like capability while maintaining the integrity of the microcontent. RedirectThis will let bloggers and content developers attach a simple "PostThis" button to their posts. Clicking on that button will send that post to the reader/blogger's favorite blogging tool. This favorite tool is specified at the RedirectThis web service, where users register their blogging tool of choice. RedirectThis also helps maintain the integrity and structure of microcontentÂthen it's just up to the user to prefer a blogging tool that also attains that lofty goal of microcontent integrity.
OutputThis is another nascent web services standard, to let bloggers specify what "destinations" they'd like to have as options in their blogging tool. As new destinations are added to the service, more checkboxes would get added to their blogging toolÂallowing them to route their published microcontent to additional destinations.
Movers and Shakers:
Michael Migurski, Lucas Gonze
9. Open Communications
Likely, you've experienced the joys of finding friends on AIM or Yahoo Messenger, or the convenience of Skyping with someone overseas. Not that you're about to throw away your mobile phone or BlackBerry, but for many, also having access to Instant Messaging (IM) and Voice over IP (VoIP) is crucial.
IM and VoIP are mainstream technologies that already enjoy the benefits of open standards. Entire industries are bornÂright this secondÂbased around these open standards. Jabber has been an open IM technology for yearsÂin fact, as XMPP, it was officially dubbed a standard by the IETF. Although becoming an official IETF standard is usually the kiss of death, Jabber looks like it'll be around for a while, as entire generations of collaborative, work-group applications and services have been built on top of its messaging protocol. For VoIP, Skype is clearly the leading standard todayÂthough one could argue just how "open" it is (and defenders of the IETF's SIP standard often do). But it is free and user-friendly, so there won't be much argument from users about it being insufficiently open. Yet there may be a cloud on Skype's horizon: web behemoth Google recently released a beta of Google Talk, an IM client committed to open standards. It currently supports XMPP, and will support SIP for VoIP calls.
Movers and Shakers:
Jeremie Miller, Henning Schulzrinne, Jon Peterson, Jeff Pulver
10. Device Management and Control
To access online content, we're using more and more devices. BlackBerrys, iPods, Treos, you name it. As the web evolves, more and more different devices will have to communicate with each other to give us the content we want when and where we want it. No-one wants to be dependent on one vendor anymoreÂlike, say, SonyÂfor their laptop, phone, MP3 player, PDA, and digital camera, so that it all works together. We need fully interoperable devices, and the standards to make that work. And to fully make use of how content is moving online content and innovative web services, those standards need to be open.
MIDI (musical instrument digital interface), one of the very first open standards in music, connected disparate vendors' instruments, post-production equipment, and recording devices. But MIDI is limited, and MIDI II has been very slow to arrive. Now a new standard for controlling musical devices has emerged: OSC (Open SoundControl). This protocol is optimized for modern networking technology and inter-connects music, video and controller devices with "other multimedia devices." OSC is used by a wide range of developers, and is being taken up in the mainstream MIDI marketplace.
Another open-standards-based device management technology is ZigBee, for building wireless intelligence and network monitoring into all kinds of devices. ZigBee is supported by many networking, consumer electronics, and mobile device companies.
   · · · · · ·  Â
The Change to Openness
The rise of open source software and its "architecture of participation" are completely shaking up the old proprietary-web-services-and-standards approach. Sun MicrosystemsÂwhose proprietary Java standard helped define the Web 1.0Âis opening its Solaris OS and has even announced the apparent paradox of an open-source Digital Rights Management system.
Today's incumbents will have to adapt to the new openness of the Web 2.0. If they stick to their proprietary standards, code, and content, they'll become the new walled gardensÂplaces users visit briefly to retrieve data and content from enclosed data silos, but not where users "live." The incumbents' revenue models will have to change. Instead of "owning" their users, users will know they own themselves, and will expect a return on their valuable identity and attention. Instead of being locked into incompatible media formats, users will expect easy access to digital content across many platforms.
Yesterday's web giants and tomorrow's users will need to find a mutually beneficial new balanceÂbetween open and proprietary, developer and user, hierarchical and horizontal, owned and shared, and compatible and closed.
Marc Canter is an active evangelist and developer of open standards. Early in his career, Marc founded MacroMind, which became Macromedia. These days, he is CEO of Broadband Mechanics, a founding member of the Identity Gang and of ourmedia.org. Broadband Mechanics is currently developing the GoingOn Network (with the AlwaysOn Network), as well as an open platform for social networking called the PeopleAggregator.
A version of the above post appears in the Fall 2005 issue of AlwaysOn's quarterly print blogozine, and ran as a four-part series on the AlwaysOn Network website.(Via Marc's Voice.)
The value of the Internet as a repository of useful information is very low. Carl Shapiro in âInformation Rulesâ suggests that the amount of actually useful information on the Internet would fit within roughly 15,000 books, which is about half the size of an average mall bookstore. To put this in perspective: there are over 5 billion unique, static & publicly accessible web pages on the www. Apparently Only 6% of web sites have educational content (Maureen Henninger, âDonât just surf the net: Effective research strategiesâ. UNSW Press). Even of the educational content only a fraction is of significant informational value.
..As Stanford students, Larry Page and Sergey Brin looked at the same problemâhow to impart meaning to all the content on the Webâand decided to take a different approach. The two developed sophisticated software that relied on other clues to discover the meaning of content, such as which Web sites the information was linked to. And in 1998 they launched Google..
You mean noise ranking. Now, I don't think Larry and Sergey set out to do this, but Google page ranks are ultimately based on the concept of "Google Juice" (aka links). The value quotient of this algorithm is accelerating at internet speed (ironically, but naturally). Human beings are smarter than computers, we just process data (not information!) much slower that's all. Thus, we can conjure up numerous ways to bubble up the google link ranking algorithms in no time (as is the case today).
..What most differentiates Google's approach from Berners-Lee's is that Google doesn't require people to change the way they post content..
The Semantic Web doesn't require anyone to change how they post content either! It just provides a roadmap for intelligent content managment and consumption through innovative products.
..As Sergey Brin told Infoworld's 2002 CTO Forum, "I'd rather make progress by having computers under-stand what humans write, than by forcing -humans to write in ways that computers can understand." In fact, Google has not participated at all in the W3C's formulation of Semantic Web standards, says Eric Miller..
Semantic Content generated by next generation content managers will make more progress, and they certainly won't require humans to write any differently. If anything, humans will find the process quite refreshing as and when participation is required e.g. clicking bookmarklets associated with tagging services such as 'del.icio.us', 'de.lirio.us', or Unalog and others. But this is only the beginning, if I can click on a bookmarklet to post this blog post to a tagging service, then why wouldn't I be able to incorporate the "tag service post" into the same process that saves my blog post (the post is content that ends up in a content management system aka blog server)?
Yet Google's impact on the Web is so dramatic that it probably makes more sense to call the next generation of the Web the "Google Web" rather than the "Semantic Web."
Ah! so you think we really want the noisy "Google Web" as opposed to a federation of distributed Information- and Knowledgbases ala the "Semantic Web"? I don't think so somehow!
Today we are generally excited about "tagging" but fail to see its correlation with the "Semantic Web", somehow? I have said this before, and I will say it again, the "Semantic Web" is going to be self-annotated by humans with the aid of intelligent and unobtrusive annotation technology solutions. These solutions will provide context and purpose by using our our social essence as currency. The annotation effort will be subliminal, there won't be a "Semantic Web Day" parade or anything of the like. It will appear before us all, in all its glory, without any fanfare. Funnily enough, we might not even call it "The Semantic Web", who cares? But it will have the distinct attributes of being very "Quiet" and highly "Valuable"; with no burden on "how we write", but constructive burden on "why we write" as part of the content contribution process (less Google/Yahoo/etc juice chasing for more knowledge assembly and exchange).
We are social creatures at our core. The Internet and Web have collectively reduced the connectivity hurdles that once made social network oriented solutions implausible. The eradication of these hurdles ultimately feeds the very impulses that trigger the critical self-annotation that is the basis of my fundamental belief in the realization of TBL's Semantic Web vision.
Â
]]>Tim Berners-Lee provided a keynote at WWW2004 earlier this week, and Paul Ford provided a keynote breakdown from which I have scrapped a poignant excerpt that helps me illuminate Virtuoso's role in the inevitable semantic web.
First off, I see the Semantic Web as a core component of Web 2.x (a minor upgrade of Web 2.0), and I see Virtuoso as a definitive Web 2.0 (and beyond) technology, hence the use today of the branding term "Universal Server". A term that I expect to become a common product moniker in the not too distant future.
The first challenge that confronts the semantic web is the creation of Semantic content. How will the content be created? Ideally, this should come from data, at the end of the day this is a data contextualization process. The excerpt below from Paul's article highlights the point:
Rather than concerning themselves unduly with hewing to existing ontologies, Berners-Lee pushed developers to start using RDF and triples more aggressively. In particular, he wants to see existing databases exported as RDF, with ontologies created ad-hoc to match the structure of that data. Rather than using PHP scripts only to produce HTML, he suggested, create RDF as well. Then, when all of the RDF is aggregated, apply rules and see what happens. "Let's not fall back on handmade markup."
Data in existing databases does not have to be exported as RDF, especially if sensitivity to change is a specific contextual requirement. Naturally, the assumption is made that most databases don't have the ability to produce RDF so an additonal tool would be required to perform the data exports and transformation, and then a separate HTTP server makes this repurposed RDF data accessible over HTTP.
Later in the talk, he described a cascade of Semantic Web connections, postulating that one day, individuals may be able to follow links from a parts catalog to order status, from location to weather to taxes.
The final excerpt (above) outlines the kinds of interactions that the Semantic Web facilitates. The traversal from a "part catalog" to "order status", or from "location" to "weather" to "taxes", illustrates the roles that services and service orchestration will also play in the Semantic Web era.
Thus, we can safely deduce the following about the semantic web:
I would also like to conclude that what we know today, as the monolithic "point of presence" on the web called a "Web Site" (which infers browsing and page serving), is naturally going to morph into a different kind of "point of presence" that is capable of delivering the following from a single process:
This is what Virtuoso is all about, and why it is described as a "Universal Server"; a server instance that speaks many protocols, delivering a plethora of functionality (Database, Web Services Platform, Orchestration Engine, and more).
]]>Tim Berners-Lee First Honoree of Millennium Technology Prize.
Reuters, via MSNBC News
World Wide Web inventor Tim Berners-Lee won $1.23 million on Thursday, the largest single amount of money he has made from an invention that has made many others very rich. Berners-Lee, 48, was named the first winner of the world's largest technology award -- the Millennium Technology Prize -- by the Finnish Technology Award Foundation at a ceremony in the Finnish city of Espoo. When myriad dot-com firms went public in the late 1990s, their founders were instantly turned into millionaires at the height of the Internet investment bubble. Most people would be hard-pressed to name the retiring Internet architect, who bypassed cashing-in on his technology contributions for an academic's salary at the Massachusetts Institute of Technology in the United States.
http://msnbc.msn.com/id/4744554/
See also the W3C news item: http://www.w3.org/News/2004#item64
]]>