Sticking with the TechCrunch layout, here is why all roads simply lead to Linked Data come 2010 and beyond:
As I've stated in the past (across a variety of mediums), you cannot build applications that have long term value without addressing the following issues:
The items above basically showcase the very essence of the HTTP URI abstraction that drives HTTP based Linked Data; which is also the basic payload unit that underlies REST.
I simply hope that the next decade marks a period of broad appreciation and comprehension of Data Access, Integration, and Management issues on the parts of: application developers, integrators, analysts, end-users, and decision makers. Remember, without structured Data we cannot produce or share Information, and without Information, we cannot produce of share Knowledge.
As the "Linked Data" meme has gained momentum you've more than likely been on the receiving end of dialog with Linked Open Data community members (myself included) that goes something like this:
"Do you have a URI", "Get yourself a URI", "Give me a de-referencable URI" etc..
And each time, you respond with a URL -- which to the best of your Web knowledge is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..
Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) & Address/Location of Documents (Information bearing Resources).
The "Linked Data" meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.
People, Places, Music, Books, Cars, Ideas, Emotions etc..
A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.
The constituent parts of a URI (from URI Generic Syntax RFC) are depicted below:
A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:
So far so good!
The kind of URI Linked Data aficionados mean when they use the term: URI.
An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:
Data about Data. Put differently, data that describes other data in a structured manner.
The predominant model for metadata is the Entity-Attribute-Value + Classes & Relationships model (EAV/CR). A model that's been with us since the inception of modern computing (long before the Web).
The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:
The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct "Raw Data" -- which is inherently objective.
Note: Information is "data in context", it isn't the same thing as "Raw Data". Thus, if we can link to Information via the Web, why shouldn't we be able to do the same for "Raw Data"?
The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., "Data Item to Data Item" level links (aka hyperdata links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.
Linked Data is ultimately about an HTTP URI for each item in the Data Organization Hierarchy :-)
If we could just take "The Semantic Web" moniker for what it was -- a code name for an aspect of the Web -- and move on, things will get much clearer, fast!
Basically, what is/was the "Semantic Web" should really have been code named: ("You" Oriented Data Access) as a play on: Yoda's appreciation of the FORCE (Fact ORiented Connected Entities) -- the power of inter galactic, interlinked, structured data, fashioned by the World Wide Web courtesy of the HTTP protocol.
As stated in a earlier post, the next phase of the Web is all about the magic of entity "You". The single most important item of reference to every Web user would be the Person Entity ID (URI). Just by remembering your Entity ID, you will have intelligent pathways across, and into, the FORCE that the Linked Data Web delivers. The quality of the pathways and increased density of the FORCE are the keys to high SDQ (tomorrows SEO). Thus, the SDQ of URIs will ultimately be the unit determinant of value to Web Users, along the following personal lines, hence the critical platform questions:
While most industry commentators continue to ponder and pontificate about what "The Semantic Web" is (unfortunately), the real thing (the "FORCE") is already here, and self-enhancing rapidly.
Assuming we now accept the FORCE is simply an RDF based Linked Data moniker, and that RDF Linked Data is all about the Web as a structured database, we should start to move our attention over to practical exploitation of this burgeoning global database, and in doing so we should not discard knowledge from the past such as the many great examples available gratis from the Relational Database realm. For instance, we should start paying attention to the discovery, development, and deployment of high level tools such as query builders, report writers, and intelligence oriented analytic tools, none of which should -- at first point of interaction -- expose raw RDF or the SPARQL query language. Along similar lines of thinking, we also need development environments and frameworks that are counterparts to Visual Studio, ACCESS, File Maker, and the like.
Now, if re-labeling can confuse me when applied to a realm I've been intimately involved with for eons (internet time). I don't want to imagine what it does for others who aren't that intimately involved with the important data access and data integration realms.
On the more refreshing side, the article does shed some light on the potency of RDF and OWL when applied to the construction of conceptual views of heterogeneous data sources.
"How do you know that data coming from one place calculates net revenue the same way that data coming from another place does? Youâve got people using the same term for different things and different terms for the same things. How do you reconcile all of that? Thatâs really what semantic integration is about."
BTW - I discovered this article via another titled: Understanding Integration And How It Can Help with SOA, that covers SOA and Integration matters. Again, in this piece I feel the gradual realization of the virtues that RDF, OWL, and RDF Linked Data bring to bear in the vital realm of data integration across heterogeneous data silos.
A number of events, at the micro and macro economic levels, are forcing attention back to the issue of productive use of existing IT resources. The trouble with the aforementioned quest is that it ultimately unveils the global IT affliction known as: heterogeneous data silos, and the challenges of pain alleviation, that have been ignored forever or approached inadequately as clearly shown by the rapid build up of SOA horror stories in the data integration realm.
Data Integration via conceptualization of heterogenous data sources, that result in concrete conceptual layer data access and management, remains the greatest and most potent application of technologies associated with the "Semantic Web" and/or "Linked Data" monikers.
CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is?
Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes & Relationships).
CrunchBase: And what inspired you to create it?
Me: Bengee built a new space with your data, and we've built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles).
Bengee posted a notice to the Linking Open Data Community's public mailing list announcing his effort. Bearing in mind the fact that we've been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee's RDFization efforts, and ours.
Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we've taken the "RDFization on the fly" approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we've also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data.
CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?
Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to "Mesh" (rather than "Mash") CrunchBase data with other Linked Data sources on the Web without writing a single line of code.
CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?
Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap.
CrunchBase: Can you put into laymanâs terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?
Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).
SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That's it in a nutshell.
CrunchBase: On your website you wrote that âRDF and SPARQL as productivity boosters in everyday web developmentâ. Can you elaborate on why you believe that to be true?
Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn't confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the "Information at Your Fingertips" vision of yore. Even though it's taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision's intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn't any knowledge without accessible Information, and there isn't any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).
Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet.
CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?
Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of "Smart Data Access" to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs.
Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge:
]]>Of course, I also believe that Linked Data serves Web Data Integration across the Internet very well too, and the fact that it will be beneficial to businesses in a big way. No individual or organization is an island, I think the Internet and Web have done a good job of demonstrating that thus far :-) We're all data nodes in a Giant Global Graph.
Daniel lewis did shed light on the read-write aspects of the Linked Data Web, which is actually very close to the callout for a Wikipedia for Data. TimBL has been working on this via Tabulator (see Tabulator Editing Screencast), Bengamin Nowack also added similar functionality to ARC, and of course we support the same SPARQL UPDATE into an RDF information resource via the RDF Sink feature of our WebDAV and ODS-Briefcase implementations.
]]>ReadWriteWeb via Alex Iskold's post have delivered another iteration of their "Guide to Semantic Technologies".
If you look at the title of this post (and their article) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the "Semantic Web" as prescribed by TimBL then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: "ReadWriteWeb" and "Alex Iskold" are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted blog posts.
Preoccupation with Literal objects as describe above, implies we can only take what "ReadWriteWeb" and "Alex Iskold" say "Literally" (grep, regex, and XPath/Xquery are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar "About Page" blurb), no data connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.
TimBL's vision as espoused via the "Semantic Web" vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (Hyperdata Links / Linked Data). It's how we use the Web as a Distributed Database where (as Jim Hendler once stated with immense clarity): I can point to records (entity instances) in your database (aka Data Space) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type "Document") using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the "Federated and Open" nature of the Web.
As I stated in a prior post, if you or your platform of choice aren't producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no "World Wide Web" in what you are doing.
I am a Kingsley Idehen, a Person who authors this weblog. I also share bookmarks gathered over the years across an array of subjects via my bookmark data space. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data space. Of course, all of these data sources have Tags which are collectively exposed via my weblog tag-cloud, feeds subscriptions tag-cloud, and bookmarks tag-cloud data spaces.
As I don't like repeating myself, and I hate wasting my time or the time of others, I simply share my Data Space (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new information (perspectives) that is hopefully exposed others via the Web.
Bottom-line, the Semantic Web is about adding the missing "Open Data Access & Connectivity" feature to the current Document Web (we have to beyond regex, grep, xpath, xquery, full text search, and other literal scrapping approaches). The Linked Data Web of de-referencable data object URIs is the critical foundation layer that makes this feasible.
Remember, It's not about "Applications" it's about Data and actually freeing Data from the "tyranny of Applications". Unfortunately, application inadvertently always create silos (esp. on the Web) since entity data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.
Final comment, RDF facilitates Linked Data on the Web, but all RDF isn't endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn't provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they've discovered without exposing de-referencable URIs for each item.
BTW - my final comments above aren't in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, Virtuoso, OpenLink Data Spaces, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, Zitgist, which now has Mike Bergman as it's CEO alongside Frederick Giasson as CTO. Of course, I cannot do Zitgist justice via a footnote in a blog post, so I will expand further in a separate post.
If you look at the title of this post (and their article) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the "Semantic Web" as prescribed by TimBL then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: "ReadWriteWeb" and "Alex Iskold" are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted blog posts.
Preoccupation with Literal objects as describe above, implies we can only take what "ReadWriteWeb" and "Alex Iskold" say "Literally" (grep, regex, and XPath/Xquery are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar "About Page" blurb), no data connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.
TimBL's vision as espoused via the "Semantic Web" vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (Hyperdata Links / Linked Data). It's how we use the Web as a Distributed Database where (as Jim Hendler once stated with immense clarity): I can point to records (entity instances) in your database (aka Data Space) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type "Document") using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the "Federated and Open" nature of the Web.
As I stated in a prior post, if you or your platform of choice aren't producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no "World Wide Web" in what you are doing.
I am a Kingsley Idehen, a Person who authors this weblog. I also share bookmarks gathered over the years across an array of subjects via my bookmark data space. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data space. Of course, all of these data sources have Tags which are collectively exposed via my weblog tag-cloud, feeds subscriptions tag-cloud, and bookmarks tag-cloud data spaces.
As I don't like repeating myself, and I hate wasting my time or the time of others, I simply share my Data Space (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new information (perspectives) that is hopefully exposed others via the Web.
Bottom-line, the Semantic Web is about adding the missing "Open Data Access & Connectivity" feature to the current Document Web (we have to beyond regex, grep, xpath, xquery, full text search, and other literal scrapping approaches). The Linked Data Web of de-referencable data object URIs is the critical foundation layer that makes this feasible.
Remember, It's not about "Applications" it's about Data and actually freeing Data from the "tyranny of Applications". Unfortunately, application inadvertently always create silos (esp. on the Web) since entity data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.
Final comment, RDF facilitates Linked Data on the Web, but all RDF isn't endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn't provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they've discovered without exposing de-referencable URIs for each item.
BTW - my final comments above aren't in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, Virtuoso, OpenLink Data Spaces, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, Zitgist, which now has Mike Bergman as it's CEO alongside Frederick Giasson as CTO. Of course, I cannot do Zitgist justice via a footnote in a blog post, so I will expand further in a separate post.
Now that broader understanding of the Semantic Data Web is emerging, I would like to revisit the issue of "Data Spaces".
A Data Space is a place where Data Resides. It isn't inherently bound to a specific Data Model (Concept Oriented, Relational, Hierarchical etc..). Neither is it implicitly an access point to Data, Information, or Knowledge (the perception is purely determined through the experiences of the user agents interacting with the Data Space.
A Web Data Space is a Web accessible Data Space.
Real world example:
Today we increasing perform one of more of the following tasks as part of our professional and personal interactions on the Web:
John Breslin has nice a animation depicting the creation of Web Data Spaces that drives home the point.
Web Data Space SilosUnfortunately, what isn't as obvious to many netizens, is the fact that each of the activities above results in the creation of data that is put into some context by you the user. Even worse, you eventually realize that the service providers aren't particularly willing, or capable of, giving you unfettered access to your own data. Of course, this isn't always by design as the infrastructure behind the service can make this a nightmare from security and/or load balancing perspectives. Irrespective of cause, we end up creating our own "Data Spaces" all over the Web without a coherent mechanism for accessing and meshing these "Data Spaces".
What are Semantic Web Data Spaces?Data Spaces on the Web that provide granular access to RDF Data.
What's OpenLink Data Spaces (ODS) About?Short History
In anticipation of this the "Web Data Silo" challenge (an issue that we tackled within internal enterprise networks for years) we commenced the development (circa. 2001) of a distributed collaborative application suite called OpenLink Data Spaces (ODS). The project was never released to the public since the problems associated with the deliberate or inadvertent creation of Web Data silos hadn't really materialized (silos only emerged in concreted form after the emergence of the Blogosphere and Web 2.0). In addition, there wasn't a clear standard Query Language for the RDF based Web Data Model (i.e. the SPARQL Query Language didn't exist).
Today, ODS is delivered as a packaged solution (in Open Source and Commercial flavors) that alleviates the pain associated with Data Space Silos that exist on the Web and/or behind corporate firewalls. In either scenario, ODS simply allows you to create Open and Secure Data Spaces (via it's suite of applications) that expose data via SQL, RDF, XML oriented data access and data management technologies. Of course it also enables you to integrates transparently with existing 3rd party data space generators (Blogs, Wikis, Shared Bookmrks, Discussion etc. services) by supporting industry standards that cover:
Thus, by installing ODS on your Desktop, Workgroup, Enterprise, or public Web Server, you end up with a very powerful solution for creating Open Data access oriented presence on the "Semantic Data Web" without incurring any of the typically assumed "RDF Tax".
Naturally, ODS is built atop Virtuoso and of course it exploits Virtuoso's feature-set to the max. It's also beginning to exploit functionality offered by the OpenLink Ajax Toolkit (OAT).
]]>Dare:
I have been through the Wikipedia fires a few times. If you recall that I actually triggered the early Web 2.0 Wikipedia article. along the following lines:
As annoying as the experience above was, I didn't find this inconsistent with the spirit of Wikipedia (i.e. open contribution and discourse). I felt, at the time, that a lot of historical data was being left in place for future reference etc.. In addition, the ultimate aim of creating an evolving Web 2.0 document did commence albeit some distance from "modern man" re. accuracy and meaningfulness as of my last read (today).
Even closer to home, I repeated the process above re. Virtuoso Universal Server. This basically ended up being a live case study on how you handle the Wikipedia NPOV conundurum. Just look at the Virtuoso Universal Server Talk Pages to see how the process evolved (the key was Virtuoso's lineage and it's proximity to the very DBMS platform upon which Wikipedia runs i.e MySQL).
Bearing in mind the size and magnitude of Microsoft, there should be no reason why Microsoft's "Microsoft Digital Caucus" ( legions of Staff, MSDN members, Integrators, and other partners) can't simply go into Wikipedia and participate in the edit and discourse process.
Truth cannot be surpressed! At best, it can only be temporarily delayed :-) Even more so on the Web!
]]>Here is a very important excerpt:
...And then something happened. Visual Basic became popular as a scriptable "automation language". ODBC, being a C-style interface, was not directly consumable from VB. However, some of you clever folks figured out that Microsoft Access supported executing queries against ODBC Datasources, and that Access did support scriptable automation through its Data Access Object (DAO) API. Voila! Now you could write applications against ODBC sources using VB.
However, DAO went through Access's internal "Jet" (Joint Engine Technology) database engine, which defaulted to building local keysets for each result in order to do advanced query processing and cursoring against the remote data. This was fine if you needed that functionality, but significant performance overhead and additional round trips when you didn't.
Enter the Visual Basic team who, responding to customer demand for better performance against ODBC sources, came up with something called Remote Data Objects (RDO). RDO implemented the same DAO programming patterns directly against ODBC, rather than going through Jet. RDO was extremely popular among VB developers, but the fact that we had two different sets of automation objects for accessing ODBC sources caused confusion.
But apparently not enough confusion, because our solution was to introduce "ODBCDirect". Despite its name, ODBCDirect was not a new API; it was just a mode we added to DAO that set defaults in such a way as to avoid the overhead of building keysets and such
...
To this very day (unfortunately!) ODBC has been maligned by the perpetuated misunderstanding of JET's DAO layer that sits atop ODBC providing advanced query processing (i.e. Virtual DBMS functionality) alongside a client-side keyset cursor model implementation.
]]>A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.
BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.
With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:
As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.
We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.
What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).
The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:
Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.
If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:
A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open & Flexible Data Access Model â.
In more succinct form:
A pervasive network of people connected by data or data connected by people.
Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).
To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.
The Enterprise Challenge
Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).
A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.
Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.
Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:
Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business
A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.
BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: The Two Webs. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which Jon Udell evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent podcast conversation.
With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum:
As you can see from the above, Open Data access isn't genuinely compatible with Web 2.0.
We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I'll return to this later in this post.
What about Social Networking you must be thinking? Isn't this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting People and Data. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all).
The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following:
Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access.
If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this:
A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open & Flexible Data Access Model â.
In more succinct form:
A pervasive network of people connected by data or data connected by people.
Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context).
To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model.
The Enterprise Challenge
Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing).
A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture's Institute for High Performance Business titled: Visualizing Organizational Change.
Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people & data networks (The very issue emphasized in this original Web vision document by Tim Berners-Lee). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time.
Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0:
Information Management Proposal - Tim Berners-Lee
Visualizing Organizational Change - Accenture Institute of High Performance Business
I was compelled to go back to the RSS 2.0 imbroglio when I came across Dave Winer's comments re. "the SEC attempting to reinvent RSS 2.0..." response to Jon Udell's recent XBRL article.
Although I don't believe in complex entry points into complex technology realms, I do subscribe to the approach where developers deal with the complexity associated with a problem domain while hiding said complexity from ambivalent end-users via coherent interfaces -- which does not always imply User Interface.
XBRL is a great piece of work that addresses the complex problem domain of Financial Reporting. The only thing it's missing right now is an Ontology that facilitates RDF Data Model based XBRL Schema and Instance Data which ultimately makes XBRL data available to RDF query languages such as SPARQL. This line of thought implies, for instance, an XML Schema to OWL Ontology Mapping for Schema Data (as explained in a white paper by the VSIS Group at the university of Hamburg) leaving the Instance Data to be generated in a myriad of ways that includes XML to RDF and/or XML->SQL->RDF.
As I stated in an earlier post: we should not mistake ambivalence to lack of intelligence. Assuming "Simple" is always right at all times is another way of subscribing to this profound misconception. You know, assuming the world was flat (as opposed to geoid) was quite palatable at some point in the history of mankind, I wonder what would have happened if we held on to this point of view to this day because of its "Simplicity"?
]]>I think weâre all sorta jumping around the same bush. Itâs been a good dance because Iâve learned some things. First of all, nothingâs simple and it isnât getting any simpler. There are no rules any more and as much as Iâd like to come up with some kind of all encompassing unified field theory of ethical research method, I know that smarter people than me have already done a better job, and none of it is perfect.
Please allow me to do something kinda strange. I want to look backward for some clues. When I was young, my Dad loved to build things. He was the preeminent do-it-yourselfer. Every weekend, he had a building project, and every Saturday morning he loaded us boys into the station wagon and off we went to the Lowes Hardware Store in Shelby, where he bought the tools and materials he would need for the project.
He did not have a list of criterial for selecting his materials, because every project was different â the goal was different. If he had selected everything based on the same criteria, then everything he built would have been made with pine shelving, two-penny finishing nails, and all the work would have been done with a Craftsman common nail hammer. Instead, he selected his building materials and tools based on the goal of the project. To do otherwise would have resulted in a product that did not last long, and that would have been unethical.
Years later, I studied under the best teacher I ever had, Mr. Bill Edwards â my industrial arts teacher. His technique was to help us learn industrial arts skills by helping us to build something of value. I built a kayak. Other students built book shelves, stools, and chess boards. Two friends of mine built a life-size replica of a Gemini Space Capsule. Mr. Edwards taught us to set goals and to make decisions based on those goals.
This was the perfect way to teach industrial arts skills, since we were in the industrial age. If Edwards had taught us in the same way that my information arts teachers were teaching, he would have put a stack of lumber on our desks and asked us to practice driving nails. But he taught us by putting us in the industry. We should be teaching today by putting students in the industry of information. We need to stop teaching science and start teaching students to be scientists. Stop teaching history, but rather teach to be historians. Stop teaching students to be researchers, and instead, teach them to solve problems and accomplish goals using information.
I am certain that there were brands of wood and nails that my father wouldnât buy, because he couldnât depend on them. He swore by Craftsman tools. To build with materials that were unreliable would have been unethical. But his conscious work in finding and selecting materials was based on the goal at hand. All else pointed to that criteria.
It is critical to know and understand the source of the information. But what is it about the source that helps you accomplish your goal. Itâs important to understand when the information was generated and published. But what is it about âwhenâ that helps you accomplish your goal. Itâs important to understand what the information is made of, and what it is about its format and how you can use it that helps you accomplish your goal. Itâs important to understand the informationâs cultural, economic, environmental, and emotional context, and what it is about the context that helps you accomplish your goal. All aspects remain critical, but its problem solving and goal achieving that children need to be doing, not just hoop-jumping in their schools. The need to look for the informationâs value as a tool for ethically accomplishing their goals.
Technorati Tags: librarians, warlick
Note to Tim:
Is the RDF.net domain deal still on? I know it's past 1st Jan 2006, but do bear in mind that the critical issue of a broadly supported RDF Query Language only took significant shape approximately 13 months ago (in the form of SPARQL), and this is all so critical to the challenge you posed in 2003.
RDF.net could become a point of semantic-web-presence through which the benefits of SPARQL compliant Triple|Quad Stores, Shared Ontologies, and SPARQL Protocol are unveiled in their well intended glory :-).
]]>Standards as social contracts: "Looking at Dave Winer's efforts in evangelizing OPML, I try to draw some rough lines into what makes a de-facto standard. De Facto standards are made and seldom happen on their own. In this entry, I look back at the history of HTML, RSS, the open source movement and try to draw some lines as to what makes a standard.
"(Via Tristan Louis.)
I posted a comment to the Tristan Louis' post along the following lines:
Analysis is spot on re. the link between de facto standardization and bootstrapping. Likewise, the clear linkage between boostrapping and connected communities (a variation of the social networking paradigm).
Dave built a community around a XML content syndication and subscription usecase demo that we know today as the blogosphere. Superficially, one may conclude that Semantic Web vision has suffered to date from a lack a similar bootstrap effort. Whereas in reality, we are dealing with "time and context" issues that are critical to the base understanding upon which a "Dave Winer" style bootstrap for the Semantic Web would occur.
Personally, I see the emergence of Web 2.0 (esp. the mashups phenomenon) as the "time and context" seeds from which the Semantic Web bootstrap will sprout. I see shared ontologies such as FOAF and SIOC leading the way (they are the RSS 2.0's of the Semantic Web IMHO).
]]>I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as "Adam Bosworth's fingerprints." I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years)."In my fourth Friday podcast we hear from Kingsley Idehen, CEO of OpenLink Software. I wrote about OpenLink's universal database and app server, Virtuoso, back in 2002 and 2003. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the transition to open source. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. ..."
(Via Jon's Radio.)
BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what's already in place).
I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam's (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn't mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).
Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)
If you are encountering Virtuoso for the first time via this post or Jon's, please make time to read the product history article on the Virtuoso Wiki (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).
That said, I better go listen to the podcast :-)
]]>I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as "Adam Bosworth's fingerprints." I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years)."In my fourth Friday podcast we hear from Kingsley Idehen, CEO of OpenLink Software. I wrote about OpenLink's universal database and app server, Virtuoso, back in 2002 and 2003. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the transition to open source. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. ..."
(Via Jon's Radio.)
BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what's already in place).
I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam's (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn't mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).
Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)
If you are encountering Virtuoso for the first time via this post or Jon's, please make time to read the product history article on the Virtuoso Wiki (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).
That said, I better go listen to the podcast :-)
]]>A powerful next generation server product that implements otherwise distinct server functionality within a single server product. Think of Virtuoso as the server software analog of a dual core processor where each core represents a traditional server functionality realm.
The Virtuoso History page tells the whole story.
90% of the aforementioned functionality has been available in Virtuoso since 2000 with the RDF Triple Store being the only 2006 item.
The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).
Simple, there is no value in a product of this magnitude remaining the "best kept secret". That status works well for our competitors, but absolutely works against the legions of new generation developers, systems integrators, and knowledge workers that need to be aware of what is actually achievable today with the right server architecture.
GPL version 2.
Dual licensing.
The Open Source version of Virtuoso includes all of the functionality listed above. While the Virtual Database (distributed heterogeneous join engine) and Replication Engine (across heterogeneous data sources) functionality will only be available in the commercial version.
On SourceForge.
Of course!
Up until this point, the Virtuoso Product Blog has been a covert live demonstration of some aspects of Virtuoso (Content Management). My Personal Blog and the Virtuoso Product Blog are actual Virtuoso instances, and have been so since I started blogging in 2003.
Is There a product Wiki?
Sure! The Virtuoso Product Wiki is also an instance of Virtuoso demonstrating another aspect of the Content Management prowess of Virtuoso.
Yep! Virtuoso Online Documentation is hosted via yet another Virtuoso instance. This particular instance also attempts to demonstrate Free Text search combined with the ability to repurpose well formed content in a myriad of forms (Atom, RSS, RDF, OPML, and OCS).
The Virtuoso Online Tutorial Site has operated as a live demonstration and tutorial portal for a numbers of years. During the same timeframe (circa. 2001) we also assembled a few Screencast style demos (their look feel certainly show their age; updates are in the works).
BTW - We have also updated the Virtuoso FAQ and also released a number of missing Virtuoso White Papers (amongst many long overdue action items).
]]>Quick Example using my blog:
Digest the rest of Dare's post:
Clone the Google APIs: Kill That Noise: "
Yesterday Dave Winer wrote in a post about cloning the Google API Dave Winer wrote
Let's make the Google API an open standard. Back in 2002, Google took a bold first step to enable open architecture search engines, by creating an API that allowed developers to build applications on top of their search engine. However, there were severe limits on the capacity of these applications. So we got a good demo of what might be, now three years later, it's time for the real thing.and earlier thatIf you didn't get a chance to hear yesterday's podcast, it recommends that Microsoft clone the Google API for search, without the keys, and without the limits. When a developer's application generates a lot of traffic, buy him a plane ticket and dinner, and ask how you both can make some money off their excellent booming application of search. This is something Google can't do, because search is their cash cow. That's why Microsoft should do it. And so should Yahoo. Also, there's no doubt Google will be competing with Apple soon, so they should be also thinking about ways to devalue Google's advantage.This doesn't seem like a great idea to me for a wide variety of reasons but first, let's start with a history lesson before I tackle this specific issue
A Trip Down Memory Lane
This history lessonused to be inis in a post entitled The Tragedy of the API by Evan Williamsbut seems to be gone now. Anyway, back in the early days of blogging the folks at Pyra [which eventually got bought by Google] created the Blogger API for their service. Since Blogspot/Blogger was a popular service, a the number of applications that used the API quickly grew. At this point Dave Winer decided that since the Blogger API was so popular he should implement it in his weblogging tools but then he decided that he didn't like some aspects of it such as application keys (sound familiar?) and did without them in his version of the API. Dave Winer's version of the Blogger API became the MetaWeblog API. These APIs became de facto standards and a number of other weblogging applications implemented them.After a while, the folks at Pyra decided that their API needed to evolve due to various flaws in its design. As Diego Doval put it in his post a review of blogging APIs, The Blogger API is a joke, and a bad one at that. This lead to the creation of the Blogger API 2.0. At this point a heated debate erupted online where Dave Winer berated the Blogger folks for deviating from an industry standard. The irony of flaming a company for coming up with a v2 of their own API seemed to be lost on many of the people who participated in the debate. Eventually the Blogger API 2.0 went nowhere.
Today the blogging API world is a few de facto standards based on a hacky API created by a startup a few years ago, a number of site specific APIs (LiveJournal API, MovableType API, etc) and a number of inconsistently implemented versions of the Atom API.
On Cloning the Google Search API
To me the most salient point in the hijacking of the Blogger API from Pyra is that it didn't change the popularity of their service or even make Radio Userland (Dave Winer's product) catch up to them in popularity. This is important to note since this is Dave Winer's key argument for Microsoft cloning the Google API.Off the top of my head, here are my top three technical reasons for Microsoft to ignore the calls to clone the Google Search APIs
Difference in Feature Set: The features exposed by the API do not run the entire gamut of features that other search engines may want to expose. Thus even if you implement something that looks a lot like the Google API, you'd have to extend it to add the functionality that it doesn't provide. For example, compare the features provided by the Google API to the features provided by the Yahoo! search API. I can count about half a dozen features in the Yahoo! API that aren't in the Google API.
Difference in Technology Choice: The Google API uses SOAP. This to me is a phenomenally bad technical decision because it raises the bar to performing a basic operation (data retrieval) by using a complex technology. I much prefer Yahoo!'s approach of providing a RESTful API and
MSNWindows Live Search's approach of providing RSS search feeds and a SOAP API for the folks who need such overkill.- Unreasonable Demands: A number of Dave Winer's demands seem contradictory. He asks companies to not require application keys but then advises them to contact application developers who've built high traffic applications about revenue sharing. Exactly how are these applications to be identified without some sort of application ID? As for removing the limits on the services? I guess Dave is ignoring the fact that providing services costs money, which I seem to remember is why he sold weblogs.com to Verisign for a few million dollars. I do agree that some of the limits on existing search APIs aren't terribly useful. The Google API limit of 1000 queries a day seems to guarantee that you won't be able to power a popular application with the service.
Lack of Innovation: Copying Google sucks.
Large-scale computer applications require rapid access to large amounts of data. A computerized checkout system in a supermarket must track the entire product line of the market. Airline reservation systems are used at many locations simultaneously to place passengers on numerous flights on different dates. Library computers store millions of entries and access citations from hundreds of publications. Transaction processing systems in banks and brokerage houses keep the accounts that generate international flows of capital. World Wide Web search engines scan thousands of Web pages to produce quantitative responses to queries almost instantly. Thousands of small businesses and organizations use databases to track everything from inventory and personnel to DNA sequences and pottery shards from archaeological digs.
Thus, databases not only represent significant infrastructure for computer applications, but they also process the transactions and exchanges that drive the U.S. economy.
My only addition to the excerpt above is that the impact of databases extends beyond the U.S. economy. We are talking about the global economy. And this will be so for all of time!
I came across this page while enriching the links in one of my earlier "history" related posts about Relational Database Technology pioneers. During this effort I also stumbled across another historic document titled: "1995 SQL Reunion".
]]>Anyway, Marc's article is a very refreshing read because it provides a really good insight into the general landscape of a rapidly evolving Web alongside genuine appreciation of our broader timeless pursuit of "Openness".
To really help this document provide additional value have scrapped the content of the original post and dumped it below so that we can appreciate the value of the links embedded within the article (note: thanks to Virtuoso I only had to paste the content into my blog, the extraction to my Linkblog and Blog Summary Pages are simply features of my Virtuoso based Blog Engine):
]]>Breaking the Web Wide Open! (complete story)
Even the web giants like AOL, Google, MSN, and Yahoo need to observe these open standards, or they'll risk becoming the "walled gardens" of the new web and be coolio no more.
Editorial Note: Several months ago, AlwaysOn got a personal invitation from Yahoo founder Jerry Yang "to see and give us feedback on our new social media product, y!360." We were happy to oblige and dutifully showed up, joining a conference room full of hard-core bloggers and new, new media types. The geeks gave Yahoo 360 an overwhelming thumbs down, with comments like, "So the only services I can use within this new network are Yahoo services? What if I don't use Yahoo IM?" In essence, the Yahoo team was booed for being "closed web," and we heartily agreed. With Yahoo 360, Yahoo continues building its own "walled garden" to control its 135 million customersÂan accusation also hurled at AOL in the early 1990s, before AOL migrated its private network service onto the web. As the Economist recently noted, "Yahoo, in short, has old media plans for the new-media era."
The irony to our view here is, of course, that today's AO Network is also a "closed web." In the end, Mr. Yang's thoughtful invitation and our ensuing disappointment in his new service led to the assignment of this article. It also confirmed our existing plan to completely revamp the AO Network around open standards. To tie it all together, we recruited the chief architect of our new site, the notorious Marc Canter, to pen this piece. We look forward to our reader feedback.
Breaking the Web Wide Open!
By Marc Canter
For decades, "walled gardens" of proprietary standards and content have been the strategy of dominant players in mainframe computer software, wireless telecommunications services, and the World Wide WebÂit was their successful lock-in strategy of keeping their customers theirs. But like it or not, those walls are tumbling down. Open web standards are being adopted so widely, with such value and impact, that the web giantsÂAmazon, AOL, eBay, Google, Microsoft, and YahooÂare facing the difficult decision of opening up to what they don't control.
The online world is evolving into a new open web (sometimes called the Web 2.0), which is all about being personalized and customized for each user. Not only open source software, but open standards are becoming an essential component.
Many of the web giants have been using open source software for years. Most of them use at least parts of the LAMP (Linux, Apache, MySQL, Perl/Python/PHP) stack, even if they aren't well-known for giving back to the open source community. For these incumbents that grew big on proprietary web services, the methods, practices, and applications of open source software development are difficult to fully adopt. And the next open source movementsÂwhich will be as much about open standards as about codeÂwill be a lot harder for the incumbents to exploit.
While the incumbents use cheap open source software to run their back-ends systems, their business models largely depend on proprietary software and algorithms. But our view a new slew of open software, open protocols, and open standards will confront the incumbents with the classic Innovator's Dilemma. Should they adopt these tools and standards, painfully cannibalizing their existing revenue for a new unproven concept, or should they stick with their currently lucrative model with the risk that eventually a bunch of upstarts eat their lunch?
Credit should go to several of the web giants who have been making efforts to "open up." Google, Yahoo, eBay, and Amazon all have Open APIs (Application Programming Interfaces) built into their data and systems. Any software developer can access and use them for whatever creative purposes they wish. This means that the API provider becomes an open platform for everyone to use and build on top of. This notion has expanded like wildfire throughout the blogosphere, so nowadays, Open APIs are pretty much required.
Other incumbents also have open strategies. AOL has got the RSS religion, providing a feedreader and RSS search in order to escape the "walled garden of content" stigma. Apple now incorporates podcasts, the "personal radio shows" that are latest rage in audio narrowcasting, into iTunes. Even Microsoft is supporting open standards, for example by endorsing SIP (Session Initiation Protocol) for internet telephony and conferencing over Skype's proprietary format or one of its own devising.
But new open standards and protocols are in use, under construction, or being proposed every day, pushing the envelope of where we are right now. Many of these standards are coming from startup companies and small groups of developers, not from the giants. Together with the Open APIs, those new standards will contribute to a new, open infrastructure. Tens of thousands of developers will use and improve this open infrastructure to create new kinds of web-based applications and services, to offer web users a highly personalized online experience.
A Brief History of Openness
At this point, I have to admit that I am not just a passive observer, full-time journalist or "just some blogger"Âbut an active evangelist and developer of these standards. It's the vision of "open infrastructure" that's driving my company and the reason why I'm writing this article. This article will give you some of the background behind on these standards, and what the evolution of the next generation of open standards will look like.
Starting back in the 1980s, establishing a software standard was a key strategy for any software company. My former company, MacroMind (which became Macromedia), achieved this goal early on with Director. As Director evolved into Flash, the world saw that other companies besides Microsoft, Adobe, and Apple could establish true cross-platform, independent media standards.
Then Tim Berners-Lee and Marc Andreessen came along, and changed the rules of the software business and of entrepreneurialism. No matter how entrenched and "standardized" software was, the rug could still get pulled out from under it. Netscape did it to Microsoft, and then Microsoft did it back to Netscape. The web evolved, and lots of standards evolved with it. The leading open source standards (such as the LAMP stack) became widely used alternatives to proprietary closed-source offerings.
Open standards are more than just technology. Open standards mean sharing, empowering, and community support. Someone floats a new idea (or meme) and the community runs with it â with each person making their own contributions to the standard â evolving it without a moment's hesitation about "giving away their intellectual property."
One good example of this was Dave Sifry, who built the Technorati blog-tracking technology inspired by the Blogging Ecosystem, a weekend project by young hacker Phil Pearson. Dave liked what he saw and he ran with itÂturning Technorati into what it is today.
Dave Winer has contributed enormously to this area of open standards. He defined and personally created several open standards and protocolsÂsuch as RSS, OPML, and XML-RPC. Dave has also helped build the blogosphere through his enthusiasm and passion.
By 2003, hundreds of programmers were working on creating and establishing new standards for almost everything. The best of these new standards have evolved into compelling web services platforms â such as del.icio.us, Webjay, or Flickr. Some have even spun off formal standards â like XSPF (a standard for playlists) or instant messaging standard XMPP (also known as Jabber).
Today's Open APIs are complemented by standardized SchemasÂthe structure of the data itself and its associated meta-data. Take for example a podcasting feed. It consists of: a) the radio show itself, b) information on who is on the show, what the show is about and how long the show is (the meta-data) and also c) API calls to retrieve a show (a single feed item) and play it from a specified server.
The combination of Open APIs, standardized schemas for handling meta-data, and an industry which agrees on these standards are breaking the web wide open right now. So what new open standards should the web incumbentsÂand youÂbe watching? Keep an eye on the following developments:
Identity
Attention
Open Media
Microcontent Publishing
Open Social Networks
Tags
Pinging
Routing
Open Communications
Device Management and Control
1. Identity
Right now, you don't really control your own online identity. At the core of just about every online piece of software is a membership system. Some systems allow you to browse a site anonymouslyÂbut unless you register with the site you can't do things like search for an article, post a comment, buy something, or review it. The problem is that each and every site has its own membership system. So you constantly have to register with new systems, which cannot share dataÂeven you'd want them to. By establishing a "single sign-on" standard, disparate sites can allow users to freely move from site to site, and let them control the movement of their personal profile data, as well as any other data they've created.
With Passport, Microsoft unsuccessfully attempted to force its proprietary standard on the industry. Instead, a world is evolving where most people assume that users want to control their own data, whether that data is their profile, their blog posts and photos, or some collection of their past interactions, purchases, and recommendations. As long as users can control their digital identity, any kind of service or interaction can be layered on top of it.
Identity 2.0 is all about users controlling their own profile data and becoming their own agents. This way the users themselves, rather than other intermediaries, will profit from their ID info. Once developers start offering single sign-on to their users, and users have trusted places to store their dataÂwhich respect the limits and provide access controls over that data, users will be able to access personalized services which will understand and use their personal data.
Identity 2.0 may seem like some geeky, visionary future standard that isn't defined yet, but by putting each user's digital identity at the core of all their online experiences, Identity 2.0 is becoming the cornerstone of the new open web.
The Initiatives:
Right now, Identity 2.0 is under construction through various efforts from Microsoft (the "InfoCard" component built into the Vista operating system and its "Identity Metasystem"), Sxip Identity, Identity Commons, Liberty Alliance, LID (NetMesh's Lightweight ID), and SixApart's OpenID.
More Movers and Shakers:
Identity Commons and Kaliya Hamlin, Sxip Identity and Dick Hardt, the Identity Gang and Doc Searls, Microsoft's Kim Cameron, Craig Burton, Phil Windley, and Brad Fitzpatrick, to name a few.
2. Attention
How many readers know what their online attention is worth? If you don't, Google and Yahoo doÂthey make their living off our attention. They know what we're searching for, happily turn it into a keyword, and sell that keyword to advertisers. They make money off our attention. We don't.
Technorati and friends proposed an attention standard, Attention.xml, designed to "help you keep track of what you've read, what you're spending time on, and what you should be paying attention to." AttentionTrust is an effort by Steve Gillmor and Seth Goldstein to standardize on how captured end-user performance, browsing, and interest data are used.
Blogger Peter Caputa gives a good summary of AttentionTrust:"As we use the web, we reveal lots of information about ourselves by what we pay attention to. Imagine if all of that information could be stored in a nice neat little xml file. And when we travel around the web, we can optionally share it with websites or other people. We can make them pay for it, lease it ... we get to decide who has access to it, how long they have access to it, and what we want in return. And they have to tell us what they are going to do with our Attention data."
So when you give your attention to sites that adhere to the AttentionTrust, your attention rights (you own your attention, you can move your attention, you can pay attention and be paid for it, and you can see how your attention is used) are guaranteed. Attention data is crucial to the future of the open web, and Steve and Seth are making sure that no one entity or oligopoly controls it.
Movers and Shakers:
Steve Gillmor, Seth Goldstein, Dave Sifry and the other Attention.xml folks.
3. Open Media
Proprietary media standardsÂFlash, Windows Media, and QuickTime, to name a few Âhelped liven up the web. But they are proprietary standards that try to keep us locked in, and they weren't created from scratch to handle today's online content. That's why, for many of us, an Open Media standard has been a holy grail. Yahoo's new Media RSS standard brings us one step closer to achieving open media, as do Ogg Vorbis audio codecs, XSPF playlists, or MusicBrainz. And several sites offer digital creators not only a place to store their content, but also to sell it.
Media RSS (being developed by Yahoo with help from the community) extends RSS and combines it with "RSS enclosures" Âadds metadata to any media itemÂto create a comprehensive solution for media "narrowcasters." To gain acceptance for Media RSS, Yahoo knows it has to work with the community. As an active member of this community, I can tell you that we'll create Media RSS equivalents for rdf (an alternative subscription format) and Atom (yet another subscription format), so no one will be able to complain that Yahoo is picking sides in format wars.
When Yahoo announced the purchase of Flickr, Yahoo founder Jerry Yang insinuated that Yahoo is acquiring "open DNA" to turn Yahoo into an open standards player. Yahoo is showing what happens when you take a multi-billion dollar company and make openness one of its core valuesÂso Google, beware, even if Google does have more research fellows and Ph.D.s.
The open media landscape is far and wide, reaching from game machine hacks and mobile phone downloads to PC-driven bookmarklets, players, and editors, and it includes many other standardization efforts. XSPF is an open standard for playlists, and MusicBrainz is an alternative to the proprietary (and originally effectively stolen) database that Gracenote licenses.
Ourmedia.org is a community front-end to Brewster Kahle's Internet Archive. Brewster has promised free bandwidth and free storage forever to any content creators who choose to share their content via the Internet Archive. Ourmedia.org is providing an easy-to-use interface and community to get content in and out of the Internet Archive, giving ourmedia.org users the ability to share their media anywhere they wish, without being locked into a particular service or tool. Ourmedia plans to offer open APIs and an open media registry that interconnects other open media repositories into a DNS-like registry (just like the www domain system), so folks can browse and discover open content across many open media services. Systems like Brightcove and Odeo support the concept of an open registry, and hope to work with digital creators to sell their work to fulfill the financial aspect of the "Long Tail."
More Movers and Shakers:
Creative Commons, the Open Media Network, Jay Dedman, Ryanne Hodson, Michael Verdi, Eli Chapman, Kenyatta Cheese, Doug Kaye, Brad Horowitz, Lucas Gonze, Robert Kaye, Christopher Allen, Brewster Kahle, JD Lasica, and indeed, Marc Canter, among others.
4. Microcontent Publishing
Unstructured content is cheap to create, but hard to search through. Structured content is expensive to create, but easy to search. Microformats resolve the dilemma with simple structures that are cheap to use and easy to search.
The first kind of widely adopted microcontent is blogging. Every post is an encapsulated idea, addressable via a URL called a permalink. You can syndicate or subscribe to this microcontent using RSS or an RSS equivalent, and news or blog aggregators can then display these feeds in a convenient readable fashion. But a blog post is just a block of unstructured textânot a bad thing, but just a first step for microcontent. When it comes tostructured data, such as personal identity profiles, product reviews, or calendar-type event data, RSS was not designed to maintain the integrity of the structures.
Right now, blogging doesn't have the underlying structure necessary for full-fledged microcontent publishing. But that will change. Think of local information services (such as movie listings, event guides, or restaurant reviews) that any college kid can access and use in her weekend programming project to create new services and tools.
Today's blogging tools will evolve into microcontent publishing systems, and will help spread the notion of structured data across the blogosphere. New ways to store, represent and produce microcontent will create new standards, such as Structured Blogging and Microformats. Microformats differ from RSS feeds in that you can't subscribe to them. Instead, Microformats are embedded into webpages and discovered by search engines like Google or Technorati. Microformats are creating common definitions for "What is a review or event? What are the specific fields in the data structure?" They can also specify what we can do with all this information.OPML (Outline Processor Markup Language) is a hierarchical file format for storing microcontent and structured data. It was developed by Dave Winer of RSS and podcast fame.
Events are one popular type of microcontent. OpenEvents is already working to create shared databases of standardized events, which would get used by a new generation of event portalsâsuch as Eventful/EVDB, Upcoming.org, and WhizSpark. The idea of OpenEvents is that event-oriented systems and services can work together to establish shared events databases (and associated APIs) that any developer could then use to create and offer their own new service or application. OpenReviews is still in the conceptual stage, but it would make it possible to provide open alternatives to closed systems like Epinions, and establish a shared database of local and global reviews. Its shared open servers would be filled with all sorts of reviews for anyone to access.
Why is this important? Because I predict that in the future, 10 times more people will be writing reviews than maintaining their own blog. The list of possible microcontent standards goes on: OpenJobpostings, OpenRecipes, and even OpenLists. Microsoft recently revealed that it has been working on an important new kind of microcontent: Listsâso OpenLists will attempt to establish standards for the kind of lists we all use, such as lists of Links, lists of To Do Items, lists of People, Wish Lists, etc.
Movers and Shakers:
Tantek Ãelik and Kevin Marks of Technorati, Danny Ayers, Eric Meyer, Matt Mullenweg, Rohit Khare, Adam Rifkin, Arnaud Leene, Seb Paquet, Alf Eaton, Phil Pearson, Joe Reger, Bob Wyman among others.
5. Open Social Networks
I'll never forget the first time I met Jonathan Abrams, the founder of Friendster. He was arrogant and brash and he claimed he "owned"Â all his users, and that he was going to monetize them and make a fortune off them. This attitude robbed Friendster of its momentum, letting MySpace, Facebook, and other social networks take Friendster's place.
Jonathan's notion of social networks as a way to control users is typical of the Web 1.0 business model and its attitude towards users in general. Social networks have become one of the battlegrounds between old and new ways of thinking. Open standards for Social Networking will define those sides very clearly. Since meeting Jonathan, I have been working towards finding and establishing open standards for social networks. Instead of closed, centralized social networks with 10 million people in them, the goal is making it possible to have 10 million social networks that each have 10 people in them.
FOAF (which stands for Friend Of A Friend, and describes people and relationships in a way that computers can parse) is a schema to represent not only your personal profile's meta-data, but your social network as well. Thousands of researchers use the FOAF schema in their "Semantic Web" projects to connect people in all sorts of new ways. XFN is a microformat standard for representing your social network, while vCard (long familiar to users of contact manager programs like Outlook) is a microformat that contains your profile information. Microformats are baked into any xHTML webpage, which means thatany blog, social network page, or any webpage in general can "contain" your social network in itÂand be used byany compatible tool, service or application.
PeopleAggregator is an earlier project now being integrated into open content management framework Drupal. The PeopleAggregator APIs will make it possible to establish relationships, send messages, create or join groups, and post between different social networks. (Sneak preview: this technology will be available in the upcoming GoingOn Network.)
All of these open social networking standards mean that inter-connected social networks will form a mesh that will parallel the blogosphere. This vibrant, distributed, decentralized world will be driven by open standards: personalized online experiences are what the new open web will be all aboutÂand what could be more personalized than people's networks?
Movers and Shakers:
Eric Sigler, Joel De Gan, Chris Schmidt, Julian Bond, Paul Martino, Mary Hodder, Drummond Reed, Dan Brickley, Randy Farmer, and Kaliya Hamlin, to name a few.
6. Tags
Nowadays, no self-respecting tool or service can ship without tags. Tags are keywords or phrases attached to photos, blog posts, URLs, or even video clips. These user- and creator-generated tags are an open alternative to what used to be the domain of librarians and information scientists: categorizing information and content using taxonomies. Tags are instead creating "folksonomies."
The recently proposed OpenTags concept would be an open, community-owned version of the popular Technorati Tags service. It would aggregate the usage of tags across a wide range of services, sites, and content tools. In addition to Technorati's current tag features, OpenTags would let groups of people share their tags in "TagClouds." Open tagging is likely to include some of the open identity features discussed above, to create a tag system that is resilient to spam, and yet trustable across sites all over the web.
OpenTags owes a debt to earlier versions of shared tagging systems, which include Topic Exchange and something called the k-collectorÂa knowledge management tag aggregatorÂfrom Italian company eVectors.
Movers & Shakers:
Phil Pearson, Matt Mower , Paolo Valdemarin, and Mary Hodder and Drummond Reed again, among others.
7. Pinging
Websites used to be mostly static. Search engines that crawled (or "spidered") them every so often did a good enough job to show reasonably current versions of your cousin's homepage or even Time magazine's weekly headlines. But when blogging took off, it became hard for search engines to keep up. (Google has only just managed to offer blog-search functionality, despite buying Blogger back in early 2003.)
To know what was new in the blogosphere, users couldn't depend on services that spidered webpages once in a while. The solution: a way for blogs themselves to automatically notify blog-tracking sites that they'd been updated. Weblogs.com was the first blog "ping service": it displayed the name of a blog whenever that blog was updated. Pinging sites helped the blogosphere grow, and more tools, services, and portals started using pinging in new and different ways. Dozens of pinging services and sitesÂmost of which can't talk to each otherÂsprang up.
Matt Mullenweg (the creator of open source blogging software WordPress) decided that a one-stop service for pinging was needed. He created Ping-o-MaticÂwhich aggregates ping services and simplifies the pinging process for bloggers and tool developers. With Ping-o-Matic, any developer can alert all of the industry's blogging tools and tracking sites at once. This new kind of open standard, with shared infrastructure, is a critical to the scalability of Web 2.0 services.
As Matt said:There are a number of services designed specifically for tracking and connecting blogs. However it would be expensive for all the services to crawl all the blogs in the world all the time. By sending a small ping to each service you let them know you've updated so they can come check you out. They get the freshest data possible, you don't get a thousand robots spidering your site all the time. Everybody wins.
Movers and Shakers:
Matt Mullenweg, Jim Winstead, Dave Winer
8. Routing
Bloggers used to have to manually enter the links and content snippets of blog posts or news items they wanted to blog. Today, some RSS aggregators can send a specified post directly into an associated blogging tool: as bloggers browse through the feeds they subscribe to, they can easily specify and send any post they wish to "reblog" from their news aggregator or feed reader into their blogging tool. (This is usually referred to as "BlogThis.") As structured blogging comes into its own (see the section on Microcontent Publishing), it will be increasingly important to maintain the structural integrity of these pieces of microcontent when reblogging them.
Promising standard RedirectThis will combine a "BlogThis"-like capability while maintaining the integrity of the microcontent. RedirectThis will let bloggers and content developers attach a simple "PostThis" button to their posts. Clicking on that button will send that post to the reader/blogger's favorite blogging tool. This favorite tool is specified at the RedirectThis web service, where users register their blogging tool of choice. RedirectThis also helps maintain the integrity and structure of microcontentÂthen it's just up to the user to prefer a blogging tool that also attains that lofty goal of microcontent integrity.
OutputThis is another nascent web services standard, to let bloggers specify what "destinations" they'd like to have as options in their blogging tool. As new destinations are added to the service, more checkboxes would get added to their blogging toolÂallowing them to route their published microcontent to additional destinations.
Movers and Shakers:
Michael Migurski, Lucas Gonze
9. Open Communications
Likely, you've experienced the joys of finding friends on AIM or Yahoo Messenger, or the convenience of Skyping with someone overseas. Not that you're about to throw away your mobile phone or BlackBerry, but for many, also having access to Instant Messaging (IM) and Voice over IP (VoIP) is crucial.
IM and VoIP are mainstream technologies that already enjoy the benefits of open standards. Entire industries are bornÂright this secondÂbased around these open standards. Jabber has been an open IM technology for yearsÂin fact, as XMPP, it was officially dubbed a standard by the IETF. Although becoming an official IETF standard is usually the kiss of death, Jabber looks like it'll be around for a while, as entire generations of collaborative, work-group applications and services have been built on top of its messaging protocol. For VoIP, Skype is clearly the leading standard todayÂthough one could argue just how "open" it is (and defenders of the IETF's SIP standard often do). But it is free and user-friendly, so there won't be much argument from users about it being insufficiently open. Yet there may be a cloud on Skype's horizon: web behemoth Google recently released a beta of Google Talk, an IM client committed to open standards. It currently supports XMPP, and will support SIP for VoIP calls.
Movers and Shakers:
Jeremie Miller, Henning Schulzrinne, Jon Peterson, Jeff Pulver
10. Device Management and Control
To access online content, we're using more and more devices. BlackBerrys, iPods, Treos, you name it. As the web evolves, more and more different devices will have to communicate with each other to give us the content we want when and where we want it. No-one wants to be dependent on one vendor anymoreÂlike, say, SonyÂfor their laptop, phone, MP3 player, PDA, and digital camera, so that it all works together. We need fully interoperable devices, and the standards to make that work. And to fully make use of how content is moving online content and innovative web services, those standards need to be open.
MIDI (musical instrument digital interface), one of the very first open standards in music, connected disparate vendors' instruments, post-production equipment, and recording devices. But MIDI is limited, and MIDI II has been very slow to arrive. Now a new standard for controlling musical devices has emerged: OSC (Open SoundControl). This protocol is optimized for modern networking technology and inter-connects music, video and controller devices with "other multimedia devices." OSC is used by a wide range of developers, and is being taken up in the mainstream MIDI marketplace.
Another open-standards-based device management technology is ZigBee, for building wireless intelligence and network monitoring into all kinds of devices. ZigBee is supported by many networking, consumer electronics, and mobile device companies.
   · · · · · ·  Â
The Change to Openness
The rise of open source software and its "architecture of participation" are completely shaking up the old proprietary-web-services-and-standards approach. Sun MicrosystemsÂwhose proprietary Java standard helped define the Web 1.0Âis opening its Solaris OS and has even announced the apparent paradox of an open-source Digital Rights Management system.
Today's incumbents will have to adapt to the new openness of the Web 2.0. If they stick to their proprietary standards, code, and content, they'll become the new walled gardensÂplaces users visit briefly to retrieve data and content from enclosed data silos, but not where users "live." The incumbents' revenue models will have to change. Instead of "owning" their users, users will know they own themselves, and will expect a return on their valuable identity and attention. Instead of being locked into incompatible media formats, users will expect easy access to digital content across many platforms.
Yesterday's web giants and tomorrow's users will need to find a mutually beneficial new balanceÂbetween open and proprietary, developer and user, hierarchical and horizontal, owned and shared, and compatible and closed.
Marc Canter is an active evangelist and developer of open standards. Early in his career, Marc founded MacroMind, which became Macromedia. These days, he is CEO of Broadband Mechanics, a founding member of the Identity Gang and of ourmedia.org. Broadband Mechanics is currently developing the GoingOn Network (with the AlwaysOn Network), as well as an open platform for social networking called the PeopleAggregator.
A version of the above post appears in the Fall 2005 issue of AlwaysOn's quarterly print blogozine, and ran as a four-part series on the AlwaysOn Network website.(Via Marc's Voice.)
[You donât expect me to work out the CSS right after making it semantic, do you?]
Shift to another universe. Itâs sometime in the late 1990âs. Ramanathan Guha, Tim Bray, Dave Winer, Tantek Ãelik, Dan Libby and Dan Connolly are sharing a jacuzzi*. As they sip Marghueritas, their conversation goes like this:
So, weâve got this idea for publishing content thatâs a bit like CDF, but weâve made the system more of a service than just a desktop thing.
Sounds cool. Might be a good fit with this RDF thing Iâve been working on.
Hmm, Danâs stuff does sound cool, but with all due respect dude, RDF does seem a bit complicated. I really donât think the folks out in userland would get it. And they majored in graphs.
Maybe we could make it a bit more straightforward, you know, like put pointy brackets around it?
Straightforwardâs good. Better still, simple. They like simple.
But what about the rest of the Web, you know, like HTML?
Hmm, but how do we do the timestamping kind of thing, and wrap it up in a âmicropostyâ way, the things that makes this distribution mode work?
Yeah, metadata is cool. Keep the metadata.
Not cheap though. The Web must be cheap. Did Andreesen show you his pictures..?
â¦âMicropostyâ? you mean like my newsletter thing, but on the Web?
Yep, like Cool Diary Entry of the Day
But do we really need 1000 pages of spec for that?
â¦Incidentally, did you see my Box Model Hack?
Yup.
Yup.
Yup.
Yup. I explained that on DaveNet last year.
Hey! Iâve got it: âMyDigitalCocktailâ..?
Hang on, that gives me an ideaâ¦
There was a tangible outcome to this conversation: a document format which supports content and unambiguous, explicit, data and metadata, timestamping and much, much more. Itâs viewable in a regular browser. Can be syndicated; can be aggregated. Unlike forgetful RSS, archives are almost always retrievable using regular HTTP methods. In this universe there was no RSS. No syndication wars. No talking-at-cross-purposes conflict between docheads and dataheads, syntax fans and model fans. No-one had to publish simple data in Byzantine RDF/XML. No-one had to deal with doubly-escaped content and silent data loss. There was no need for any new format for business cards, calendars, blogs, link lists, reviews, pet profiles. XHTML with CSS was more than enough. DanL got the MyNetscape he wanted. Tim got the simple, tight format he wanted. Guha got the AI. Tantek got to do presentations in a cool black raincoat. DanC finally got his schedule on his Palm Pilot. Dave got the credit. MarcC got the parasols and a grass skirt none of the others would admit to having brought.
Shift back to this universe. Check out hAtom. Itâs not finished yet, but Davidâs been methodically working through the (utterly sound) microformats process. Looks good to me.
* apologies for the imagery, but how else do think Silicon Valley might seem to someone raised in the cowpat-coated hills of Derbyshire?
PS. Apologies to everyone mentioned. And before you suggest it, blogging *is* therapy.
"(Via Raw.)
]]>Microsoft Gadgets, Start.com and Innovation: "
A lot of the comments in the initial post on the Microsoft Gadgets blog are complaints that the Microsoft is copying ideas from Apple's dashboard. First of all, people should give credit where it is due and acknowledge that Konfabulator is the real pioneer when it comes to desktop widgets. More importantly, the core ideas in Microsoft Gadgets were pioneered by Microsoft not Apple or Konfabulator.
From the post A Brief History of Windows Sidebar by Sean Alexander
Microsoft 'Sideshow*' Research Project (2000-2001)
While work started prior, in September 2001, a team of Microsoft researchers published a paper entitled, 'Sideshow: Providing peripheral awareness of important information' including findings of their project.
...
The research paper provides screenshots that bear a striking resemblance to the Windows Sidebar. The paper is a good read for anyone thinking about Gadget development. For folks who have visited Microsoft campuses, you may recall the posters in elevator hallways and Sidebar running on many employees desktops. Technically one of the first teams to implement this concept*Internal code-name, not directly related to the official, âÂÂWindows SideShowâ¢â auxiliary display feature in Windows Vista.>
Microsoft âÂÂLonghornâ Alpha Release (2003)
In 2003, Microsoft unveiled a new feature called, 'Sidebar' at the Microsoft Professional DeveloperâÂÂs Conference. This feature took the best concepts from Microsoft Research and applied them to a new platform code-named, 'Avalon', now formally known as Windows Presentation Foundation...
Microsoft Windows Vista PDC Release (2005)
While removed from public eye during the Longhorn plan change in 2004, a small team was formed to continue to incubate Windows Sidebar as a concept, dating back to its roots in 2000/2001 as a research exercise. Now Windows Sidebar will be a feature of Windows Vista. Feedback from customers and hardware industry dynamics are being taken into account, particularly adding support for DHTML-based Gadgets to support a broader range of developer and designer, enhanced security infrastructure, and better support for Widescreen (16:10, 16:9) displays. Additionally a new feature in Windows Sidebar is support for hosting of Web Gadgets which can be hosted on sites such as Start.com or run locally. Gadgets that run on the Windows desktop will also be available for Windows XP customers â more details to be shared here in the future.
So the desktop version of 'Microsoft Gadgets' is the shipping version of Microsoft Research's 'Sideshow' project. Since the research paper was published a number of parties have shipped products inspired by that research including MSN Dashboard, Google Desktop and Desktop Sidebar but this doesn't change the fact that the Microsoft is the pioneer in this space.
From the post Gadgets and Start.com by Sanaz Ahari
Start.com was initially released on February 2005, on start.com/1 â since then weâÂÂve been innovating regularly (start.com/2, start.com/3, start.com and start.com/pdc) working towards accomplishing our goals:
- To bring the webâÂÂs content to users through:
- Rich DHTML components (Gadgets)
- RSS and behaviors associated with RSS
- High customizability and personalization
- To enable developers to extend their start experience by building their own Gadgets
Yesterday marked a humble yet significant milestone for us â we opened our 'Atlas' framework enabling developers to extend their start.com experience. You can read more it here: http://start.com/developer. The key differentiators about our Gadgets are:
- Most web applications were designed as closed systems rather than as a web platform. For example, most customizable 'aggregator' web-sites consume feeds and provide a fair amount of layout customization. However, the systems were not extensible by developers. With start.com, the experience is now an integrated and extensible application platform.
- We will be enriching the gadgets experience even further, enabling these gadgets to seamlessly work on Windows Sidebar
The Start.com stuff is really cool. Currently with traditional portal sites like MyMSN or MyYahoo, I can customize my data sources by subscribing to RSS feeds but not how they look. Instead all my RSS feeds always look like a list of headlines. These portal sites usually use different widgets for display richer data like stock quotes or weather reports but there is no way for me to subscribe to a stock quote or weather report feed and have it look the same as the one provided by the site. Start.com fundamentally changes this model by turning it on its head. I can create a custom RSS feed and specify how it should render in Start.com using JavaScript which basically makes it a Start.com gadget, no different from the default ones provided by the site.
From my perspective, we're shipping really innovative stuff but because of branding that has attempted to cash in on the 'widgets' hype, we end up looking like followers and copycats.
Marketing sucks.
" Posted for historic annotation purposes (re. Widgets as Microsoft didn't copy Apple here at all; Apple just packaged this better at the expense of Konfabulator as already noted above). And yes, Marketing sucks big time!!]]>Bill Gates: Cell Phones Will Overtake MP3 Players, Calls iPod 'Unsustainable' Microsoft's chairman draws on computing history to make his proclamation that the iPod phenomenon won't...
"Why did Foreman lose to Ali? The fact is Ali beat Foreman because he was tougher and stronger than he's ever given credit for. Ali didn't box Foreman! He went to the ropes and allowed Foreman to hit on him, is that boxing? What if Foreman had knocked him out while he was stationary against the ropes. It would've been said for the rest of time, why did Ali remain stationary letting Foreman get off on him? How come he didn't use the ring and box? Which is exactly what those watching the fight were thinking and saying during rounds two through eight. That's not boxing, that's being forced to fight because your opponent will not allow you to box."
The article discusses most of the key issues, but it should also have included and discussed he following question: "should Microsoft benefit from the mess that we let them create?". By "we" I mean the extensive pool of Microsoft product consumers, developers, and partners etc.
I have worked with Microsoft products (as a developer and user) for more years than I would like to remember; I have personally experienced the journey from Windows 2.0 to Windows XP (and played around with Longhorn).
I added my question to this dialog as without it's resultant perspective, history will simply repeat itself. If IT technology decision makers don't change their product selection and acquisition habits, then why should Microsoft or any other vendor change their ways? Especially when a perpetual promise-under deliver-repromise cycle works absolutely fine. This isn't rocket science, it basic common sense (but we know that common sense ain't that common).
Microsoft like most software companies seek significant portions of their revenue growth from product upgrades. In a sense, it inherently implies that these products will always be millions of miles away from the "silver bullet" promises espoused in the pre product release marketing and PR hype. Sadly, there was a time when Marketing and PR hype used to be about new features; a time when there was a clear line between a new feature and a fundamental product bug.
Buying products from any company simply because they have the largest market share is dumb! All it does is encourage other vendors to focus on product market share rather than product quality, which ultimately results in the following:
Microsoft isn't a unique source of this problem, but hey! They are the largest Software Company (the one with the vital market share), and their software products are on some 80-90% of desktops on this planet, and the planet isn't at its most productive at the current time, and no matter how you look at it, this loss of productivity has something to do with the increased nuisance of desktop computing.
If Microsoft could just focus on its core competence (BTW - I can't quite pint point this anymore since they are in every software market that exists today), it would have at least have an iota of a chance in hell of cleaning up this mess.
]]>
Great Business Strategy or Dumb Luck Interesting read here today at ZDNet -- Open Solaris and strategic consequences. Here's a bit of the conclusion:
Since last fall, I've been recommending Bloglines to first-timers as the fastest and easiest introduction to the subscription side of the blogosphere. Remarkably, this same application also meets the needs of some of the most advanced users. I've now added myself to that list. Hats off to Mark Fletcher for putting all the pieces together in such a masterful way.
What goes around comes around. Five years ago, centralized feed aggregators -- my.netscape.com and my.userland.com -- were the only game in town. Fat-client feedreaders only arrived on the scene later. Because of the well-known rich-versus-reach tradeoffs, I never really settled in with one of those. Most of the time I've used the Radio UserLand reader. It is browser-based, and it normally points to localhost, but I've been parking Radio UserLand on a secure server so that I can read the feeds it aggregates for me from anywhere.
Bloglines takes that idea and runs with it. Like the Radio UserLand reader, it supports the all-important (to me) consolidated view of new items. But its two-pane interface also shows me the list of feeds, highlighting those with new entries, so you can switch between a linear of scan of all new items and random access to particular feeds. Once you've read an item it vanishes, but you can recall already-read items like so:
If a month's worth of some blog's entries produces too much stuff to easily scan, you can switch that blog to a titles-only view. The titles expand to reveal all the content transmitted in the feed for that item.
I haven't gotten around to organizing my feeds into folders, the way other users of Bloglines do, but I've poked around enough to see that Bloglines, like Zope, handles foldering about as well as you can in a Web UI -- which is to say, well enough. With an intelligent local cache it could be really good; more on that later.
Bloglines does two kinds of data mining that are especially noteworthy. First, it counts and reports the number of Bloglines users subscribed to each blog. In the case of Jonathan Schwartz's weblog, for example, there are (as of this moment) 253 subscribers.
Second, Bloglines is currently managing references to items more effectively than the competition. I was curious, for example, to gauge the reaction to the latest salvo in Schwartz's ongoing campaign to turn up the heat on Red Hat. Bloglines reports 10 References. In this case, the comparable query on Feedster yields a comparable result, but on the whole I'm finding Bloglines' assembly of conversations to be more reliable than Feedster's (which, however, is still marked as 'beta'). Meanwhile Technorati, though it casts a much wider net than either, is currently struggling with conversation assembly.
I love how Bloglines weaves everything together to create a dense web of information. For example, the list of subscribers to the Schwartz blog includes: judell - subscribed since July 23, 2004. Click that link and you'll see my Bloglines subscriptions. Which you can export and then -- if you'd like to see the world through my filter -- turn around and import.
Moving my 265 subscriptions into Bloglines wasn't a complete no-brainer. I imported my Radio UserLand-generated OPML file without any trouble, but catching up on unread items -- that is, marking all of each feed's sometimes lengthy history of items as having been read -- was painful. In theory you can do that by clicking once on the top-level folder containing all the feeds, which generates the consolidated view of unread items. In practice, that kept timing out. I finally had to touch a number of the larger feeds, one after another, in order to get everything caught up. A Catch Up All Feeds feature would solve this problem.
Another feature I'd love to see is Move To Next Unread Item -- wired to a link in the HTML UI, or to a keystroke, or ideally both.
Finally, I'd love it if Bloglines cached everything in a local database, not only for offline reading but also to make the UI more responsive and to accelerate queries that reach back into the archive.
Like Gmail, Bloglines is the kind of Web application that surprises you with what it can do, and makes you crave more. Some argue that to satisfy that craving, you'll need to abandon the browser and switch to RIA (rich Internet application) technology -- Flash, Java, Avalon (someday), whatever. Others are concluding that perhaps the 80/20 solution that the browser is today can become a 90/10 or 95/5 solution tomorrow with some incremental changes.
Dare Obasanjo wondered, over the weekend, "What is Google building?" He wrote:
In the past couple of months Google has hired four people who used to work on Internet Explorer in various capacities [especially its XML support] who then moved to BEA; David Bau, Rod Chavez, Gary Burd and most recently Adam Bosworth. A number of my coworkers used to work with these guys since our team, the Microsoft XML team, was once part of the Internet Explorer team. It's been interesting chatting in the hallways with folks contemplating what Google would want to build that requires folks with a background in building XML data access technologies both on the client side, Internet Explorer and on the server, BEA's WebLogic. [Dare Obasanjo]It seems pretty clear to me. Web applications such as Gmail and Bloglines are already hard to beat. With a touch of alchemy they just might become unstoppable.
I have little to add to this matter as our understanding and vision is aptly expressed via the architecture and feature set of Virtuoso (this area was actually addressed circa 1999).
We are heading into a era of multi-model databases, these are single database engines that are capable of effectively serving the requirements of the Hierarchical, Network, Relational, and Object database models . As we get closer to the unravelling of universal storage, hopefully this will get clearer.
Back to Dare's commentary:
]]>C.J. Date, one of the most influential names in the relational database world, had some harsh words about XML's encroachment into the world of relational databases in a recent article entitled Date defends relational model that appeared on SearchDatabases.com. Key parts of the article are excerpted below
Date reserved his harshest criticism for the competition, namely object-oriented and XML-based DBMSs. Calling them "the latest fashions in the computer world," Date said he rejects the argument that relational DBMSs are yesterday's news. Fans of object-oriented database systems "see flaws in the relational model because they don't fully understand it," he said.
Date also said that XML enthusiasts have gone overboard.
"XML was invented to solve the problem of data interchange, but having solved that, they now want to take over the world," he said. "With XML, it's like we forget what we are supposed to be doing, and focus instead on how to do it."
Craig S. Mullins, the director of technology planning at BMC Software and a SearchDatabase.com expert, shares Date's opinion of XML. It can be worthwhile, Mullins said, as long as XML is only used as a method of taking data and putting it into a DBMS. But Mullins cautioned that XML data that is stored in relational DBMSs as whole documents will be useless if the data needs to be queried, and he stressed Date's point that XML is not a real data model.
Craig Mullins points are more straightforward to answer since his comments don't jibe with the current state of the art in the XML world. He states that you can't query XML documents stored in databases but this is untrue. Almost three years ago, I was writing articles about querying XML documents stored in relational databases. Storing XML in a relational database doesn't mean it has to be stored in as an opaque binary BLOB or as a big, bunch of text which cannot effectively be queried. The next version of SQL Server will have extensive capabilities for querying XML data in relational database and doing joins across relational and XML data, a lot of this functionality is described in the article on XML Support in SQL Server 2005. As for XML not having a data model, I beg to differ. There is a data model for XML that many applications and people adhere to, often without realizing that they are doing so. This data model is the XPath 1.0 data model, which is being updated to handled typed data as the XQuery and XPath 2.0 data model.
Now to tackle the meat of C.J. Date's criticisms which is that XML solves the problem of data interchange but now is showing up in the database. The thing first point I'd like point out is that there are two broad usage patterns of XML, it is used to represent both rigidly structured tabular data (e.g., relational data or serialized objects) and semi-structured data (e.g., office documents). The latter type of data will only grow now that office productivity software like Microsoft Office have enabled users to save their documents as XML instead of proprietary binary formats. In many cases, these documents cannot simply shredded into relational tables. Sure you can shred an Excel spreadsheet written in spreadsheetML into relational tables but is the same really feasible for a Word document written in WordprocessingML? Many enterprises would rather have their important business data being stored and queried from a unified location instead of the current situation where some data is in document management systems, some hangs around as random files in people's folders while some sits in a database management system.
As for stating that critics of the relational model don't understand it, I disagree. One of the major benefits of using XML in relational databases is that it is a lot easier to deal with fluid schemas or data with sparse entries with XML. When the shape of the data tends to change or is not fixed the relational model is simply not designed to deal with this. Constantly changing your database schema is simply not feasible and there is no easy way to provide the extensibility of XML where one can say "after the X element, any element from any namespace can appear". How would one describe the capacity to store âany dataâ in a traditional relational database without resorting to an opaque blob?
I do tend to agree that some people are going overboard and trying to model their data hierarchically instead of relationally which experience has thought us is a bad idea. Recently on the XML-DEV mailing list entitled Designing XML to Support Information Evolution where Roger L. Costello described his travails trying to model his data which was being transferred as XML in a hierarchical manner. Micheal Champion accurately described the process Roger Costello went through as having "rediscovered the relational model". In a response to that thread I wrote "Hierarchical databases failed for a reason".
Using hierarchy as a primary way to model data is bad for at least the following reasons
Hierarchies tend to encourage redundancy. Imagine I have a <Customer> element who has one or more <ShippingAddress> elements as children as well as one or more <Order> elements as children as well. Each order was shipped to an address, so if modelled hierarchically each <Order> element also will have a <ShippingAddress> element which leads to a lot of unnecessary duplication of data. In the real world, there are often multiple groups to which a piece of data belongs which often cannot be modelled with a single hierarchy. Data is too tightly coupled. If I delete a <Customer> element, this means I've automatically deleted his entire order history since all the <Order> elements are children of <Customer>. Similarly if I query for a <Customer>, I end up getting all the <Order> information as well.To put it simply, experience has taught the software world that the relational model is a better way to model data than the hierarchical model. Unfortunately, in the rush to embrace XML many a repreating the mistakes from decades ago in the new millenium.
I wrote this essay before reading Free Culture so I'm saying a lot of stuff that Larry says better...
Several crucial shifts in technology are emerging that will drastically affect the relationship between users and technology in the near future. Wireless Internet is becoming ubiquitous and economically viable. Internet capable devices are becoming smaller and more powerful.
Alongside technological shifts, new social trends are emerging. Users are shifting their attention from packaged content to social information about location, presence and community. Tools for identity, trust, relationship management and navigating social networks are becoming more popular. Mobile communication tools are shifting away from a 1-1 model, allowing for increased many-to-many interactions; such a shift is even being used to permit new forms of democracy and citizen participation in global dialog.
While new technological and social trends are occurring, it is not without resistance, often by the developers and distributors of technology and content. In order to empower the consumer as a community member and producer, communication carriers, hardware manufacturers and content providers must understand and build models that focus less on the content and more on the relationships.
Smaller faster
Computing started out as large mainframe computers, software developers and companies “time sharing” for slices of computing time on the large machines. The mini-computer was cheaper and smaller, allowing companies and labs to own their own computers. The mini computer allowed a much greater number of people to have access to computers and even use them in real time. The mini computer lead to a burst in software and networking technologies. In the early 80’s, the personal computer increased the number of computers by an order of magnitude and again, led to an explosion in new software and technology while lowering the cost even more. Console gaming companies proved once again that unit costs could be decreased significantly by dramatically increasing the number of units sold. Today, we have over a billion cell phones in the market. There are tens of millions camera phones. The incredible number of these devices has continued to lower the unit cost of computing as well as devices imbedded in these devices such as small cameras. High end phones have the computing power of the personal computers of the 80’s and the game consoles of the 90’s.
History repeats with WiFi
There are parallels in the history of communications and computing. In the 1980’s the technology of packet switched networks became widely deployed. Two standards competed. X.25 was a packet switched network technology being promoted by CCITT (a large, formal international standards body) and the telephone companies. It involved a system run by telephone companies including metered tariffs and multiple bilateral agreements between carriers to hook up.
Concurrently, universities and research labs were promoting TCP/IP and the Internet opportunity for loosely organized standards meetings being operated with flat rate tariffs and little or no agreements between the carriers. People just connected to the closest node and everyone agreed to freely carry traffic for others.
There were several “free Internet” services such as “The Little Garden” in San Francisco. Commercial service providers, particularly the telephone company operators such as SprintNet tried to shut down such free services by threatening not to carry this free traffic.
Eventually, large ISPs began providing high quality Internet connectivity and finally the telephone companies realized that the Internet was the dominant standard and shutdown or acquired the ISPs.
A similar trend is happening in wireless data services. GPRS is currently the dominant technology among mobile telephone carriers. GPRS allows users to transmit packets of data across the carrier network to the Internet. One can roam to other networks as long as the mobile operators have agreements with each other. Just like in the days of X.25, the system requires many bilateral agreements between the carriers; their goal is to track and bill for each packet of information.
Competing with this standard is WiFi. WiFi is just a simple wireless extension to the current Internet and many hotspots provide people with free access to the Internet in cafes and other public areas. WiFi service providers have emerged, while telephone operators –such as a T-Mobile and Vodaphone- are capitalizing on paid WiFi services. Just as with the Internet, network operators are threatening to shut down free WiFi providers, citing a violation of terms of service.
Just as with X.25, the GPRS data network and the future data networks planned by the telephone carriers (e.g. 3G) are crippled with unwieldy standards bodies, bilateral agreements, and inherently complicated and expensive plant operations.
It is clear that the simplicity of WiFi and the Internet is more efficient than the networks planned by the telephone companies. That said, the availability of low cost phones is controlled by mobile telephone carriers, their distribution networks and their subsidies.
Content vs Context
Many of the mobile telephone carriers are hoping that users will purchase branded content manufactured in Hollywood and packaged and distributed by the telephone companies using sophisticated technology to thwart copying.
Broadband in the home will always be cheaper than mobile broadband. Therefore it will be cheaper for people to download content at home and use storage devices to carry it with them rather than downloading or viewing content over a mobile phone network. Most entertainment content is not so time sensitive that it requires real time network access.
The mobile carriers are making the same mistake that many of the network service providers made in the 80s. Consider Delphi, a joint venture between IBM and Sears Roebuck. Delphi assumed that branded content was going to be the main use of their system and designed the architecture of the network to provide users with such content. Conversely, the users ended up using primary email and communications and the system failed to provide such services effectively due to the mis-design.
Similarly, it is clear that mobile computing is about communication. Not only are mobile phones being used for 1-1 communications, as expected through voice conversations; people are learning new forms of communication because of SMS, email and presence technologies. Often, the value of these communication processes is the transmission of “state” or “context” information; the content of the messages are less important.
Copyright and the Creative Commons
In addition to the constant flow of traffic keeping groups of people in touch with each other, significant changes are emerging in multimedia creation and sharing. The low cost of cameras and the nearly television studio quality capability of personal computers has caused an explosion in the number and quality of content being created by amateurs. Not only is this content easier to develop, people are using the power of weblogs and phones to distribute their creations to others.
The network providers and many of the hardware providers are trying to build systems that make it difficult for users to share and manipulate multimedia content. Such regulation drastically stifles the users’ ability to produce, share and communicate. This is particularly surprising given that such activities are considered the primary “killer application” for networks.
It may seem unintuitive to argue that packaged commercial content can co-exist alongside consumer content while concurrently stimulating content creation and sharing. In order to understand how this can work, it is crucial to understand how the current system of copyright is broken and can be fixed.
First of all, copyright in the multimedia digital age is inherently broken. Historically, copyright works because it is difficult to copy or edit works and because only few people produce new works over a very long period of time. Today, technology allows us to find, sample, edit and share very quickly. The problem is that the current notion of copyright is not capable of addressing the complexity and the speed of what technology enables artists to create. Large copyright holders, notably Hollywood studios, have aggressively extended and strengthened their copyright protections to try to keep the ability to produce and distribute creative works in the realm of large corporations.
Hollywood asserts, “all rights reserved” on works that they own. Sampling music, having a TV show running in the background in a movie scene or quoting lyrics to a song in a book about the history of music all require payment to and a negotiation with the copyright holder. Even though the Internet makes available a wide palette of wonderful works based on content from all over the world, the current copyright practices forbid most of such creation.
However, most artists are happy to have their music sampled if they receive attribution. Most writers are happy to be quoted or have their books copied for non-commercial use. Most creators of content realize that all content builds on the past and the ability for people to build on what one has created is a natural and extremely important part of the creative process.
Creative Commons tries to give artists that choice. By providing a more flexible copyright than the standards “all rights reserved” copyright of commercial content providers, Creative Commons allows artists to set a variety of rights to their works. This includes the ability to reuse for commercial use, copy, sample, require attribution, etc. Such an approach allows artists to decide how their work can be used, while providing people with the materials necessary for increased creation and sharing.
Creative Commons also provides for a way to make the copyright of pieces of content machine-readable. This means that a search engine or other tool to manipulate content is able to read the copyright. As such, an artist can search for songs, images and text to use while having the information to provide the necessary attribution.
Creative Commons can co-exist with the stringent copyright regimes of the Hollywood studios while allowing professional and amateur artists to take more control of how much they want their works to be shared and integrated into the commons. Until copyright law itself is fundamentally changed, the Creative Commons will provide an essential tool to provide an alternative to the completely inflexible copyright of commercial content.
Content is not like some lump of gold to be horded and owned which diminishes in value each time it is shared. Content is a foundation upon which community and relationships are formed. Content is the foundation for culture. We must evolve beyond the current copyright regime that was developed in a world where the creation and transmission of content was unwieldy and expense, reserved to those privileged artists who were funded by commercial enterprises. This will provide the emerging wireless networks and mobile devices with the freedom necessary for them to become the community building tools of sharing that is their destiny.
Databases get a grip on XML
From Inforworld.
The
next iteration of the SQL standard was supposed to arrive in 2003. But
SQL standardization has always been a glacially slow process, so nobody
should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready
yet. Even so, 2003 was a year in which XML-oriented data management,
one of the areas addressed by the forthcoming standard, showed up on
more and more developers? radar screens.ÃÂ >> READ MORE
This article rounds up product for 2003 in the critical area of Enterprise Database Technology. It's certainly provides an apt reflection of how Virtuoso compares with offerings from some the larger (but certainly slower to implement) database vendors in this space. As usual Jon Udell's quote pretty much sums this up:
"While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. OpenLink Software?s Virtuoso 3.0, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated ?joins? across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian?s Mono."
Albeit still somewhat unknown to the broader industry we have remained true our "innovator" discipline, which still remains our chosen path to market leadership. Thus, its worth a quick Virtuoso release history, and featuresÃÂ recap as we get set to up the ante even further in 2004:
1998 - Virtuoso's initial public beta release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources.
1999 - Virtuoso's official commercial release, with emphasis stillÃÂ on Virtual Database functionality for ODBC, JDBC accessible SQL Databases.
2000 - Virtuoso 2.0 adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI, HTTP, Replication, Free Text Indexing (*feature update*), POP3, and NNTP support.
2002 - Virtuoso 2.7 extends Virtualization prowess beyond data access via enhancements to its Web Services protocol stack implementation by enabling SQL Stored Procedures to be published as Web Services. It also debutsÃÂ its Object-Relational engine enhancements that include theÃÂ incorporation of Java and Microsoft .NET Objects into its User Defined Type, User Defined Functions, and Stored ProcedureÃÂ offerings.
2003 - Virtuoso 3.0 extends data and application logic virtualization into the Application Server realm (basically a Virtual Application server too!), by adding support for ASP.NET, PHP, Java Server Pages runtime hosting (making applications built using any of these languages deployable using Virtuoso across all supported platforms).
Collectively each of these releases have contributed to a very premeditated architecture and vision that will ultimately unveil the inherent power of critical I.S infrastructure virtualizationÃÂ along the following lines; data storage, data access , and application logic via coherent integration of SQL, XML, Web Services, and Persistent Stored Modules (.NET, Java, and other object based component building blocks).
ÃÂ
]]>October the 1st is an emotional day for many Nigerians, especially those of us in the Diaspora. Our country remains a paradox as the excerpts below attest:
The more popular view of Nigerians as a result of the proliferation of 419 scams (the mangled by-product of misdirected intellectual prowess and the boundless depths of greed -- which applies to perpetrators and victims alike).
The Nigerian SCO Connection "I AM MR. DARL MCBRIDE CURRENTLY SERVING AS THE PRESIDENT AND CHIEF EXECUTIVE OFFICER OF THE SCO GROUP ..." [via Be Blogging]
Funny! But many a truth is told in jest (I think that's how the quote goes); this one is pretty damned poignant.
Unbeknownst to many, there are other views of Nigeria (unfortunately these aren't the norm).
The call for optimism by our president (he doesn't support or condone the 419 nonsense):
Â
President Olusegun Obasanjo urged Nigerians to change their ways and be optimistic about the future as
[via Odili.net � this site desperately needs RSS!]
Â
There is an increasing pool of key high-tech players of Nigerian decent (and nationality)Â making constructive impact on the high-tech industry (making it less lonely for myself and other Nigerians in the high-tech arena):
Â
Dare Obasanjo is a member of Microsoft's WebData team, which among other things develops the components within the System.Xml and System.Data namespace of the .NET Framework, Microsoft XML Core Services (MSXML), and Microsoft Data Access Components (MDAC). More of Dare's writings on XML can be found on his Extreme XML column on MSDN.
Â
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a consulting firm specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, the open source platform for XML middleware. Mr. Ogbuji is a Computer Engineer and writer born in
Website:Â http://www.fourthought.com/
Â
Philip Emeagwali, a computer scientist, is one of the fathers of the Internet and a trailblazer in petroleum extraction," as quoted by CNN.Â
Â
Philip leaves all Nigerians with this important message on this special day (key excerpt below):
"Our investments in education and technology will be our legacy to our children. They are investments that will bring the best out of the next generation of Nigerians and enable us to reach our potential as individuals, as communities, as a nation."
Happy Birthday dear motherland!
AfricaNigeriaxml]]>In the year 2000 the question of the shape and form of XML data was unclear to many, and reading the article below basically took me back in time to when we released Virtuoso 2.0 (we are now at release 3.0 commercially with a 3.2 beta dropping any minute).
RSS is a great XML application, and it does a great job of demonstrating how XML --the new data access foundation layer-- will galvanize the next generation Web (I refer to this as Web 2.0.).
RSS: INJAN (It's not just about news)
RSS is not just about news, according to Ian Davis on rss-dev.
He presents a nice list of alternatives, which I reproduce here (and to which I�d add, of course, bibliography management)
- Sitemaps: one of the S�s in RSS stands for summary. A sitemap is a summary of the content on a site, the items are pages or content areas. This is clearly a non-chronological ordering of items. Is a hierarchy of RSS sitemaps implied here � how would the linking between them work? How hard would it be to hack a web browser to pick up the RSS sitemap and display it in a sidebar when you visit the site?
- Small ads: also known as classifieds. These expire so there�s some kind of dynamic going on here but the ordering of items isn�t necessarily chronological. How to describe the location of the seller, or the condition of the item or even the price. Not every ad is selling something � perhaps it�s to rent out a room.
- Personals: similar model to the small ads. No prices though (I hope). Comes with a ready made vocabulary of terms that could be converted to an RDF schema. Probably should do that just for the hell of it anyway � gsoh
- Weather reports: how about a week�s worth of weather in an RSS channel. If an item is dated in the future, should an aggregator display it before time? Alternate representations include maps of temperature and pressure etc.
- Auctions: again, related to small ads, but these are much more time limited since there is a hard cutoff after which the auction is closed. The sequence of bids could be interesting � would it make sense to thread them like a discussion so you can see the tactics?
- TV listings: this is definitely chronological but with a twist � the items have durations. They also have other metadata such as cast lists, classification ratings, widescreen, stereo, program type. Some types have additional information such as director and production year.
- Top ten listings: top ten singles, books, dvds, richest people, ugliest, rear of the year etc. Not chronological, but has definate order. May update from day to day or even more often.
- Sales reporting: imagine if every department of a company reported their sales figures via RSS. Then the divisions aggregate the departmental figures and republish to the regional offices, who aggregate and add value up the chain. The chairman of the company subscribes to one super-aggregate feed.
- Membership lists / buddy lists: could I publish my buddy list from Jabber or other instant messengers? Maybe as an interchange format or perhaps could be used to look for shared contacts. Lots of potential overlap with FOAF here.
- Mailing lists: or in fact any messaging system such as usenet. There are some efforts at doing this already (e.g. yahoogroups) but we need more information � threads; references; headers; links into archives.
- Price lists / inventory: the items here are products or services. No particular ordering but it�d be nice to be able to subscribe to a catalog of products and prices from a company. The aggregator should be able to pick out price rises or bargains given enough history.
Thus, if we can comprehend RSS (the blog article below does a great job) we should be able to see the fundamental challenges that are before any organization seeking to exploit the potential of the imminent Web 2.0 inflection; how will you cost-effectively create XML data from existing data sources? Without upgrading or switching database engines, operating systems, programming languages? Put differently how can you exploit this phenomenon without losing your ever dwindling technology choices (believe me choices are dwindling fast but most are oblivious to this fact).
Â
xmlrsssyndication]]>From Wikipedia, the free encyclopedia.
Informix is a relational database and for almost 20 years was also the name of the company who developed it. Informix DBMS was a development of the pioneering Ingres system that also led to Sybase and SQL Server, and was the #2 database system behind Oracle for some time in the 1990s. Their brush with success was surprisingly short-lived however, and by 2000 a series of management blunders had all but destroyed the company. In 2001 they were purchased by IBM in order to gain access to Informix's existing market share and customer base. Long term plans to merge Informix technology with DB2 are in place, since the Informix Arrowhead project is now called DB2 Arrowhead. IBM is also commited in supporting older versions.
]]>How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned? I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years:
Chris DateI just read this on Scott Loftesness's blog and thought it was worth sharing. Scott was an EVP at Visa in the early 90's and his blog is an unbelievably comprehensive discussion of the payments space. Here is his discussion of Visa's recent announcement that this Visa system had reached $1 Trillion in annual United States transaction volume. It is an amazing growth curve and reminds me of a comment Peter Thiel, the former CEO of PayPal, made to me one day when we were having lunch -- he said that when he was pitching PayPal to VCs he was tempted to describe his market opportunity as the "market for money." Visa's numbers prove that that is precisely their market. VC's are always looking for big markets to penetrate and the market for money certainly qualifies. Here are Scott's thoughts:
Visa USA announced this morning that, for the first time, its annual sales volume exceeded $1 trillion.
The record usage means that an average of $32,000 went through the Visa system every second of every day over the 12-month period that ended March 31 - or nearly 10 percent of the 2002 U.S. Gross Domestic Product.
"One trillion dollars is an almost incomprehensible number, but it represents clear evidence of the silent revolution we're witnessing in the way consumers pay for goods and services. It means $12 of every $100 consumers spent in the U.S. is spent using a Visa card," said Carl Pascarella, president and CEO of Visa USA. "This is an important milestone in the history of U.S. commerce. Clearly, more and more people rely upon the security and convenience of Visa credit, debit and other payment products. To put it into context, $1 trillion could buy 162,000 Harley-Davidson motorcycles every day for a year."
By comparison, $1 trillion is greater than the combined volume of all other U.S. payment organizations, a field that includes MasterCard, American Express, Discover and others.
Just before I left Visa in 1994, I remember having a discussion with a colleague about growth in sales volume. 1993 had just ended with $500 billion in annual Visa sales on an international basis. We were focused on that total growing to $1 trillion globally over the next five years. As I recall, the US in 1993 was about 40+% of the global total -- so the growth in US volume over the last nine years has been pretty amazing. Of course, this is also one of those statistics that has a nice built-in inflation hedge too (the numbers just keep growing!). $32,000 a second -- at a $50 average ticket that works out to an average of 640 Visa transactions per second.[via VentureBlog]]]>
Key excerpt of relevance to us (as potential providers of an application that demonstrates RDFs value prop.):
It's not the syntax that makes the difference, it's the app. History supports this view. How many people tried to pry apart the obscure Excel file format on the Mac? Or the Lotus file format on the PC? Name all the market leaders of the past, and only the Web had both the killer app and a transparent format. Maybe the relationship is multiplicative. Maybe Excel would have been the Web if it had used an open file format that anyone could understand. What if you could have created a spreadsheet with BBEdit or a HyperTalk script? The mind boggles at the possibilities (it never happened, of course).
Even in Office 2003 there is a failure to really open things up.
An aside, Jean Paoli rushes into the room, jumping up and down and saying "That's what I'm doing that's what I'm doing."
Anyway, I don't see any killer apps in the RDF crowd. I see lots of people with strong opinions and not much software. Killer apps are not something you wish into existence. Lots of people have said that RDF models a relational database. Okay that tells me something important, the killer app is a relational database.
Ha Ha!
But we already have relational databases. They were new when I was a grad student, and that was a long time ago.
Yeah, but what we don't have is a relational databases that incorporate RDF as part of the database technology evolution roadmap. Of course many will get it (and FUD-emulate) when we unveil something via Virtuoso.
Key excerpt of relevance to us (as potential providers of an application that demonstrates RDFs value prop.):
It's not the syntax that makes the difference, it's the app. History supports this view. How many people tried to pry apart the obscure Excel file format on the Mac? Or the Lotus file format on the PC? Name all the market leaders of the past, and only the Web had both the killer app and a transparent format. Maybe the relationship is multiplicative. Maybe Excel would have been the Web if it had used an open file format that anyone could understand. What if you could have created a spreadsheet with BBEdit or a HyperTalk script? The mind boggles at the possibilities (it never happened, of course).
Even in Office 2003 there is a failure to really open things up.
An aside, Jean Paoli rushes into the room, jumping up and down and saying "That's what I'm doing that's what I'm doing."
Anyway, I don't see any killer apps in the RDF crowd. I see lots of people with strong opinions and not much software. Killer apps are not something you wish into existence. Lots of people have said that RDF models a relational database. Okay that tells me something important, the killer app is a relational database.
Ha Ha!
But we already have relational databases. They were new when I was a grad student, and that was a long time ago.
Yeah, but what we don't have is a relational databases that incorporate RDF as part of the database technology evolution roadmap. Of course many will get it (and FUD-emulate) when we unveil something via Virtuoso.