<?xml version="1.0" encoding="UTF-8" ?>
<!--ATOM based XML document generated By OpenLink Virtuoso-->
<atom:feed xmlns:atom="http://www.w3.org/2005/Atom" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:vi="http://www.openlinksw.com/weblog/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:itunes="http://www.itunes.com/DTDs/Podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/">
<atom:id>http://www.openlinksw.com/weblog/dav/dav-blog-1/</atom:id>
<atom:title>OpenLink Community Blog</atom:title>
<atom:link href="http://www.openlinksw.com/weblog/dav/dav-blog-1/" type="text/html" rel="alternate" />
<atom:link href="http://www.openlinksw.com/weblog/dav/dav-blog-1/gems/atom_tag_arch.xml?:tag=history&amp;:bid=dav-blog-1" type="application/atom+xml" rel="self" />
<atom:subtitle>A Collection of blogs by OpenLink Staff</atom:subtitle>
 <atom:author>
  <atom:name>kidehen@openlinksw.com</atom:name>
  <atom:email>kidehen@openlinksw.com</atom:email>
  </atom:author>
<atom:updated>2009-11-23T14:21:42Z</atom:updated>
<atom:generator>Virtuoso Universal Server 05.12.3041</atom:generator>
<atom:logo>http://www.openlinksw.com/weblog/public/images/vbloglogo.gif</atom:logo>
 <atom:entry>
  <atom:title>European Commission and the Data Overflow</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-10-27#1586</atom:id>
  <atom:published>2009-10-27T18:29:51Z</atom:published>
  <atom:updated>2009-10-27T14:57:31-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The European Commission recently circulated a questionnaire to selected experts on what could be done for the future of big &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x43bae00&quot;&gt;data&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Since the &lt;a href=&quot;http://cordis.europa.eu/fp7/ict/content-knowledge/consultation_en.html&quot; id=&quot;link-id1191c0f8&quot;&gt;questionnaire is public&lt;/a&gt;, I am publishing my answers below.&lt;/p&gt; &lt;ol type=&quot;1&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Data and data types&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What volumes of data are we dealing with today? What is the growth rate? Where can we expect to be in 2015? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;Private data warehouses of corporations have more than doubled yearly for the past years; hundreds of TB is not exceptional. This will continue. The real shift is in structured data being published in increasing quantities with a minimum level of integrate-ability through use of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x5c7add0&quot;&gt;RDF&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x5c7adb8&quot;&gt;linked data&lt;/a&gt; principles. There are rewards for use of standard vocabularies and identifiers through search engines recognizing such data. There is convergence around &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x5c7ada0&quot;&gt;DBpedia&lt;/a&gt; identifiers for real-world entities, e.g., most things that would be in the news.&lt;/p&gt; &lt;p&gt;This also means that internal data processes and silos may be enriched with this content. There is consequent pressure for accommodating more diversity of data, with more flexible &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x7d87a88&quot;&gt;schema&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Ultimately, all content presently stored in RDBs and presented in public accessible dynamic web pages will end up on the web of linked data. Examples are product catalogs, price lists, event schedules and the like.&lt;/p&gt; &lt;p&gt;The volume of the well known linked data sets is around 10 billion statements. With the above mentioned trends, growth by two or three orders of magnitude by 2015 seems reasonable, This is so especially if explicit semantics are extracted from the document web and if there is some further progress in the precision/recall of such extraction.&lt;/p&gt; &lt;p&gt;Relevant sections of this mass of data are a potential addition to any present or future analytics application.&lt;/p&gt; &lt;p&gt;Since arbitrary analytics over the database which is the web cannot be economically provided by a centralized search engine, a cloud model may be used for on-demand selection of relevant data and mixing it with private data. This will drive database innovation for the next years even more than the continued classical warehouse growth.&lt;/p&gt; &lt;p&gt;Science data is another driver of the data overflow. For example, faster gene sequencing, more accurate measurements in high energy physics, better imaging, and remote sensing will produce large volumes of data. This data has highly regular structure but labeling this data with source and lineage calls for a flexible, schema-last, self-describing model, such as RDF and linked data. Data and &lt;a href=&quot;http://dbpedia.org/resource/Metadata&quot; id=&quot;link-id0x7a3fb40&quot;&gt;metadata&lt;/a&gt; should travel together but may have different data models.&lt;/p&gt; &lt;p&gt;By and large, the metadata of science data will be another stream to the web of linked data, at least to the degree it is publicly accessible. Restricted circles can and likely will implement similar ideas.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What types of data can we deal with intelligently due to their inherent structure (geospatial, temporal, social or &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x5a48058&quot;&gt;knowledge&lt;/a&gt; graphs, 3D, sensor streams...)?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;All the above types should be supported inside one DBMS so as to allow efficient querying combining conditions on all these types of data, e.g., &lt;i&gt;photos of sunsets taken last summer in Ibiza, with over 20 megapixels, by people I know.&lt;/i&gt; &lt;/p&gt; &lt;p&gt;Note that the test for being a sunset is an operation on the image blob that should be taken to the data; the images cannot be economically transferred.&lt;/p&gt; &lt;p&gt;Interleaving of all database functions and types becomes increasingly important.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Industries, communities&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is producing these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Right now, projects such as &lt;a href=&quot;http://www.bio2rdf.org/&quot; id=&quot;link-id0x2a29de8&quot;&gt;Bio2RDF&lt;/a&gt;, &lt;a href=&quot;http://neurocommons.org/page/Main_Page&quot; id=&quot;link-id0x7ddaed0&quot;&gt;Neurocommons&lt;/a&gt;, and DBPedia produce this data. The processes are in place and are reasonable. Incremental improvement is to be expected. These processes, along with the &lt;a href=&quot;http://www.w3.org/DesignIssues/LinkedData.html&quot; id=&quot;link-id0xbab4dfd0&quot;&gt;linked data meme&lt;/a&gt; generally taking off, drive demand for better &lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x51f4e0&quot;&gt;NLP&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x51a1b48&quot;&gt;Natural Language Processing&lt;/a&gt;), e.g., &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x956680&quot;&gt;entity&lt;/a&gt; and relationship extraction, especially extraction that can produce instance data in given ontologies (e.g., events) using common identifiers (e.g., DBPedia URIs).&lt;/p&gt; &lt;p&gt;Mapping of RDBs to RDF is possible, and a W3C working group is developing standards for this. The required baseline level has been reached; the rest is a matter of automating deployment. Within the enterprise, there are advantages to be gained for &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x7da9e80&quot;&gt;information&lt;/a&gt; integration; e.g., all entities in the CRM space can be integrated with all email and support tickets through giving everything a &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x71673f8&quot;&gt;URI&lt;/a&gt;. Some of this information may even be published on an &lt;a href=&quot;http://dbpedia.org/resource/Extranet&quot; id=&quot;link-id0x9aa6e0&quot;&gt;extranet&lt;/a&gt; for self-service and web-service interfaces. This has been done at small scales and the rest is a matter of spreading adoption and lowering the entry barrier. Incremental progress will take place, eventually resulting in qualitatively better integration along the value chain when adoption is sufficiently widespread.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is consuming these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Consumers are various. The greatest need is for tools that summarize complex data and allow getting a bird&amp;#39;s eye view of what data is in the first instance available. Consuming the data is hindered by the user not even necessarily knowing what data there is. This is somewhat new, as traditionally the business analyst did know the schema of the warehouse and was proficient with &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x7f7b148&quot;&gt;SQL&lt;/a&gt; report generators and statistics packages.&lt;/p&gt; &lt;p&gt;Where Web 2.0 made the &lt;i&gt;citizen journalist&lt;/i&gt;, the web of linked data will make the &lt;i&gt;citizen analyst&lt;/i&gt;. For this to happen, with benefits for individuals, enterprises, and governments alike, more work in user interfaces, knowledge discovery, and query composition will be useful. We may envision a &amp;quot;meshup economy&amp;quot; where data is plentiful, but the unit of value and exchange is the smart report that crystallizes actionable value from this ocean.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What industrial sectors in Europe could become more competitive if they became much better at managing data?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Any sector could benefit. Early adopters are seen in the biomedical field and to an extent in media. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Is the regulation landscape imposing constraints (privacy, compliance ...) that don&amp;#39;t have today good tool support?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The regulation landscape drives database demand through data retention requirements and the like.&lt;/p&gt; &lt;p&gt;With data integration, especially with privacy-sensitive data (as in medicine), there are issues of whether one dares put otherwise-shareable information online. Regulation is needed to protect individuals, but integration should still be possible for science.&lt;/p&gt; &lt;p&gt;For this, we see a need for progress in applying policy-based approaches (e.g., row level security) to relatively schema-last data such as RDF. This is possible but needs some more work. Also, creating on-the-fly-anonymizing views on data might help.&lt;/p&gt; &lt;p&gt;More research is needed for reconciling the need for security with the advantages of broad-based &lt;i&gt;ad hoc&lt;/i&gt; integration. Ideally, data should be intelligent, aware of its origins and classification and cautious of whom it interacts with, all of this supported under the covers so that the user could ask anything but the data might refuse to answer or might restrict answers according to the user&amp;#39;s profile. This is a tall order and implementing something of the sort is an open question.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What are the main practical problem identified for individuals and organizations? Please give examples and tell us about the main obstacles and barriers.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;We have come across the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Knowing that the data exists in the first place.&lt;/li&gt; &lt;li&gt;If the data is found, figuring out the provenance, units and precision of measurement, identifiers, and the like.&lt;/li&gt; &lt;li&gt;Compatible subject matter but incompatible representation: For example, one has numbers on a map with different maps for different points in time; another has time series of instrument data with geo-location for the instrument. It is only to be expected that the time interval between measurements is not the same. So there is need for a lot of one-off programming to align data.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Other problems have to do with sheer volume, i.e., transfer of data even in a local area network is too slow, let alone over a wide area network. Computation needs to go to the data, and databases need to support this.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Services, software stacks, protocols, standards, benchmarks&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What combinations of components are needed to deal with these problems?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Recent times have seen a proliferation of special purpose databases. Since the data needs of the future are about combining data with maximum agility and minimum performance hit, there is need to gather the currently-separate functionality into an integrated system with sufficient flexibility. We see some of this in integration of map-reduce and scale-out databases. The former antagonists have become partners. Vertica, &lt;a href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x7a94e70&quot;&gt;Greenplum&lt;/a&gt;, and OpenLink &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x2ab2868&quot;&gt;Virtuoso&lt;/a&gt; are example of DBMS featuring work in this direction.&lt;/p&gt; &lt;p&gt;Interoperability and at least &lt;i&gt;de facto&lt;/i&gt; standards in ways of doing this will emerge.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data exchange and processing mechanisms will be needed to work across platforms and programming languages?&lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x78a0458&quot;&gt;HTTP&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x7ff2360&quot;&gt;XML&lt;/a&gt;, and RDF are in fact very verbose, yet these are the formats and models that have uptake. Thus, these will continue to be used even though one might think binary formats to be more efficient.&lt;/p&gt; &lt;p&gt;There are of course science data set standards that are more compressed and these will continue, hopefully adding a practice of rich metadata in RDF.&lt;/p&gt; &lt;p&gt;For internals of systems, MPI and TCP/IP with proprietary optimized wire formats will continue. Inter-system communication will likely continue to be HTTP, XML, and RDF as appropriate.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data environments are today so wastefully messy that they would benefit from the development of standards?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;RDF and &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x5643d70&quot;&gt;OWL&lt;/a&gt; are not messy but they could use some more performance; we are working on this. &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x152ab18&quot;&gt;SPARQL&lt;/a&gt; is finally acquiring the capabilities of a serious query language, so things are slowly coming together.&lt;/p&gt; &lt;p&gt;Community process for developing application domain specific vocabularies works quite well, even though one could argue it is &lt;i&gt;ad hoc&lt;/i&gt; and not up to what a modeling purist might wish.&lt;/p&gt; &lt;p&gt;Top-down imposition of standards has a mixed history, with long and expensive development and sometimes no or little uptake, consider some WS* standards for example.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What kind of performance is expected or required of these systems? Who will measure it reliably? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Relational databases have a history of substantial investment in &lt;a href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0xecc100&quot;&gt;optimization&lt;/a&gt; and some of them are very good for what they do, e.g., the newer generation of analytics databases.&lt;/p&gt; &lt;p&gt;The very large schema-last, no-SQL, sometimes eventually consistent key-value stores have a somewhat shorter history but do fill a real need.&lt;/p&gt; &lt;p&gt;These trends will merge: Extreme scale, schema-last, complex queries, even more complex inference, custom code for in-database machine learning and other bulk processing.&lt;/p&gt; &lt;p&gt;We find RDF augmented with some binary types at this crossroads. This point of the design space will have to provide performance roughly on the level of today&amp;#39;s best relational solution for workloads that fit the relational model. The added cost of schema-last and inference must come down. We are working on this. Research work such as carried out with &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x7ae2890&quot;&gt;MonetDB&lt;/a&gt; gives clues as to how these aims can be reached.&lt;/p&gt; &lt;p&gt;The separation of query language and inference is artificial. After the concepts are mature, these functions will merge and execute close to the data; there are clear evolutionary pressures in this direction.&lt;/p&gt; &lt;p&gt;Benchmarks are key. Some gain can be had even from repurposing standard relational benchmarks like &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x71eb528&quot;&gt;TPC&lt;/a&gt;-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x5e16a40&quot;&gt;H&lt;/a&gt;. But the TPC-H rules do not allow official reporting of such.&lt;/p&gt; &lt;p&gt;Development of benchmarks for RDF, complex queries, and inference is needed. A bold challenge to the community, it should be rooted in real-life integration needs and involve high heterogeneity. A key-value store benchmark might also be conceived. A transaction benchmark like TPC-&lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x78562d0&quot;&gt;C&lt;/a&gt; might be the basis, maybe augmented with massive user-generated content like reviews and blogs.&lt;/p&gt; &lt;p&gt;If benchmarks exist and are not too easy nor inaccessibly difficult nor too expensive to run â think of the high end TPC-C results â then TPC-style rules and processes would be quite adequate. The threshold to publish should be lowered: Everybody runs the TPC workloads internally but few publish.&lt;/p&gt; &lt;p&gt;Some EC initiative for benchmarking could make sense, similar to the TREC initiative of the US government. Industry should be consulted for the specific content; possibly the answers to the present questionnaire can provide an approximate direction.&lt;/p&gt; &lt;p&gt;Benchmarks should be run by software vendors on their own systems, tuned by themselves. But there should be a process of disclosure and auditing; the TPC rules give an example. Compliance should not be too expensive or time consuming. Some community development for automating these things would be a worthwhile target for EC funding.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Usability and training&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How difficult will it be for a developer of average competence to deploy components whose core is based on rather deep computer science? Do we all need to understand Monads and Continuations? What can be done to make it ever easier?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;In the database world, huge advances in technology have taken place behind a relatively simple and stable interface: SQL. For the linked data &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x7761e50&quot;&gt;web&lt;/a&gt;, the same will take place behind SPARQL.&lt;/p&gt; &lt;p&gt;Beyond these, for example, programming with MPI with good utilization of a cluster platform for an arbitrary algorithm, is quite difficult. The casual amateur is hereby warned.&lt;/p&gt; &lt;p&gt;There is no single solution. For automatic parallelization, since explicit, programmatic parallelization of things with MPI for example is very unscalable in terms of required skill, we should favor declarative and/or functional approaches.&lt;/p&gt; &lt;p&gt;Developing a debugger and explanation engine for rule-based and description-logics-based inference would be an idea.&lt;/p&gt; &lt;p&gt;For procedural workloads, things like Erlang may be good in cases and are not overly difficult in principle, especially if there are good debugging facilities.&lt;/p&gt; &lt;p&gt;For shipping functions in a cluster or cloud, the &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x5494b0&quot;&gt;BOOM&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x7f1f148&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt;) approach or logic programming with explicit specification of compute location seem promising, surely more flexible than map-reduce. The question is whether a &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x5c758c8&quot;&gt;PHP&lt;/a&gt; developer can be made to do logic programming.&lt;/p&gt; &lt;p&gt;This bridge will be crossed only with actual need and even then reluctantly. We may look at the Web 2.0 practice of sharding &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0x432f868&quot;&gt;MySQL&lt;/a&gt;, inconvenient as this may be, for an example. There is inertia and thus re-architecting is a constant process that is generally in reaction to facts, &lt;i&gt;post hoc&lt;/i&gt;, often a point solution. One could argue that planning ahead would be smarter but by and large the world does not work so.&lt;/p&gt; &lt;p&gt;One part of the answer is an infinitely-scalable SQL database that expands and shrinks in the clouds, with the usual semantics, maybe optional eventual consistency and built-in map reduce. If such a thing is inexpensive enough and syntax-level-compatible with present installed base, many developers do not have to learn very much more.&lt;/p&gt; &lt;p&gt;This is maybe good for the bread-and-butter IT, but European competitiveness should not rest on this. Therefore we wish to go for bold new application types for which the client-server database application is not the model. Data-centric languages like BOOM, if they can be made very efficient and have good debugging support, are attractive there. These do require more intellectual investment but that is not a problem since the less-inquisitive part of the developer community is served by the first part of the answer.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How is a developer of average skills going to learn about these new advanced tools? How can we plan for excellent documentation and training, community mentoring, exchange of good practices, etc... across all EU countries?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;For the most part, developers do not learn things for the sake of learning. When they have learned something and it is adequate, they stay with it for the most part and are even reluctant to engage in cross-camps interaction. The research world is often similarly insular. A new inflection in the application landscape is needed to drive learning. This inflection is provided by the &lt;a href=&quot;https://wiki.mozilla.org/Labs/Ubiquity&quot; id=&quot;link-id0x7f051c8&quot;&gt;ubiquity&lt;/a&gt; of mobile devices, sensor data, explicit semantics, NLP concept extraction, web of linked data, and such factors.&lt;/p&gt; &lt;p&gt;RDFa is a good example of a new technique piggybacking on something everybody uses, namely HTML. These new things should, within possibility, be deployed in the usual technology stack, &lt;a href=&quot;http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29&quot; id=&quot;link-id0x77151e0&quot;&gt;LAMP&lt;/a&gt; or Java. Of course these do not have to be LAMP or Java or HTML or HTTP themselves but they must manifest through these.&lt;/p&gt; &lt;p&gt;A lot of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x7940cd0&quot;&gt;semantic web&lt;/a&gt; potential can be realized within the client-server database application model, thus no fundamental re-architecting, just some new data types and queries.&lt;/p&gt; &lt;p&gt;For data- or processing-intensive tasks, an on-demand hookup to cloud-based servers with Erlang and/or BOOM for programming model would be easy enough to learn and utilize.&lt;/p&gt; &lt;p&gt;The question is one of providing challenges. Addressing actual challenges with these techniques will lead to maturity, documentation, examples, and training. With virtual, Europe-wide distributed teams a reality in many places, Europe-wide dissemination is no longer insurmountable.&lt;/p&gt; &lt;p&gt;As the data overflow proceeds, its victims will multiply and create demand for solutions. The EC could here encourage research project use cases gaining an extended life past the end of research projects, possibly being maintained and multiplied and spun off.&lt;/p&gt; &lt;p&gt;If such things could be mutated into self-sustaining service businesses with pay-per-use revenue, say through a cloud SaaS business model, still primarily leveraging an open source technology stack, we could have self-propagating and self-supporting models for exploiting advanced IT. This would create interest, and interest would drive training and dissemination.&lt;/p&gt; &lt;p&gt;The problem is creating the pull.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Challenges&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should be, in this domain, the equivalent of the Netflix challenge, Ansari X Prize, &lt;a href=&quot;http://dbpedia.org/resource/Google&quot; id=&quot;link-id0x7e72f40&quot;&gt;Google&lt;/a&gt; Lunar X Prize, etc. ... ?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC itself no doubt suffers from data overflow in one function or another. Unless security/secrecy prohibits, simply publishing a large data set and a description of what operations should be done on it would be a start. The more real the data, the better â reality is consistently more complex and surprising than imagination. Since many interesting problems touch on fraud detection and law enforcement, there may be some security obstacles for using these application domains as subject matters of open challenges.&lt;/p&gt; &lt;p&gt;Once there is a good benchmark, as discussed above, there can be some prize money allocated for the winners, specially if the race is tight.&lt;/p&gt; &lt;p&gt;The Semantic Web Challenge and the Billion Triples Challenge exist and are useful as such, but do not seem to have any huge impact.&lt;/p&gt; &lt;p&gt;The incentives should be sufficient and part of the expenses arising from running for such challenges could be funded. Otherwise investing in existing business development will be more interesting to industry. Some industry participation seems necessary; we would wish academia and industry to work closer. Also, having industry supply the baseline guarantees that academia actually does further the state of the art. This is not always certain.&lt;/p&gt; &lt;p&gt;If challenges are based on actual problems, whether of the EC, its member governments, or private entities, and winning the challenge may lead to a contract for supplying an actual solution, these will naturally become more interesting for consortia involving integrators, specialist software vendors, and academia. Such a model would build actual capacity to deploy leading edge technologies in production, which is sorely needed.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should one do to set up such a challenge, administer, and monitor it?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC should probably circulate a call for actual problem scenarios involving big data. If the matter of the overflow is as dire as represented, cases should be easy to find. A few should be selected and then anonymized if needed.&lt;/p&gt; &lt;p&gt;The party with the use case would benefit by having hopefully the best work on it. The contestants would benefit from having real world needs guide R&amp;amp;D. The EC would not have to do very much, except possibly use some money for funding the best proposals. The winner would possibly get a large account and related sales and service income. The contestants would have to be teams possibly involving many organizations; for example, development and first-line services and support could come from different companies along a systems integrator model such as is widely used in the US.&lt;/p&gt; &lt;p&gt;There may be a good benchmark at the time, possibly resulting from FP7 itself. In such a case, the EC could offer a prize for winners. Details would have to be worked out case by case. Such a challenge could be repeated a few times, as benchmark-driven progress in databases or TREC for example have taken some years to reach a point of slowdown in progress.&lt;/p&gt; &lt;p&gt;Administrating such an activity should not be prohibitive, as most of the expertise can be found with the stakeholders.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>European Commission and the Data Overflow</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2009-10-27#1585</atom:id>
  <atom:published>2009-10-27T18:29:51Z</atom:published>
  <atom:updated>2009-10-27T14:57:28.000002-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The European Commission recently circulated a questionnaire to selected experts on what could be done for the future of big &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x79cfe58&quot;&gt;data&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Since the &lt;a href=&quot;http://cordis.europa.eu/fp7/ict/content-knowledge/consultation_en.html&quot; id=&quot;link-id1191c0f8&quot;&gt;questionnaire is public&lt;/a&gt;, I am publishing my answers below.&lt;/p&gt; &lt;ol type=&quot;1&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Data and data types&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What volumes of data are we dealing with today? What is the growth rate? Where can we expect to be in 2015? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;Private data warehouses of corporations have more than doubled yearly for the past years; hundreds of TB is not exceptional. This will continue. The real shift is in structured data being published in increasing quantities with a minimum level of integrate-ability through use of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x7d7e7a0&quot;&gt;RDF&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x7f2a788&quot;&gt;linked data&lt;/a&gt; principles. There are rewards for use of standard vocabularies and identifiers through search engines recognizing such data. There is convergence around &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x7dfbca8&quot;&gt;DBpedia&lt;/a&gt; identifiers for real-world entities, e.g., most things that would be in the news.&lt;/p&gt; &lt;p&gt;This also means that internal data processes and silos may be enriched with this content. There is consequent pressure for accommodating more diversity of data, with more flexible &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x7babaf8&quot;&gt;schema&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Ultimately, all content presently stored in RDBs and presented in public accessible dynamic web pages will end up on the web of linked data. Examples are product catalogs, price lists, event schedules and the like.&lt;/p&gt; &lt;p&gt;The volume of the well known linked data sets is around 10 billion statements. With the above mentioned trends, growth by two or three orders of magnitude by 2015 seems reasonable, This is so especially if explicit semantics are extracted from the document web and if there is some further progress in the precision/recall of such extraction.&lt;/p&gt; &lt;p&gt;Relevant sections of this mass of data are a potential addition to any present or future analytics application.&lt;/p&gt; &lt;p&gt;Since arbitrary analytics over the database which is the web cannot be economically provided by a centralized search engine, a cloud model may be used for on-demand selection of relevant data and mixing it with private data. This will drive database innovation for the next years even more than the continued classical warehouse growth.&lt;/p&gt; &lt;p&gt;Science data is another driver of the data overflow. For example, faster gene sequencing, more accurate measurements in high energy physics, better imaging, and remote sensing will produce large volumes of data. This data has highly regular structure but labeling this data with source and lineage calls for a flexible, schema-last, self-describing model, such as RDF and linked data. Data and &lt;a href=&quot;http://dbpedia.org/resource/Metadata&quot; id=&quot;link-id0x96ce60&quot;&gt;metadata&lt;/a&gt; should travel together but may have different data models.&lt;/p&gt; &lt;p&gt;By and large, the metadata of science data will be another stream to the web of linked data, at least to the degree it is publicly accessible. Restricted circles can and likely will implement similar ideas.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What types of data can we deal with intelligently due to their inherent structure (geospatial, temporal, social or &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x7e8e248&quot;&gt;knowledge&lt;/a&gt; graphs, 3D, sensor streams...)?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;All the above types should be supported inside one DBMS so as to allow efficient querying combining conditions on all these types of data, e.g., &lt;i&gt;photos of sunsets taken last summer in Ibiza, with over 20 megapixels, by people I know.&lt;/i&gt; &lt;/p&gt; &lt;p&gt;Note that the test for being a sunset is an operation on the image blob that should be taken to the data; the images cannot be economically transferred.&lt;/p&gt; &lt;p&gt;Interleaving of all database functions and types becomes increasingly important.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Industries, communities&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is producing these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Right now, projects such as &lt;a href=&quot;http://www.bio2rdf.org/&quot; id=&quot;link-id0x43bd098&quot;&gt;Bio2RDF&lt;/a&gt;, &lt;a href=&quot;http://neurocommons.org/page/Main_Page&quot; id=&quot;link-id0x5c074b0&quot;&gt;Neurocommons&lt;/a&gt;, and DBPedia produce this data. The processes are in place and are reasonable. Incremental improvement is to be expected. These processes, along with the &lt;a href=&quot;http://www.w3.org/DesignIssues/LinkedData.html&quot; id=&quot;link-id0x72131d0&quot;&gt;linked data meme&lt;/a&gt; generally taking off, drive demand for better &lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x71e7798&quot;&gt;NLP&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x7e0e2f0&quot;&gt;Natural Language Processing&lt;/a&gt;), e.g., &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x71ab500&quot;&gt;entity&lt;/a&gt; and relationship extraction, especially extraction that can produce instance data in given ontologies (e.g., events) using common identifiers (e.g., DBPedia URIs).&lt;/p&gt; &lt;p&gt;Mapping of RDBs to RDF is possible, and a W3C working group is developing standards for this. The required baseline level has been reached; the rest is a matter of automating deployment. Within the enterprise, there are advantages to be gained for &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x7a8e9a8&quot;&gt;information&lt;/a&gt; integration; e.g., all entities in the CRM space can be integrated with all email and support tickets through giving everything a &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x599f630&quot;&gt;URI&lt;/a&gt;. Some of this information may even be published on an &lt;a href=&quot;http://dbpedia.org/resource/Extranet&quot; id=&quot;link-id0x2a28f98&quot;&gt;extranet&lt;/a&gt; for self-service and web-service interfaces. This has been done at small scales and the rest is a matter of spreading adoption and lowering the entry barrier. Incremental progress will take place, eventually resulting in qualitatively better integration along the value chain when adoption is sufficiently widespread.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is consuming these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Consumers are various. The greatest need is for tools that summarize complex data and allow getting a bird&amp;#39;s eye view of what data is in the first instance available. Consuming the data is hindered by the user not even necessarily knowing what data there is. This is somewhat new, as traditionally the business analyst did know the schema of the warehouse and was proficient with &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x5999558&quot;&gt;SQL&lt;/a&gt; report generators and statistics packages.&lt;/p&gt; &lt;p&gt;Where Web 2.0 made the &lt;i&gt;citizen journalist&lt;/i&gt;, the web of linked data will make the &lt;i&gt;citizen analyst&lt;/i&gt;. For this to happen, with benefits for individuals, enterprises, and governments alike, more work in user interfaces, knowledge discovery, and query composition will be useful. We may envision a &amp;quot;meshup economy&amp;quot; where data is plentiful, but the unit of value and exchange is the smart report that crystallizes actionable value from this ocean.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What industrial sectors in Europe could become more competitive if they became much better at managing data?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Any sector could benefit. Early adopters are seen in the biomedical field and to an extent in media. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Is the regulation landscape imposing constraints (privacy, compliance ...) that don&amp;#39;t have today good tool support?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The regulation landscape drives database demand through data retention requirements and the like.&lt;/p&gt; &lt;p&gt;With data integration, especially with privacy-sensitive data (as in medicine), there are issues of whether one dares put otherwise-shareable information online. Regulation is needed to protect individuals, but integration should still be possible for science.&lt;/p&gt; &lt;p&gt;For this, we see a need for progress in applying policy-based approaches (e.g., row level security) to relatively schema-last data such as RDF. This is possible but needs some more work. Also, creating on-the-fly-anonymizing views on data might help.&lt;/p&gt; &lt;p&gt;More research is needed for reconciling the need for security with the advantages of broad-based &lt;i&gt;ad hoc&lt;/i&gt; integration. Ideally, data should be intelligent, aware of its origins and classification and cautious of whom it interacts with, all of this supported under the covers so that the user could ask anything but the data might refuse to answer or might restrict answers according to the user&amp;#39;s profile. This is a tall order and implementing something of the sort is an open question.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What are the main practical problem identified for individuals and organizations? Please give examples and tell us about the main obstacles and barriers.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;We have come across the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Knowing that the data exists in the first place.&lt;/li&gt; &lt;li&gt;If the data is found, figuring out the provenance, units and precision of measurement, identifiers, and the like.&lt;/li&gt; &lt;li&gt;Compatible subject matter but incompatible representation: For example, one has numbers on a map with different maps for different points in time; another has time series of instrument data with geo-location for the instrument. It is only to be expected that the time interval between measurements is not the same. So there is need for a lot of one-off programming to align data.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Other problems have to do with sheer volume, i.e., transfer of data even in a local area network is too slow, let alone over a wide area network. Computation needs to go to the data, and databases need to support this.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Services, software stacks, protocols, standards, benchmarks&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What combinations of components are needed to deal with these problems?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Recent times have seen a proliferation of special purpose databases. Since the data needs of the future are about combining data with maximum agility and minimum performance hit, there is need to gather the currently-separate functionality into an integrated system with sufficient flexibility. We see some of this in integration of map-reduce and scale-out databases. The former antagonists have become partners. Vertica, &lt;a href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x45ecfa0&quot;&gt;Greenplum&lt;/a&gt;, and OpenLink &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x7f73fc8&quot;&gt;Virtuoso&lt;/a&gt; are example of DBMS featuring work in this direction.&lt;/p&gt; &lt;p&gt;Interoperability and at least &lt;i&gt;de facto&lt;/i&gt; standards in ways of doing this will emerge.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data exchange and processing mechanisms will be needed to work across platforms and programming languages?&lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x776a1a0&quot;&gt;HTTP&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x2a4e8d0&quot;&gt;XML&lt;/a&gt;, and RDF are in fact very verbose, yet these are the formats and models that have uptake. Thus, these will continue to be used even though one might think binary formats to be more efficient.&lt;/p&gt; &lt;p&gt;There are of course science data set standards that are more compressed and these will continue, hopefully adding a practice of rich metadata in RDF.&lt;/p&gt; &lt;p&gt;For internals of systems, MPI and TCP/IP with proprietary optimized wire formats will continue. Inter-system communication will likely continue to be HTTP, XML, and RDF as appropriate.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data environments are today so wastefully messy that they would benefit from the development of standards?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;RDF and &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x2a35960&quot;&gt;OWL&lt;/a&gt; are not messy but they could use some more performance; we are working on this. &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x12362e8&quot;&gt;SPARQL&lt;/a&gt; is finally acquiring the capabilities of a serious query language, so things are slowly coming together.&lt;/p&gt; &lt;p&gt;Community process for developing application domain specific vocabularies works quite well, even though one could argue it is &lt;i&gt;ad hoc&lt;/i&gt; and not up to what a modeling purist might wish.&lt;/p&gt; &lt;p&gt;Top-down imposition of standards has a mixed history, with long and expensive development and sometimes no or little uptake, consider some WS* standards for example.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What kind of performance is expected or required of these systems? Who will measure it reliably? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Relational databases have a history of substantial investment in &lt;a href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x7b2d7c8&quot;&gt;optimization&lt;/a&gt; and some of them are very good for what they do, e.g., the newer generation of analytics databases.&lt;/p&gt; &lt;p&gt;The very large schema-last, no-SQL, sometimes eventually consistent key-value stores have a somewhat shorter history but do fill a real need.&lt;/p&gt; &lt;p&gt;These trends will merge: Extreme scale, schema-last, complex queries, even more complex inference, custom code for in-database machine learning and other bulk processing.&lt;/p&gt; &lt;p&gt;We find RDF augmented with some binary types at this crossroads. This point of the design space will have to provide performance roughly on the level of today&amp;#39;s best relational solution for workloads that fit the relational model. The added cost of schema-last and inference must come down. We are working on this. Research work such as carried out with &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x794ee48&quot;&gt;MonetDB&lt;/a&gt; gives clues as to how these aims can be reached.&lt;/p&gt; &lt;p&gt;The separation of query language and inference is artificial. After the concepts are mature, these functions will merge and execute close to the data; there are clear evolutionary pressures in this direction.&lt;/p&gt; &lt;p&gt;Benchmarks are key. Some gain can be had even from repurposing standard relational benchmarks like &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x7d45c58&quot;&gt;TPC&lt;/a&gt;-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x45b0198&quot;&gt;H&lt;/a&gt;. But the TPC-H rules do not allow official reporting of such.&lt;/p&gt; &lt;p&gt;Development of benchmarks for RDF, complex queries, and inference is needed. A bold challenge to the community, it should be rooted in real-life integration needs and involve high heterogeneity. A key-value store benchmark might also be conceived. A transaction benchmark like TPC-&lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x7e32178&quot;&gt;C&lt;/a&gt; might be the basis, maybe augmented with massive user-generated content like reviews and blogs.&lt;/p&gt; &lt;p&gt;If benchmarks exist and are not too easy nor inaccessibly difficult nor too expensive to run â think of the high end TPC-C results â then TPC-style rules and processes would be quite adequate. The threshold to publish should be lowered: Everybody runs the TPC workloads internally but few publish.&lt;/p&gt; &lt;p&gt;Some EC initiative for benchmarking could make sense, similar to the TREC initiative of the US government. Industry should be consulted for the specific content; possibly the answers to the present questionnaire can provide an approximate direction.&lt;/p&gt; &lt;p&gt;Benchmarks should be run by software vendors on their own systems, tuned by themselves. But there should be a process of disclosure and auditing; the TPC rules give an example. Compliance should not be too expensive or time consuming. Some community development for automating these things would be a worthwhile target for EC funding.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Usability and training&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How difficult will it be for a developer of average competence to deploy components whose core is based on rather deep computer science? Do we all need to understand Monads and Continuations? What can be done to make it ever easier?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;In the database world, huge advances in technology have taken place behind a relatively simple and stable interface: SQL. For the linked data &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x7e01618&quot;&gt;web&lt;/a&gt;, the same will take place behind SPARQL.&lt;/p&gt; &lt;p&gt;Beyond these, for example, programming with MPI with good utilization of a cluster platform for an arbitrary algorithm, is quite difficult. The casual amateur is hereby warned.&lt;/p&gt; &lt;p&gt;There is no single solution. For automatic parallelization, since explicit, programmatic parallelization of things with MPI for example is very unscalable in terms of required skill, we should favor declarative and/or functional approaches.&lt;/p&gt; &lt;p&gt;Developing a debugger and explanation engine for rule-based and description-logics-based inference would be an idea.&lt;/p&gt; &lt;p&gt;For procedural workloads, things like Erlang may be good in cases and are not overly difficult in principle, especially if there are good debugging facilities.&lt;/p&gt; &lt;p&gt;For shipping functions in a cluster or cloud, the &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x43665a8&quot;&gt;BOOM&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x7718f00&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt;) approach or logic programming with explicit specification of compute location seem promising, surely more flexible than map-reduce. The question is whether a &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x7d64f68&quot;&gt;PHP&lt;/a&gt; developer can be made to do logic programming.&lt;/p&gt; &lt;p&gt;This bridge will be crossed only with actual need and even then reluctantly. We may look at the Web 2.0 practice of sharding &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0xbab1ae98&quot;&gt;MySQL&lt;/a&gt;, inconvenient as this may be, for an example. There is inertia and thus re-architecting is a constant process that is generally in reaction to facts, &lt;i&gt;post hoc&lt;/i&gt;, often a point solution. One could argue that planning ahead would be smarter but by and large the world does not work so.&lt;/p&gt; &lt;p&gt;One part of the answer is an infinitely-scalable SQL database that expands and shrinks in the clouds, with the usual semantics, maybe optional eventual consistency and built-in map reduce. If such a thing is inexpensive enough and syntax-level-compatible with present installed base, many developers do not have to learn very much more.&lt;/p&gt; &lt;p&gt;This is maybe good for the bread-and-butter IT, but European competitiveness should not rest on this. Therefore we wish to go for bold new application types for which the client-server database application is not the model. Data-centric languages like BOOM, if they can be made very efficient and have good debugging support, are attractive there. These do require more intellectual investment but that is not a problem since the less-inquisitive part of the developer community is served by the first part of the answer.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How is a developer of average skills going to learn about these new advanced tools? How can we plan for excellent documentation and training, community mentoring, exchange of good practices, etc... across all EU countries?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;For the most part, developers do not learn things for the sake of learning. When they have learned something and it is adequate, they stay with it for the most part and are even reluctant to engage in cross-camps interaction. The research world is often similarly insular. A new inflection in the application landscape is needed to drive learning. This inflection is provided by the &lt;a href=&quot;https://wiki.mozilla.org/Labs/Ubiquity&quot; id=&quot;link-id0x770df38&quot;&gt;ubiquity&lt;/a&gt; of mobile devices, sensor data, explicit semantics, NLP concept extraction, web of linked data, and such factors.&lt;/p&gt; &lt;p&gt;RDFa is a good example of a new technique piggybacking on something everybody uses, namely HTML. These new things should, within possibility, be deployed in the usual technology stack, &lt;a href=&quot;http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29&quot; id=&quot;link-id0x55596a8&quot;&gt;LAMP&lt;/a&gt; or Java. Of course these do not have to be LAMP or Java or HTML or HTTP themselves but they must manifest through these.&lt;/p&gt; &lt;p&gt;A lot of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x3d5378&quot;&gt;semantic web&lt;/a&gt; potential can be realized within the client-server database application model, thus no fundamental re-architecting, just some new data types and queries.&lt;/p&gt; &lt;p&gt;For data- or processing-intensive tasks, an on-demand hookup to cloud-based servers with Erlang and/or BOOM for programming model would be easy enough to learn and utilize.&lt;/p&gt; &lt;p&gt;The question is one of providing challenges. Addressing actual challenges with these techniques will lead to maturity, documentation, examples, and training. With virtual, Europe-wide distributed teams a reality in many places, Europe-wide dissemination is no longer insurmountable.&lt;/p&gt; &lt;p&gt;As the data overflow proceeds, its victims will multiply and create demand for solutions. The EC could here encourage research project use cases gaining an extended life past the end of research projects, possibly being maintained and multiplied and spun off.&lt;/p&gt; &lt;p&gt;If such things could be mutated into self-sustaining service businesses with pay-per-use revenue, say through a cloud SaaS business model, still primarily leveraging an open source technology stack, we could have self-propagating and self-supporting models for exploiting advanced IT. This would create interest, and interest would drive training and dissemination.&lt;/p&gt; &lt;p&gt;The problem is creating the pull.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Challenges&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should be, in this domain, the equivalent of the Netflix challenge, Ansari X Prize, &lt;a href=&quot;http://dbpedia.org/resource/Google&quot; id=&quot;link-id0x6a6c2b0&quot;&gt;Google&lt;/a&gt; Lunar X Prize, etc. ... ?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC itself no doubt suffers from data overflow in one function or another. Unless security/secrecy prohibits, simply publishing a large data set and a description of what operations should be done on it would be a start. The more real the data, the better â reality is consistently more complex and surprising than imagination. Since many interesting problems touch on fraud detection and law enforcement, there may be some security obstacles for using these application domains as subject matters of open challenges.&lt;/p&gt; &lt;p&gt;Once there is a good benchmark, as discussed above, there can be some prize money allocated for the winners, specially if the race is tight.&lt;/p&gt; &lt;p&gt;The Semantic Web Challenge and the Billion Triples Challenge exist and are useful as such, but do not seem to have any huge impact.&lt;/p&gt; &lt;p&gt;The incentives should be sufficient and part of the expenses arising from running for such challenges could be funded. Otherwise investing in existing business development will be more interesting to industry. Some industry participation seems necessary; we would wish academia and industry to work closer. Also, having industry supply the baseline guarantees that academia actually does further the state of the art. This is not always certain.&lt;/p&gt; &lt;p&gt;If challenges are based on actual problems, whether of the EC, its member governments, or private entities, and winning the challenge may lead to a contract for supplying an actual solution, these will naturally become more interesting for consortia involving integrators, specialist software vendors, and academia. Such a model would build actual capacity to deploy leading edge technologies in production, which is sorely needed.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should one do to set up such a challenge, administer, and monitor it?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC should probably circulate a call for actual problem scenarios involving big data. If the matter of the overflow is as dire as represented, cases should be easy to find. A few should be selected and then anonymized if needed.&lt;/p&gt; &lt;p&gt;The party with the use case would benefit by having hopefully the best work on it. The contestants would benefit from having real world needs guide R&amp;amp;D. The EC would not have to do very much, except possibly use some money for funding the best proposals. The winner would possibly get a large account and related sales and service income. The contestants would have to be teams possibly involving many organizations; for example, development and first-line services and support could come from different companies along a systems integrator model such as is widely used in the US.&lt;/p&gt; &lt;p&gt;There may be a good benchmark at the time, possibly resulting from FP7 itself. In such a case, the EC could offer a prize for winners. Details would have to be worked out case by case. Such a challenge could be repeated a few times, as benchmark-driven progress in databases or TREC for example have taken some years to reach a point of slowdown in progress.&lt;/p&gt; &lt;p&gt;Administrating such an activity should not be prohibitive, as most of the expertise can be found with the stakeholders.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>VLDB 2009 Web Scale Data Management Panel (5 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-09-01#1583</atom:id>
  <atom:published>2009-09-01T16:24:17Z</atom:published>
  <atom:updated>2009-09-02T12:05:26-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt; &lt;p&gt; &lt;i&gt;&amp;quot;The universe of cycles is not exactly one of literal cycles, but rather one of spirals,&amp;quot; mused &lt;a href=&quot;http://db.cs.berkeley.edu/jmh/&quot; id=&quot;link-id117455a0&quot;&gt;Joe Hellerstein&lt;/a&gt; of UC Berkeley.&lt;/i&gt; &lt;/p&gt; &lt;p&gt; &lt;i&gt;&amp;quot;Come on, let&amp;#39;s all drop some &lt;a href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id16b3db50&quot;&gt;ACID&lt;/a&gt;,&amp;quot; interjected another.&lt;/i&gt; &lt;/p&gt; &lt;p&gt; &lt;i&gt;&amp;quot;It is not that we end up repeating the exact same things, rather even if some patterns seem to repeat, they do so at a higher level, enhanced by the experience gained,&amp;quot; continued Joe.&lt;/i&gt; &lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;Thus did the Web Scale &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id11061ae0&quot;&gt;Data&lt;/a&gt; Management panel conclude.&lt;/p&gt; &lt;p&gt;Whether successive generations are made wiser by the ones that have gone before may be argued either way.&lt;/p&gt; &lt;p&gt;The cycle in question was that of developers discovering ACID in the 1960s, i.e. Atomicity, Consistency, Integrity, Durability. Thus did the DBMS come into being. Then DBMSs kept becoming more complex until, as there will be a counter-force to each force, came the &lt;a href=&quot;http://dbpedia.org/resource/Meme&quot; id=&quot;link-id11076cc8&quot;&gt;meme&lt;/a&gt; of key value stores and BASE, no multiple-row transactions, eventual consistency, no query language but scaling to thousands of computers. So now, the DBMS community asks itself what went wrong.&lt;/p&gt; &lt;p&gt;In the words of one panelist, another demonstrated a &amp;quot;shocking familiarity with the subject matter of substance abuse&amp;quot; when he called for the DBMS community to get on a &lt;a href=&quot;http://dbpedia.org/resource/Twelve-step_program&quot; id=&quot;link-id15d954a8&quot;&gt;12 step program&lt;/a&gt; and to look where addiction to certain ideas, among which ACID, had brought its life. Look at yourself: The influential papers in what ought to be your space by rights are coming from the OS community: &lt;a href=&quot;http://dbpedia.org/resource/Google&quot; id=&quot;link-id166675f0&quot;&gt;Google&lt;/a&gt; Bigtable, Amazon Dynamo, want more? When you ought to drive, you give excuses and play catch up! Stop denial, drop &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id1105adf0&quot;&gt;SQL&lt;/a&gt;, drop ACID!&lt;/p&gt; &lt;p&gt;The web developers have revolted against the time-honored principles of the DBMS. This is true. Sharded &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id1221c230&quot;&gt;MySQL&lt;/a&gt; is not the ticket â or is it? Must they rediscover the virtues of ACID, just like the previous generation did?&lt;/p&gt; &lt;p&gt;Nothing under the sun is new. As in music and fashion, trends keep cycling also in science and engineering.&lt;/p&gt; &lt;p&gt;But seriously, does the full-featured DBMS scale to web scale? &lt;a href=&quot;http://dbpedia.org/resource/Microsoft&quot; id=&quot;link-id10ffcaf8&quot;&gt;Microsoft&lt;/a&gt; says the Azure version of SQL server does. &lt;a href=&quot;http://dbpedia.org/resource/Yahoo%21&quot; id=&quot;link-id16b3f138&quot;&gt;Yahoo&lt;/a&gt; says they want no SQL but &lt;a href=&quot;http://dbpedia.org/resource/Hadoop&quot; id=&quot;link-id11046ef0&quot;&gt;Hadoop&lt;/a&gt; and &lt;a href=&quot;http://research.yahoo.com/node/2304&quot; id=&quot;link-id110a0040&quot;&gt;PNUTS&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Twitter, Facebook, and other web names got their own discussion. Why do they not go to serious DBMS vendors for their data but make their own, like Facebook with Hive?&lt;/p&gt; &lt;p&gt;Who can divine the mind of the web developer? What makes them go to &lt;a href=&quot;http://www.danga.com/memcached/&quot; id=&quot;link-id1109e280&quot;&gt;memcached&lt;/a&gt;, manually sharded MySQL, and &lt;a href=&quot;http://dbpedia.org/resource/MapReduce&quot; id=&quot;link-id1107cd60&quot;&gt;MapReduce&lt;/a&gt;, walking away from the 40 years of technology invested in declarative query and ACID? What is this highly visible but hard to grasp &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id1105b6b8&quot;&gt;entity&lt;/a&gt;? My guess is that they want something they can understand, at least at the beginning. A DBMS, especially on a cluster, is complicated, and it is not so easy to say how it works and how its performance is determined. The big brands, if deployed on a thousand PCs, would also be prohibitively expensive. But if all you do with the DBMS is single row selects and updates, it is no longer so scary, but you end up doing all the distributed things in a middle layer, and abandoning expressive queries, transactions, and database-supported transparency of location. But at least now you know how it works and what it is good/not good for.&lt;/p&gt; &lt;p&gt;This would be the case for those who make a conscious choice. But by and large the choice is not deliberate; it is something one drifts into: The application gains popularity; the single &lt;a href=&quot;http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29&quot; id=&quot;link-iddc68d28&quot;&gt;LAMP&lt;/a&gt; can no longer keep all in memory; you need a second MySQL in the LAMP and you decide that users AâM go left and NâZ right (horizontal partitioning). This siren of sharding beckons you and all is good until you hit the reef of re-architecting. Memcached and duct-tape help, like aspirin helps with hangover, but the root cause of the headache lies unaddressed.&lt;/p&gt; &lt;p&gt;The conclusion was that there ought to be something incrementally scalable from the get-go. Low cost of entry and built-in scale-out. No, the web developers do not hate SQL; they just have gotten the idea that it does not scale. But they would really wish it to. So, DBMS people, show there is life in you yet.&lt;/p&gt; &lt;p&gt;Joe Hellerstein was the philosopher and paradigmatician of the panel. His team had developed a protocol-compatible Hadoop in a few months using a declarative logic programming style approach. His claim was that developers made the market. Thus, for writing applications against web scale data, there would have to be data centric languages. Why not? These are discussed in &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id110ba0e0&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id16aab768&quot;&gt;BOOM&lt;/a&gt;).&lt;/p&gt; &lt;p&gt;I come from &lt;a href=&quot;http://en.wikipedia.org/wiki/Lisp_%28programming_language%29&quot; id=&quot;link-id10f2cd68&quot;&gt;Lisp&lt;/a&gt; myself, way back. I have since abandoned any desire to tell anybody what they ought to program in. This is a bit like religion: Attempting to impose or legislate or ram it on somebody just results in anything from lip service to rejection to war. The appeal exerted by the diverse language/paradigm -isms on their followers seems to be based on hitting a simplification of reality that coincides with a problem in the air. MapReduce is an example of this. &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-ide22cdd0&quot;&gt;PHP&lt;/a&gt; is another. A quick fix for a present need: Scripting web servers (PHP) or processing tons of files (MapReduce). The full database is not as quick a fix, even though it has many desirable features. It is also not as easy to tell what happens inside one, so MapReduce may give a greater feeling of control.&lt;/p&gt; &lt;p&gt;Totally self-managing, dynamically-scalable &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id152864b0&quot;&gt;RDF&lt;/a&gt; would be a fix for not having to design or administer databases: Since it would be indexed on everything, complex queries would be possible; no full database scans would stop everything. For the mid-size segment of web sites this might be a fit. For the extreme ends of the spectrum, the choice is likely something custom built and much less expressive.&lt;/p&gt; &lt;p&gt;The BOOM rule language for data-centric programming would be something very easy for us to implement, in fact we will get something of the sort essentially for free when we do the rule support already planned.&lt;/p&gt; &lt;p&gt;The question is, can one induce web developers to do logic? The history is one of procedures, both in LAMP and MapReduce. On the other hand, the query languages that were ever universally adopted were declarative, i.e., keyword search and SQL. There certainly is a quest for an application model for the cloud space beyond just migrating apps. We&amp;#39;ll see. More on this another time.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>VLDB 2009 Web Scale Data Management Panel (5 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2009-09-01#1582</atom:id>
  <atom:published>2009-09-01T16:24:17Z</atom:published>
  <atom:updated>2009-09-02T12:05:20.000001-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt; &lt;p&gt; &lt;i&gt;&amp;quot;The universe of cycles is not exactly one of literal cycles, but rather one of spirals,&amp;quot; mused &lt;a href=&quot;http://db.cs.berkeley.edu/jmh/&quot; id=&quot;link-id117455a0&quot;&gt;Joe Hellerstein&lt;/a&gt; of UC Berkeley.&lt;/i&gt; &lt;/p&gt; &lt;p&gt; &lt;i&gt;&amp;quot;Come on, let&amp;#39;s all drop some &lt;a href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id16b3db50&quot;&gt;ACID&lt;/a&gt;,&amp;quot; interjected another.&lt;/i&gt; &lt;/p&gt; &lt;p&gt; &lt;i&gt;&amp;quot;It is not that we end up repeating the exact same things, rather even if some patterns seem to repeat, they do so at a higher level, enhanced by the experience gained,&amp;quot; continued Joe.&lt;/i&gt; &lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;Thus did the Web Scale &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id11061ae0&quot;&gt;Data&lt;/a&gt; Management panel conclude.&lt;/p&gt; &lt;p&gt;Whether successive generations are made wiser by the ones that have gone before may be argued either way.&lt;/p&gt; &lt;p&gt;The cycle in question was that of developers discovering ACID in the 1960s, i.e. Atomicity, Consistency, Integrity, Durability. Thus did the DBMS come into being. Then DBMSs kept becoming more complex until, as there will be a counter-force to each force, came the &lt;a href=&quot;http://dbpedia.org/resource/Meme&quot; id=&quot;link-id11076cc8&quot;&gt;meme&lt;/a&gt; of key value stores and BASE, no multiple-row transactions, eventual consistency, no query language but scaling to thousands of computers. So now, the DBMS community asks itself what went wrong.&lt;/p&gt; &lt;p&gt;In the words of one panelist, another demonstrated a &amp;quot;shocking familiarity with the subject matter of substance abuse&amp;quot; when he called for the DBMS community to get on a &lt;a href=&quot;http://dbpedia.org/resource/Twelve-step_program&quot; id=&quot;link-id15d954a8&quot;&gt;12 step program&lt;/a&gt; and to look where addiction to certain ideas, among which ACID, had brought its life. Look at yourself: The influential papers in what ought to be your space by rights are coming from the OS community: &lt;a href=&quot;http://dbpedia.org/resource/Google&quot; id=&quot;link-id166675f0&quot;&gt;Google&lt;/a&gt; Bigtable, Amazon Dynamo, want more? When you ought to drive, you give excuses and play catch up! Stop denial, drop &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id1105adf0&quot;&gt;SQL&lt;/a&gt;, drop ACID!&lt;/p&gt; &lt;p&gt;The web developers have revolted against the time-honored principles of the DBMS. This is true. Sharded &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id1221c230&quot;&gt;MySQL&lt;/a&gt; is not the ticket â or is it? Must they rediscover the virtues of ACID, just like the previous generation did?&lt;/p&gt; &lt;p&gt;Nothing under the sun is new. As in music and fashion, trends keep cycling also in science and engineering.&lt;/p&gt; &lt;p&gt;But seriously, does the full-featured DBMS scale to web scale? &lt;a href=&quot;http://dbpedia.org/resource/Microsoft&quot; id=&quot;link-id10ffcaf8&quot;&gt;Microsoft&lt;/a&gt; says the Azure version of SQL server does. &lt;a href=&quot;http://dbpedia.org/resource/Yahoo%21&quot; id=&quot;link-id16b3f138&quot;&gt;Yahoo&lt;/a&gt; says they want no SQL but &lt;a href=&quot;http://dbpedia.org/resource/Hadoop&quot; id=&quot;link-id11046ef0&quot;&gt;Hadoop&lt;/a&gt; and &lt;a href=&quot;http://research.yahoo.com/node/2304&quot; id=&quot;link-id110a0040&quot;&gt;PNUTS&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Twitter, Facebook, and other web names got their own discussion. Why do they not go to serious DBMS vendors for their data but make their own, like Facebook with Hive?&lt;/p&gt; &lt;p&gt;Who can divine the mind of the web developer? What makes them go to &lt;a href=&quot;http://www.danga.com/memcached/&quot; id=&quot;link-id1109e280&quot;&gt;memcached&lt;/a&gt;, manually sharded MySQL, and &lt;a href=&quot;http://dbpedia.org/resource/MapReduce&quot; id=&quot;link-id1107cd60&quot;&gt;MapReduce&lt;/a&gt;, walking away from the 40 years of technology invested in declarative query and ACID? What is this highly visible but hard to grasp &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id1105b6b8&quot;&gt;entity&lt;/a&gt;? My guess is that they want something they can understand, at least at the beginning. A DBMS, especially on a cluster, is complicated, and it is not so easy to say how it works and how its performance is determined. The big brands, if deployed on a thousand PCs, would also be prohibitively expensive. But if all you do with the DBMS is single row selects and updates, it is no longer so scary, but you end up doing all the distributed things in a middle layer, and abandoning expressive queries, transactions, and database-supported transparency of location. But at least now you know how it works and what it is good/not good for.&lt;/p&gt; &lt;p&gt;This would be the case for those who make a conscious choice. But by and large the choice is not deliberate; it is something one drifts into: The application gains popularity; the single &lt;a href=&quot;http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29&quot; id=&quot;link-iddc68d28&quot;&gt;LAMP&lt;/a&gt; can no longer keep all in memory; you need a second MySQL in the LAMP and you decide that users AâM go left and NâZ right (horizontal partitioning). This siren of sharding beckons you and all is good until you hit the reef of re-architecting. Memcached and duct-tape help, like aspirin helps with hangover, but the root cause of the headache lies unaddressed.&lt;/p&gt; &lt;p&gt;The conclusion was that there ought to be something incrementally scalable from the get-go. Low cost of entry and built-in scale-out. No, the web developers do not hate SQL; they just have gotten the idea that it does not scale. But they would really wish it to. So, DBMS people, show there is life in you yet.&lt;/p&gt; &lt;p&gt;Joe Hellerstein was the philosopher and paradigmatician of the panel. His team had developed a protocol-compatible Hadoop in a few months using a declarative logic programming style approach. His claim was that developers made the market. Thus, for writing applications against web scale data, there would have to be data centric languages. Why not? These are discussed in &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id110ba0e0&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id16aab768&quot;&gt;BOOM&lt;/a&gt;).&lt;/p&gt; &lt;p&gt;I come from &lt;a href=&quot;http://en.wikipedia.org/wiki/Lisp_%28programming_language%29&quot; id=&quot;link-id10f2cd68&quot;&gt;Lisp&lt;/a&gt; myself, way back. I have since abandoned any desire to tell anybody what they ought to program in. This is a bit like religion: Attempting to impose or legislate or ram it on somebody just results in anything from lip service to rejection to war. The appeal exerted by the diverse language/paradigm -isms on their followers seems to be based on hitting a simplification of reality that coincides with a problem in the air. MapReduce is an example of this. &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-ide22cdd0&quot;&gt;PHP&lt;/a&gt; is another. A quick fix for a present need: Scripting web servers (PHP) or processing tons of files (MapReduce). The full database is not as quick a fix, even though it has many desirable features. It is also not as easy to tell what happens inside one, so MapReduce may give a greater feeling of control.&lt;/p&gt; &lt;p&gt;Totally self-managing, dynamically-scalable &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id152864b0&quot;&gt;RDF&lt;/a&gt; would be a fix for not having to design or administer databases: Since it would be indexed on everything, complex queries would be possible; no full database scans would stop everything. For the mid-size segment of web sites this might be a fit. For the extreme ends of the spectrum, the choice is likely something custom built and much less expressive.&lt;/p&gt; &lt;p&gt;The BOOM rule language for data-centric programming would be something very easy for us to implement, in fact we will get something of the sort essentially for free when we do the rule support already planned.&lt;/p&gt; &lt;p&gt;The question is, can one induce web developers to do logic? The history is one of procedures, both in LAMP and MapReduce. On the other hand, the query languages that were ever universally adopted were declarative, i.e., keyword search and SQL. There certainly is a quest for an application model for the cloud space beyond just migrating apps. We&amp;#39;ll see. More on this another time.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The URI, URL, and Linked Data Meme&#39;s Generic HTTP URI (Updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2009-08-07#1567</atom:id>
  <atom:published>2009-08-07T18:34:50Z</atom:published>
  <atom:updated>2009-10-07T08:02:34-04:00</atom:updated>
  <atom:content type="html">&lt;h3&gt;Situation Analysis&lt;/h3&gt; &lt;p&gt;As the &amp;quot;&lt;a href=&quot;http://www.w3.org/DesignIssues/LinkedData.html&quot; id=&quot;link-id12f96a00&quot;&gt;Linked Data&amp;quot; meme&lt;/a&gt; has gained momentum you&amp;#39;ve more than likely been on the receiving end of dialog with Linked Open &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;Data&lt;/a&gt; community members (myself included) that goes something like this:&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt;&amp;quot;Do you have a &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id139252a0&quot;&gt;URI&lt;/a&gt;&amp;quot;, &amp;quot;Get yourself a URI&amp;quot;, &amp;quot;Give &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id140eab68&quot;&gt;me&lt;/a&gt; a de-referencable URI&amp;quot; etc..&lt;/cite&gt; &lt;/blockquote&gt; &lt;p&gt;And each time, you respond with a &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id112c1860&quot;&gt;URL&lt;/a&gt; -- which to the best of your &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id140b51c0&quot;&gt;knowledge&lt;/a&gt; is a bona fide URI. But to your utter confusion you are told: Nah! You gave me a Document URI instead of the URI of a real-world thing or object etc..&lt;/p&gt; &lt;h3&gt;What&amp;#39;s up with that?&lt;/h3&gt; &lt;p&gt;Well our everyday use of the Web is an unfortunate conflation of two distinct things, which have Identity: Real World Objects (RWOs) &amp;amp; Address/Location of Documents (&lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id144838b0&quot;&gt;Information&lt;/a&gt; bearing Resources).&lt;/p&gt; &lt;p&gt;The &amp;quot;Linked Data&amp;quot; meme is about enhancing the Web by unobtrusively reintroducing its core essence: the generic HTTP URI, a vital piece of Web Architecture DNA. Basically, its about so realizing the full capabilities of the Web as a platform for Open Data Identification, Definition, Access, Storage, Representation, Presentation, and Integration.&lt;/p&gt; &lt;h3&gt;What is a Real World Object?&lt;/h3&gt; &lt;p&gt;People, Places, Music, Books, Cars, Ideas, Emotions etc..&lt;/p&gt; &lt;h3&gt;What is a URI?&lt;/h3&gt; &lt;p&gt;A Uniform Resource Identifier. A global identifier mechanism for network addressable data items. Its sole function is Name oriented Identification.&lt;/p&gt; &lt;h4&gt;URI Generic Syntax&lt;/h4&gt; &lt;p&gt;The constituent parts of a URI (from &lt;a href=&quot;http://www.ietf.org/rfc/rfc2396.txt&quot; id=&quot;link-id1180c700&quot;&gt;URI Generic Syntax RFC&lt;/a&gt;) are depicted below: &lt;img src=&quot;http://virtuoso.openlinksw.com/images/generic_uri_syntax_image.png&quot; /&gt; &lt;/p&gt; &lt;h3&gt;What is a URL?&lt;/h3&gt; &lt;p&gt;A location oriented HTTP scheme based URI. The HTTP scheme introduces a powerful and inherent duality that delivers:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; Resource Address/Location Identifier&lt;/li&gt; &lt;li&gt; Data Access mechanism for an Information bearing Resource (Document, File etc..) &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;So far so good!&lt;/p&gt; &lt;h3&gt;What is an HTTP based URI?&lt;/h3&gt; &lt;p&gt;The kind of URI &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id11100a28&quot;&gt;Linked Data&lt;/a&gt; aficionados mean when they use the term: URI.&lt;/p&gt; &lt;p&gt;An HTTP URI is an HTTP scheme based URI. Unlike a URL, this kind of HTTP scheme URI is devoid of any Web Location orientation or specificity. Thus, Its inherent duality provides a more powerful level of abstraction. Hence, you can use this form of URI to assign Names/Identifiers to Real World Objects (RWO). Even better, courtesy of the Identity/Address duality of the HTTP scheme, a single URI can deliver the following:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; RWO Identfier/Name&lt;/li&gt; &lt;li&gt; RWO Metadata document Locator (courtesy of URL aspect) &lt;/li&gt; &lt;li&gt; Negotiable Representation of the Located Document (courtesy of HTTP&amp;#39;s content negotiation feature).&lt;/li&gt; &lt;/ol&gt; &lt;h3&gt;What is Metadata?&lt;/h3&gt; &lt;p&gt; Data about Data. Put differently, data that describes other data in a structured manner.&lt;/p&gt; &lt;h3&gt;How Do we Model Metadata?&lt;/h3&gt; &lt;p&gt;The predominant model for metadata is the &lt;a href=&quot;http://dbpedia.org/resource/Entity-attribute-value_model&quot; id=&quot;link-id11193d30&quot;&gt;Entity&lt;/a&gt;-Attribute-Value + Classes &amp;amp; Relationships model (&lt;a href=&quot;http://dbpedia.org/resource/Entity-attribute-value_model&quot; id=&quot;link-id11725710&quot;&gt;EAV&lt;/a&gt;/CR). A model that&amp;#39;s been with us since the inception of modern computing (long before the Web). &lt;/p&gt; &lt;h3&gt;What about RDF?&lt;/h3&gt; &lt;p&gt;The Resource Description Framework (RDF) is a framework for describing Web addressable resources. In a nutshell, its a framework for adding Metadata bearing Information Resources to the current Web. Its comprised of:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; Entity-Attribute-Value (aka. Subject-Predictate-Object) plus Classes &amp;amp; Relationships (&lt;a href=&quot;http://dbpedia.org/resource/Data_dictionary&quot; id=&quot;link-id138df0f8&quot;&gt;Data Dictionaries&lt;/a&gt; e.g., &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id116bf590&quot;&gt;OWL&lt;/a&gt;) metadata model&lt;/li&gt; &lt;li&gt; A plethora of instance data representation formats that include: &lt;a href=&quot;http://dbpedia.org/resource/RDFa&quot; id=&quot;link-id13360b90&quot;&gt;RDFa&lt;/a&gt; (when doing so within (X)HTML docs), Turtle, N3, TriX, RDF/XML etc. &lt;/li&gt; &lt;/ol&gt; &lt;h3&gt;What&amp;#39;s the Problem Today?&lt;/h3&gt; &lt;p&gt;The ubiquitous use of the Web is primarily focused on a Linked Mesh of Information bearing Documents. URLs rather than generic HTTP URIs are the prime mechanism for Web tapestry; basically, we use URLs to conduct Information -- which is inherently subjective -- instead of using HTTP URIs to conduct &amp;quot;Raw Data&amp;quot; -- which is inherently objective. &lt;/p&gt; &lt;blockquote&gt; &lt;strong&gt;Note:&lt;/strong&gt; Information is &amp;quot;data in &lt;a href=&quot;http://dbpedia.org/resource/Context_%28language_use%29&quot; id=&quot;link-id1395ca50&quot;&gt;context&lt;/a&gt;&amp;quot;, it isn&amp;#39;t the same thing as &amp;quot;Raw Data&amp;quot;. Thus, if we can link to Information via the Web, why shouldn&amp;#39;t we be able to do the same for &amp;quot;Raw Data&amp;quot;?&lt;/blockquote&gt; &lt;h3&gt;How Does the Link Data &lt;a href=&quot;http://dbpedia.org/resource/Meme&quot; id=&quot;link-id1160ab70&quot;&gt;meme&lt;/a&gt; solve the problem?&lt;/h3&gt; &lt;p&gt;The meme simply provides a set of guidelines (best practices) for producing Web architecture friendly metadata. Meaning: when producing EAV/CR model based metadata, endow Subjects, their Attributes, and Attribute Values (optionally) with HTTP URIs. By doing so, a new level of Link Abstraction on the Web is possible i.e., &amp;quot;Data Item to Data Item&amp;quot; level links (aka &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id137a78a0&quot;&gt;hyperdata&lt;/a&gt; links). Even better, when you de-reference a RWO hyperdata link you end up with a negotiated representations of its metadata.&lt;/p&gt; &lt;h3&gt;Conclusion&lt;/h3&gt; &lt;p&gt;Linked Data is ultimately about an HTTP URI for each item in the &lt;a href=&quot;http://dbpedia.org/resource/Data_hierarchy&quot; id=&quot;link-id1393c3e0&quot;&gt;Data Organization Hierarchy&lt;/a&gt; :-)&lt;/p&gt; &lt;h3&gt;Related&lt;/h3&gt; &lt;ol&gt; &lt;li&gt; &lt;a href=&quot;http://www.nabble.com/Review-of-new-HTTPbis-text-for-303-See-Other-to24035004.html#a24774368&quot; id=&quot;link-id10fcaba8&quot;&gt;History of how &amp;quot;Resource&amp;quot; became part of URI&lt;/a&gt; - historic account by &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id1172b128&quot;&gt;TimBL&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.w3.org/DesignIssues/LinkedData.html&quot; id=&quot;link-id1338cbd0&quot;&gt;Linked Data Design Issues Document&lt;/a&gt; - &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id13536ad8&quot;&gt;TimBL&lt;/a&gt;&amp;#39;s initial Linked Data Guide&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1561&quot; id=&quot;link-id116c1af8&quot;&gt;Linked Data Rules Simplified&lt;/a&gt; - My attempt at simplifying the Linked Data Meme without &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id116c3b40&quot;&gt;SPARQL&lt;/a&gt; &amp;amp; RDF distraction&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1547&quot; id=&quot;link-id135dd1b8&quot;&gt;Linked Data &amp;amp; Identity&lt;/a&gt; - another related post&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1565&quot; id=&quot;link-id134afc50&quot;&gt;The Linked Data Meme&amp;#39;s Value Proposition&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://delicious.com/kidehen/identifier_scheme&quot; id=&quot;link-id14cc7e18&quot;&gt;My Del.icio.us hosted Bookmark Data Space for Identity Schemes&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html&quot; id=&quot;link-id115a3748&quot;&gt;TimBL&amp;#39;s Ted Talk re. &amp;quot;Raw Linked Data&amp;quot;&lt;/a&gt;.&lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Social Web Camp (#5 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-04-30#1555</atom:id>
  <atom:published>2009-04-30T16:14:02Z</atom:published>
  <atom:updated>2009-04-30T12:51:54-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;(Last of five posts related to the &lt;a href=&quot;http://www2009.org/&quot; id=&quot;link-id0x112efd58&quot;&gt;WWW 2009&lt;/a&gt; conference, held the week of April 20, 2009.) &lt;/p&gt; &lt;p&gt;The social networks camp was interesting, with a special meeting around Twitter. Half jokingly, we (that is, the OpenLink folks attending) concluded that societies would never be completely classless, although mobility between, as well as criteria for membership in, given classes would vary with time and circumstance. Now, there would be a new class division between people for whom micro-blogging is obligatory and those for whom it is an option.&lt;/p&gt; &lt;p&gt;By my experience, a great deal is possible in a short time, but this possibility depends on focus and concentration. These are increasingly rare. I am a great believer in core competence and focus. This is not only for geeks â one can have a lot of breadth-of-scope but this too depends on not getting sidetracked by constant &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x14e380b8&quot;&gt;information&lt;/a&gt; overload.&lt;/p&gt; &lt;p&gt;Insofar as personal success depends on constant reaction to online social media, this comes at a cost in time and focus and this cost will have to be managed somehow, for example by automation or outsourcing. But if the social media is only automated fronts twitting and re-twitting among themselves, a bit like electronic trading systems do with securities, with or without human operators, the value of the medium decreases.&lt;/p&gt; &lt;p&gt;There are contradictory requirements. On one hand, what is said in electronic media is essentially permanent, so one had best only say things that are well considered. On the other hand, one must say these things without adequate time for reflection or analysis. To cope with this, one must have a well-rehearsed position that is compacted so that it fits in a short format and is easy to remember and unambiguous to express. A culture of pre-cooked fast-food advertising cuts down on depth. Real-world things are complex and multifaceted. Besides, prevalent patterns of communication train the brain for a certain mode of functioning. If we train for rapid-fire 140-character messaging, we optimize one side but probably at the expense of another. In the meantime, the world continues developing increased complexity by all kinds of emergent effects. Connectivity is good but don&amp;#39;t get lost in it.&lt;/p&gt; &lt;p&gt;There is &lt;a href=&quot;https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/index.html&quot; id=&quot;link-id170cb010&quot;&gt;a CIA memorandum about how analysts misinterpret data and see what they want to see&lt;/a&gt;. This is a relevant resource for understanding some psychology of perception and memory. With the information overload, largely driven by user generated content, interpreting fragmented and variously-biased real-time information is not only for the analyst but for everyone who needs to intelligently function in cyber-social space.&lt;/p&gt; &lt;p&gt;I participated in discussions on security and privacy and on mobile social networks and context.&lt;/p&gt; &lt;p&gt;For privacy, the main thing turned out to be whether people should be protected from themselves. Should information expire? Will it get buried by itself under huge volumes of new content? Well, for purposes of visibility, it will certainly get buried and will require constant management to stay visible. But for purposes of future finding of dirt, it will stay findable for those who are looking.&lt;/p&gt; &lt;p&gt;There is also the corollary of setting security for resources, like documents, versus setting security for statements, i.e., structured data like social networks. As I have blogged before, policies &lt;a id=&quot;link-id14aaff90&quot;&gt;Ã  la&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x13d77830&quot;&gt;SQL&lt;/a&gt; do not work well when schema is fluid and end-users can&amp;#39;t be expected to formulate or understand these. Remember &lt;a href=&quot;http://dbpedia.org/resource/Ted_Nelson&quot; id=&quot;link-id0x156ceae0&quot;&gt;Ted Nelson&lt;/a&gt;? A user interface should be such that a beginner understands it in 10 seconds in an emergency. The user interaction question is how to present things so that the user understands who will have access to what content. Also, users should themselves be able to check what potentially sensitive information can be found out about them. A service along the lines of Garlic&amp;#39;s Data Patrol should be a part of the social web infrastructure of the future.&lt;/p&gt; &lt;p&gt;People at MIT have developed AIR (Accountability In &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x14e2abc0&quot;&gt;RDF&lt;/a&gt;) for expressing policies about what can be done with data and for explaining why access is denied if it is denied. However, if we at all look at the history of secrets, it is rather seldom that one hears that access to information about X is restricted to compartment so-and-so; it is much more common to hear that there is no X. I would say that a policy system that just leaves out information that is not supposed to be available will please the users more. This is not only so for organizations; it is fully plausible that an individual might not wish to expose even the existence of some selected inner circle of friends, their parties together, or whatever.&lt;/p&gt; &lt;p&gt;In conclusion, there is no self-evident solution for careless use of social media. A site that requires people to confirm multiple times that they know what they are doing when publishing a photo will not get much use. We will see.&lt;/p&gt; &lt;p&gt;For mobility, there was some talk about the context of usage. Again, this is difficult. For different contexts, one would for example disclose one&amp;#39;s location at the granularity of the city; for some other purposes, one would say which conference room one is in.&lt;/p&gt; &lt;p&gt;Embarrassing social situations may arise if mobile devices are too clever: If information about travel is pushed into the social network, one would feel like having to explain why one does not call on such-and-such a person and so on. Too much initiative in the mobile phone seems like a recipe for problems.&lt;/p&gt; &lt;p&gt;There is a thin line between convenience and having IT infrastructure rule one&amp;#39;s life. The complexities and subtleties of social situations ought not to be reduced to the level of if-then rules. People and their interactions are more complex than they themselves often realize. A system is not its own metasystem, as GÃ¶del put it. Similarly, human self-&lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x70d82ff8&quot;&gt;knowledge&lt;/a&gt;, let alone knowledge about another, is by this very principle only approximate. Not to forget what psychology tells us about state-dependent recall and of how circumstance can evoke patterns of behavior before one even notices. The history of expert systems did show that people do not do very well at putting their skills in the form of if-then rules. Thus automating sociality past a certain point seems a problematic proposition.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Social Web Camp (#5 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2009-04-30#1554</atom:id>
  <atom:published>2009-04-30T16:14:02Z</atom:published>
  <atom:updated>2009-04-30T12:51:49-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;(Last of five posts related to the &lt;a href=&quot;http://www2009.org/&quot; id=&quot;link-id0xd28c860&quot;&gt;WWW 2009&lt;/a&gt; conference, held the week of April 20, 2009.) &lt;/p&gt; &lt;p&gt;The social networks camp was interesting, with a special meeting around Twitter. Half jokingly, we (that is, the OpenLink folks attending) concluded that societies would never be completely classless, although mobility between, as well as criteria for membership in, given classes would vary with time and circumstance. Now, there would be a new class division between people for whom micro-blogging is obligatory and those for whom it is an option.&lt;/p&gt; &lt;p&gt;By my experience, a great deal is possible in a short time, but this possibility depends on focus and concentration. These are increasingly rare. I am a great believer in core competence and focus. This is not only for geeks â one can have a lot of breadth-of-scope but this too depends on not getting sidetracked by constant &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x10019a70&quot;&gt;information&lt;/a&gt; overload.&lt;/p&gt; &lt;p&gt;Insofar as personal success depends on constant reaction to online social media, this comes at a cost in time and focus and this cost will have to be managed somehow, for example by automation or outsourcing. But if the social media is only automated fronts twitting and re-twitting among themselves, a bit like electronic trading systems do with securities, with or without human operators, the value of the medium decreases.&lt;/p&gt; &lt;p&gt;There are contradictory requirements. On one hand, what is said in electronic media is essentially permanent, so one had best only say things that are well considered. On the other hand, one must say these things without adequate time for reflection or analysis. To cope with this, one must have a well-rehearsed position that is compacted so that it fits in a short format and is easy to remember and unambiguous to express. A culture of pre-cooked fast-food advertising cuts down on depth. Real-world things are complex and multifaceted. Besides, prevalent patterns of communication train the brain for a certain mode of functioning. If we train for rapid-fire 140-character messaging, we optimize one side but probably at the expense of another. In the meantime, the world continues developing increased complexity by all kinds of emergent effects. Connectivity is good but don&amp;#39;t get lost in it.&lt;/p&gt; &lt;p&gt;There is &lt;a href=&quot;https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/index.html&quot; id=&quot;link-id170cb010&quot;&gt;a CIA memorandum about how analysts misinterpret data and see what they want to see&lt;/a&gt;. This is a relevant resource for understanding some psychology of perception and memory. With the information overload, largely driven by user generated content, interpreting fragmented and variously-biased real-time information is not only for the analyst but for everyone who needs to intelligently function in cyber-social space.&lt;/p&gt; &lt;p&gt;I participated in discussions on security and privacy and on mobile social networks and context.&lt;/p&gt; &lt;p&gt;For privacy, the main thing turned out to be whether people should be protected from themselves. Should information expire? Will it get buried by itself under huge volumes of new content? Well, for purposes of visibility, it will certainly get buried and will require constant management to stay visible. But for purposes of future finding of dirt, it will stay findable for those who are looking.&lt;/p&gt; &lt;p&gt;There is also the corollary of setting security for resources, like documents, versus setting security for statements, i.e., structured data like social networks. As I have blogged before, policies &lt;a id=&quot;link-id14aaff90&quot;&gt;Ã  la&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x10b058d0&quot;&gt;SQL&lt;/a&gt; do not work well when schema is fluid and end-users can&amp;#39;t be expected to formulate or understand these. Remember &lt;a href=&quot;http://dbpedia.org/resource/Ted_Nelson&quot; id=&quot;link-id0x145b3070&quot;&gt;Ted Nelson&lt;/a&gt;? A user interface should be such that a beginner understands it in 10 seconds in an emergency. The user interaction question is how to present things so that the user understands who will have access to what content. Also, users should themselves be able to check what potentially sensitive information can be found out about them. A service along the lines of Garlic&amp;#39;s Data Patrol should be a part of the social web infrastructure of the future.&lt;/p&gt; &lt;p&gt;People at MIT have developed AIR (Accountability In &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x10dec8f8&quot;&gt;RDF&lt;/a&gt;) for expressing policies about what can be done with data and for explaining why access is denied if it is denied. However, if we at all look at the history of secrets, it is rather seldom that one hears that access to information about X is restricted to compartment so-and-so; it is much more common to hear that there is no X. I would say that a policy system that just leaves out information that is not supposed to be available will please the users more. This is not only so for organizations; it is fully plausible that an individual might not wish to expose even the existence of some selected inner circle of friends, their parties together, or whatever.&lt;/p&gt; &lt;p&gt;In conclusion, there is no self-evident solution for careless use of social media. A site that requires people to confirm multiple times that they know what they are doing when publishing a photo will not get much use. We will see.&lt;/p&gt; &lt;p&gt;For mobility, there was some talk about the context of usage. Again, this is difficult. For different contexts, one would for example disclose one&amp;#39;s location at the granularity of the city; for some other purposes, one would say which conference room one is in.&lt;/p&gt; &lt;p&gt;Embarrassing social situations may arise if mobile devices are too clever: If information about travel is pushed into the social network, one would feel like having to explain why one does not call on such-and-such a person and so on. Too much initiative in the mobile phone seems like a recipe for problems.&lt;/p&gt; &lt;p&gt;There is a thin line between convenience and having IT infrastructure rule one&amp;#39;s life. The complexities and subtleties of social situations ought not to be reduced to the level of if-then rules. People and their interactions are more complex than they themselves often realize. A system is not its own metasystem, as GÃ¶del put it. Similarly, human self-&lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0xd7b1808&quot;&gt;knowledge&lt;/a&gt;, let alone knowledge about another, is by this very principle only approximate. Not to forget what psychology tells us about state-dependent recall and of how circumstance can evoke patterns of behavior before one even notices. The history of expert systems did show that people do not do very well at putting their skills in the form of if-then rules. Thus automating sociality past a certain point seems a problematic proposition.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>YODA &amp; the Data FORCE</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-11-03#1474</atom:id>
  <atom:published>2008-11-03T17:32:49Z</atom:published>
  <atom:updated>2008-11-06T09:04:34.000004-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The original &lt;a href=&quot;http://www.w3.org/History/1989/proposal.html&quot; id=&quot;link-id13b25ba8&quot;&gt;design document&lt;/a&gt; (by &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id181e4c70&quot;&gt;TimBL&lt;/a&gt;) that lead to the WWW (*an important read*) was very clear about the need to create an &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id10f23918&quot;&gt;information&lt;/a&gt; space&amp;quot; that connects heterogeneous &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;data&lt;/a&gt; sources. Unfortunately, in trying to create a moniker to distinguish one aspect of the &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; (the Linked Document Web) from the part that was overlooked (the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id11096818&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id1b9c6b98&quot;&gt;Web&lt;/a&gt;), we ended up with a project code name that&amp;#39;s fundamentally a misnomer in the form of: &amp;quot;The &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id10ffe228&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot;.&lt;/p&gt; &lt;p&gt;If we could just take &amp;quot;The Semantic Web&amp;quot; moniker for what it was -- a code name for an aspect of the Web -- and move on, things will get much clearer, fast!&lt;/p&gt; &lt;p&gt;Basically, what is/was the &amp;quot;Semantic Web&amp;quot; should really have been code named: (&amp;quot;You&amp;quot; Oriented Data Access) as a play on: Yoda&amp;#39;s appreciation of the FORCE (Fact ORiented Connected Entities) -- the power of inter galactic, interlinked, structured data, fashioned by the &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot; id=&quot;link-id191b22e0&quot;&gt;World Wide Web&lt;/a&gt; courtesy of the HTTP protocol.&lt;/p&gt; &lt;div&gt; &lt;img src=&quot;http://www.the-planets.com/star-biography/yoda_biography_3.jpg&quot; /&gt; &lt;/div&gt; &lt;p&gt;As stated in a earlier post, the next phase of the Web is all about the magic of &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id1a7395f0&quot;&gt;entity&lt;/a&gt; &amp;quot;You&amp;quot;. The single most important item of reference to every Web user would be the Person Entity &lt;a href=&quot;http://dbpedia.org/resource/Identity_%28object-oriented_programming%29&quot; id=&quot;link-id16ab9308&quot;&gt;ID&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id1d403c88&quot;&gt;URI&lt;/a&gt;). Just by remembering your Entity ID, you will have intelligent pathways across, and into, the FORCE that the Linked Data Web delivers. The quality of the pathways and increased density of the FORCE are the keys to high &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1442&quot; id=&quot;link-id1c549b28&quot;&gt;SDQ&lt;/a&gt; (tomorrows SEO). Thus, the SDQ of URIs will ultimately be the unit determinant of value to Web Users along the following personal lines:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; Does your platform give &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id175afe00&quot;&gt;me&lt;/a&gt; Identity (a URI) with high SDQ?&lt;/li&gt; &lt;li&gt; Do the Data Source Names (URIs) in your Data Spaces deliver high SDQ? &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;While most industry commentators continue to ponder and pontificate about what &amp;quot;The Semantic Web&amp;quot; is (unfortunately), the real thing (the &amp;quot;FORCE&amp;quot;) is already here, and self-enhancing rapidly. &lt;/p&gt; &lt;p&gt;Assuming we now accept the FORCE is simply an RDF based Linked Data moniker, and that RDF Linked Data is all about the Web as a structured database, we should start to move our attention over to practical exploitation of this burgeoning global database, and in doing so we should not discard &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id19e2c6e0&quot;&gt;knowledge&lt;/a&gt; from the past such as the many great examples available gratis from the Relational Database realm. For instance, we should start paying attention to the discovery, development, and deployment of high level tools such as query builders, report writers, and intelligence oriented analytic tools, none of which should -- at first point of interaction -- expose raw RDF or the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id117921f0&quot;&gt;SPARQL&lt;/a&gt; query language. Along similar lines of thinking, we also need development environments and frameworks that are counterparts to Visual Studio, ACCESS, File Maker, and the like.&lt;/p&gt; &lt;h3&gt;Related&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1458&quot; id=&quot;link-id1cec1a40&quot;&gt;Numerati &amp;amp; The Magic of You!&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso - Are We Too Clever for Our Own Good? (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-26#1467</atom:id>
  <atom:published>2008-10-26T12:15:35Z</atom:published>
  <atom:updated>2008-10-27T12:07:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&amp;quot;Physician, heal thyself,&amp;quot; it is said. We profess to say what the messaging of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x1b4a25f0&quot;&gt;semantic web&lt;/a&gt; ought to be, but is our own perfect?&lt;/p&gt; &lt;p&gt;I will here engage in some critical introspection as well as amplify on some answers given to &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e4f9928&quot;&gt;Virtuoso&lt;/a&gt;-related questions in recent times.&lt;/p&gt; &lt;p&gt;I use some conversations from the &lt;a href=&quot;http://dbpedia.org/resource/Vienna&quot; id=&quot;link-id0x1e6c0ca8&quot;&gt;Vienna&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1e56df88&quot;&gt;Linked Data&lt;/a&gt; Practitioners meeting as a starting point. These views are mine and are limited to the Virtuoso server. These do not apply to the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1e680440&quot;&gt;ODS&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1e140068&quot;&gt;OpenLink Data Spaces&lt;/a&gt;) applications line, &lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1f4ba630&quot;&gt;OAT&lt;/a&gt; (&lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1ba4bac8&quot;&gt;OpenLink Ajax Toolkit&lt;/a&gt;), or &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1d4159b0&quot;&gt;ODE&lt;/a&gt; (&lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1e973c80&quot;&gt;OpenLink Data Explorer&lt;/a&gt;).&lt;/p&gt; &lt;h3&gt;&amp;quot;It is not always clear what the main thrust is, we get the impression that you are spread too thin,&amp;quot; said &lt;a href=&quot;http://www.informatik.uni-leipzig.de/~auer/foaf.rdf#me&quot; id=&quot;link-id0x1f8bafe0&quot;&gt;SÃ¶ren Auer&lt;/a&gt;.&lt;/h3&gt; &lt;p&gt;Well, personally, I am all for core competence. This is why I do not participate in all the online conversations and groups as much as I could, for example. Time and energy are critical resources and must be invested where they make a difference. In this case, the real core competence is running in the database race. This in itself, come to think of it, is a pretty broad concept.&lt;/p&gt; &lt;p&gt;This is why we put a lot of emphasis on Linked Data and the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x200bd1f0&quot;&gt;Data&lt;/a&gt; Web for now, as this is the emerging game. This is a deliberate choice, not an outside imperative or built-in limitation. More specifically, this means exposing any pre-existing relational data as linked data plus being the definitive &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1fb03528&quot;&gt;RDF&lt;/a&gt; store.&lt;/p&gt; &lt;p&gt;We can do this because we own our database and &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1e7dcc70&quot;&gt;SQL&lt;/a&gt; and data access middleware and have a history of connecting to any &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1e9baf18&quot;&gt;RDBMS&lt;/a&gt; out there.&lt;/p&gt; &lt;p&gt;The principal message we have been hearing from the RDF field is the call for scale of triple storage. This is even louder than the call for relational mapping. We believe that in time mapping will exceed triple storage as such, once we get some real production strength mappings deployed, enough to outperform RDF warehousing.&lt;/p&gt; &lt;p&gt;There are also RDF middleware things like RDF-ization and demand-driven web harvesting (i.e, the so-called Sponger). These are &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1f5f6b78&quot;&gt;SPARQL&lt;/a&gt; options, thus accessed via standard interfaces. We have little desire to create our own languages or APIs, or to tell people how to program. This is why we recently introduced &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x206818c8&quot;&gt;Sesame&lt;/a&gt;- and &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x202b3348&quot;&gt;Jena&lt;/a&gt;-compatible APIs to our RDF store. From what we hear, these work. On the other hand, we do not hesitate to move beyond the standards when there is obvious value or necessity. This is why we brought SPARQL up to and beyond SQL expressivity. It is not a case of E3 (Embrace, Extend, Extinguish).&lt;/p&gt; &lt;p&gt;Now, this message could be better reflected in our material on the web. This &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1c82e508&quot;&gt;blog&lt;/a&gt; is a rather informal step in this direction; more is to come. For now we concentrate on delivering.&lt;/p&gt; &lt;p&gt;The conventional communications wisdom is to split the message by target audience. For this, we should split the RDF, relational, and web services messages from each other. We believe that a challenger, like the semantic web technology stack, must have a compelling message to tell for it to be interesting. This is not a question of research prototypes. The new technology cannot lack something the installed technology takes for granted.&lt;/p&gt; &lt;p&gt;This is why we do not tend to show things like how to insert and query a few triples: No business out there will insert and query triples for the sake of triples. There must be a more compelling story â for example, turning the whole world into a database. This is why our examples start with things like turning the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x20832510&quot;&gt;TPC-H&lt;/a&gt; database into RDF, queries and all. Anything less is not interesting. Why would an enterprise that has business intelligence and integration issues way more complex than the rather stereotypical TPC-H even look at a technology that pretends to be all for integration and all for expressivity of queries, yet cannot answer the first question of the entry exam?&lt;/p&gt; &lt;p&gt;The world out there is complex. But maybe we ought to make some simple tutorials? So, as a call to the people out there, tell us what a good tutorial would be. The question is more about figuring out what is out there and adapting these and making a sort of compatibility list. Jena and Sesame stuff ought to run as is. We could offer a webinar to all the data web luminaries showing how to promote the data web message with Virtuoso. After all, why not show it on the best platform?&lt;/p&gt; &lt;h3&gt;&amp;quot;You are arrogant. When I read your papers or documentation, the impression I get is that you say you are smart and the reader is stupid.&amp;quot;&lt;/h3&gt; &lt;p&gt;We should answer in multiple parts.&lt;/p&gt; &lt;p&gt;For general collateral, like web sites and documentation:&lt;/p&gt; &lt;p&gt;The web site gives a confused product image. For the Virtuoso product, we should divide at the top into&lt;/p&gt; &lt;ul&gt; &lt;li&gt; Data web and RDF - Host linked data, expose relational assets as linked data;&lt;/li&gt; &lt;li&gt; Relational Database - Full function, high performance, open source, Federated/Virtual Relational DBMS, expose heterogeneous RDB assets through one point of contact for integration;&lt;/li&gt; &lt;li&gt; Web Services - access all the above over standard protocols, dynamic web pages, web hosting.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;For each point, one simple statement. We all know what the above things mean?&lt;/p&gt; &lt;p&gt;Then we add a new point about scalability that impacts all the above, namely the Virtuoso version 6 Cluster, meaning that you can do all these things at 10 to 1000 times the scale. This means this much more data or in some cases this much more requests per second. This too is clear.&lt;/p&gt; &lt;p&gt;Far as I am concerned, hosting Java or .&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x20283a88&quot;&gt;NET&lt;/a&gt; does not have to be on the front page. Also, we have no great interest in going against &lt;a href=&quot;http://dbpedia.org/resource/Apache&quot; id=&quot;link-id0x2024a068&quot;&gt;Apache&lt;/a&gt; when it comes to a web server only situation. The fact that we have a web listener is important for some things but our claim to fame does not rest on this.&lt;/p&gt; &lt;p&gt;Then for documentation and training materials: The documentation should be better. Specifically it should have more of a how-to dimension since nobody reads the whole thing anyhow. About online tutorials, the order of presentation should be different. They do not really reflect what is important at the present moment either.&lt;/p&gt; &lt;p&gt;Now for conference papers: Since taking the data web as a focus area, we have submitted some papers and had some rejected because these do not have enough references and do not explain what is obvious to ourselves.&lt;/p&gt; &lt;p&gt;I think that the communications failure in this case is that we want to talk about end to end solutions and the reviewers expect research. For us, the solution is interesting and exists only if there is an adequate functionality mix for addressing a specific use case. This is why we do not make a paper about query cost model alone because the cost model, while indispensable, is a thing that is taken for granted where we come from. So we mention RDF adaptations to cost model, as these are important to the whole but do not find these to be the justification for a whole paper. If we made papers on this basis, we would have to make five times as many. Maybe we ought to.&lt;/p&gt; &lt;h3&gt;&amp;quot;Virtuoso is very big and very difficult&amp;quot;&lt;/h3&gt; &lt;p&gt;One thing that is not obvious from the Virtuoso packaging is that the minimum installation is an executable under 10MB and a config file. Two files.&lt;/p&gt; &lt;p&gt;This gives you SQL and SPARQL out of the box. Adding &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x1ee61058&quot;&gt;ODBC&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1b8c31c0&quot;&gt;JDBC&lt;/a&gt; clients is as simple as it gets. After this, there is basic database functionality. Tuning is a matter of a few parameters that are explained on this blog and elsewhere. Also, the full scale installation is available as an Amazon EC2 image, so no installation required.&lt;/p&gt; &lt;p&gt;Now for the difficult side:&lt;/p&gt; &lt;p&gt;Use SQL and SPARQL; use stored procedures whenever there is server side business logic. For some time critical web pages, use VSP. Do not use VSPX. Otherwise, use whatever you are used to â &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x20a13c00&quot;&gt;PHP&lt;/a&gt; or Java or anything else. For web services, simple is best. Stick to basics. &amp;quot;The engineer is one who can invent a simple thing.&amp;quot; Use SQL statements rather than admin UI.&lt;/p&gt; &lt;p&gt;Know that you can start a server with no database file and you get an initial database with nothing extra. The demo database, the way it is produced by installers is cluttered.&lt;/p&gt; &lt;p&gt;We should put this into a couple of use case oriented how-tos.&lt;/p&gt; &lt;p&gt;Also, we should create a network of &amp;quot;friendly local virtuoso geeks&amp;quot; for providing basic training and services so we do not have to explain these things all the time. To all you data-web-ers out there â please sign up and we will provide instructions, etc. Contact YrjÃ¤nÃ¤ Rankka (ghard[at-sign]openlinksw.com), or go through the mailing lists; do not contact me directly.&lt;/p&gt; &lt;h3&gt;&amp;quot;OK, we understand that you may be good at the large end of the spectrum but how do you reconcile this with the lightweight or embedded end, like the semantic desktop?&amp;quot;&lt;/h3&gt; &lt;p&gt;Now, what is good for one end is usually good for the other. Namely, a database, no matter the scale, needs to have space efficient storage, fast index lookup, and correct query plans. Then there are things that occur only at the high-end, like clustering, but these are separate things. For embedding, the initial memory footprint needs to be small. With Virtuoso, this is accomplished by leaving out some 200 built-in tables and 100,000 lines of SQL procedures that are normally in by default, supporting things such as DAV and diverse other protocols. After all, if SPARQL is all one wants these are not needed.&lt;/p&gt; &lt;p&gt;If one really wants to do one&amp;#39;s server logic (like web listener and thread dispatching) oneself, this is not impossible but requires some advice from us. On the other hand, if one wants to have logic for security close to the data, then using stored procedures is recommended; these execute right next to the data, and support inline SPARQL and SQL. Depending on the license status of the other code, some special licensing arrangements may apply.&lt;/p&gt; &lt;p&gt;We are talking about such things with different parties at present.&lt;/p&gt; &lt;h3&gt;&amp;quot;How webby are you? What is webby?&amp;quot;&lt;/h3&gt; &lt;p&gt;&amp;quot;Webby means distributed, heterogeneous, open; not monolithic consolidation of everything.&amp;quot;&lt;/p&gt; &lt;p&gt;We are philosophically webby. We come from open standards; we are after all called OpenLink; our history consists of connecting things. We believe in choice â the user should be able to pick the best of breed for components and have them work together. We cannot and do not wish to force replacement of existing assets. Transforming data on the fly and connecting systems, leaving data where it originally resides, is the first preference. For the data web, the first preference is a federation of independent SPARQL end points. When there is harvesting, we prefer to do it on demand, as with our Sponger. With the immense amount of data out there we believe in finding what is relevant &lt;i&gt;when&lt;/i&gt; it is relevant, preferably close at hand, leveraging things like social networks. With a data web, many things which are now siloized, such as marketplaces and social networks, will return to the open.&lt;/p&gt; &lt;p&gt;Google-style crawling of everything becomes less practical if one needs to run complex &lt;i&gt;ad hoc&lt;/i&gt; queries against the mass of data. For these types of scenarios, if one needs to warehouse, the data cloud will offer solutions where one pays for database on demand. While we believe in loosely coupled federation where possible, we have serious work on the scalability side for the data center and the compute-on-demand cloud.&lt;/p&gt; &lt;h3&gt;&amp;quot;How does OpenLink see the next five years unfolding?&amp;quot;&lt;/h3&gt; &lt;p&gt;Personally, I think we have the basics for the birth of a new inflection in the &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1fb9ae58&quot;&gt;knowledge&lt;/a&gt; economy. The &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x1f07c648&quot;&gt;URI&lt;/a&gt; is the unit of exchange; its value and competitive edge lie in the data it links you with. A name without context is worth little, but as a name gets more use, more &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1f007d60&quot;&gt;information&lt;/a&gt; can be found through that name. This is anything from financial statistics, to legal precedents, to news reporting or government data. Right now, if the SEC just added one line of markup to the XBRL template, this would instantaneously make all SEC-mandated reporting into linked data via GRDDL.&lt;/p&gt; &lt;p&gt;The URI is a carrier of brand. An information brand gets traffic and references, and this can be monetized in diverse ways. The key word is &lt;i&gt;context&lt;/i&gt;. Information overload is here to stay, and only better context offers the needed increase in productivity to stay ahead of the flood.&lt;/p&gt; &lt;p&gt;Semantic technologies on the whole can help with this. Why these should be semantic web or data web technologies as opposed to just semantic is the linked data value proposition. Even smart islands are still islands. Agility, scale, and scope, depend on the possibility of combining things. Therefore common terminologies and dereferenceability and discoverability are important. Without these, we are at best dealing with closed systems even if they were smart. The expert systems of the 1980s are a case in point.&lt;/p&gt; &lt;p&gt;Ever since the .com era, the &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id0x2048e670&quot;&gt;URL&lt;/a&gt; has been a brand. Now it becomes a URI. Thus, entirely hiding the URI from the user experience is not always desirable. The URI is a sort of handle on the provenance and where more can be found; besides, people are already used to these.&lt;/p&gt; &lt;p&gt;With linked data, information value-add products become easy to build and deploy. They can be basically just canned SPARQL queries combining data in a useful and insightful manner. And where there is traffic there can be monetization, whether by advertizing, subscription, or other means. Such possibilities are a natural adjunct to the blogosphere. To publish analysis, one no longer needs to be a think tank or media company. We could call this scenario the birth of a meshup economy.&lt;/p&gt; &lt;p&gt;For OpenLink itself, this is our roadmap. The immediate future is about getting our high end offerings like clustered RDF storage generally available, both on the cloud and for private data centers. Ourselves, we will offer the whole &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1c696170&quot;&gt;Linked Open Data&lt;/a&gt; cloud as a database. The single feature to come in version 2 of this is fully automatic partitioning and repartitioning for on-demand scale; now, you have to choose how many partitions you have.&lt;/p&gt; &lt;p&gt;This makes some things possible that were hard thus far.&lt;/p&gt; &lt;p&gt;On the mapping front, we go for real-scale data integration scenarios where we can show that SPARQL can unify terms and concepts across databases, yet bring no added cost for complex queries. Enterprises can use their existing warehouses and have an added level of abstraction, the possibility of cross systems interlinking, the advantages of using the same taxonomies and ontologies across systems, and so forth.&lt;/p&gt; &lt;p&gt;Then there will be developments in the direction of smarter web harvesting on demand with the Virtuoso &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html&quot; id=&quot;link-id0x206ab780&quot;&gt;Sponger&lt;/a&gt;, and federation of heterogeneous SPARQL end points. The federation is not so unlike clustering, except the time scales are 2 orders of magnitude longer. The work on SPARQL end point statistics and data set description and discovery is a good development in the community.&lt;/p&gt; &lt;p&gt;Then there will be NLP integration, as exemplified by the Open Calais linked data wrapper and more.&lt;/p&gt; &lt;p&gt;Can we pull this off or is this being spread too thin? We know from experience that all this can be accomplished. Scale is already here; we show it with the billion triples set. Mapping is here; we showed it last in the Berlin Benchmark. We will also show some TPC-H results after we get a little quiet after the ISWC event. Then there is ongoing maintenance but with this we have shown a steady turnaround and quick time to fix for pretty much anything.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso - Are We Too Clever for Our Own Good? (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2008-10-26#1465</atom:id>
  <atom:published>2008-10-26T12:15:35Z</atom:published>
  <atom:updated>2008-10-27T12:07:52-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&amp;quot;Physician, heal thyself,&amp;quot; it is said. We profess to say what the messaging of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x1fa3da18&quot;&gt;semantic web&lt;/a&gt; ought to be, but is our own perfect?&lt;/p&gt; &lt;p&gt;I will here engage in some critical introspection as well as amplify on some answers given to &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e1eecf0&quot;&gt;Virtuoso&lt;/a&gt;-related questions in recent times.&lt;/p&gt; &lt;p&gt;I use some conversations from the &lt;a href=&quot;http://dbpedia.org/resource/Vienna&quot; id=&quot;link-id0x1ec0b2e0&quot;&gt;Vienna&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x2045ac10&quot;&gt;Linked Data&lt;/a&gt; Practitioners meeting as a starting point. These views are mine and are limited to the Virtuoso server. These do not apply to the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x2045ac38&quot;&gt;ODS&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x14f63c58&quot;&gt;OpenLink Data Spaces&lt;/a&gt;) applications line, &lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x14f63c80&quot;&gt;OAT&lt;/a&gt; (&lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1e536928&quot;&gt;OpenLink Ajax Toolkit&lt;/a&gt;), or &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1eaed7f8&quot;&gt;ODE&lt;/a&gt; (&lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1edfff88&quot;&gt;OpenLink Data Explorer&lt;/a&gt;).&lt;/p&gt; &lt;h3&gt;&amp;quot;It is not always clear what the main thrust is, we get the impression that you are spread too thin,&amp;quot; said &lt;a href=&quot;http://www.informatik.uni-leipzig.de/~auer/foaf.rdf#me&quot; id=&quot;link-id0x1b8a9580&quot;&gt;SÃ¶ren Auer&lt;/a&gt;.&lt;/h3&gt; &lt;p&gt;Well, personally, I am all for core competence. This is why I do not participate in all the online conversations and groups as much as I could, for example. Time and energy are critical resources and must be invested where they make a difference. In this case, the real core competence is running in the database race. This in itself, come to think of it, is a pretty broad concept.&lt;/p&gt; &lt;p&gt;This is why we put a lot of emphasis on Linked Data and the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1b85fa38&quot;&gt;Data&lt;/a&gt; Web for now, as this is the emerging game. This is a deliberate choice, not an outside imperative or built-in limitation. More specifically, this means exposing any pre-existing relational data as linked data plus being the definitive &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1f5b4468&quot;&gt;RDF&lt;/a&gt; store.&lt;/p&gt; &lt;p&gt;We can do this because we own our database and &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x20076468&quot;&gt;SQL&lt;/a&gt; and data access middleware and have a history of connecting to any &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1ffd6f98&quot;&gt;RDBMS&lt;/a&gt; out there.&lt;/p&gt; &lt;p&gt;The principal message we have been hearing from the RDF field is the call for scale of triple storage. This is even louder than the call for relational mapping. We believe that in time mapping will exceed triple storage as such, once we get some real production strength mappings deployed, enough to outperform RDF warehousing.&lt;/p&gt; &lt;p&gt;There are also RDF middleware things like RDF-ization and demand-driven web harvesting (i.e, the so-called Sponger). These are &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1316f720&quot;&gt;SPARQL&lt;/a&gt; options, thus accessed via standard interfaces. We have little desire to create our own languages or APIs, or to tell people how to program. This is why we recently introduced &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x20756a68&quot;&gt;Sesame&lt;/a&gt;- and &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x1ec01ac0&quot;&gt;Jena&lt;/a&gt;-compatible APIs to our RDF store. From what we hear, these work. On the other hand, we do not hesitate to move beyond the standards when there is obvious value or necessity. This is why we brought SPARQL up to and beyond SQL expressivity. It is not a case of E3 (Embrace, Extend, Extinguish).&lt;/p&gt; &lt;p&gt;Now, this message could be better reflected in our material on the web. This &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x2027b410&quot;&gt;blog&lt;/a&gt; is a rather informal step in this direction; more is to come. For now we concentrate on delivering.&lt;/p&gt; &lt;p&gt;The conventional communications wisdom is to split the message by target audience. For this, we should split the RDF, relational, and web services messages from each other. We believe that a challenger, like the semantic web technology stack, must have a compelling message to tell for it to be interesting. This is not a question of research prototypes. The new technology cannot lack something the installed technology takes for granted.&lt;/p&gt; &lt;p&gt;This is why we do not tend to show things like how to insert and query a few triples: No business out there will insert and query triples for the sake of triples. There must be a more compelling story â for example, turning the whole world into a database. This is why our examples start with things like turning the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x2051ff98&quot;&gt;TPC-H&lt;/a&gt; database into RDF, queries and all. Anything less is not interesting. Why would an enterprise that has business intelligence and integration issues way more complex than the rather stereotypical TPC-H even look at a technology that pretends to be all for integration and all for expressivity of queries, yet cannot answer the first question of the entry exam?&lt;/p&gt; &lt;p&gt;The world out there is complex. But maybe we ought to make some simple tutorials? So, as a call to the people out there, tell us what a good tutorial would be. The question is more about figuring out what is out there and adapting these and making a sort of compatibility list. Jena and Sesame stuff ought to run as is. We could offer a webinar to all the data web luminaries showing how to promote the data web message with Virtuoso. After all, why not show it on the best platform?&lt;/p&gt; &lt;h3&gt;&amp;quot;You are arrogant. When I read your papers or documentation, the impression I get is that you say you are smart and the reader is stupid.&amp;quot;&lt;/h3&gt; &lt;p&gt;We should answer in multiple parts.&lt;/p&gt; &lt;p&gt;For general collateral, like web sites and documentation:&lt;/p&gt; &lt;p&gt;The web site gives a confused product image. For the Virtuoso product, we should divide at the top into&lt;/p&gt; &lt;ul&gt; &lt;li&gt; Data web and RDF - Host linked data, expose relational assets as linked data;&lt;/li&gt; &lt;li&gt; Relational Database - Full function, high performance, open source, Federated/Virtual Relational DBMS, expose heterogeneous RDB assets through one point of contact for integration;&lt;/li&gt; &lt;li&gt; Web Services - access all the above over standard protocols, dynamic web pages, web hosting.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;For each point, one simple statement. We all know what the above things mean?&lt;/p&gt; &lt;p&gt;Then we add a new point about scalability that impacts all the above, namely the Virtuoso version 6 Cluster, meaning that you can do all these things at 10 to 1000 times the scale. This means this much more data or in some cases this much more requests per second. This too is clear.&lt;/p&gt; &lt;p&gt;Far as I am concerned, hosting Java or .&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x1f297540&quot;&gt;NET&lt;/a&gt; does not have to be on the front page. Also, we have no great interest in going against &lt;a href=&quot;http://dbpedia.org/resource/Apache&quot; id=&quot;link-id0x1ea29578&quot;&gt;Apache&lt;/a&gt; when it comes to a web server only situation. The fact that we have a web listener is important for some things but our claim to fame does not rest on this.&lt;/p&gt; &lt;p&gt;Then for documentation and training materials: The documentation should be better. Specifically it should have more of a how-to dimension since nobody reads the whole thing anyhow. About online tutorials, the order of presentation should be different. They do not really reflect what is important at the present moment either.&lt;/p&gt; &lt;p&gt;Now for conference papers: Since taking the data web as a focus area, we have submitted some papers and had some rejected because these do not have enough references and do not explain what is obvious to ourselves.&lt;/p&gt; &lt;p&gt;I think that the communications failure in this case is that we want to talk about end to end solutions and the reviewers expect research. For us, the solution is interesting and exists only if there is an adequate functionality mix for addressing a specific use case. This is why we do not make a paper about query cost model alone because the cost model, while indispensable, is a thing that is taken for granted where we come from. So we mention RDF adaptations to cost model, as these are important to the whole but do not find these to be the justification for a whole paper. If we made papers on this basis, we would have to make five times as many. Maybe we ought to.&lt;/p&gt; &lt;h3&gt;&amp;quot;Virtuoso is very big and very difficult&amp;quot;&lt;/h3&gt; &lt;p&gt;One thing that is not obvious from the Virtuoso packaging is that the minimum installation is an executable under 10MB and a config file. Two files.&lt;/p&gt; &lt;p&gt;This gives you SQL and SPARQL out of the box. Adding &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x20a2e7d0&quot;&gt;ODBC&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1e4cceb8&quot;&gt;JDBC&lt;/a&gt; clients is as simple as it gets. After this, there is basic database functionality. Tuning is a matter of a few parameters that are explained on this blog and elsewhere. Also, the full scale installation is available as an Amazon EC2 image, so no installation required.&lt;/p&gt; &lt;p&gt;Now for the difficult side:&lt;/p&gt; &lt;p&gt;Use SQL and SPARQL; use stored procedures whenever there is server side business logic. For some time critical web pages, use VSP. Do not use VSPX. Otherwise, use whatever you are used to â &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x20b03f08&quot;&gt;PHP&lt;/a&gt; or Java or anything else. For web services, simple is best. Stick to basics. &amp;quot;The engineer is one who can invent a simple thing.&amp;quot; Use SQL statements rather than admin UI.&lt;/p&gt; &lt;p&gt;Know that you can start a server with no database file and you get an initial database with nothing extra. The demo database, the way it is produced by installers is cluttered.&lt;/p&gt; &lt;p&gt;We should put this into a couple of use case oriented how-tos.&lt;/p&gt; &lt;p&gt;Also, we should create a network of &amp;quot;friendly local virtuoso geeks&amp;quot; for providing basic training and services so we do not have to explain these things all the time. To all you data-web-ers out there â please sign up and we will provide instructions, etc. Contact YrjÃ¤nÃ¤ Rankka (ghard[at-sign]openlinksw.com), or go through the mailing lists; do not contact me directly.&lt;/p&gt; &lt;h3&gt;&amp;quot;OK, we understand that you may be good at the large end of the spectrum but how do you reconcile this with the lightweight or embedded end, like the semantic desktop?&amp;quot;&lt;/h3&gt; &lt;p&gt;Now, what is good for one end is usually good for the other. Namely, a database, no matter the scale, needs to have space efficient storage, fast index lookup, and correct query plans. Then there are things that occur only at the high-end, like clustering, but these are separate things. For embedding, the initial memory footprint needs to be small. With Virtuoso, this is accomplished by leaving out some 200 built-in tables and 100,000 lines of SQL procedures that are normally in by default, supporting things such as DAV and diverse other protocols. After all, if SPARQL is all one wants these are not needed.&lt;/p&gt; &lt;p&gt;If one really wants to do one&amp;#39;s server logic (like web listener and thread dispatching) oneself, this is not impossible but requires some advice from us. On the other hand, if one wants to have logic for security close to the data, then using stored procedures is recommended; these execute right next to the data, and support inline SPARQL and SQL. Depending on the license status of the other code, some special licensing arrangements may apply.&lt;/p&gt; &lt;p&gt;We are talking about such things with different parties at present.&lt;/p&gt; &lt;h3&gt;&amp;quot;How webby are you? What is webby?&amp;quot;&lt;/h3&gt; &lt;p&gt;&amp;quot;Webby means distributed, heterogeneous, open; not monolithic consolidation of everything.&amp;quot;&lt;/p&gt; &lt;p&gt;We are philosophically webby. We come from open standards; we are after all called OpenLink; our history consists of connecting things. We believe in choice â the user should be able to pick the best of breed for components and have them work together. We cannot and do not wish to force replacement of existing assets. Transforming data on the fly and connecting systems, leaving data where it originally resides, is the first preference. For the data web, the first preference is a federation of independent SPARQL end points. When there is harvesting, we prefer to do it on demand, as with our Sponger. With the immense amount of data out there we believe in finding what is relevant &lt;i&gt;when&lt;/i&gt; it is relevant, preferably close at hand, leveraging things like social networks. With a data web, many things which are now siloized, such as marketplaces and social networks, will return to the open.&lt;/p&gt; &lt;p&gt;Google-style crawling of everything becomes less practical if one needs to run complex &lt;i&gt;ad hoc&lt;/i&gt; queries against the mass of data. For these types of scenarios, if one needs to warehouse, the data cloud will offer solutions where one pays for database on demand. While we believe in loosely coupled federation where possible, we have serious work on the scalability side for the data center and the compute-on-demand cloud.&lt;/p&gt; &lt;h3&gt;&amp;quot;How does OpenLink see the next five years unfolding?&amp;quot;&lt;/h3&gt; &lt;p&gt;Personally, I think we have the basics for the birth of a new inflection in the &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x2018bd98&quot;&gt;knowledge&lt;/a&gt; economy. The &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x1ec110d8&quot;&gt;URI&lt;/a&gt; is the unit of exchange; its value and competitive edge lie in the data it links you with. A name without context is worth little, but as a name gets more use, more &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1ecfba08&quot;&gt;information&lt;/a&gt; can be found through that name. This is anything from financial statistics, to legal precedents, to news reporting or government data. Right now, if the SEC just added one line of markup to the XBRL template, this would instantaneously make all SEC-mandated reporting into linked data via GRDDL.&lt;/p&gt; &lt;p&gt;The URI is a carrier of brand. An information brand gets traffic and references, and this can be monetized in diverse ways. The key word is &lt;i&gt;context&lt;/i&gt;. Information overload is here to stay, and only better context offers the needed increase in productivity to stay ahead of the flood.&lt;/p&gt; &lt;p&gt;Semantic technologies on the whole can help with this. Why these should be semantic web or data web technologies as opposed to just semantic is the linked data value proposition. Even smart islands are still islands. Agility, scale, and scope, depend on the possibility of combining things. Therefore common terminologies and dereferenceability and discoverability are important. Without these, we are at best dealing with closed systems even if they were smart. The expert systems of the 1980s are a case in point.&lt;/p&gt; &lt;p&gt;Ever since the .com era, the &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id0x1c4c9248&quot;&gt;URL&lt;/a&gt; has been a brand. Now it becomes a URI. Thus, entirely hiding the URI from the user experience is not always desirable. The URI is a sort of handle on the provenance and where more can be found; besides, people are already used to these.&lt;/p&gt; &lt;p&gt;With linked data, information value-add products become easy to build and deploy. They can be basically just canned SPARQL queries combining data in a useful and insightful manner. And where there is traffic there can be monetization, whether by advertizing, subscription, or other means. Such possibilities are a natural adjunct to the blogosphere. To publish analysis, one no longer needs to be a think tank or media company. We could call this scenario the birth of a meshup economy.&lt;/p&gt; &lt;p&gt;For OpenLink itself, this is our roadmap. The immediate future is about getting our high end offerings like clustered RDF storage generally available, both on the cloud and for private data centers. Ourselves, we will offer the whole &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x20791bf0&quot;&gt;Linked Open Data&lt;/a&gt; cloud as a database. The single feature to come in version 2 of this is fully automatic partitioning and repartitioning for on-demand scale; now, you have to choose how many partitions you have.&lt;/p&gt; &lt;p&gt;This makes some things possible that were hard thus far.&lt;/p&gt; &lt;p&gt;On the mapping front, we go for real-scale data integration scenarios where we can show that SPARQL can unify terms and concepts across databases, yet bring no added cost for complex queries. Enterprises can use their existing warehouses and have an added level of abstraction, the possibility of cross systems interlinking, the advantages of using the same taxonomies and ontologies across systems, and so forth.&lt;/p&gt; &lt;p&gt;Then there will be developments in the direction of smarter web harvesting on demand with the Virtuoso &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html&quot; id=&quot;link-id0x1f27e6d8&quot;&gt;Sponger&lt;/a&gt;, and federation of heterogeneous SPARQL end points. The federation is not so unlike clustering, except the time scales are 2 orders of magnitude longer. The work on SPARQL end point statistics and data set description and discovery is a good development in the community.&lt;/p&gt; &lt;p&gt;Then there will be NLP integration, as exemplified by the Open Calais linked data wrapper and more.&lt;/p&gt; &lt;p&gt;Can we pull this off or is this being spread too thin? We know from experience that all this can be accomplished. Scale is already here; we show it with the billion triples set. Mapping is here; we showed it last in the Berlin Benchmark. We will also show some TPC-H results after we get a little quiet after the ISWC event. Then there is ongoing maintenance but with this we have shown a steady turnaround and quick time to fix for pretty much anything.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The Trouble with Labels (Contd.): Data Integration &amp; SOA</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-10-12#1457</atom:id>
  <atom:published>2008-10-12T18:53:44Z</atom:published>
  <atom:updated>2008-10-12T18:54:22-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I just stumbled across an post from &lt;a href=&quot;http://www.itbusinessedge.com&quot; id=&quot;link-id10f82f50&quot;&gt;ITBusines Edge&lt;/a&gt; titled: &lt;a href=&quot;http://www.itbusinessedge.com/item/?ci=48119&quot; id=&quot;link-id10f37b90&quot;&gt;How Semantic Technology Can Help Companies with Integration&lt;/a&gt;. While reading the post I encountered the term: &lt;a href=&quot;http://dbpedia.org/resource/Master_Data_Management&quot; id=&quot;link-id11055eb8&quot;&gt;Master Data Manager (MDM)&lt;/a&gt;, and wondered to myself, &amp;quot;what&amp;#39;s that?&amp;quot; only to realize it&amp;#39;s the very same thing I described as a &lt;a href=&quot;http://dbpedia.org/resource/Federated_database_system&quot; id=&quot;link-id13985af0&quot;&gt;Data Virtualization&lt;/a&gt; or &lt;a href=&quot;http://dbpedia.org/resource/Virtual_Database&quot; id=&quot;link-id1167c720&quot;&gt;Virtual Database technology&lt;/a&gt; (circa. 1998).&lt;/p&gt; &lt;p&gt;Now, if re-labeling can confuse &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id14aaaaf0&quot;&gt;me&lt;/a&gt; when applied to a realm I&amp;#39;ve been intimately involved with for eons (&lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id112042f0&quot;&gt;internet&lt;/a&gt; time). I don&amp;#39;t want to imagine what it does for others who aren&amp;#39;t that intimately involved with the important data access and data integration realms. &lt;/p&gt; &lt;p&gt;On the more refreshing side, the article does shed some light on the potency of RDF and OWL when applied to the construction of conceptual views of heterogeneous data sources.&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt;&amp;quot;How do you know that data coming from one place calculates net revenue the same way that data coming from another place does? Youâve got people using the same term for different things and different terms for the same things. How do you reconcile all of that? Thatâs really what semantic integration is about.&amp;quot; &lt;/cite&gt; &lt;/blockquote&gt; &lt;p&gt;BTW - I discovered this article via another titled: &lt;a href=&quot;http://www.itbusinessedge.com/blogs/mia/?p=485&quot; id=&quot;link-id11134098&quot;&gt;Understanding Integration And How It Can Help with SOA&lt;/a&gt;, that covers SOA and Integration matters. Again, in this piece I feel the gradual realization of the virtues that RDF, OWL, and RDF &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id11048740&quot;&gt;Linked Data&lt;/a&gt; bring to bear in the vital realm of data integration across heterogeneous data silos.&lt;/p&gt; &lt;h3&gt;Conclusion&lt;/h3&gt; &lt;p&gt;A number of events, at the micro and macro economic levels, are forcing attention back to the issue of productive use of existing IT resources. The trouble with the aforementioned quest is that it ultimately unveils the global IT affliction known as: heterogeneous data silos, and the challenges of pain alleviation, that have been ignored forever or approached inadequately as clearly shown by the rapid build up of SOA horror stories in the data integration realm.&lt;/p&gt; &lt;p&gt;Data Integration via conceptualization of heterogenous data sources, that result in concrete conceptual layer data access and management, remains the greatest and most potent application of technologies associated with the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id10fa5050&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot; and/or &amp;quot;Linked Data&amp;quot; monikers.&lt;/p&gt; &lt;h3&gt;Related&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.infoworld.com/article/03/05/23/21FEinnovidehen_1.html&quot; id=&quot;link-id118c9c00&quot;&gt;InforWorld 2003 Innovator article&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://weblog.infoworld.com/udell/2006/04/28.html&quot; id=&quot;link-id11057298&quot;&gt;2006 Podcast Interview with Jon Udell&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://dbpedia.org/resource/Enterprise_Information_Integration&quot; id=&quot;link-id13f89030&quot;&gt;Enterprise Information Integration&lt;/a&gt; &lt;/li&gt; &lt;li&gt;One of &lt;a href=&quot;http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&amp;amp;q=data%20integration&amp;amp;type=text&amp;amp;output=html&quot; id=&quot;link-id11048b98&quot;&gt;several posts&lt;/a&gt; about our &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id10fef0e0&quot;&gt;Virtuoso&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Virtuoso_Universal_Server&quot; id=&quot;link-id10e5a068&quot;&gt;Universal Server&lt;/a&gt; and &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1406&quot; id=&quot;link-id111d5aa8&quot;&gt;Conceptual Model based data integration&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory&quot; id=&quot;link-id11020108&quot;&gt;History of Virtuoso&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.mkbergman.com/me/&quot; id=&quot;link-id1101e7b0&quot;&gt;Mike Bergman&lt;/a&gt;&amp;#39;s post titled: &lt;a href=&quot;http://www.mkbergman.com/?p=459&quot; id=&quot;link-id10fdb640&quot;&gt;WOA: A New Enterprise Partner for Linked Data&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Transitivity and Graphs for SQL</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-09-08#1435</atom:id>
  <atom:published>2008-09-08T09:41:24Z</atom:published>
  <atom:updated>2008-09-08T15:43:07-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Transitivity and Graphs for SQL&lt;/div&gt; &lt;h2&gt;Background&lt;/h2&gt; &lt;p&gt;I have mentioned on a couple of prior occasions that basic graph operations ought to be integrated into the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa1a18c58&quot;&gt;SQL&lt;/a&gt; query language.&lt;/p&gt; &lt;p&gt;The history of databases is by and large about moving from specialized applications toward a generic platform. The introduction of the DBMS itself is the archetypal example. It is all about extracting the common features of applications and making these the features of a platform instead.&lt;/p&gt; &lt;p&gt;It is now time to apply this principle to graph traversal.&lt;/p&gt; &lt;p&gt;The rationale is that graph operations are somewhat tedious to write in a parallelize-able, latency-tolerant manner. Writing them as one would for memory-based &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xaf8c730&quot;&gt;data&lt;/a&gt; structures is easier but totally unscalable as soon as there is any latency involved, i.e., disk reads or messages between cluster peers.&lt;/p&gt; &lt;p&gt;The ad-hoc nature and very large volume of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xae41ef0&quot;&gt;RDF&lt;/a&gt; data makes this a timely question. Up until now, the answer to this question has been to materialize any implied facts in RDF stores. If &lt;i&gt;a&lt;/i&gt; was part of &lt;i&gt;b&lt;/i&gt;, and &lt;i&gt;b&lt;/i&gt; part of &lt;i&gt;&lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id0xac9d8790&quot;&gt;c&lt;/a&gt;&lt;/i&gt;, the implied fact that &lt;i&gt;a&lt;/i&gt; is part of &lt;i&gt;c&lt;/i&gt; would be inserted explicitly into the database as a pre-query step.&lt;/p&gt; &lt;p&gt;This is simple and often efficient, but tends to have the downside that one makes a specialized warehouse for each new type of query. The activity becomes less ad-hoc.&lt;/p&gt; &lt;p&gt;Also, this becomes next to impossible when the scale approaches web scale, or if some of the data is liable to be on-and-off included-into or excluded-from the set being analyzed. This is why with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb68f9d0&quot;&gt;Virtuoso&lt;/a&gt; we have tended to favor inference on demand (&amp;quot;backward chaining&amp;quot;) and mapping of relational data into RDF without copying.&lt;/p&gt; &lt;p&gt;The SQL world has taken steps towards dealing with recursion with the &lt;code&gt;WITH - UNION&lt;/code&gt; construct which allows definition of recursive views. The idea there is to define, for example, a tree walk as a &lt;code&gt;UNION&lt;/code&gt; of the data of the starting node plus the recursive walk of the starting node&amp;#39;s immediate children.&lt;/p&gt; &lt;p&gt;The main problem with this is that I do not very well see how a SQL optimizer could effectively rearrange queries involving &lt;code&gt;JOIN&lt;/code&gt;s between such recursive views. This model of recursion seems to lose SQL&amp;#39;s non-procedural nature. One can no longer easily rearrange &lt;code&gt;JOIN&lt;/code&gt;s based on what data is given and what is to be retrieved. If the recursion is written from root to leaf, it is not obvious how to do this from leaf to root. At any rate, queries written in this way are so complex to write, let alone optimize, that I decided to take another approach.&lt;/p&gt; &lt;p&gt;Take a question like &amp;quot;list the parts of products of category &lt;i&gt;C&lt;/i&gt; which have materials that are classified as toxic.&amp;quot; Suppose that the product categories are a tree, the product parts are a tree, and the materials classification is a tree taxonomy where &amp;quot;toxic&amp;quot; has a multilevel substructure.&lt;/p&gt; &lt;p&gt;Depending on the count of products and materials, the query can be evaluated as either going from products to parts to materials and then climbing up the materials tree to see if the material is toxic. Or one could do it in reverse, starting with the different toxic materials, looking up the parts containing these, going to the part tree to the product, and up the product hierarchy to see if the product is in the right category. One should be able to evaluate the identical query either way depending on what indices exist, what the cardinalities of the relations are, and so forth â regular cost based optimization.&lt;/p&gt; &lt;p&gt;Especially with RDF, there are many problems of this type. In regular SQL, it is a long-standing cultural practice to flatten hierarchies, but this is not the case with RDF.&lt;/p&gt; &lt;p&gt;In Virtuoso, we see &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xb3bdcc0&quot;&gt;SPARQL&lt;/a&gt; as reducing to SQL. Any RDF-oriented database-engine or query-optimization feature is accessed via SQL. Thus, if we address run-time-recursion in the Virtuoso query engine, this becomes, &lt;i&gt;ipso facto&lt;/i&gt;, an SQL feature. Besides, we remember that SQL is a much more mature and expressive language than the current SPARQL recommendation.&lt;/p&gt; &lt;h2&gt; SQL and Transitivity &lt;/h2&gt; &lt;p&gt;We will here look at some simple social network queries. A later article will show how to do more general graph operations. We extend the SQL derived table construct, i.e., &lt;code&gt;SELECT&lt;/code&gt; in another &lt;code&gt;SELECT&lt;/code&gt;&amp;#39;s &lt;code&gt;FROM&lt;/code&gt; clause, with a &lt;code&gt;TRANSITIVE&lt;/code&gt; clause.&lt;/p&gt; &lt;p&gt;Consider the data:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;CREATE TABLE &amp;quot;knows&amp;quot; (&amp;quot;p1&amp;quot; INT, &amp;quot;p2&amp;quot; INT, PRIMARY KEY (&amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;) ); ALTER INDEX &amp;quot;knows&amp;quot; ON &amp;quot;knows&amp;quot; PARTITION (&amp;quot;p1&amp;quot; INT); CREATE INDEX &amp;quot;knows2&amp;quot; ON &amp;quot;knows&amp;quot; (&amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot;) PARTITION (&amp;quot;p2&amp;quot; INT); &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We represent a social network with the many-to-many relation &amp;quot;knows&amp;quot;. The persons are identified by integers.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 2); INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 3); INSERT INTO &amp;quot;knows&amp;quot; VALUES (2, 4);&lt;/code&gt; &lt;/pre&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p1&amp;quot; = 1;&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We obtain the result:&lt;/p&gt; &lt;blockquote&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The operation is reversible:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4; &lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Since now we give &lt;i&gt;p2&lt;/i&gt;, we traverse from &lt;i&gt;p2&lt;/i&gt; towards &lt;i&gt;p1&lt;/i&gt;. The result set states that 4 is known by 2 and 2 is known by 1.&lt;/p&gt; &lt;p&gt;To see what would happen if &lt;i&gt;x&lt;/i&gt; knowing &lt;i&gt;y&lt;/i&gt; also meant &lt;i&gt;y&lt;/i&gt; knowing &lt;i&gt;x&lt;/i&gt;, one could write:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM (SELECT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; UNION ALL SELECT &amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k2&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Now, since we know that 1 and 4 are related, we can ask how they are related.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, T_STEP (1) AS &amp;quot;via&amp;quot;, T_STEP (&amp;#39;step_no&amp;#39;) AS &amp;quot;step&amp;quot;, T_STEP (&amp;#39;path_id&amp;#39;) AS &amp;quot;path&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 AND &amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;250&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;via&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;step&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;path&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The two first columns are the ends of the path. The next column is the person that is a step on the path. The next one is the number of the step, counting from 0, so that the end of the path that corresponds to the end condition on the column designated as input, i.e., &lt;i&gt;p1&lt;/i&gt;, has number 0. Since there can be multiple solutions, the last column is a sequence number allowing distinguishing multiple alternative paths from each other.&lt;/p&gt; &lt;p&gt;For LinkedIn users, the friends ordered by distance and descending friend count query, which is at the basis of most LinkedIn search result views can be written as: &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT p2, dist, (SELECT COUNT (*) FROM &amp;quot;knows&amp;quot; &amp;quot;c&amp;quot; WHERE &amp;quot;c&amp;quot;.&amp;quot;p1&amp;quot; = &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; ) FROM (SELECT TRANSITIVE t_in (1) t_out (2) t_distinct &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, t_step (&amp;#39;step_no&amp;#39;) AS &amp;quot;dist&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 ORDER BY &amp;quot;dist&amp;quot;, 3 DESC;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;150&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;dist&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;aggregate&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;h2&gt;How?&lt;/h2&gt; &lt;p&gt;The queries shown above work on Virtuoso v6. When running in cluster mode, several thousand graph traversal steps may be proceeding at the same time, meaning that all database access is parallelized and that the algorithm is internally latency-tolerant. By default, all results are produced in a deterministic order, permitting predictable slicing of result sets.&lt;/p&gt; &lt;p&gt;Furthermore, for queries where both ends of a path are given, the optimizer may decide to attack the path from both ends simultaneously. So, supposing that every member of a social network has an average of 30 contacts, and we need to find a path between two users that are no more than 6 steps apart, we begin at both ends, expanding each up to 3 levels, and we stop when we find the first intersection. Thus, we reach 2 * 30^3 = 54,000 nodes, and not 30^6 = 729,000,000 nodes.&lt;/p&gt; &lt;p&gt;Writing a generic database driven graph traversal framework on the application side, say in Java over &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0xa8a9ef8&quot;&gt;JDBC&lt;/a&gt;, would easily be over a thousand lines. This is much more work than can be justified just for a one-off, ad-hoc query. Besides, the traversal order in such a case could not be optimized by the DBMS.&lt;/p&gt; &lt;h2&gt;Next&lt;/h2&gt; &lt;p&gt;In a future &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0xb526a40&quot;&gt;blog&lt;/a&gt; post I will show how this feature can be used for common graph tasks like critical path, itinerary planning, traveling salesman, the 8 queens chess problem, etc. There are lots of switches for controlling different parameters of the traversal. This is just the beginning. I will also give examples of the use of this in SPARQL.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Transitivity and Graphs for SQL</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2008-09-08#1433</atom:id>
  <atom:published>2008-09-08T09:20:11Z</atom:published>
  <atom:updated>2008-09-08T15:43:04.000006-04:00</atom:updated>
  <atom:content type="html">&lt;h2&gt;Background&lt;/h2&gt; &lt;p&gt;I have mentioned on a couple of prior occasions that basic graph operations ought to be integrated into the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xb1fe830&quot;&gt;SQL&lt;/a&gt; query language.&lt;/p&gt; &lt;p&gt;The history of databases is by and large about moving from specialized applications toward a generic platform. The introduction of the DBMS itself is the archetypal example. It is all about extracting the common features of applications and making these the features of a platform instead.&lt;/p&gt; &lt;p&gt;It is now time to apply this principle to graph traversal.&lt;/p&gt; &lt;p&gt;The rationale is that graph operations are somewhat tedious to write in a parallelize-able, latency-tolerant manner. Writing them as one would for memory-based &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1cb37218&quot;&gt;data&lt;/a&gt; structures is easier but totally unscalable as soon as there is any latency involved, i.e., disk reads or messages between cluster peers.&lt;/p&gt; &lt;p&gt;The ad-hoc nature and very large volume of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1e1850a0&quot;&gt;RDF&lt;/a&gt; data makes this a timely question. Up until now, the answer to this question has been to materialize any implied facts in RDF stores. If &lt;i&gt;a&lt;/i&gt; was part of &lt;i&gt;b&lt;/i&gt;, and &lt;i&gt;b&lt;/i&gt; part of &lt;i&gt;&lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id0xa1a08d38&quot;&gt;c&lt;/a&gt;&lt;/i&gt;, the implied fact that &lt;i&gt;a&lt;/i&gt; is part of &lt;i&gt;c&lt;/i&gt; would be inserted explicitly into the database as a pre-query step.&lt;/p&gt; &lt;p&gt;This is simple and often efficient, but tends to have the downside that one makes a specialized warehouse for each new type of query. The activity becomes less ad-hoc.&lt;/p&gt; &lt;p&gt;Also, this becomes next to impossible when the scale approaches web scale, or if some of the data is liable to be on-and-off included-into or excluded-from the set being analyzed. This is why with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa51bd10&quot;&gt;Virtuoso&lt;/a&gt; we have tended to favor inference on demand (&amp;quot;backward chaining&amp;quot;) and mapping of relational data into RDF without copying.&lt;/p&gt; &lt;p&gt;The SQL world has taken steps towards dealing with recursion with the &lt;code&gt;WITH - UNION&lt;/code&gt; construct which allows definition of recursive views. The idea there is to define, for example, a tree walk as a &lt;code&gt;UNION&lt;/code&gt; of the data of the starting node plus the recursive walk of the starting node&amp;#39;s immediate children.&lt;/p&gt; &lt;p&gt;The main problem with this is that I do not very well see how a SQL optimizer could effectively rearrange queries involving &lt;code&gt;JOIN&lt;/code&gt;s between such recursive views. This model of recursion seems to lose SQL&amp;#39;s non-procedural nature. One can no longer easily rearrange &lt;code&gt;JOIN&lt;/code&gt;s based on what data is given and what is to be retrieved. If the recursion is written from root to leaf, it is not obvious how to do this from leaf to root. At any rate, queries written in this way are so complex to write, let alone optimize, that I decided to take another approach.&lt;/p&gt; &lt;p&gt;Take a question like &amp;quot;list the parts of products of category &lt;i&gt;C&lt;/i&gt; which have materials that are classified as toxic.&amp;quot; Suppose that the product categories are a tree, the product parts are a tree, and the materials classification is a tree taxonomy where &amp;quot;toxic&amp;quot; has a multilevel substructure.&lt;/p&gt; &lt;p&gt;Depending on the count of products and materials, the query can be evaluated as either going from products to parts to materials and then climbing up the materials tree to see if the material is toxic. Or one could do it in reverse, starting with the different toxic materials, looking up the parts containing these, going to the part tree to the product, and up the product hierarchy to see if the product is in the right category. One should be able to evaluate the identical query either way depending on what indices exist, what the cardinalities of the relations are, and so forth â regular cost based optimization.&lt;/p&gt; &lt;p&gt;Especially with RDF, there are many problems of this type. In regular SQL, it is a long-standing cultural practice to flatten hierarchies, but this is not the case with RDF.&lt;/p&gt; &lt;p&gt;In Virtuoso, we see &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xb4b3ce8&quot;&gt;SPARQL&lt;/a&gt; as reducing to SQL. Any RDF-oriented database-engine or query-optimization feature is accessed via SQL. Thus, if we address run-time-recursion in the Virtuoso query engine, this becomes, &lt;i&gt;ipso facto&lt;/i&gt;, an SQL feature. Besides, we remember that SQL is a much more mature and expressive language than the current SPARQL recommendation.&lt;/p&gt; &lt;h2&gt; SQL and Transitivity &lt;/h2&gt; &lt;p&gt;We will here look at some simple social network queries. A later article will show how to do more general graph operations. We extend the SQL derived table construct, i.e., &lt;code&gt;SELECT&lt;/code&gt; in another &lt;code&gt;SELECT&lt;/code&gt;&amp;#39;s &lt;code&gt;FROM&lt;/code&gt; clause, with a &lt;code&gt;TRANSITIVE&lt;/code&gt; clause.&lt;/p&gt; &lt;p&gt;Consider the data:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;CREATE TABLE &amp;quot;knows&amp;quot; (&amp;quot;p1&amp;quot; INT, &amp;quot;p2&amp;quot; INT, PRIMARY KEY (&amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;) ); ALTER INDEX &amp;quot;knows&amp;quot; ON &amp;quot;knows&amp;quot; PARTITION (&amp;quot;p1&amp;quot; INT); CREATE INDEX &amp;quot;knows2&amp;quot; ON &amp;quot;knows&amp;quot; (&amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot;) PARTITION (&amp;quot;p2&amp;quot; INT); &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We represent a social network with the many-to-many relation &amp;quot;knows&amp;quot;. The persons are identified by integers.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 2); INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 3); INSERT INTO &amp;quot;knows&amp;quot; VALUES (2, 4);&lt;/code&gt; &lt;/pre&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p1&amp;quot; = 1;&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We obtain the result:&lt;/p&gt; &lt;blockquote&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The operation is reversible:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4; &lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Since now we give &lt;i&gt;p2&lt;/i&gt;, we traverse from &lt;i&gt;p2&lt;/i&gt; towards &lt;i&gt;p1&lt;/i&gt;. The result set states that 4 is known by 2 and 2 is known by 1.&lt;/p&gt; &lt;p&gt;To see what would happen if &lt;i&gt;x&lt;/i&gt; knowing &lt;i&gt;y&lt;/i&gt; also meant &lt;i&gt;y&lt;/i&gt; knowing &lt;i&gt;x&lt;/i&gt;, one could write:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM (SELECT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; UNION ALL SELECT &amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k2&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Now, since we know that 1 and 4 are related, we can ask how they are related.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, T_STEP (1) AS &amp;quot;via&amp;quot;, T_STEP (&amp;#39;step_no&amp;#39;) AS &amp;quot;step&amp;quot;, T_STEP (&amp;#39;path_id&amp;#39;) AS &amp;quot;path&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 AND &amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;250&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;via&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;step&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;path&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The two first columns are the ends of the path. The next column is the person that is a step on the path. The next one is the number of the step, counting from 0, so that the end of the path that corresponds to the end condition on the column designated as input, i.e., &lt;i&gt;p1&lt;/i&gt;, has number 0. Since there can be multiple solutions, the last column is a sequence number allowing distinguishing multiple alternative paths from each other.&lt;/p&gt; &lt;p&gt;For LinkedIn users, the friends ordered by distance and descending friend count query, which is at the basis of most LinkedIn search result views can be written as: &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT p2, dist, (SELECT COUNT (*) FROM &amp;quot;knows&amp;quot; &amp;quot;c&amp;quot; WHERE &amp;quot;c&amp;quot;.&amp;quot;p1&amp;quot; = &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; ) FROM (SELECT TRANSITIVE t_in (1) t_out (2) t_distinct &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, t_step (&amp;#39;step_no&amp;#39;) AS &amp;quot;dist&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 ORDER BY &amp;quot;dist&amp;quot;, 3 DESC;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;150&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;dist&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;aggregate&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;h2&gt;How?&lt;/h2&gt; &lt;p&gt;The queries shown above work on Virtuoso v6. When running in cluster mode, several thousand graph traversal steps may be proceeding at the same time, meaning that all database access is parallelized and that the algorithm is internally latency-tolerant. By default, all results are produced in a deterministic order, permitting predictable slicing of result sets.&lt;/p&gt; &lt;p&gt;Furthermore, for queries where both ends of a path are given, the optimizer may decide to attack the path from both ends simultaneously. So, supposing that every member of a social network has an average of 30 contacts, and we need to find a path between two users that are no more than 6 steps apart, we begin at both ends, expanding each up to 3 levels, and we stop when we find the first intersection. Thus, we reach 2 * 30^3 = 54,000 nodes, and not 30^6 = 729,000,000 nodes.&lt;/p&gt; &lt;p&gt;Writing a generic database driven graph traversal framework on the application side, say in Java over &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0xb595050&quot;&gt;JDBC&lt;/a&gt;, would easily be over a thousand lines. This is much more work than can be justified just for a one-off, ad-hoc query. Besides, the traversal order in such a case could not be optimized by the DBMS.&lt;/p&gt; &lt;h2&gt;Next&lt;/h2&gt; &lt;p&gt;In a future &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1e4d4f18&quot;&gt;blog&lt;/a&gt; post I will show how this feature can be used for common graph tasks like critical path, itinerary planning, traveling salesman, the 8 queens chess problem, etc. There are lots of switches for controlling different parameters of the traversal. This is just the beginning. I will also give examples of the use of this in SPARQL.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Crunchbase &amp; Semantic Web Interview (Remix - Update 1)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-08-27#1424</atom:id>
  <atom:published>2008-08-27T18:16:37Z</atom:published>
  <atom:updated>2008-08-27T20:35:15-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;After reading &lt;a href=&quot;http://blog.crunchbase.com/2008/08/26/building-a-semantic-web-interview-with-benjamin-nowack/&quot; id=&quot;link-id16b8e0e0&quot;&gt;Bengee&amp;#39;s interview with CrunchBase&lt;/a&gt;, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse.&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt;&lt;a href=&quot;http://www.crunchbase.com/&quot; id=&quot;link-id17c8e7b8&quot;&gt;CrunchBase&lt;/a&gt;: When we released the &lt;a href=&quot;http://www.crunchbase.com/help/api&quot; id=&quot;link-id16681f68&quot;&gt;CrunchBase API&lt;/a&gt;, you were one of the first developers to step up and quickly released a &lt;a href=&quot;http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com&#39;s%20BLOG%20%5B127%5D/1395&quot; id=&quot;link-id1016d5f0&quot;&gt;CrunchBase Sponger Cartridge&lt;/a&gt;. Can you explain what a CrunchBase Sponger Cartridge is?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id13243300&quot;&gt;Me&lt;/a&gt;: A Sponger Cartridge is a &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;data&lt;/a&gt; access driver for &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; Resources that plugs into our &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id17042f08&quot;&gt;Virtuoso&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Virtuoso_Universal_Server&quot; id=&quot;link-id1399b588&quot;&gt;Universal Server&lt;/a&gt; (DBMS and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id137fd188&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id100b23d8&quot;&gt;Web&lt;/a&gt; Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id10418750&quot;&gt;Linked Data graph&lt;/a&gt; that essentially describes the resource via its properties (Attributes &amp;amp; Relationships). &lt;/blockquote&gt; &lt;br /&gt; &lt;img src=&quot;http://virtuoso.openlinksw.com/presentations/Creating_Deploying_Exploiting_Linked_Data2/images/ldp4.png&quot; /&gt; &lt;br /&gt; &lt;br /&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: And what inspired you to create it?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id12fa60c0&quot;&gt;Me&lt;/a&gt;: Bengee built a new space with your data, and we&amp;#39;ve built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id101a8d28&quot;&gt;Linked Data&lt;/a&gt; i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id14a3ff30&quot;&gt;Semantic Web&lt;/a&gt; circles).&lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://cb.semsol.org/&quot; id=&quot;link-id182a0170&quot;&gt;Bengee&lt;/a&gt; posted a notice to the &lt;a href=&quot;http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData&quot; id=&quot;link-id131e8d10&quot;&gt;Linking Open Data Community&lt;/a&gt;&amp;#39;s public &lt;a href=&quot;http://lists.w3.org/Archives/Public/public-lod/2008Jul/0110.html&quot; id=&quot;link-id11dd0720&quot;&gt;mailing list announcing his effort&lt;/a&gt;. Bearing in mind the fact that we&amp;#39;ve been using &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/?id=1144&quot; id=&quot;link-id117cf6e8&quot;&gt;middleware to mesh the realms of Web 2.0 and the Linked Data Web&lt;/a&gt; for a while, it was a no-brainer to knock something up based on the conceptual similarities between &lt;a href=&quot;http://wikicompany.org/wiki/Main_Page&quot; id=&quot;link-id13b87a68&quot;&gt;Wikicompany&lt;/a&gt; and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee&amp;#39;s RDFization efforts, and ours.&lt;/blockquote&gt; &lt;blockquote&gt;Bengee created an RDF based &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id133c8fc8&quot;&gt;Linked Data&lt;/a&gt; warehouse based on the data exposed by your API, which is exposed via the &lt;a href=&quot;http://cb.semsol.org/&quot; id=&quot;link-id1826f928&quot;&gt;Semantic CrunchBase&lt;/a&gt; &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot; id=&quot;link-id102d8890&quot;&gt;data space&lt;/a&gt;. In our case we&amp;#39;ve taken the &amp;quot;RDFization on the fly&amp;quot; approach which produces a transient &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id16a0b8d0&quot;&gt;Linked Data&lt;/a&gt; View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id1668e6c8&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id188e7da0&quot;&gt;Web&lt;/a&gt; is about incorporating HTTP into the naming scheme of these data sources so that the conventional &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id13490710&quot;&gt;URL&lt;/a&gt; based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id169aa568&quot;&gt;Linked Data&lt;/a&gt; on the Web (e.g. &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id10af10e8&quot;&gt;DBpedia&lt;/a&gt;, &lt;a href=&quot;http://www.pingthesemanticweb.com/about/&quot; id=&quot;link-id10a2b710&quot;&gt;PingTheSemanticWeb&lt;/a&gt;, and others), we&amp;#39;ve also automatically meshed Crunchbase data with related data in &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id1403cd40&quot;&gt;DBpedia&lt;/a&gt; and Wikicompany data.&lt;/blockquote&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id177d24c8&quot;&gt;Me&lt;/a&gt;: Yes, the &lt;a href=&quot;http://ode.openlinksw.com&quot; id=&quot;link-id10725ca0&quot;&gt;OpenLink Data Explorer&lt;/a&gt; which provides CrunchBase site visitors with the option to explore the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id17dedea8&quot;&gt;Linked Data&lt;/a&gt; in the CrunchBase &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot; id=&quot;link-id13f02a00&quot;&gt;data space&lt;/a&gt;. It also allows them to &amp;quot;Mesh&amp;quot; (rather than &amp;quot;Mash&amp;quot;) CrunchBase data with other &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id11fb3ba0&quot;&gt;Linked Data&lt;/a&gt; sources on the Web without writing a single line of code. &lt;/blockquote&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: You have been immersed in the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id12e18a00&quot;&gt;Semantic Web&lt;/a&gt; movement for a while now. How did you first get interested in the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id15132110&quot;&gt;Semantic Web&lt;/a&gt;?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id0xddaa9c8&quot;&gt;Me&lt;/a&gt;: We saw the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id188b3330&quot;&gt;Semantic Web&lt;/a&gt; as a vehicle for standardizing conceptual views of heterogeneous data sources via &lt;a href=&quot;http://dbpedia.org/resource/Context_%28language_use%29&quot; id=&quot;link-id10350978&quot;&gt;context&lt;/a&gt; lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id171d6798&quot;&gt;ODBC&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id138120a0&quot;&gt;JDBC&lt;/a&gt;, and OLE-DB data providers, we decided to build a &lt;a href=&quot;http://dbpedia.org/resource/Virtual_Database&quot; id=&quot;link-id13ea6618&quot;&gt;Virtual Database&lt;/a&gt; Engine (see: &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSHistory&quot; id=&quot;link-id11a4fa30&quot;&gt;Virtuoso History&lt;/a&gt;), and in doing so we sought a standards based mechanism for the conceptual output of the &lt;a href=&quot;http://dbpedia.org/resource/Federated_database_system&quot; id=&quot;link-id101a1248&quot;&gt;data virtualization&lt;/a&gt; effort. As of the time of the &lt;a href=&quot;http://www.w3.org/DesignIssues/Semantic.html&quot; id=&quot;link-id18882cf8&quot;&gt;seminal unveiling of the Semantic Web in 1998&lt;/a&gt; we were clear about two things, in relation to the effects of the Web and &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id12fa2c58&quot;&gt;Internet&lt;/a&gt; data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id102b09a0&quot;&gt;Virtuoso&lt;/a&gt; with an eye to leveraging the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id11984d98&quot;&gt;Semantic Web&lt;/a&gt; as a vehicle from completing its technical roadmap.&lt;/blockquote&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: Can you put into laymanâs terms exactly what RDF and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id1066dcf0&quot;&gt;SPARQL&lt;/a&gt; are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt;Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the &lt;a href=&quot;http://www.eslincanada.com/englishlesson2.html&quot; id=&quot;link-id178b94a8&quot;&gt;Subject, Predicate, and Object principle&lt;/a&gt;. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: &lt;a href=&quot;http://dbpedia.org/resource/RDFa&quot; id=&quot;link-id188db0a8&quot;&gt;RDFa&lt;/a&gt; (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).&lt;/blockquote&gt; &lt;blockquote&gt; &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id188c2030&quot;&gt;SPARQL&lt;/a&gt; is the query language associated with the RDF Data Model, just as &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id13f0ffe0&quot;&gt;SQL&lt;/a&gt; is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id166874d0&quot;&gt;linked data&lt;/a&gt; on the Web, you can query against Web using &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id1016cc98&quot;&gt;SPARQL&lt;/a&gt; just as you would against an &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id101c9708&quot;&gt;Oracle&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id11cb0b18&quot;&gt;SQL&lt;/a&gt; Server/&lt;a href=&quot;http://dbpedia.org/resource/IBM_DB2&quot; id=&quot;link-id10760ec0&quot;&gt;DB2&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/IBM_Informix&quot; id=&quot;link-id1066c8c0&quot;&gt;Informix&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/Ingres&quot; id=&quot;link-id18894f40&quot;&gt;Ingres&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-iddc9ebb0&quot;&gt;MySQL&lt;/a&gt;/etc.. DBMS using &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id1030d120&quot;&gt;SQL&lt;/a&gt;. That&amp;#39;s it in a nutshell.&lt;/blockquote&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: On your website you wrote that âRDF and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id168e9ad0&quot;&gt;SPARQL&lt;/a&gt; as productivity boosters in everyday web developmentâ. Can you elaborate on why you believe that to be true?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt;Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x179f6328&quot;&gt;Linked Data&lt;/a&gt; that isn&amp;#39;t confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot;&gt;Linked Data&lt;/a&gt; is about infrastructure for the true materialization of the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id13e475b8&quot;&gt;Information&lt;/a&gt; at Your Fingertips&amp;quot; vision of yore. Even though it&amp;#39;s taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision&amp;#39;s intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id13e38a30&quot;&gt;Knowledge&lt;/a&gt; is Power, well there isn&amp;#39;t any &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id188b7348&quot;&gt;knowledge&lt;/a&gt; without accessible &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id140415d0&quot;&gt;Information&lt;/a&gt;, and there isn&amp;#39;t any accessible &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id11a976e8&quot;&gt;Information&lt;/a&gt; without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).&lt;/blockquote&gt; &lt;blockquote&gt;Bottom line, RDF based Linked Data is about Open &lt;a href=&quot;http://dbpedia.org/resource/Reference_(computer_science)&quot; id=&quot;link-id1206bfb8&quot;&gt;Data access by reference&lt;/a&gt; using URIs (HTTP based &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-idfaa6ce0&quot;&gt;Entity&lt;/a&gt; IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id188ecc68&quot;&gt;Internet&lt;/a&gt;.&lt;/blockquote&gt; &lt;br /&gt; &lt;blockquote&gt; &lt;cite&gt;CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id12e2d968&quot;&gt;Semantic Web&lt;/a&gt;, or Semanti&lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id105744c0&quot;&gt;c&lt;/a&gt; Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id13fa4218&quot;&gt;Semantic Web&lt;/a&gt; will play in Web 3.0?&lt;/cite&gt; &lt;/blockquote&gt; &lt;blockquote&gt;Me: I agree with Nova. But I see Web 3.0 as a phase within the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id188c9000&quot;&gt;Semantic Web&lt;/a&gt; innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of &amp;quot;Smart Data Access&amp;quot; to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, &lt;a href=&quot;http://dbpedia.org/resource/Context_%28language_use%29&quot; id=&quot;link-id188d2ef8&quot;&gt;context&lt;/a&gt; fidelity, and comprehension of privacy) exposed by URIs.&lt;/blockquote&gt; &lt;p&gt;Here are some examples of the CrunchBase Linked Data &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot; id=&quot;link-id122756f8&quot;&gt;Space&lt;/a&gt;, as projected via our CruncBase Sponger Cartridge:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Famazon&quot; id=&quot;link-id13e0fd18&quot;&gt;Amazon.com&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fmicrosoft&quot; id=&quot;link-id13eef9e0&quot;&gt;Microsoft&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fgoogle&quot; id=&quot;link-id13fe47a0&quot;&gt;Google&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fapple&quot; id=&quot;link-id170c73b8&quot;&gt;Apple&lt;/a&gt; &lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Comments about recent Semantic Gang Podcast</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-05-02#1357</atom:id>
  <atom:published>2008-05-02T21:44:31Z</atom:published>
  <atom:updated>2008-05-05T20:06:42.000004-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;After listening to the &lt;a href=&quot;http://semanticgang.talis.com/2008/05/02/april-2008-the-semantic-web-gang-discuss-a-wikipedia-for-data/&quot; id=&quot;link-id1089e218&quot;&gt;latest Semantic Web Gang podcast&lt;/a&gt;, I found myself agreeing with some of the points made by &lt;a href=&quot;http://www.linkedin.com/in/iskold&quot; id=&quot;link-id10b91e58&quot;&gt;Alex Iskold&lt;/a&gt;, specifically: &lt;/p&gt; &lt;ul&gt;-- &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id106e24e0&quot;&gt;Linked Data&lt;/a&gt; does not implicitly imply making all your &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id17ab3d48&quot;&gt;data&lt;/a&gt; public&lt;/ul&gt; &lt;ul&gt;-- &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id11fdcef0&quot;&gt;Linked Data&lt;/a&gt; principles benefit &lt;a href=&quot;http://dbpedia.org/resource/Intranet&quot; id=&quot;link-id109756e8&quot;&gt;Intranet&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Extranet&quot; id=&quot;link-id1099cfd8&quot;&gt;Extranet&lt;/a&gt; style &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id10cd25b0&quot;&gt;data&lt;/a&gt; integration (trumps alternative &lt;a href=&quot;http://dbpedia.org/resource/federated_database_system&quot; id=&quot;link-id14f29940&quot;&gt;distributed database&lt;/a&gt; integration approaches any day)&lt;/ul&gt; &lt;ul&gt;-- Business exploitation of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0xca51940&quot;&gt;Linked Data&lt;/a&gt; on the &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; will certainly be driven by the correlation of opportunity costs (which is more than likely what Alex meant by &amp;quot;use cases&amp;quot;) associated with the lack of URIs originating from the domain of a given business (Tom Heath: also effectively alluded to this via his &lt;a href=&quot;http://dbpedia.org/resource/BBC&quot; id=&quot;link-id16f33348&quot;&gt;BBC&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id10decf38&quot;&gt;URI&lt;/a&gt; land grab anecdotes; same applies Georgi&amp;#39;s examples)&lt;/ul&gt; &lt;ul&gt;-- History is a great tutor, answers to many of today&amp;#39;s problems always lie somewhere in plain sight of the past.&lt;/ul&gt; &lt;p&gt;Of course, I also believe that &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot;&gt;Linked Data&lt;/a&gt; serves Web &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1afebd58&quot;&gt;Data&lt;/a&gt; Integration across the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id10aa5668&quot;&gt;Internet&lt;/a&gt; very well too, and the fact that it will be beneficial to businesses in a big way. No individual or organization is an island, I think the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id0xb25fbd0&quot;&gt;Internet&lt;/a&gt; and Web have done a good job of demonstrating that thus far :-) We&amp;#39;re all &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;data&lt;/a&gt; nodes in a &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id5d8a3a8&quot;&gt;Giant Global Graph&lt;/a&gt;.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://myopenlink.net/dataspace/person/danieljohnlewis#this&quot; id=&quot;link-id17cac8a0&quot;&gt;Daniel lewis&lt;/a&gt; did shed light on the read-write aspects of the Linked Data &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id10be8590&quot;&gt;Web&lt;/a&gt;, which is actually very close to the callout for a Wikipedia for Data. &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id10a810c0&quot;&gt;TimBL&lt;/a&gt; has been working on this via &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id184b7108&quot;&gt;Tabulator&lt;/a&gt; (see &lt;a href=&quot;http://dig.csail.mit.edu/2007/tab/tutorial/editing.mov&quot; id=&quot;link-id1416f1e8&quot;&gt;Tabulator Editing Screencast&lt;/a&gt;), &lt;a href=&quot;http://bnode.org/about&quot; id=&quot;link-id17e33750&quot;&gt;Bengamin Nowack&lt;/a&gt; also added &lt;a href=&quot;http://arc.semsol.org/download/plugins/data_wiki&quot; id=&quot;link-id1688cc40&quot;&gt;similar functionality to ARC&lt;/a&gt;, and of course we support the same &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id10bff7c8&quot;&gt;SPARQL&lt;/a&gt; UPDATE into an &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id168ace08&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id10641878&quot;&gt;information&lt;/a&gt; resource via the &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xddb5240&quot;&gt;RDF&lt;/a&gt; Sink feature of our WebDAV and &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/OdsBriefcase&quot; id=&quot;link-id0x11199310&quot;&gt;ODS&lt;/a&gt;-Briefcase implementations.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data and Information Architecture</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-04-29#1350</atom:id>
  <atom:published>2008-04-29T14:37:22Z</atom:published>
  <atom:updated>2008-04-29T17:18:21.000048-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Linked Data and Information Architecture&lt;/div&gt; &lt;p&gt;We had a workshop on &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1437ac70&quot;&gt;Linked Open Data&lt;/a&gt; (&lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1315f788&quot;&gt;LOD&lt;/a&gt;) last week in &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0x13737468&quot;&gt;Beijing&lt;/a&gt;. You can see the papers in &lt;a href=&quot;http://events.linkeddata.org/ldow2008/#program&quot; id=&quot;link-id10651ab8&quot;&gt;the program&lt;/a&gt;. The event was a success with plenty of good talks and animated conversation. I will not go into every paper here but will comment a little on the conversation and draw some technology requirements going forward.&lt;/p&gt; &lt;p&gt;Tim Berners-Lee showed a read-write version of &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id0x15633520&quot;&gt;Tabulator&lt;/a&gt;. This raises the question of updating on the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1350a178&quot;&gt;Data&lt;/a&gt; Web. The consensus was that one could assert what one wanted in one&amp;#39;s own space but that others&amp;#39; spaces would be read-only. What spaces one considered relevant would be the user&amp;#39;s or developer&amp;#39;s business, as in the document web.&lt;/p&gt; &lt;p&gt;It seems to me that a significant use case of LOD is an open-web situation where the user picks a broad read-only &amp;quot;data wallpaper&amp;quot; or backdrop of assertions, and then uses this combined with a much smaller, local, writable data set. This is certainly the case when editing data for publishing, as in Tim&amp;#39;s demo. This will also be the case when developing mesh-ups combining multiple distinct data sets bound together by sets of SameAs assertions, for example. Questions like, &amp;quot;What is the minimum subset of n data sets needed for deriving the result?&amp;quot; will be common. This will also be the case in applications using proprietary data combined with open data.&lt;/p&gt; &lt;p&gt;This means that databases will have to deal with queries that specify large lists of included graphs, all graphs in the store or all graphs with an exclusion list. All this is quite possible but again should be considered when architecting systems for an open &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0xa27bae8&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x155c3f18&quot;&gt;web&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;&amp;quot;There is data but what can we really do with it? How far can we trust it, and what can we confidently decide based on it?&amp;quot;&lt;/p&gt; &lt;p&gt;As an answer to this question, &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0xd447580&quot;&gt;Zitgist&lt;/a&gt; has compiled the &lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0x14735008&quot;&gt;UMBEL&lt;/a&gt; taxonomy using &lt;a href=&quot;http://dbpedia.org/resource/SKOS&quot; id=&quot;link-id0x15ab1c48&quot;&gt;SKOS&lt;/a&gt;. This draws on Wikipedia, Open CYC, Wordnet, and &lt;a href=&quot;http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/&quot; id=&quot;link-id0x15d5aa88&quot;&gt;YAGO&lt;/a&gt;, hence the acronym WOWY. UMBEL is both a taxononmy and a set of instance data, containing a large set of &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0x9fe45d98&quot;&gt;named entities&lt;/a&gt;, including persons, organizations, geopolitical entities, and so forth. By extracting references to this set of named entities from documents and correlating this to the taxonomy, one gets a good idea of what a document (or part thereof) is about.&lt;/p&gt; &lt;p&gt;Kingsley presented this in the Zitgist demo. This is our answer to the criticism about &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xa1920800&quot;&gt;DBpedia&lt;/a&gt; having errors in classification. DBpedia, as a bootstrap stage, is about giving names to all things. Subsequent efforts like UMBEL are about refining the relationships.&lt;/p&gt; &lt;p&gt;&amp;quot;Should there be a global &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x12cd5290&quot;&gt;URI&lt;/a&gt; dictionary?&amp;quot;&lt;/p&gt; &lt;p&gt;There was a talk by Paolo Bouquet about &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x12d03400&quot;&gt;Entity&lt;/a&gt; Name System, a a sort of data DNS, with the purpose of associating some description and rough classification to URIs. This would allow discovering URIs for reuse. I&amp;#39;d say that this is good if it can cut down on the SameAs proliferation and if this can be widely distributed and replicated for resilience, &lt;i&gt;Ã  la&lt;/i&gt; DNS. On the other hand, it was pointed out that this was not quite in the LOD spirit, where parties would mint their own dereferenceable URIs, in their own domains. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;&amp;quot;What to do when identity expires?&amp;quot;&lt;/p&gt; &lt;p&gt;Giovanni of Sindice said that a document should be removed from search if it was no longer available. Kingsley pointed out that resilience of reference requires some way to recover data. The data web cannot be less resilient than the document web, and there is a point to having access to history. He recommended hooking up with the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id0x143e4130&quot;&gt;Internet&lt;/a&gt; Archive, since they make long term persistence their business. In this way, if an application depends on data, and the URIs on which it depends are no longer dereferenceable or or provide content from a new owner of the domain, those who need the old version can still get it and host it themselves.&lt;/p&gt; &lt;p&gt;It is increasingly clear that OWL SameAs is both the blessing and bane of linked data. We can easily have tens of URIs for the same thing, especially with people. Still, these should be considered the same.&lt;/p&gt; &lt;p&gt;Returning every synonym in a query answer hardly makes sense but accepting them as input seems almost necessary. This is what we do with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x15a2a930&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s SameAs support. Even so, this can easily double query times even when there are no synonyms.&lt;/p&gt; &lt;p&gt;Be that as it may, SameAs is here to stay; just consider the mapping of DBpedia to Geonames, for example.&lt;/p&gt; &lt;p&gt;Also, making aberrant SameAs statements can completely poison a data set and lead to absurd query results. Hence choosing which SameAs assertions from which source will be considered seems necessary. In an open web scenario, this leads inevitably to multi-graph queries that can be complex to write with regular &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x12bb8ce8&quot;&gt;SPARQL&lt;/a&gt;. By extension, it seems that a good query would also include the graphs actually used for deriving each result row. This is of course possible but has some implications on how databases should be organized.&lt;/p&gt; &lt;p&gt;Yves Raymond gave a talk about deriving identity between Musicbrainz and Jamendo. I see the issue as a core question of linked data in general. The algorithm Yves presented started with attribute value similarities and then followed related entities. Artists would be the same if they had similar names and similar names of albums with similar song titles, for example. We can find the same basic question in any analysis, for example, looking at how news reporting differs between media, supposing there is adequate entity extraction.&lt;/p&gt; &lt;p&gt;There is basic graph diffing in &lt;a href=&quot;http://data.semanticweb.org/conference/iswc-aswc/2007/tracks/research/papers/533/html&quot; id=&quot;link-id0x153c1fa8&quot;&gt;RDFSync&lt;/a&gt;, for example. But here we are expanding the context significantly. We will traverse references to some depth, allow similarity matches, SameAs, and so forth. Having presumed identity of two URIs, we can then look at the difference in their environment to produce a human readable summary. This could then be evaluated for purposes of analysis or of combining content.&lt;/p&gt; &lt;p&gt;At first sight, these algorithms seem well parallelizable, as long as all threads have access to all data. For scaling, this means a probably message-bound distributed algorithm. This is something to look into for the next stage of linked data.&lt;/p&gt; &lt;p&gt;Some inference is needed, but if everybody has their own choice of data sets to query, then everybody would also have their own entailed triples. This will make for an explosion of entailed graphs if forward chaining is used. Forward chaining is very nice because it keeps queries simple and easy to optimize. With Virtuoso, we still favor backward chaining since we expect a great diversity of graph combinations and near infinite volume in the open web scenario. With private repositories of slowly changing data put together for a special application, the situation is different.&lt;/p&gt; &lt;p&gt;In conclusion, we have a real LOD movement with actual momentum and a good idea of what to do next. The next step is promoting this to the broader community, starting with &lt;a href=&quot;http://www.linkeddataplanet.com/&quot; id=&quot;link-id0x155d1d00&quot;&gt;Linked Data Planet&lt;/a&gt; in New York in June.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data and Information Architecture</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2008-04-29#1347</atom:id>
  <atom:published>2008-04-29T12:08:24Z</atom:published>
  <atom:updated>2008-04-29T17:18:15.000009-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We had a workshop on &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x9e7e5098&quot;&gt;Linked Open Data&lt;/a&gt; (&lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0xb9e86c8&quot;&gt;LOD&lt;/a&gt;) last week in &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0x131a72a0&quot;&gt;Beijing&lt;/a&gt;. You can see the papers in &lt;a href=&quot;http://events.linkeddata.org/ldow2008/#program&quot; id=&quot;link-id10651ab8&quot;&gt;the program&lt;/a&gt;. The event was a success with plenty of good talks and animated conversation. I will not go into every paper here but will comment a little on the conversation and draw some technology requirements going forward.&lt;/p&gt; &lt;p&gt;Tim Berners-Lee showed a read-write version of &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id0x9da015a0&quot;&gt;Tabulator&lt;/a&gt;. This raises the question of updating on the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x14813578&quot;&gt;Data&lt;/a&gt; Web. The consensus was that one could assert what one wanted in one&amp;#39;s own space but that others&amp;#39; spaces would be read-only. What spaces one considered relevant would be the user&amp;#39;s or developer&amp;#39;s business, as in the document web.&lt;/p&gt; &lt;p&gt;It seems to me that a significant use case of LOD is an open-web situation where the user picks a broad read-only &amp;quot;data wallpaper&amp;quot; or backdrop of assertions, and then uses this combined with a much smaller, local, writable data set. This is certainly the case when editing data for publishing, as in Tim&amp;#39;s demo. This will also be the case when developing mesh-ups combining multiple distinct data sets bound together by sets of SameAs assertions, for example. Questions like, &amp;quot;What is the minimum subset of n data sets needed for deriving the result?&amp;quot; will be common. This will also be the case in applications using proprietary data combined with open data.&lt;/p&gt; &lt;p&gt;This means that databases will have to deal with queries that specify large lists of included graphs, all graphs in the store or all graphs with an exclusion list. All this is quite possible but again should be considered when architecting systems for an open &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x9f1a1d8&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0xa5a3e1b0&quot;&gt;web&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;&amp;quot;There is data but what can we really do with it? How far can we trust it, and what can we confidently decide based on it?&amp;quot;&lt;/p&gt; &lt;p&gt;As an answer to this question, &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0xad44dc0&quot;&gt;Zitgist&lt;/a&gt; has compiled the &lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0x9ebde358&quot;&gt;UMBEL&lt;/a&gt; taxonomy using &lt;a href=&quot;http://dbpedia.org/resource/SKOS&quot; id=&quot;link-id0xa04a85c0&quot;&gt;SKOS&lt;/a&gt;. This draws on Wikipedia, Open CYC, Wordnet, and &lt;a href=&quot;http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/&quot; id=&quot;link-id0x9fdd9018&quot;&gt;YAGO&lt;/a&gt;, hence the acronym WOWY. UMBEL is both a taxononmy and a set of instance data, containing a large set of &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0xa0b9cb70&quot;&gt;named entities&lt;/a&gt;, including persons, organizations, geopolitical entities, and so forth. By extracting references to this set of named entities from documents and correlating this to the taxonomy, one gets a good idea of what a document (or part thereof) is about.&lt;/p&gt; &lt;p&gt;Kingsley presented this in the Zitgist demo. This is our answer to the criticism about &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xdc5d940&quot;&gt;DBpedia&lt;/a&gt; having errors in classification. DBpedia, as a bootstrap stage, is about giving names to all things. Subsequent efforts like UMBEL are about refining the relationships.&lt;/p&gt; &lt;p&gt;&amp;quot;Should there be a global &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0xa0aa2cc0&quot;&gt;URI&lt;/a&gt; dictionary?&amp;quot;&lt;/p&gt; &lt;p&gt;There was a talk by Paolo Bouquet about &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x9e6f4b28&quot;&gt;Entity&lt;/a&gt; Name System, a a sort of data DNS, with the purpose of associating some description and rough classification to URIs. This would allow discovering URIs for reuse. I&amp;#39;d say that this is good if it can cut down on the SameAs proliferation and if this can be widely distributed and replicated for resilience, &lt;i&gt;Ã  la&lt;/i&gt; DNS. On the other hand, it was pointed out that this was not quite in the LOD spirit, where parties would mint their own dereferenceable URIs, in their own domains. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;&amp;quot;What to do when identity expires?&amp;quot;&lt;/p&gt; &lt;p&gt;Giovanni of Sindice said that a document should be removed from search if it was no longer available. Kingsley pointed out that resilience of reference requires some way to recover data. The data web cannot be less resilient than the document web, and there is a point to having access to history. He recommended hooking up with the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id0xcc203f8&quot;&gt;Internet&lt;/a&gt; Archive, since they make long term persistence their business. In this way, if an application depends on data, and the URIs on which it depends are no longer dereferenceable or or provide content from a new owner of the domain, those who need the old version can still get it and host it themselves.&lt;/p&gt; &lt;p&gt;It is increasingly clear that OWL SameAs is both the blessing and bane of linked data. We can easily have tens of URIs for the same thing, especially with people. Still, these should be considered the same.&lt;/p&gt; &lt;p&gt;Returning every synonym in a query answer hardly makes sense but accepting them as input seems almost necessary. This is what we do with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa0d28c98&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s SameAs support. Even so, this can easily double query times even when there are no synonyms.&lt;/p&gt; &lt;p&gt;Be that as it may, SameAs is here to stay; just consider the mapping of DBpedia to Geonames, for example.&lt;/p&gt; &lt;p&gt;Also, making aberrant SameAs statements can completely poison a data set and lead to absurd query results. Hence choosing which SameAs assertions from which source will be considered seems necessary. In an open web scenario, this leads inevitably to multi-graph queries that can be complex to write with regular &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xc89a768&quot;&gt;SPARQL&lt;/a&gt;. By extension, it seems that a good query would also include the graphs actually used for deriving each result row. This is of course possible but has some implications on how databases should be organized.&lt;/p&gt; &lt;p&gt;Yves Raymond gave a talk about deriving identity between Musicbrainz and Jamendo. I see the issue as a core question of linked data in general. The algorithm Yves presented started with attribute value similarities and then followed related entities. Artists would be the same if they had similar names and similar names of albums with similar song titles, for example. We can find the same basic question in any analysis, for example, looking at how news reporting differs between media, supposing there is adequate entity extraction.&lt;/p&gt; &lt;p&gt;There is basic graph diffing in &lt;a href=&quot;http://data.semanticweb.org/conference/iswc-aswc/2007/tracks/research/papers/533/html&quot; id=&quot;link-id0x9fe62620&quot;&gt;RDFSync&lt;/a&gt;, for example. But here we are expanding the context significantly. We will traverse references to some depth, allow similarity matches, SameAs, and so forth. Having presumed identity of two URIs, we can then look at the difference in their environment to produce a human readable summary. This could then be evaluated for purposes of analysis or of combining content.&lt;/p&gt; &lt;p&gt;At first sight, these algorithms seem well parallelizable, as long as all threads have access to all data. For scaling, this means a probably message-bound distributed algorithm. This is something to look into for the next stage of linked data.&lt;/p&gt; &lt;p&gt;Some inference is needed, but if everybody has their own choice of data sets to query, then everybody would also have their own entailed triples. This will make for an explosion of entailed graphs if forward chaining is used. Forward chaining is very nice because it keeps queries simple and easy to optimize. With Virtuoso, we still favor backward chaining since we expect a great diversity of graph combinations and near infinite volume in the open web scenario. With private repositories of slowly changing data put together for a special application, the situation is different.&lt;/p&gt; &lt;p&gt;In conclusion, we have a real LOD movement with actual momentum and a good idea of what to do next. The next step is promoting this to the broader community, starting with &lt;a href=&quot;http://www.linkeddataplanet.com/&quot; id=&quot;link-id0xa16319a8&quot;&gt;Linked Data Planet&lt;/a&gt; in New York in June.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data Illustrated and a Virtuoso Functionality Reminder</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-04-28#1342</atom:id>
  <atom:published>2008-04-28T17:32:47Z</atom:published>
  <atom:updated>2008-04-28T14:47:06.000001-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://myopenlink.net/dataspace/person/danieljohnlewis#this&quot; id=&quot;link-id156ceb30&quot;&gt;Daniel Lewis&lt;/a&gt; has put together a nice &lt;a href=&quot;http://vanirsystems.com/danielsblog/2008/04/27/linked-data-the-role-of-the-data-server/&quot; id=&quot;link-id10456040&quot;&gt;collection of Linked Data related posts&lt;/a&gt; that illustrate the fundamentals of the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id1033f6f0&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id106fa168&quot;&gt;Web&lt;/a&gt; and the vital role that &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id10141c20&quot;&gt;Virtuoso&lt;/a&gt; plays as a deployment platform. Remember, &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id10301e38&quot;&gt;Virtuoso&lt;/a&gt; was architected in 1998 (see &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory&quot; id=&quot;link-id10c44088&quot;&gt;Virtuoso History&lt;/a&gt;) in anticipation of the eventual &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id1383a1e8&quot;&gt;Internet&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Intranet&quot; id=&quot;link-id1028e770&quot;&gt;Intranet&lt;/a&gt;, and &lt;a href=&quot;http://dbpedia.org/resource/Extranet&quot; id=&quot;link-id14b07b40&quot;&gt;Extranet&lt;/a&gt; level requirements for a different kind of Server. At the time of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id14ad24a8&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s inception, many thought our desire to build a multi-protocol, multi-model, and multi-purpose, virtual and native &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id108dac48&quot;&gt;data&lt;/a&gt; server was sheer craziness, but we pressed on (courtesy of our vision and technical capabilities). Today, we have a very sophisticated &lt;a href=&quot;http://dbpedia.org/resource/Virtuoso_Universal_Server&quot; id=&quot;link-id14a65d48&quot;&gt;Universal Server&lt;/a&gt; Platform (in Open Source and Commercial forms) that is naturally equipped to do the following via very simple interfaces: &lt;ul&gt; - Produce &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id11fb1170&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id10871da8&quot;&gt;Linked Data&lt;/a&gt; from non &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id156ec3d0&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id10f0ca38&quot;&gt;Data&lt;/a&gt; Sources (Heterogeneous &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id15133078&quot;&gt;SQL&lt;/a&gt;, XML, &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; Services)&lt;/ul&gt; &lt;ul&gt; - Provide highly scalable &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id10585940&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id15151e10&quot;&gt;Data&lt;/a&gt; Management via a Quad Store (&lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id1530d640&quot;&gt;DBpedia&lt;/a&gt; is an example of a live demonstration)&lt;/ul&gt; &lt;ul&gt; - Sophisticated Deployment of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id10141c80&quot;&gt;Linked Data&lt;/a&gt; that exploits the power of &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id1064fa18&quot;&gt;SPARQL&lt;/a&gt; &lt;/ul&gt; &lt;ul&gt; - Powerful WebDAV innovations that simplify read-write mode interaction with &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id1396ff68&quot;&gt;Linked Data&lt;/a&gt; &lt;/ul&gt; &lt;ul&gt; - Use Web &lt;a href=&quot;http://dbpedia.org/resource/Federated_database_system&quot; id=&quot;link-id108256e8&quot;&gt;Data Virtualization&lt;/a&gt; to address the pain and frustration associated with Web &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id147e65f8&quot;&gt;Data&lt;/a&gt; Silos (e.g. &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-idffaf078&quot;&gt;OpenLink Data Spaces&lt;/a&gt; layer stop &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id14ae8fe8&quot;&gt;Virtuoso&lt;/a&gt; that delivers &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0xa0fb5e40&quot;&gt;Personal Data Spaces&lt;/a&gt; / Unified Storage in the Clouds) &lt;/ul&gt; &lt;ul&gt; - Deliver a &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id10869700&quot;&gt;Linked Data&lt;/a&gt; development and deployment platform to .&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id1514cac0&quot;&gt;NET&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Visual_Basic&quot; id=&quot;link-id10c107a8&quot;&gt;VB&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id101f3c68&quot;&gt;C&lt;/a&gt;#) , Java, &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id106e4710&quot;&gt;PHP&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Ruby_programming_language&quot; id=&quot;link-id10277448&quot;&gt;Ruby&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Perl&quot; id=&quot;link-id10a75748&quot;&gt;Perl&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Python_programming_language&quot; id=&quot;link-id12fdb118&quot;&gt;Python&lt;/a&gt;, &amp;#39;&lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id10c9d9e0&quot;&gt;C&lt;/a&gt;&amp;#39;, &lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id10392400&quot;&gt;C++&lt;/a&gt;, and other developers &lt;/ul&gt; &lt;ul&gt;- More...&lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Semantic Web Patterns: A Guide to Semantic Technologies (Update 2)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-03-27#1329</atom:id>
  <atom:published>2008-03-27T00:08:13Z</atom:published>
  <atom:updated>2008-07-16T21:43:36-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;For all the one-way feed consumers and aggregators, and readers of the original post, here is a variant equipped hyperlinked phrases as opposed to words. As I stated in the prior post, the post (like most of my posts) was part experiment / dog-fodding of automatic tagging and hyper-linking functionality in &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x194f56f0&quot;&gt;OpenLink Data Spaces&lt;/a&gt;. &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.readwriteweb.com&quot; id=&quot;link-id0x1bddde00&quot;&gt;ReadWriteWeb&lt;/a&gt; via &lt;a href=&quot;http://alexiskold.wordpress.com/&quot; id=&quot;link-id154ae848&quot;&gt;Alex Iskold&amp;#39;s post&lt;/a&gt; have delivered another iteration of their &amp;quot;Guide to Semantic Technologies&amp;quot;. &lt;/p&gt; &lt;p&gt;If you look at the title of this post (and &lt;a href=&quot;http://feeds.feedburner.com/%7Er/readwriteweb/%7E3/257943334/semantic_web_patterns.php&quot; id=&quot;link-id10a9a900&quot;&gt;their article&lt;/a&gt;) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x15ccef28&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot; as prescribed by &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id0xb94a2d40&quot;&gt;TimBL&lt;/a&gt; then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: &amp;quot;&lt;a href=&quot;http://www.readwriteweb.com&quot;&gt;ReadWriteWeb&lt;/a&gt;&amp;quot; and &amp;quot;&lt;a href=&quot;http://www.linkedin.com/in/iskold&quot; id=&quot;link-id0x19960308&quot;&gt;Alex Iskold&lt;/a&gt;&amp;quot; are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1a719968&quot;&gt;blog&lt;/a&gt; posts.&lt;/p&gt; &lt;blockquote&gt; &lt;p&gt;Preoccupation with Literal objects as describe above, implies we can only take what &amp;quot;ReadWriteWeb&amp;quot; and &amp;quot;&lt;a href=&quot;http://www.linkedin.com/in/iskold&quot;&gt;Alex Iskold&lt;/a&gt;&amp;quot; say &amp;quot;Literally&amp;quot; (&lt;a href=&quot;http://dbpedia/resource/Grep&quot; id=&quot;link-id0xbc8568f8&quot;&gt;grep&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/regular_expression&quot; id=&quot;link-id0x1d915e70&quot;&gt;regex&lt;/a&gt;, and &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot; id=&quot;link-id0xbc617820&quot;&gt;XPath&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot; id=&quot;link-id0x150e1c50&quot;&gt;Xquery&lt;/a&gt; are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar &amp;quot;About Page&amp;quot; blurb), no &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;data&lt;/a&gt; connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;&lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot;&gt;TimBL&lt;/a&gt;&amp;#39;s vision as espoused via the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot; vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (&lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0xb867ced0&quot;&gt;Hyperdata&lt;/a&gt; Links / &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x3c8f438&quot;&gt;Linked Data&lt;/a&gt;). It&amp;#39;s how we use the &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; as a &lt;a href=&quot;http://dbpedia.org/resource/federated_database_system&quot; id=&quot;link-id0xbcb04f20&quot;&gt;Distributed Database&lt;/a&gt; where (as &lt;a href=&quot;http://www.cs.umd.edu/~hendler/2003/foaf.rdf#jhendler&quot; id=&quot;link-id0xb8595f18&quot;&gt;Jim Hendler&lt;/a&gt; once stated with immense clarity): I can point to records (&lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0xbc9c8ab8&quot;&gt;entity&lt;/a&gt; instances) in your database (aka &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot; id=&quot;link-id0x3b911c0&quot;&gt;Data Space&lt;/a&gt;) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type &amp;quot;Document&amp;quot;) using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the &amp;quot;Federated and Open&amp;quot; nature of the Web.&lt;/p&gt; &lt;p&gt;As I stated in a prior post, if you or your platform of choice aren&amp;#39;t producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot; id=&quot;link-id0xbcb968b0&quot;&gt;World Wide Web&lt;/a&gt;&amp;quot; in what you are doing.&lt;/p&gt; &lt;h2&gt;What are the Benefits of the Semantic Web?&lt;/h2&gt; &lt;ul&gt; &lt;strong&gt;Consumer&lt;/strong&gt; - &amp;quot;Discovery of relevant things&amp;quot; and be being &amp;quot;Discovered by relevant things&amp;quot; (people, places, events, and other things)&lt;/ul&gt; &lt;ul&gt; &lt;strong&gt;Enterprise&lt;/strong&gt; - ditto plus the addition of enterprise domain specific things such as market opportunities, product portfolios, human resources, partners, customers, competitors, co-opetitors, acquisition targets, new regulation etc..)&lt;/ul&gt; &lt;h2&gt;Simple demo:&lt;/h2&gt; &lt;blockquote&gt; &lt;p&gt;I am a &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id0x150661b0&quot;&gt;Kingsley Idehen&lt;/a&gt;, a Person who authors &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen&quot; id=&quot;link-id0x3b956d0&quot;&gt;this weblog&lt;/a&gt;. I also share bookmarks gathered over the years across an array of subjects via &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/bookmark/KingsleyBookmarks&quot; id=&quot;link-id0x164fecb0&quot;&gt;my bookmark data space&lt;/a&gt;. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot;&gt;space&lt;/a&gt;. Of course, all of these data sources have Tags which are collectively exposed via my &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/weblog/MyBlogDataSpace/tagcloud&quot; id=&quot;link-id0x15188c50&quot;&gt;weblog tag-cloud&lt;/a&gt;, feeds subscriptions &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0x5f38b98&quot;&gt;tag&lt;/a&gt;-cloud, and &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/bookmark/KingsleyBookmarks/tagcloud&quot; id=&quot;link-id0xb93c2a50&quot;&gt;bookmarks tag-cloud&lt;/a&gt; data spaces.&lt;/p&gt; &lt;p&gt;As I don&amp;#39;t like repeating myself, and I hate wasting my time or the time of others, I simply share &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen&quot; id=&quot;link-id0x3aeba98&quot;&gt;my Data Space&lt;/a&gt; (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x14e35d78&quot;&gt;information&lt;/a&gt; (perspectives) that is hopefully exposed others via the Web.&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;Bottom-line, the Semantic Web is about adding the missing &amp;quot;Open Data Access &amp;amp; Connectivity&amp;quot; feature to the current Document Web (we have to beyond &lt;a href=&quot;http://dbpedia.org/resource/regular_expression&quot;&gt;regex&lt;/a&gt;, &lt;a href=&quot;http://dbpedia/resource/Grep&quot;&gt;grep&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot;&gt;xpath&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot;&gt;xquery&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Full_text_search&quot; id=&quot;link-id0x1c1bf9c8&quot;&gt;full text search&lt;/a&gt;, and other literal scrapping approaches). The &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x14c9e0e8&quot;&gt;Web&lt;/a&gt; of de-referencable data object URIs is the critical foundation layer that makes this feasible.&lt;/p&gt; &lt;p&gt; Remember, It&amp;#39;s not about &amp;quot;Applications&amp;quot; it&amp;#39;s about Data and actually freeing Data from the &amp;quot;tyranny of Applications&amp;quot;. Unfortunately, application inadvertently always create silos (esp. on the Web) since &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot;&gt;entity&lt;/a&gt; data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.&lt;/p&gt; &lt;p&gt;Final comment, RDF facilitates Linked Data on the Web, but all RDF isn&amp;#39;t endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn&amp;#39;t provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they&amp;#39;ve discovered without exposing de-referencable URIs for each item.&lt;/p&gt; &lt;p&gt;BTW - my final comments above aren&amp;#39;t in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb919b098&quot;&gt;Virtuoso&lt;/a&gt;, OpenLink Data Spaces, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0x1481b218&quot;&gt;Zitgist&lt;/a&gt;, which now has &lt;a href=&quot;http://community.linkeddata.org/dataspace/person/mkbergman#this&quot; id=&quot;link-id0xb869bb18&quot;&gt;Mike Bergman&lt;/a&gt; as it&amp;#39;s CEO alongside &lt;a href=&quot;http://fgiasson.com/me/&quot; id=&quot;link-id0x1d18fe50&quot;&gt;Frederick Giasson&lt;/a&gt; as CTO. Of course, I cannot do &lt;a href=&quot;http://zitgist.com/about/&quot;&gt;Zitgist&lt;/a&gt; justice via a footnote in a &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot;&gt;blog&lt;/a&gt; post, so I will expand further in a separate post.&lt;/p&gt; &lt;h2&gt;Additional &lt;a href=&quot;http://dbpedia.org/resource/Information&quot;&gt;information&lt;/a&gt; about this blog post: &lt;/h2&gt; &lt;ol&gt; &lt;li&gt; I didn&amp;#39;t spent hours looking for URIs used in my hyperlinks&lt;/li&gt; &lt;li&gt; The post is best viewed via an RDF Linked Data aware user agents (&lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser&quot; id=&quot;link-id0x19af3468&quot;&gt;OpenLink RDF Browser&lt;/a&gt;, Zitgist &lt;a href=&quot;http://dataviewer.zitgist.com&quot; id=&quot;link-id0x13b17138&quot;&gt;Data Viewer&lt;/a&gt;, &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/rdf_browser&quot; id=&quot;link-id0xbc8579e0&quot;&gt;DISCO Hyperdata Browser&lt;/a&gt;, &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id0x18ad0ec8&quot;&gt;Tabulator&lt;/a&gt;).&lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Semantic Web Patterns: A Guide to Semantic Technologies (Update 1)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-03-26#1328</atom:id>
  <atom:published>2008-03-26T22:44:00Z</atom:published>
  <atom:updated>2008-07-16T21:43:04-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt; &lt;a href=&quot;http://www.readwriteweb.com&quot; id=&quot;link-id11846528&quot;&gt;ReadWriteWeb&lt;/a&gt; via &lt;a href=&quot;http://alexiskold.wordpress.com/&quot; id=&quot;link-id154ae848&quot;&gt;Alex Iskold&lt;/a&gt; have delivered another iteration of their &amp;quot;Guide to Semantic Technologies&amp;quot;. &lt;/p&gt; &lt;p&gt;If you look at the title of this post (and &lt;a href=&quot;http://feeds.feedburner.com/%7Er/readwriteweb/%7E3/257943334/semantic_web_patterns.php&quot; id=&quot;link-id10a9a900&quot;&gt;their article&lt;/a&gt;) they seem to be accurately providing a guide to Semantic Technologies, so no qualms there. If on the other hand, this is supposed to he a guide to the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0xbcb19320&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot; as prescribed by &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id0xb8725878&quot;&gt;TimBL&lt;/a&gt; then they are completely missing the essence of the whole subject, and demonstrably so I may add, since the entities: &amp;quot;&lt;a href=&quot;http://www.readwriteweb.com&quot; id=&quot;link-id0x16804040&quot;&gt;ReadWriteWeb&lt;/a&gt;&amp;quot; and &amp;quot;&lt;a href=&quot;http://www.linkedin.com/in/iskold&quot; id=&quot;link-id0x13f08538&quot;&gt;Alex Iskold&lt;/a&gt;&amp;quot; are only describable today via the attributes of the documents they publish i.e their respective blogs and hosted &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1850ca98&quot;&gt;blog&lt;/a&gt; posts. &lt;/p&gt; &lt;blockquote&gt; &lt;p&gt;Preoccupation with Literal objects as describe above, implies we can only take what &amp;quot;&lt;a href=&quot;http://www.readwriteweb.com&quot;&gt;ReadWriteWeb&lt;/a&gt;&amp;quot; and &amp;quot;&lt;a href=&quot;http://www.linkedin.com/in/iskold&quot;&gt;Alex Iskold&lt;/a&gt;&amp;quot; say &amp;quot;Literally&amp;quot; (&lt;a href=&quot;http://dbpedia/resource/Grep&quot; id=&quot;link-id0xb95a6a40&quot;&gt;grep&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/regular_expression&quot; id=&quot;link-id0x1a719968&quot;&gt;regex&lt;/a&gt;, and &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot; id=&quot;link-id0xb89d78b8&quot;&gt;XPath&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot; id=&quot;link-id0x1bddde00&quot;&gt;Xquery&lt;/a&gt; are the only tools for searching deeper in this Literal realm), we have no sense of what makes them tick or where they come from, no history (bar &amp;quot;About Page&amp;quot; blurb), no &lt;a href=&quot;http://dbpedia.org/resource/Data&quot;&gt;data&lt;/a&gt; connections beyond anchored text (more pointers to opaque data sources) in post and blogrolls. The only connection between this post and them is the my deliberate use of the same literal text in the Title of this post.&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;&lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot;&gt;TimBL&lt;/a&gt;&amp;#39;s vision as espoused via the &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot;&gt;Semantic Web&lt;/a&gt;&amp;quot; vision is about the production, consumption, and sharing of Data Objects via HTTP based Identifiers called URIs/IRIs (&lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x150e7be0&quot;&gt;Hyperdata&lt;/a&gt; Links / &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x18e50818&quot;&gt;Linked Data&lt;/a&gt;). It&amp;#39;s how we use the &lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot;&gt;Web&lt;/a&gt; as a &lt;a href=&quot;http://dbpedia.org/resource/federated_database_system&quot; id=&quot;link-id0x194f56f0&quot;&gt;Distributed Database&lt;/a&gt; where (as &lt;a href=&quot;http://www.cs.umd.edu/~hendler/2003/foaf.rdf#jhendler&quot; id=&quot;link-id0x17043b38&quot;&gt;Jim Hendler&lt;/a&gt; once stated with immense clarity): I can point to records (&lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x1476f788&quot;&gt;entity&lt;/a&gt; instances) in your database (aka &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot; id=&quot;link-id0x2621140&quot;&gt;Data Space&lt;/a&gt;) from mine. Which is to say that if we can all point to data entities/objects (not just data entities of type &amp;quot;Document&amp;quot;) using these Location, Value, and Structure independent Object Identifiers (courtesy of HTTP) we end up with a much more powerful Web, and one that is closer to the &amp;quot;Federated and Open&amp;quot; nature of the Web.&lt;/p&gt; &lt;p&gt;As I stated in a prior post, if you or your platform of choice aren&amp;#39;t producing de-referencable URIs for your data objects, you may be Semantic (this data model predates the Web), but there is no &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/World_Wide_Web&quot; id=&quot;link-id0xb860eec8&quot;&gt;World Wide Web&lt;/a&gt;&amp;quot; in what you are doing.&lt;/p&gt; &lt;h2&gt;What are the Benefits of the Semantic Web?&lt;/h2&gt; &lt;ul&gt; &lt;strong&gt;Consumer&lt;/strong&gt; - &amp;quot;Discovery of relevant things&amp;quot; and be being &amp;quot;Discovered by relevant things&amp;quot; (people, places, events, and other things)&lt;/ul&gt; &lt;ul&gt; &lt;strong&gt;Enterprise&lt;/strong&gt; - ditto plus the addition of enterprise domain specific things such as market opportunities, product portfolios, human resources, partners, customers, competitors, co-opetitors, acquisition targets, new regulation etc..)&lt;/ul&gt; &lt;h2&gt;Simple demo:&lt;/h2&gt; &lt;blockquote&gt; &lt;p&gt;I am a &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id0x15394798&quot;&gt;Kingsley Idehen&lt;/a&gt;, a Person who authors &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen&quot; id=&quot;link-id0x2556670&quot;&gt;this weblog&lt;/a&gt;. I also share bookmarks gathered over the years across an array of subjects via &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/bookmark/KingsleyBookmarks&quot; id=&quot;link-id0x142eaa10&quot;&gt;my bookmark data space&lt;/a&gt;. I also subscribe to a number of RSS/Atom/RDF feeds, which I share via my feeds subscription data &lt;a href=&quot;http://en.wikipedia.org/wiki/Data_Spaces&quot;&gt;space&lt;/a&gt;. Of course, all of these data sources have Tags which are collectively exposed via my &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/weblog/MyBlogDataSpace/tagcloud&quot; id=&quot;link-id0x140b8050&quot;&gt;weblog tag-cloud&lt;/a&gt;, feeds subscriptions &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0x15158d60&quot;&gt;tag&lt;/a&gt;-cloud, and &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen/bookmark/KingsleyBookmarks/tagcloud&quot; id=&quot;link-id0xb8652490&quot;&gt;bookmarks tag-cloud&lt;/a&gt; data spaces.&lt;/p&gt; &lt;p&gt;As I don&amp;#39;t like repeating myself, and I hate wasting my time or the time of others, I simply share &lt;a href=&quot;http://myopenlink.net/dataspace/kidehen&quot; id=&quot;link-id0x13b63208&quot;&gt;my Data Space&lt;/a&gt; (a collection of all of my purpose specific data spaces) via the Web so that others (friends, family, employees, partners, customers, project collaborators, competitors, co-opetitors etc.) can can intentionally or serendipitously discover relevant data en route to creating new &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x14365150&quot;&gt;information&lt;/a&gt; (perspectives) that is hopefully exposed others via the Web.&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;Bottom-line, the Semantic Web is about adding the missing &amp;quot;Open Data Access &amp;amp; Connectivity&amp;quot; feature to the current Document Web (we have to beyond &lt;a href=&quot;http://dbpedia.org/resource/regular_expression&quot;&gt;regex&lt;/a&gt;, &lt;a href=&quot;http://dbpedia/resource/Grep&quot;&gt;grep&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot;&gt;xpath&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot;&gt;xquery&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Full_text_search&quot; id=&quot;link-id0x15ccef28&quot;&gt;full text search&lt;/a&gt;, and other literal scrapping approaches). The &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x1a2810b8&quot;&gt;Web&lt;/a&gt; of de-referencable data object URIs is the critical foundation layer that makes this feasible.&lt;/p&gt; &lt;p&gt; Remember, It&amp;#39;s not about &amp;quot;Applications&amp;quot; it&amp;#39;s about Data and actually freeing Data from the &amp;quot;tyranny of Applications&amp;quot;. Unfortunately, application inadvertently always create silos (esp. on the Web) since &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot;&gt;entity&lt;/a&gt; data modeling, open data access, and other database technology realm matters, remain of secondary interest to many application developers.&lt;/p&gt; &lt;p&gt;Final comment, RDF facilitates Linked Data on the Web, but all RDF isn&amp;#39;t endowed with de-referencable URIs (a major source of confusion and misunderstanding). Thus, you can have RDF Data Source Providers that simply project RDF data silos via Web Services APIs if RDF output emanating from a Web Service doesn&amp;#39;t provide out-bound pathways to other data via de-referencable URIs. Of course the same also applies to Widgets that present you with all the things they&amp;#39;ve discovered without exposing de-referencable URIs for each item.&lt;/p&gt; &lt;p&gt;BTW - my final comments above aren&amp;#39;t in anyway incongruent with devising successful business models for the Web. As you may or may not know, OpenLink is not only a major platform provider for the Semantic Web (expressed in our UDA, &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x19e44e80&quot;&gt;Virtuoso&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0xb8637720&quot;&gt;OpenLink Data Spaces&lt;/a&gt;, and OAT products), we are also actively seeding Semantic Web (tribe: Linked Data of course) startups. For instance, &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0x397b940&quot;&gt;Zitgist&lt;/a&gt;, which now has &lt;a href=&quot;http://community.linkeddata.org/dataspace/person/mkbergman#this&quot; id=&quot;link-id0x5fabcf0&quot;&gt;Mike Bergman&lt;/a&gt; as it&amp;#39;s CEO alongside &lt;a href=&quot;http://fgiasson.com/me/&quot; id=&quot;link-id0xb84720f8&quot;&gt;Frederick Giasson&lt;/a&gt; as CTO. Of course, I cannot do &lt;a href=&quot;http://zitgist.com/about/&quot;&gt;Zitgist&lt;/a&gt; justice via a footnote in a &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot;&gt;blog&lt;/a&gt; post, so I will expand further in a separate post.&lt;/p&gt; &lt;h2&gt;Additional &lt;a href=&quot;http://dbpedia.org/resource/Information&quot;&gt;information&lt;/a&gt; about this blog post:&lt;/h2&gt; &lt;ol&gt; &lt;li&gt; I didn&amp;#39;t spent hours looking for URIs used in my hyperlinks &lt;/li&gt; &lt;li&gt; The post is best viewed via an RDF Linked Data aware user agents (&lt;a href=&quot;http://demo.openlinksw.com/rdfbrowser&quot; id=&quot;link-id0x3ac1b68&quot;&gt;OpenLink RDF Browser&lt;/a&gt;, Zitgist &lt;a href=&quot;http://dataviewer.zitgist.com&quot; id=&quot;link-id0x1d8e7ec0&quot;&gt;Data Viewer&lt;/a&gt;, &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/rdf_browser&quot; id=&quot;link-id0x19af3468&quot;&gt;DISCO Hyperdata Browser&lt;/a&gt;, &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id0x1532e630&quot;&gt;Tabulator&lt;/a&gt;).&lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RDF Benchmarking, Role, Motives, and Rationale</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2007-11-21#1274</atom:id>
  <atom:published>2007-11-21T14:19:39Z</atom:published>
  <atom:updated>2008-04-30T14:28:17.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Arising from the recent W3C workshop on &lt;a href=&quot;http://www.w3.org/2007/03/RdfRDB/&quot; id=&quot;link-id10679c70&quot;&gt;mapping relational data to RDF&lt;/a&gt;, there is some discussion on &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1268&quot; id=&quot;link-id1258dca8&quot;&gt;starting a benchmarking oriented experimental group&lt;/a&gt; under the W3C. I&amp;#39;ll here make some comments on where this might fit and how this might serve our nascent industry.&lt;/p&gt; &lt;p&gt;To the public, basically any recipient of the semantic &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa203a350&quot;&gt;data&lt;/a&gt; web message, the benchmarking activity should communicate:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;The semantic data web claims to&lt;/p&gt; &lt;ol&gt; &lt;li&gt; allow integrating any legacy data from wherever and allow translating this into common, mutually joinable vocabularies, and&lt;/li&gt; &lt;li&gt;make the web into a big database capable of answering structured queries on any open data.&lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;The benchmarking activity is to prove that this is not a pipe dream that Gartner Group forecast for 2027. Instead, there exists &lt;/p&gt; &lt;ol&gt; &lt;li&gt;an industry, &lt;/li&gt; &lt;li&gt;a degree of consensus within the industry concerning what the semantic data web is for, and&lt;/li&gt; &lt;li&gt;products that are beyond experimental and can deliver at least some of the claimed benefits of the semantic data web.&lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;To the general public, the message will be best delivered by the existence of online services that do interesting things with &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1404e708&quot;&gt;linked data&lt;/a&gt;, starting from search and going to more specialized derivative products of structured &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0xa1e12cd8&quot;&gt;information&lt;/a&gt; on the web.&lt;/p&gt; &lt;p&gt;To those intending to apply some semantic data web things themselves, the benchmark activity should give a directory of products to look at. The reason why a benchmark suite backed by some industry consortium is useful is that it adds to the end user&amp;#39;s confidence that the use case being measured is of somewhat general relevance and not just made to demonstrate any single product&amp;#39;s strengths. Besides this, the TPC idea of disclosing scale, throughput, price per throughput and date is fine because it makes for easy tabulation of results. The intricacies in the full disclosure is effectively masked and it is my guess that very few read the actual full disclosures.&lt;/p&gt; &lt;p&gt;The inference that an evaluator draws from benchmark results is that some product figuring there consistently is somewhat serious and can be studied further. Being in the running is like a stamp of approval. The benchmarks are complex and the evaluator seldom goes to the trouble of really analyzing performance by individual query or transaction even if these are and must be given. It is a bit like Formula 1 viewers do not generally read the rules on car engine or aerodynamics, let alone understand their finer points.&lt;/p&gt; &lt;p&gt;For credibility to be thus given to products and hence the industry, we should just have a couple of well defined and agreed upon benchmarks, just like TPC.&lt;/p&gt; &lt;p&gt;The third public is the developer. As a DBMS developer, I am a great fan of TPC. The great benefit I derive from their work is that they give a test suite for measuring effects of code changes on performance. Also, assuming that the TPC workload mix is representative, it also allows ranking what optimizations are more important than others. Lastly, TPC gives a great way of describing results, e.g., changes resulting in x% improvement on throughput of y. In such usage, the benchmarks are pretty much never run by the rules but results obtained are still good for internal comparison.&lt;/p&gt; &lt;p&gt;Communication about IS should allow for short, simple messages: Release XX Halves Price per Throughput.&lt;/p&gt; &lt;p&gt;The existence of benchmarks is, if not absolutely necessary, then at least a great help for such communication. Besides, people are culturally used to all kinds of racing and sports results so this is even a familiar format.&lt;/p&gt; &lt;p&gt;Now the TPC is also not perfect. In the high end, the measured configurations are so large that one does not see them very often in practice. It is like the techno sports of Formula 1 or America&amp;#39;s Cup. Interesting for the curiosity value but not immediately relevant to the regular car buyer or weekend yachtsman. Further, sponsoring a by-the-book audited TPC result is not so simple. Not as expensive as putting out an America&amp;#39;s Cup challenge but still some trouble and expense.&lt;/p&gt; &lt;p&gt;So, for us to benefit by the benchmarking activity, we must find a group that can both agree and be somewhat representative. Then we must put out a simple message: This here is for integration of relational sources and this here for storage and query of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xa192c590&quot;&gt;RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Furthermore, in so far we derive from relational or similar sources, the technology should not do less than the established alternative. This sends the wrong message.&lt;/p&gt; &lt;p&gt;Entering the running should not be overly difficult for vendors, hence we should not have too many benchmarks and the ones that there are should be representative and sufficiently varied workloads. The results should be compact and easy to state. One more reason why I like TPC&amp;#39;s work is the fact that the benchmarks have an easy to understand, unified use case behind them. Approximately what is done in each becomes clear from a very short and succinct description even though the details can be complex. I suspect this is one side of their appeal. I would venture the guess that a single use case story is easier to sell than a composite metric of disparate tests. Also in the scientific computing world, we have use cases, like NAS for aerodynamics, so having a use case story is quite common and a factor for making a benchmark&amp;#39;s relevance understandable.&lt;/p&gt; &lt;p&gt;Is this all possible?&lt;/p&gt; &lt;p&gt;To play the devil&amp;#39;s advocate, I could say that the use cases are not as well settled as the relational ones hence formulating a generally representative benchmark is not possible. Now this is certainly not a message that this community wishes to send. Besides, there exists decades worth of history of the problems of information integration and a great deal of RDF data out there, , even a compilation of dozens of industry use cases by the SWEO, so we are not exactly in the dark here.&lt;/p&gt; &lt;p&gt;Can there be political agreement in reasonable time? If we look at the TPC as a precedent, judging by the rate of publication and revision, the process is not exactly quick. Now, for the TPC, it does not have to be. Judging by the frequency of published test results, hardware vendors are happy enough to have a forum to show off and do so at every turn.&lt;/p&gt; &lt;p&gt;Now we are not at this stage of maturity yet.&lt;/p&gt; &lt;p&gt;Composing a TPC style test spec is possible in a reasonable time for an individual but likely not for a committee. It is quite voluminous but also quite formulaic. While TPC&amp;#39;s material is their own, I see no reason that we could not reference or link to it it where applicable.&lt;/p&gt; &lt;p&gt;Who would be motivated by such activity? How to pitch the activity to would be participants? I don&amp;#39;t think that just talking about what to measure and how is interesting enough. This is covered ground. Vendors want to promote themselves and end users want to have vendors compete at solving their problems. Or so it would be in a simpler world.&lt;/p&gt; &lt;p&gt;Personally, I&amp;#39;d like to see a benchmark with a use case story people can relate to emerge in the next few months. Now I am not necessarily holding my breath waiting for this. For purposes of ongoing development, there is the real data out there and we can for example do the social web workload mix I suggested a couple of &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1f671bf8&quot;&gt;blog&lt;/a&gt; posts back on that and it is good enough for us. But that is not good enough for the industry&amp;#39;s messaging.&lt;/p&gt; &lt;p&gt;I&amp;#39;d say that we have to assume that people play in good faith and simply ask who want to run and get an extra edge by being in on the design of the race track. By good faith I here mean a sincere wish to have the race take place in the first place.&lt;/p&gt; &lt;p&gt;The sport is exciting for the players and spectators alike if there is a use case story that they can relate to and an actual tournament. So this is what we should aim for. Because this is so far a niche public, we should not fragment the activity too much and we should consider how understandable and relevant the benchmark activity is to likely semantic data web adopters.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>On RDF and Vertical Storage</atom:title>
  <atom:id>http://www.openlinksw.com/weblog/oerling/?date=2007-05-23#1197</atom:id>
  <atom:published>2007-05-23T14:08:23Z</atom:published>
  <atom:updated>2008-04-24T09:52:09-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The topic of column-wise storage has not escaped us. We are not convinced that this is good for &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1f98ac20&quot;&gt;RDF&lt;/a&gt;. There is a point to this for business intelligence &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1e0d2d30&quot;&gt;data&lt;/a&gt; warehouses, no doubt, although one could argue that one could get the same IO benefit with suitably selected covering indices but this is more design work. Column storage fits in less space and is more versatile For unexpected workloads.&lt;/p&gt; &lt;p&gt;But we can look at the RDF case in specific. You have a quad of G, S, P, O. You have a one part index on each and you have a unique row number for each quad. Given the row number, you must get the G, S, P, and O, and given any one of these, you must get the row numbers where this occurs. If there were multi-part keys, then this would be a row store with covering indices, like &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1c81d0b0&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s RDF store.&lt;/p&gt; &lt;p&gt;Each datum is stored 8 times. What is nice is that one can use any combination of selection criteria with equal ease and in the same working set. With the RDF workload, you end up typically referencing all parts of each quad. It is not like in the business intelligence case where the typical query accesses 4 columns of the 15 column history table. Of the 4 RDF quad keys, at least 2 are generally given. So this becomes a merge intersection of two or three indices and random lookups for the unspecified columns. Complicated control path, even if the engine is meant to do this thing alone.&lt;/p&gt; &lt;p&gt;We&amp;#39;ll have to try this. We could set up Virtuoso with 4 bitmap indices, each column to row ID and then a table with the 4 columns. Then we&amp;#39;d get bitmap ANDs for multi-column criteria and would have to get the row by row ID. As long as we run in memory, this should perform like a column store, close enough. We get the row with all the columns once, so we compensate for the fact that a column store has a special means for dereferencing the row ID for any column.&lt;/p&gt; &lt;p&gt;If we optimized this specially, which would not be so terribly hard, we&amp;#39;d have a column store. The main new thing would be making a special index by row ID that would have the ID just once per index leaf and a bitmap for dense allocation of row IDs. The rest is not too different.&lt;/p&gt; &lt;p&gt;For now, we will watch. If this is the next big thing, we can get there in little time.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Semantic Web Data Spaces</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2007-04-13#1185</atom:id>
  <atom:published>2007-04-13T21:15:54Z</atom:published>
  <atom:updated>2007-04-13T18:19:29.000001-04:00</atom:updated>
  <atom:content type="html">&lt;b&gt;Web Data Spaces&lt;/b&gt; &lt;p&gt;Now that broader understanding of the Semantic Data Web is emerging, I would like to revisit the issue of &amp;quot;&lt;a href=&quot;http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&amp;q=&#39;data%20spaces&#39;&amp;type=text&amp;output=html&quot;&gt;Data Spaces&lt;/a&gt;&amp;quot;.&lt;/p&gt; &lt;p&gt;A Data Space is a place where Data Resides. It isn&amp;#39;t inherently bound to a specific Data Model (&lt;a href=&quot;http://en.wikipedia.org/wiki/Network_model&quot;&gt;Concept Oriented&lt;/a&gt;, &lt;a href=&quot;http://en.wikipedia.org/wiki/Relational_model&quot;&gt;Relational&lt;/a&gt;, &lt;a href=&quot;http://en.wikipedia.org/wiki/Hierarchical_database&quot;&gt;Hierarchical&lt;/a&gt; etc..). Neither is it implicitly an access point to Data, Information, or Knowledge (the perception is purely determined through the experiences of the user agents interacting with the Data Space.&lt;/p&gt; &lt;p&gt;A Web Data Space is a Web accessible Data Space.&lt;/p&gt; &lt;p&gt;Real world example:&lt;/p&gt; &lt;p&gt;Today we increasing perform one of more of the following tasks as part of our professional and personal interactions on the Web:&lt;/p&gt; &lt;ol&gt; &lt;li&gt;Blog via many service providers or personally managed weblog platforms&lt;/li&gt; &lt;li&gt;Create Event Calendars via &lt;a href=&quot;http://upcoming.com&quot;&gt;Upcoming.com&lt;/a&gt; and &lt;a href=&quot;http://eventful.com&quot;&gt;Eventful&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Maintain and participate in Social Networks (e.g. &lt;a href=&quot;http://facebook.com&quot;&gt;Facebook&lt;/a&gt;, &lt;a href=&quot;http://orkut.com&quot;&gt;Orkut&lt;/a&gt;, &lt;a href=&quot;http://myspace.com&quot;&gt;MySpace&lt;/a&gt;)&lt;/li&gt; &lt;li&gt;Create and Participate in Discussions (note: when you comment on blogs or wikis for instance, you are participating in, or creating, a conversation)&lt;/li&gt; &lt;li&gt;Track news by subscribing to &lt;a href=&quot;http://web.resource.org/rss/1.0/&quot;&gt;RSS 1.0&lt;/a&gt;, &lt;a href=&quot;http://cyber.law.harvard.edu/rss/rss.html&quot;&gt;RSS 2.0&lt;/a&gt;, or &lt;a href=&quot;http://en.wikipedia.org/wiki/Atom_(standard)&quot;&gt;Atom&lt;/a&gt; Feeds&lt;/li&gt; &lt;li&gt;Share Bookmarks &amp;amp; Tags via &lt;a href=&quot;http://del.icio.us&quot;&gt;Del.icio.us&lt;/a&gt; and other Services&lt;/li&gt; &lt;li&gt;Share Photos via &lt;a href=&quot;http://flickr.com&quot;&gt;Flickr&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Buy, Review, or Search for books via &lt;a href=&quot;http://amazon.com&quot;&gt;Amazon&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Participates in auctions via &lt;a href=&quot;http://ebay.com&quot;&gt;eBay&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Search for data via &lt;a href=&quot;http://google.com&quot;&gt;Google&lt;/a&gt; (of course!)&lt;/li&gt; &lt;/ol&gt; &lt;p&gt; &lt;a href=&quot;http://www.johnbreslin.com/&quot;&gt;John Breslin&lt;/a&gt; has nice a &lt;a href=&quot;http://www.johnbreslin.com/blog/wp-content/20051015a.gif&quot;&gt;animation depicting the creation of Web Data Spaces&lt;/a&gt; that drives home the point.&lt;/p&gt; &lt;b&gt;Web Data Space Silos&lt;/b&gt; &lt;p&gt; Unfortunately, what isn&amp;#39;t as obvious to many netizens, is the fact that each of the activities above results in the creation of data that is put into some context by you the user. Even worse, you eventually realize that the service providers aren&amp;#39;t particularly willing, or capable of, giving you unfettered access to your own data. Of course, this isn&amp;#39;t always by design as the infrastructure behind the service can make this a nightmare from security and/or load balancing perspectives. Irrespective of cause, we end up creating our own &amp;quot;Data Spaces&amp;quot; all over the Web without a coherent mechanism for accessing and meshing these &amp;quot;Data Spaces&amp;quot;.&lt;/p&gt; &lt;b&gt;What are Semantic Web Data Spaces?&lt;/b&gt; &lt;p&gt;Data Spaces on the Web that provide granular access to RDF Data.&lt;/p&gt; &lt;b&gt;What&amp;#39;s OpenLink Data Spaces (ODS) About?&lt;/b&gt; &lt;blockquote&gt; &lt;p&gt;Short History&lt;/p&gt; &lt;p&gt;In anticipation of this the &amp;quot;Web Data Silo&amp;quot; challenge (an issue that we tackled within internal enterprise networks for years) we commenced the development (circa. 2001) of a distributed collaborative application suite called OpenLink Data Spaces (ODS). The project was never released to the public since the problems associated with the deliberate or inadvertent creation of Web Data silos hadn&amp;#39;t really materialized (silos only emerged in concreted form after the emergence of the Blogosphere and Web 2.0). In addition, there wasn&amp;#39;t a clear standard Query Language for the RDF based Web Data Model (i.e. the SPARQL Query Language didn&amp;#39;t exist).&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt; Today, ODS is delivered as a packaged solution (in Open Source and Commercial flavors) that alleviates the pain associated with Data Space Silos that exist on the Web and/or behind corporate firewalls. In either scenario, ODS simply allows you to create Open and Secure Data Spaces (via it&amp;#39;s suite of applications) that expose data via SQL, RDF, XML oriented data access and data management technologies. Of course it also enables you to integrates transparently with existing 3rd party data space generators (Blogs, Wikis, Shared Bookmrks, Discussion etc. services) by supporting industry standards that cover:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; Content Publishing - Atom, &lt;a href=&quot;http://www.sixapart.com/developers/product_documentation/movable_type/&quot;&gt;Moveable Type&lt;/a&gt;, &lt;a href=&quot;http://www.xmlrpc.com/metaWeblogApi&quot;&gt;MetaWeblog&lt;/a&gt;, Blogger protocols &lt;/li&gt; &lt;li&gt; Content Syndication Formats - RSS 1.0, RSS 2.0, Atom, OPML etc. &lt;/li&gt; &lt;li&gt; Data Management - &lt;a href=&quot;http://en.wikipedia.org/wiki/SQL&quot;&gt;SQL&lt;/a&gt;, &lt;a href=&quot;http://www.w3.org/RDF/&quot;&gt;RDF&lt;/a&gt;, XML, Free Text &lt;/li&gt; &lt;li&gt; Data Access - SQL, &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-query/&quot;&gt;SPARQL&lt;/a&gt;, GData, Web Services (SOAP or REST styles), WebDAV/HTTP &lt;/li&gt; &lt;li&gt; Semantic Data Web Middleware - &lt;a href=&quot;http://www.w3.org/2004/01/rdxh/spec&quot;&gt;GRDDL&lt;/a&gt;, &lt;a href=&quot;http://www.w3.org/TR/xslt&quot;&gt;XSLT&lt;/a&gt;, SPARQL, XPath/XQuery, HTTP (Content Negotiation) for producing RDF from non RDF Data ((X)HTML, Microformats, XML, Web Services Response Data etc). &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;Thus, by installing ODS on your Desktop, Workgroup, Enterprise, or public Web Server, you end up with a very powerful solution for creating Open Data access oriented presence on the &amp;quot;Semantic Data Web&amp;quot; without incurring any of the typically assumed &amp;quot;RDF Tax&amp;quot;.&lt;/p&gt; &lt;p&gt;Naturally, ODS is built atop &lt;a href=&quot;http://virtuoso.openlinksw.com&quot;&gt;Virtuoso&lt;/a&gt; and of course it exploits Virtuoso&amp;#39;s feature-set to the max. It&amp;#39;s also beginning to exploit functionality offered by the OpenLink Ajax Toolkit (&lt;a href=&quot;http://demo.openlinksw.com/DAV/JS/demo/index.html&quot;&gt;OAT&lt;/a&gt;).&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Microsoft &amp; Wikipedia Imbroglio</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2007-01-25#1124</atom:id>
  <atom:published>2007-01-26T00:10:00Z</atom:published>
  <atom:updated>2007-01-25T18:47:47.000001-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I tried to post a comment to &lt;a href=&quot;http://www.25hoursaday.com/weblog&quot;&gt;Dare Obasanjo&lt;/a&gt;&amp;#39;s blog post: &lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=0c22a95a-2d81-4f40-bbce-c763d8447468&quot;&gt;How Do We Get Rid of Lies on Wikipedia&lt;/a&gt;, without success (due to my attempts to add links to the post etc..). Hence a Blog style response instead.&lt;/p&gt; &lt;p&gt;Dare:&lt;/p&gt; &lt;p&gt;I have been through the Wikipedia fires a few times. If you recall that I actually triggered the early&lt;a href=&quot;http://en.wikipedia.org/wiki/Web_2.0&quot;&gt; Web 2.0 Wikipedia article&lt;/a&gt;. along the following lines: &lt;/p&gt; &lt;ol&gt; &lt;li&gt; Asked one of my staff to start a post with the sole intention of defining Web 2.0 properly &lt;/li&gt; &lt;li&gt; I then attempted to edit the initial post &lt;/li&gt; &lt;li&gt; I left a typo re. REST &lt;/li&gt; &lt;li&gt; Got set on Fire etc... (see very beginning of &lt;a href=&quot;http://en.wikipedia.org/w/index.php?title=Talk:Web_2.0&amp;action=history&quot;&gt;Wikipedia Web 2.0 history page&lt;/a&gt;) &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;As annoying as the experience above was, I didn&amp;#39;t find this inconsistent with the spirit of Wikipedia (i.e. open contribution and discourse). I felt, at the time, that a lot of historical data was being left in place for future reference etc.. In addition, the ultimate aim of creating an evolving Web 2.0 document did commence albeit some distance from &amp;quot;modern man&amp;quot; re. accuracy and meaningfulness as of my last read (today).&lt;/p&gt; &lt;p&gt;Even closer to home, I repeated the process above re. &lt;a href=&quot;http://en.wikipedia.org/wiki/Virtuoso_Universal_Server&quot;&gt;Virtuoso Universal Server&lt;/a&gt;. This basically ended up being a live case study on how you handle the Wikipedia NPOV conundurum. Just look at the &lt;a href=&quot;http://en.wikipedia.org/wiki/Talk:Virtuoso_Universal_Server&quot;&gt;Virtuoso Universal Server Talk Pages&lt;/a&gt; to see how the process evolved (the key was Virtuoso&amp;#39;s lineage and it&amp;#39;s proximity to the very DBMS platform upon which Wikipedia runs i.e &lt;a href=&quot;http://en.wikipedia.org/wiki/MySQL&quot;&gt;MySQL&lt;/a&gt;).&lt;/p&gt; &lt;p&gt;Bearing in mind the size and magnitude of Microsoft, there should be no reason why Microsoft&amp;#39;s &amp;quot;Microsoft Digital Caucus&amp;quot; ( legions of Staff, MSDN members, Integrators, and other partners) can&amp;#39;t simply go into Wikipedia and participate in the edit and discourse process.&lt;/p&gt; &lt;p&gt; Truth cannot be surpressed! At best, it can only be temporarily delayed :-) Even more so on the Web!&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Microsoft Data Access API Backgrounder: ODBC</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2007-01-03#1106</atom:id>
  <atom:published>2007-01-03T18:20:45Z</atom:published>
  <atom:updated>2007-01-03T13:35:51.000001-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Mike Pizzo has commenced a much needed&lt;a href=&quot;http://blogs.msdn.com/data/archive/2006/12/05/data-access-api-of-the-day-part-i.aspx&quot;&gt; 4-part article series covering the history of Microsoft&amp;#39;s various Data Access&lt;/a&gt; related APIs. Naturally, Part 1 covers: &lt;a href=&quot;http://en.wikipedia.org/wiki/Open_Database_Connectivity&quot;&gt;Open Database Connectivity&lt;/a&gt; (ODBC) which is the first of a series of purpose specific Data Access APIs.&lt;/p&gt; &lt;p&gt;Here is a very important excerpt:&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt; ... &lt;p&gt;And then something happened. Visual Basic became popular as a scriptable &amp;quot;automation language&amp;quot;. ODBC, being a C-style interface, was not directly consumable from VB. However, some of you clever folks figured out that Microsoft Access supported executing queries against ODBC Datasources, and that Access did support scriptable automation through its Data Access Object (DAO) API. Voila! Now you could write applications against ODBC sources using VB.&lt;/p&gt; &lt;p&gt;However, DAO went through &lt;b&gt;Access&amp;#39;s internal &amp;quot;Jet&amp;quot; (Joint Engine Technology)&lt;/b&gt; database engine, which defaulted to building local keysets for each result in order to do advanced query processing and cursoring against the remote data. This was fine if you needed that functionality, but significant performance overhead and additional round trips when you didn&amp;#39;t. &lt;/p&gt; &lt;p&gt;Enter the Visual Basic team who, responding to customer demand for better performance against ODBC sources, came up with something called Remote Data Objects (RDO). RDO implemented the same DAO programming patterns directly against ODBC, rather than going through Jet. RDO was extremely popular among VB developers, but the fact that we had two different sets of automation objects for accessing ODBC sources caused confusion.&lt;/p&gt; &lt;p&gt; But apparently not enough confusion, because our solution was to introduce &amp;quot;ODBCDirect&amp;quot;. Despite its name, ODBCDirect was not a new API; it was just a mode we added to DAO that set defaults in such a way as to avoid the overhead of building keysets and such&lt;/p&gt; ... &lt;/cite&gt; &lt;/blockquote&gt; &lt;p&gt;To this very day (unfortunately!) ODBC has been maligned by the perpetuated misunderstanding of JET&amp;#39;s DAO layer that sits atop ODBC providing advanced query processing (i.e. Virtual DBMS functionality) alongside a client-side keyset cursor model implementation.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Web 2.0&#39;s Open Data Access Conundrum (Update)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-09-05#1034</atom:id>
  <atom:published>2006-09-05T21:02:00Z</atom:published>
  <atom:updated>2006-11-16T16:11:45-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt; Open Data Access and Web 2.0 have a very strange relationship that continues to blur the lines of demarcation between where Web 2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of Databases etc.) starts. But before I proceed, let me attempt to define Web 2.0 one more time: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.&lt;/em&gt; &lt;/p&gt; &lt;p&gt; BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: &lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=929a7fd6-1dfc-43f4-a549-d2c9fa873655&quot;&gt;The Two Webs&lt;/a&gt;. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which &lt;a href=&quot;http://weblog.infoworld.com/udell&quot;&gt;Jon Udell&lt;/a&gt; evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent &lt;a href=&quot;http://weblog.infoworld.com/udell/gems/ju_idehen.mp3&quot;&gt;podcast conversation&lt;/a&gt;. &lt;/p&gt; &lt;p&gt; With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum: &lt;/p&gt; &lt;ul&gt; &lt;li&gt;Emphasis on XML&amp;#39;s prowess in the realms of Data and Protocol Modeling alongside Data Representation. Especially as SOAP or REST styles of Web Services and various XML formats (RSS 0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web 2.0 infrastructure landscape&lt;/li&gt; &lt;li&gt;Where a modicum of Data Access appreciation and comprehension does exist it is inherently compromised by business models that mandate some form of âWalled Gardensâ and âData Silosâ&lt;/li&gt; &lt;li&gt;Mash-ups are a response to said âWalled Gardensâ and âData Silosâ . Mash-ups by definition imply combining things that were not built for recombination.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt; As you can see from the above, Open Data access isn&amp;#39;t genuinely compatible with Web 2.0. &lt;/p&gt; &lt;p&gt; We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I&amp;#39;ll return to this later in this post. &lt;/p&gt; &lt;p&gt; What about Social Networking you must be thinking? Isn&amp;#39;t this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting &lt;a href=&quot;http://www.w3.org/History/1989/Image1.gif&quot;&gt;People and Data&lt;/a&gt;. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all). &lt;/p&gt; &lt;p&gt; The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following: &lt;/p&gt; &lt;ul&gt; &lt;li&gt;Access Data from a myriad of data sources (irrespective of structural differences at the database level)&lt;/li&gt; &lt;li&gt;Mesh (not Mash) data in new and interesting ways&lt;/li&gt; &lt;li&gt;Share the meshed data with as many relevant people as possible for social, professional, political, religious, and other reasons&lt;/li&gt; &lt;li&gt;Construct valuable networks based on data oriented connections&lt;/li&gt; &lt;/ul&gt; &lt;p&gt; Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access. &lt;/p&gt; &lt;p&gt; If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open &amp;amp; Flexible Data Access Model â. &lt;/em&gt; &lt;/p&gt; &lt;p&gt; &lt;em&gt; &lt;br /&gt; &lt;/em&gt;In more succinct form: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A pervasive network of people connected by data or data connected by people.&lt;/em&gt; &lt;/p&gt; &lt;p&gt; &lt;em&gt; &lt;br /&gt; &lt;/em&gt;Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context). &lt;/p&gt; &lt;p&gt; To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model. &lt;/p&gt; &lt;p&gt; &lt;strong&gt;The Enterprise Challenge&lt;/strong&gt; &lt;/p&gt; &lt;p&gt; Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing). &lt;/p&gt; &lt;p&gt; A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture&amp;#39;s Institute for High Performance Business titled: &lt;a href=&quot;http://www.accenture.com/xdoc/en/AccentureSNA.swf&quot;&gt;Visualizing Organizational Change&lt;/a&gt;. &lt;/p&gt; &lt;p&gt; Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people &amp;amp; data networks (The very issue emphasized in this original &lt;a href=&quot;http://www.w3.org/History/1989/proposal.html&quot;&gt;Web vision document by Tim Berners-Lee&lt;/a&gt;). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time. &lt;/p&gt; &lt;p&gt; Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0: &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.w3.org/History/1989/proposal.html&quot;&gt;Information Management Proposal &lt;/a&gt;- &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/&quot;&gt;Tim Berners-Lee&lt;/a&gt; &lt;br /&gt; &lt;a href=&quot;http://www.accenture.com/xdoc/en/AccentureSNA.swf&quot;&gt;Visualizing Organizational Change&lt;/a&gt; - &lt;a href=&quot;http://www.accenture.com/Global/High_Performance_Business/default.htm&quot;&gt;Accenture Institute of High Performance Business&lt;/a&gt; &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Web 2.0&#39;s Open Data Access Conundrum</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-09-02#1032</atom:id>
  <atom:published>2006-09-02T16:47:52Z</atom:published>
  <atom:updated>2006-11-16T15:51:43-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt; Open Data Access and Web 2.0 have a very strange relationship that continues to blur the lines of demarcation between where Web 2.0 ends and where Web.Next (i.e Web 3.0, Semantic/Data Web, Web of Databases etc.) starts. But before I proceed, let me attempt to define Web 2.0 one more time: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A phase in the evolution web usage patterns that emphasizes Web Services based interaction between âWeb Usersâ and âPoints of Web Presenceâ over traditional âWeb Usersâ and âWeb Sitesâ based interaction. Basically, a transition from visual site interaction to presence based interaction.&lt;/em&gt; &lt;/p&gt; &lt;p&gt; BTW - Dare Obasanjo also commented about Web usage patterns in his post titled: &lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=929a7fd6-1dfc-43f4-a549-d2c9fa873655&quot;&gt;The Two Webs&lt;/a&gt;. Where he concluded that we had a dichotomy along the lines of: HTTP-for-APIs (2.0) and HTTP-for-Browsers (1.0). Which &lt;a href=&quot;http://weblog.infoworld.com/udell&quot;&gt;Jon Udell&lt;/a&gt; evolved into: HTTP-Services-Web and HTTP-Intereactive-Web during our recent &lt;a href=&quot;http://weblog.infoworld.com/udell/gems/ju_idehen.mp3&quot;&gt;podcast conversation&lt;/a&gt;. &lt;/p&gt; &lt;p&gt; With definitions in place, I will resume my quest to unveil the aforementioned Web 2.0 Data Access Conundrum: &lt;/p&gt; &lt;ul&gt; &lt;li&gt;Emphasis on XML&amp;#39;s prowess in the realms of Data and Protocol Modeling alongside Data Representation. Especially as SOAP or REST styles of Web Services and various XML formats (RSS 0.92/1.0/1.1/2.0, Atom, OPML, OCS etc.) collectively define the Web 2.0 infrastructure landscape&lt;/li&gt; &lt;li&gt;Where a modicum of Data Access appreciation and comprehension does exist it is inherently compromised by business models that mandate some form of âWalled Gardensâ and âData Silosâ&lt;/li&gt; &lt;li&gt;Mash-ups are a response to said âWalled Gardensâ and âData Silosâ . Mash-ups by definition imply combining things that were not built for recombination.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt; As you can see from the above, Open Data access isn&amp;#39;t genuinely compatible with Web 2.0. &lt;/p&gt; &lt;p&gt; We can also look at the same issue by way of the popular M-V-C (Model View Controller) pattern. Web 2.0 is all about the âVâ and âCâ with a modicum of âMâ at best (data access, open data access, and flexible open data access are completely separate things). The âCâ items represent application logic exposed by SOAP or REST style web services etc. I&amp;#39;ll return to this later in this post. &lt;/p&gt; &lt;p&gt; What about Social Networking you must be thinking? Isn&amp;#39;t this a Web 2.0 manifestation? Not at all (IMHO). The Web was developed / invented by Tim Berners-Lee to leverage the âNetwork Effectsâ potential of the Internet for connecting &lt;a href=&quot;http://www.w3.org/History/1989/Image1.gif&quot;&gt;People and Data&lt;/a&gt;. Social Networking on the other hand, is simply one of several ways by which construct network connections. I am sure we all accept the fact that connections are built for many other reasons beyond social interaction. That said, we also know that through social interactions we actually develop some of our most valuable relationships (we are social creatures after-all). &lt;/p&gt; &lt;p&gt; The Web 2.0 Open Data Access impedance reality is ultimately going to be the greatest piece of tutorial and usecase material for the Semantic Web. I take this position because it is human nature to seek Freedom (in unadulterated form) which implies the following: &lt;/p&gt; &lt;ul&gt; &lt;li&gt;Access Data from a myriad of data sources (irrespective of structural differences at the database level)&lt;/li&gt; &lt;li&gt;Mesh (not Mash) data in new and interesting ways&lt;/li&gt; &lt;li&gt;Share the meshed data with as many relevant people as possible for social, professional, political, religious, and other reasons&lt;/li&gt; &lt;li&gt;Construct valuable networks based on data oriented connections&lt;/li&gt; &lt;/ul&gt; &lt;p&gt; Web 2.0 by definition and use case scenarios is inherently incompatible with the above due to the lack of Flexible and Open Data Access. &lt;/p&gt; &lt;p&gt; If we take the definition of Web 2.0 (above) and rework it with an appreciation Flexible and Open Data Access you would arrive at something like this: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A phase in the evolution of the web that emphasizes interaction between âWeb Usersâ and âWeb Dataâ facilitated by Web Services based APIs and an Open &amp;amp; Flexible Data Access Model â. &lt;/em&gt; &lt;/p&gt; &lt;p&gt; &lt;em&gt; &lt;br /&gt; &lt;/em&gt;In more succinct form: &lt;/p&gt; &lt;p style=&quot;text-align: center;&quot;&gt; &lt;em&gt;A pervasive network of people connected by data or data connected by people.&lt;/em&gt; &lt;/p&gt; &lt;p&gt; &lt;em&gt; &lt;br /&gt; &lt;/em&gt;Returning to M-V-C and looking at the definition above, you now have a complete of âMâ which is enigmatic in Web 2.0 and the essence of the Semantic Web (Data and Context). &lt;/p&gt; &lt;p&gt; To make all of this possible a palatable Data Model is required. The model of choice is the Graph based RDF Data Model - not to be mistaken for the RDF/XML serialization which is just that, a data serialization that conforms to the aforementioned RDF data model. &lt;/p&gt; &lt;p&gt; &lt;strong&gt;The Enterprise Challenge&lt;/strong&gt; &lt;/p&gt; &lt;p&gt; Web 2.0 cannot and will not make valuable inroads into the the enterprise because enterprises live and die by their ability to exploit data. Weblogs, Wikis, Shared Bookmarking Systems, and other Web 2.0 distributed collaborative applications profiles are only valuable if the data is available to the enterprise for meshing (not mashing). &lt;/p&gt; &lt;p&gt; A good example of how enterprises will exploit data by leveraging networks of people and data (social networks in this case) is shown in this nice presentation by Accenture&amp;#39;s Institute for High Performance Business titled: &lt;a href=&quot;http://www.accenture.com/xdoc/en/AccentureSNA.swf&quot;&gt;Visualizing Organizational Change&lt;/a&gt;. &lt;/p&gt; &lt;p&gt; Web 2.0 commentators (for the most part) continue to ponder the use of Web 2.0 within the enterprise while forgetting the congruency between enterprise agility and exploitation of people &amp;amp; data networks (The very issue emphasized in this original &lt;a href=&quot;http://www.w3.org/History/1989/proposal.html&quot;&gt;Web vision document by Tim Berners-Lee&lt;/a&gt;). Even worse, they remain challenged or spooked by the Semantic Web vision because they do not understand that Web 2.0 is fundamentally a Semantic Web precursor due to Open Data Access challenges. Web 2.0 is one of the greatest demonstrations of why we need the Semantic Web at the current time. &lt;/p&gt; &lt;p&gt; Finally, juxtapose the items below and you may even get a clearer view of what I am an attempting to convey about the virtues of Open Data Access and the inflective role it plays as we move beyond Web 2.0: &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.w3.org/History/1989/proposal.html&quot;&gt;Information Management Proposal &lt;/a&gt;- &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/&quot;&gt;Tim Berners-Lee&lt;/a&gt; &lt;br /&gt; &lt;a href=&quot;http://www.accenture.com/xdoc/en/AccentureSNA.swf&quot;&gt;Visualizing Organizational Change&lt;/a&gt; - &lt;a href=&quot;http://www.accenture.com/Global/High_Performance_Business/default.htm&quot;&gt;Accenture Institute of High Performance Business&lt;/a&gt; &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The WWW Proposal and RDF: Then and Now (circa 1999)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-08-28#1029</atom:id>
  <atom:published>2006-08-28T10:20:00Z</atom:published>
  <atom:updated>2006-09-30T16:27:36-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I&amp;#39;ve just re-read an article penned by Dan Brickley in 1999 titled: &lt;a href=&quot;http://www.w3.org/1999/11/11-WWWProposal/thenandnow&quot;&gt;The WWW Proposal and RDF: Then and Now&lt;/a&gt;, that retains its prescience to this very day. Ironically I stumbled across this timeless piece while revisiting the &lt;a href=&quot;http://diveintomark.org/archives/2002/09/06/history_of_the_rss_fork&quot;&gt;RSS name imbroglio&lt;/a&gt; that gave us a simple syndication format (RSS 2.0) that will ultimately implode (IMHO) since &amp;quot;Simple&amp;quot; is ultimately short lived when dealing with attention challenged end-users that are always assumed to be dumb when in fact they are simply ambivalent.&lt;/p&gt; &lt;p&gt;I was compelled to go back to the RSS 2.0 imbroglio when I came across &lt;a href=&quot;http://www.scripting.com/dwiner/&quot;&gt;Dave Winer&lt;/a&gt;&amp;#39;s comments re. &amp;quot;the SEC attempting to reinvent RSS 2.0...&amp;quot; response to &lt;a href=&quot;http://weblog.infoworld.com/udell/2006/08/16.html&quot;&gt;Jon Udell&amp;#39;s recent XBRL article&lt;/a&gt;. &lt;/p&gt; &lt;p&gt;Although I don&amp;#39;t believe in complex entry points into complex technology realms, I do subscribe to the approach where developers deal with the complexity associated with a problem domain while hiding said complexity from ambivalent end-users via coherent interfaces -- which does not always imply User Interface.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://xml.coverpages.org/xbrl.html&quot;&gt;XBRL&lt;/a&gt; is a great piece of work that addresses the complex problem domain of Financial Reporting. The only thing it&amp;#39;s missing right now is an Ontology that facilitates &lt;a href=&quot;http://www.w3.org/TR/rdf-primer/&quot;&gt;RDF Data Model&lt;/a&gt; based XBRL Schema and Instance Data which ultimately makes XBRL data available to RDF query languages such as &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-query/&quot;&gt;SPARQL&lt;/a&gt;. This line of thought implies, for instance, an XML Schema to &lt;a href=&quot;http://www.w3.org/TR/owl-guide/&quot;&gt;OWL Ontology Mapping&lt;/a&gt; for Schema Data (as explained in a &lt;a href=&quot;http://www.google.com/url?sa=t&amp;ct=res&amp;cd=4&amp;url=http%3A%2F%2Fvsis-www.informatik.uni-hamburg.de%2FgetDoc.php%2Fpublications%2F204%2Ffzt-lxs-04.pdf&amp;ei=4lXzRPLaO8SmaJmgsLgC&amp;sig2=INc-OyDoxj16TW8tb0pNXA#search=%22xml%20schema%20owl%20mapping%22&quot;&gt;white paper by the VSIS Group at the university of Hamburg&lt;/a&gt;) leaving the Instance Data to be generated in a myriad of ways that includes XML to RDF and/or XML-&amp;gt;SQL-&amp;gt;RDF.&lt;/p&gt; &lt;p&gt;As I stated in an earlier post: &lt;a href=&quot;http://www.openlinksw.com/blog/%7Ekidehen/index.vspx?page=&amp;id=1018&quot;&gt;we should not mistake ambivalence to lack of intelligence&lt;/a&gt;. Assuming &amp;quot;Simple&amp;quot; is always right at all times is another way of subscribing to this profound misconception. You know, assuming the world was flat (as opposed to geoid) was quite palatable at some point in the history of mankind, I wonder what would have happened if we held on to this point of view to this day because of its &amp;quot;Simplicity&amp;quot;?&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Value vs Source</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-07-29#1020</atom:id>
  <atom:published>2006-07-29T22:19:22Z</atom:published>
  <atom:updated>2006-07-29T18:55:52.000002-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;(Via &lt;a href=&quot;http://davidwarlick.com/2cents&quot;&gt;David Warlick&lt;/a&gt;.)&lt;/p&gt;: &lt;p&gt; &lt;a href=&quot;http://davidwarlick.com/2cents/2006/07/29/value-vs-source/#comments&quot;&gt;Value vs Source&lt;/a&gt;: &amp;quot;&lt;/p&gt; &lt;p&gt;I think weâre all sorta jumping around the same bush. Itâs been a good dance because Iâve learned some things. First of all, nothingâs simple and it isnât getting any simpler. There are no rules any more and as much as Iâd like to come up with some kind of all encompassing unified field theory of ethical research method, I know that smarter people than me have already done a better job, and none of it is perfect.&lt;/p&gt; &lt;p&gt;Please allow me to do something kinda strange. I want to look backward for some clues. When I was young, my Dad loved to build things. He was the preeminent do-it-yourselfer. Every weekend, he had a building project, and every Saturday morning he loaded us boys into the station wagon and off we went to the Lowes Hardware Store in Shelby, where he bought the tools and materials he would need for the project. &lt;/p&gt; &lt;p&gt;He did not have a list of criterial for selecting his materials, because every project was different â the goal was different. If he had selected everything based on the same criteria, then everything he built would have been made with pine shelving, two-penny finishing nails, and all the work would have been done with a Craftsman common nail hammer. Instead, he selected his building materials and tools based on the goal of the project. To do otherwise would have resulted in a product that did not last long, and that would have been unethical.&lt;/p&gt; &lt;p&gt; &lt;img src=&quot;http://davidwarlick.com/2cents/wp-content/uploads/2006/07/bill_edwards.jpg&quot; height=&quot;309&quot; width=&quot;169&quot; border=&quot;1&quot; align=&quot;right&quot; hspace=&quot;4&quot; vspace=&quot;4&quot; alt=&quot;Bill Edwards&quot; /&gt;Years later, I studied under the best teacher I ever had, Mr. Bill Edwards â my industrial arts teacher. His technique was to help us learn industrial arts skills by helping us to build something of value. I built a kayak. Other students built book shelves, stools, and chess boards. Two friends of mine built a life-size replica of a Gemini Space Capsule. Mr. Edwards taught us to set goals and to make decisions based on those goals.&lt;/p&gt; &lt;p&gt;This was the perfect way to teach industrial arts skills, since we were in the industrial age. If Edwards had taught us in the same way that my information arts teachers were teaching, he would have put a stack of lumber on our desks and asked us to practice driving nails. But he taught us by putting us in the industry. We should be teaching today by putting students in the industry of information. We need to stop teaching science and start teaching students to be scientists. Stop teaching history, but rather teach to be historians. Stop teaching students to be researchers, and instead, teach them to solve problems and accomplish goals using information.&lt;/p&gt; &lt;p&gt;I am certain that there were brands of wood and nails that my father wouldnât buy, because he couldnât depend on them. He swore by Craftsman tools. To build with materials that were unreliable would have been unethical. But his conscious work in finding and selecting materials was based on the goal at hand. All else pointed to that criteria.&lt;/p&gt; &lt;p&gt;It is critical to know and understand the source of the information. But what is it about the source that helps you accomplish your goal. Itâs important to understand when the information was generated and published. But what is it about âwhenâ that helps you accomplish your goal. Itâs important to understand what the information is made of, and what it is about its format and how you can use it that helps you accomplish your goal. Itâs important to understand the informationâs cultural, economic, environmental, and emotional context, and what it is about the context that helps you accomplish your goal. All aspects remain critical, but its problem solving and goal achieving that children need to be doing, not just hoop-jumping in their schools. The need to look for the informationâs value as a tool for &lt;span style=&quot;text-decoration:underline;&quot;&gt;ethically&lt;/span&gt; accomplishing their goals.&lt;/p&gt; &lt;p&gt; &lt;br /&gt; &lt;/p&gt; &lt;p style=&quot;text-align:right;font-size:10px;&quot;&gt;Technorati Tags: &lt;a href=&quot;http://www.technorati.com/tag/librarians&quot; rel=&quot;tag&quot;&gt;librarians&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/warlick&quot; rel=&quot;tag&quot;&gt;warlick&lt;/a&gt; &lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;hr size=&quot;1&quot; /&gt; &lt;br /&gt; Portions of this post come from Raw Materials for the Mind ISBN #1-4116-2795-4&lt;br /&gt; &lt;a href=&quot;http://www.lulu.com/content/116469&quot;&gt;&lt;br /&gt; &lt;img src=&quot;http://www.lulu.com/services/buy_now_buttons/images/book_maroon.gif&quot; border=&quot;0&quot; alt=&quot;Support independent publishing: buy this book on Lulu.&quot; /&gt; &lt;br /&gt; &lt;/a&gt; &amp;quot;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RDF&#39;s History</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-07-13#1004</atom:id>
  <atom:published>2006-07-13T21:42:57Z</atom:published>
  <atom:updated>2006-07-13T19:04:36-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We are getting very close to a Semantic Web watershed moment (IMHO). Thus, for the purpose of historic record, I would like to create a public bookmark to Tim Bray&amp;#39;s 2003 post titled: &lt;a href=&quot;http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet&quot;&gt;RDF.net&lt;/a&gt; Challenge that also contains a nice section about the &lt;a href=&quot;http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet&quot;&gt;History of RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Note to Tim:&lt;/p&gt; &lt;p&gt; Is the RDF.net domain deal still on? I know it&amp;#39;s past 1st Jan 2006, but do bear in mind that the critical issue of a broadly supported RDF Query Language only took significant shape approximately 13 months ago (in the form of SPARQL), and this is all so critical to the challenge you posed in 2003.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://rdf.net&quot;&gt;RDF.net&lt;/a&gt; could become a point of semantic-web-presence through which the benefits of SPARQL compliant Triple|Quad Stores, Shared Ontologies, and SPARQL Protocol are unveiled in their well intended glory :-).&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Standards as social contracts</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-07-04#995</atom:id>
  <atom:published>2006-07-04T17:25:51Z</atom:published>
  <atom:updated>2006-07-04T14:53:48.000001-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt; &lt;p&gt; &lt;a href=&quot;http://www.tnl.net/blog/2006/06/07/standards-as-social-contracts/#comments&quot;&gt;Standards as social contracts&lt;/a&gt;: &amp;quot;Looking at Dave Winer&amp;#39;s efforts in evangelizing OPML, I try to draw some rough lines into what makes a de-facto standard. De Facto standards are made and seldom happen on their own. In this entry, I look back at the history of HTML, RSS, the open source movement and try to draw some lines as to what makes a standard. &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://feeds.tristanlouis.com/~a/TNLnet?a=nXIQUu&quot;&gt;&lt;img src=&quot;http://feeds.tristanlouis.com/~a/TNLnet?i=nXIQUu&quot; border=&quot;0&quot; /&gt; &lt;/a&gt; &lt;/p&gt; &lt;div class=&quot;feedflare&quot;&gt; &lt;a href=&quot;http://feeds.tristanlouis.com/~f/TNLnet?a=dklI2jYY&quot;&gt;&lt;img src=&quot;http://feeds.tristanlouis.com/~f/TNLnet?i=dklI2jYY&quot; border=&quot;0&quot; /&gt; &lt;/a&gt; &lt;a href=&quot;http://feeds.tristanlouis.com/~f/TNLnet?a=HoauA2Ma&quot;&gt;&lt;img src=&quot;http://feeds.tristanlouis.com/~f/TNLnet?i=HoauA2Ma&quot; border=&quot;0&quot; /&gt;&lt;/a&gt; &lt;a href=&quot;http://feeds.tristanlouis.com/~f/TNLnet?a=DxOLN3Br&quot;&gt;&lt;img src=&quot;http://feeds.tristanlouis.com/~f/TNLnet?i=DxOLN3Br&quot; border=&quot;0&quot; /&gt;&lt;/a&gt; &lt;a href=&quot;http://feeds.tristanlouis.com/~f/TNLnet?a=zU2uLdOm&quot;&gt;&lt;img src=&quot;http://feeds.tristanlouis.com/~f/TNLnet?i=zU2uLdOm&quot; border=&quot;0&quot; /&gt;&lt;/a&gt; &lt;/div&gt;&amp;quot; &lt;p&gt;(Via &lt;a href=&quot;http://www.tnl.net/blog&quot;&gt;Tristan Louis&lt;/a&gt;.)&lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;I posted a comment to the Tristan Louis&amp;#39; post along the following lines:&lt;/p&gt; &lt;p&gt;Analysis is spot on re. the link between de facto standardization and bootstrapping. Likewise, the clear linkage between boostrapping and connected communities (a variation of the social networking paradigm). &lt;/p&gt; &lt;p&gt;Dave built a community around a XML content syndication and subscription usecase demo that we know today as the blogosphere. Superficially, one may conclude that Semantic Web vision has suffered to date from a lack a similar bootstrap effort. Whereas in reality, we are dealing with &amp;quot;time and context&amp;quot; issues that are critical to the base understanding upon which a &amp;quot;Dave Winer&amp;quot; style bootstrap for the Semantic Web would occur.&lt;/p&gt; &lt;p&gt;Personally, I see the emergence of Web 2.0 (esp. the mashups phenomenon) as the &amp;quot;time and context&amp;quot; seeds from which the Semantic Web bootstrap will sprout. I see shared ontologies such as &lt;a href=&quot;http://oplussol5.usnet.private:8893/foaf&quot;&gt;FOAF&lt;/a&gt; and &lt;a href=&quot;http://rdfs.org/sioc/&quot;&gt;SIOC&lt;/a&gt; leading the way (they are the RSS 2.0&amp;#39;s of the Semantic Web IMHO).&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>My podcast conversation with Jon Udell</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-04-28#965</atom:id>
  <atom:published>2006-04-28T14:43:12Z</atom:published>
  <atom:updated>2006-07-21T07:22:41.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Jon and I had a recent chat yesterday that is now available in &lt;a href=&quot;http://weblog.infoworld.com/udell/2006/04/28.html#a1437&quot;&gt;Podcast&lt;/a&gt; form.&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt;&lt;p&gt;&amp;quot;In my &lt;a href=&quot;http://weblog.infoworld.com/udell/gems/ju_idehen.mp3&quot;&gt;fourth Friday podcast&lt;/a&gt; we hear from Kingsley Idehen, CEO of &lt;a href=&quot;http://openlinksw.com/&quot;&gt;OpenLink Software&lt;/a&gt;. I wrote about OpenLink&amp;#39;s universal database and app server, Virtuoso, back in &lt;a href=&quot;http://www.infoworld.com/article/02/04/12/020415plvirtuoso_1.html&quot;&gt;2002&lt;/a&gt; and &lt;a href=&quot;http://www.infoworld.com/article/03/03/21/12virtuoso_1.html&quot;&gt;2003&lt;/a&gt;. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/?id=951&quot;&gt;transition to open source&lt;/a&gt;. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. &lt;b&gt;...&lt;/b&gt;&amp;quot;&lt;/p&gt; &lt;p&gt;(Via &lt;a href=&quot;http://weblog.infoworld.com/udell/&quot;&gt;Jon&amp;#39;s Radio&lt;/a&gt;.)&lt;/p&gt; &lt;/cite&gt; &lt;/blockquote&gt; I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as &amp;quot;&lt;a href=&quot;http://jeremy.zawodny.com/blog/archives/006687.html&quot;&gt;Adam Bosworth&amp;#39;s fingerprints.&lt;/a&gt;&amp;quot; I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years). &lt;blockquote&gt;BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what&amp;#39;s already in place).&lt;/blockquote&gt; &lt;p&gt;I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam&amp;#39;s (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn&amp;#39;t mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).&lt;/p&gt; &lt;p&gt;Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)&lt;/p&gt; &lt;p&gt;If you are encountering Virtuoso for the first time via this post or Jon&amp;#39;s, please make time to read the &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory/&quot;&gt;product history&lt;/a&gt; article on the &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/&quot;&gt;Virtuoso Wiki&lt;/a&gt; (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).&lt;/p&gt; &lt;p&gt;That said, I better go listen to the podcast :-)&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>My podcast conversation with Jon Udell</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-04-28#993</atom:id>
  <atom:published>2006-04-28T14:43:12Z</atom:published>
  <atom:updated>2006-06-29T10:14:44.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Jon and I had a recent chat yesterday that is now available in &lt;a href=&quot;http://weblog.infoworld.com/udell/2006/04/28.html#a1437&quot;&gt;Podcast&lt;/a&gt; form.&lt;/p&gt; &lt;blockquote&gt; &lt;cite&gt;&lt;/cite&gt; &lt;p&gt;&amp;quot;In my &lt;a href=&quot;http://weblog.infoworld.com/udell/gems/ju_idehen.mp3&quot;&gt;fourth Friday podcast&lt;/a&gt; we hear from Kingsley Idehen, CEO of &lt;a href=&quot;http://openlinksw.com/&quot;&gt;OpenLink Software&lt;/a&gt;. I wrote about OpenLink&amp;#39;s universal database and app server, Virtuoso, back in &lt;a href=&quot;http://www.infoworld.com/article/02/04/12/020415plvirtuoso_1.html&quot;&gt;2002&lt;/a&gt; and &lt;a href=&quot;http://www.infoworld.com/article/03/03/21/12virtuoso_1.html&quot;&gt;2003&lt;/a&gt;. Earlier this month Virtuoso became the first mature SQL/XML hybrid to make the &lt;a href=&quot;http://www.openlinksw.com/blog/%7Ekidehen/?id=951&quot;&gt;transition to open source&lt;/a&gt;. The latest incarnation of the product also adds SPARQL (a semantic web query language) to its repertoire. &lt;b&gt;...&lt;/b&gt;&amp;quot;&lt;/p&gt; &lt;p&gt;(Via &lt;a href=&quot;http://weblog.infoworld.com/udell/&quot;&gt;Jon&amp;#39;s Radio&lt;/a&gt;.)&lt;/p&gt; &lt;/blockquote&gt; I would like to make an important clarification re. the GData Protocol and what is popularly dubbed as &amp;quot;&lt;a href=&quot;http://jeremy.zawodny.com/blog/archives/006687.html&quot;&gt;Adam Bosworth&amp;#39;s fingerprints.&lt;/a&gt;&amp;quot; I do not believe in a one solution (a simple one for the sake of simplicity) to a deceptively complex problem. Virtuoso supports Atom 1.0 (syndication only at the current time) and Atom 0.3 (syndication and publication which have been in place for years). &lt;blockquote&gt;BTW - the GData Protocol and Atom 1.0 publishing support will be delivered in both the Open Source and Commercial Edition updates to Virtuoso next week (very little work due to what&amp;#39;s already in place).&lt;/blockquote&gt; &lt;p&gt;I make the clarification above to eliminate the possibility of assuming mutual exclusivity of my perspective/vison and Adam&amp;#39;s (Jon also makes this important point when he speaks about our opinions being on either side of a spectrum/continuum). I simply want to broaden the scope of this discussion. I am a profound believer in the Semantic Web / Data Web vision, and I predict that we will be querying the Googlebase via SPARQL in the not to distant future (this doesn&amp;#39;t mean that netizens will be forced to master SPARQL, absolutely not! But there will be conduit technologies that deal with matter).&lt;/p&gt; &lt;p&gt;Side note: I actually last spoke with Adam at the NY Hilton in 2000 (the day I unveiled Virtuoso to the public for the first time, in person). We bumped into each other and I told him about Virtuoso (at the time the big emphasis was SQL to XML and the vocabulary we had chosen re. SQL extension...), and he told me about his departure from Microsoft and the commencement of his new venture (CrossGain prior to his stint at BEA), what struck me even more was his interest in Linux and Open Source (bearing in mind this was about 3 or so week after he departed Microsoft.)&lt;/p&gt; &lt;p&gt;If you are encountering Virtuoso for the first time via this post or Jon&amp;#39;s, please make time to read the &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory&quot;&gt;product history&lt;/a&gt; article on the &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso Wiki&lt;/a&gt; (which is one of many Virtuoso based applications that make up our soon to be released OpenLink DataSpace offering).&lt;/p&gt; &lt;p&gt;That said, I better go listen to the podcast :-)&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso is Officially Open Source!</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-04-11#951</atom:id>
  <atom:published>2006-04-11T18:01:44Z</atom:published>
  <atom:updated>2006-07-21T07:22:20.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I am pleased to unveil (officially) the fact that &lt;a href=&quot;http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104&amp;STORY=/www/story/04-11-2006/0004338324&amp;EDATE=&quot;&gt;Virtuoso is now available in Open Source form&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What Is Virtuoso?&lt;/h4&gt; &lt;p&gt;A powerful next generation server product that implements otherwise distinct server functionality within a single server product. Think of Virtuoso as the server software analog of a dual core processor where each core represents a traditional server functionality realm.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;Where did it come from?&lt;/h4&gt; &lt;p&gt;The &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory&quot;&gt;Virtuoso History page&lt;/a&gt; tells the whole story.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What Functionality Does It Provide?&lt;/h4&gt; The following: &lt;ul&gt; 1. Object-Relational DBMS Engine (ORDBMS like PostgreSQL and DBMS engine like MySQL) &lt;/ul&gt; &lt;ul&gt; 2. XML Data Management (with support for XQuery, XPath, XSLT, and XML Schema) &lt;/ul&gt; &lt;ul&gt; 3. RDF Triple Store (or Database) that supports SPARQL (Query Language, Transport Protocol, and XML Results Serialization format) &lt;/ul&gt; &lt;ul&gt; 4. Service Oriented Architecture (it combines a BPEL Engine with an ESB) &lt;/ul&gt; &lt;ul&gt; 5. Web Application Server (supports HTTP/WebDAV) &lt;/ul&gt; &lt;ul&gt; 6. NNTP compliant Discussion Server &lt;/ul&gt; And more. (see: &lt;a href=&quot;http://virtuoso.openlinksw.com&quot;&gt;Virtuoso Web Site&lt;/a&gt;) &lt;p&gt; 90% of the aforementioned functionality has been available in Virtuoso since 2000 with the RDF Triple Store being the only 2006 item.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What Platforms are Supported&lt;/h4&gt; &lt;p&gt; The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;Why Open Source?&lt;/h4&gt; &lt;p&gt;Simple, there is no value in a product of this magnitude remaining the &amp;quot;best kept secret&amp;quot;. That status works well for our competitors, but absolutely works against the legions of new generation developers, systems integrators, and knowledge workers that need to be aware of what is actually achievable today with the right server architecture.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What Open Source License is it under?&lt;/h4&gt; &lt;p&gt;GPL version 2.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What&amp;#39;s the business model?&lt;/h4&gt; &lt;p&gt;Dual licensing.&lt;/p&gt; &lt;p&gt;The Open Source version of Virtuoso includes all of the functionality listed above. While the Virtual Database (distributed heterogeneous join engine) and Replication Engine (across heterogeneous data sources) functionality will only be available in the commercial version. &lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;Where is the Project Hosted?&lt;/h4&gt; &lt;p&gt;On &lt;a href=&quot;http://sourceforge.net/projects/virtuoso&quot;&gt;SourceForge.&lt;/a&gt; &lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;Is there a product Blog?&lt;/h4&gt; &lt;p&gt;Of course! &lt;/p&gt; &lt;p&gt;Up until this point, the &lt;a href=&quot;http://virtuoso.openlinksw.com/blog/&quot;&gt;Virtuoso Product Blog&lt;/a&gt; has been a covert live demonstration of some aspects of Virtuoso (Content Management). My Personal Blog and the Virtuoso Product Blog are actual Virtuoso instances, and have been so since I started blogging in 2003.&lt;/p&gt; &lt;p&gt;Is There a product Wiki?&lt;/p&gt; &lt;p&gt;Sure! &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;The Virtuoso Product Wiki&lt;/a&gt; is also an instance of Virtuoso demonstrating another aspect of the Content Management prowess of Virtuoso.&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What About Online Documentation?&lt;/h4&gt; &lt;p&gt;Yep! &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/&quot;&gt;Virtuoso Online Documentation&lt;/a&gt; is hosted via yet another Virtuoso instance. This particular instance also attempts to demonstrate Free Text search combined with the ability to repurpose well formed content in a myriad of forms (Atom, RSS, RDF, OPML, and OCS).&lt;/p&gt; &lt;p&gt;&lt;/p&gt; &lt;h4&gt;What about Tutorials and Demos?&lt;/h4&gt; &lt;p&gt;The &lt;a href=&quot;http://demo.openlinksw.com/tutorial/&quot;&gt;Virtuoso Online Tutorial&lt;/a&gt; Site has operated as a live demonstration and tutorial portal for a numbers of years. During the same timeframe (circa. 2001) we also assembled a few Screencast style demos (their look feel certainly show their age; updates are in the works).&lt;/p&gt; &lt;p&gt;BTW - We have also updated the &lt;a href=&quot;http://virtuoso.openlinksw.com/FAQ/&quot;&gt;Virtuoso FAQ&lt;/a&gt; and also released a number of missing &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/&quot;&gt;Virtuoso White Papers&lt;/a&gt; (amongst many long overdue action items).&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>History of Programming Languages</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-03-15#940</atom:id>
  <atom:published>2006-03-15T04:12:00Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://www.oreilly.com/news/graphics/prog_lang_poster.pdf&quot;&gt;History of Programming Languages Poster&lt;/a&gt;.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Ted Nelson&#39;s Perspective on Technology Lock-in</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-02-15#935</atom:id>
  <atom:published>2006-02-15T19:50:41Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt; &lt;a href=&quot;http://www.invisiblerevolution.net/ted-bar-it/top-level.html&quot;&gt;Ted Nelson expresses technology lock-in dislike&lt;/a&gt;. This applies to Operating System, Programming Language, Database, or any other forms. &lt;/p&gt; Amen! &lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=zigzag&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;zigzag&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=xanadu&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;xanadu&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=semantic_web&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;semantic_web&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=semweb&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;semweb&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=visionary&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;visionary&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=history&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;history&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=hypertext&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;hypertext&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=hyperlink&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;hyperlink&lt;/a&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Video: Tribute to Innovation (featuring: Doug Engelbart)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-02-15#934</atom:id>
  <atom:published>2006-02-15T19:08:55Z</atom:published>
  <atom:updated>2006-07-21T07:22:48.000001-04:00</atom:updated>
  <atom:content type="html">A really nice &lt;a href=&quot;http://www.invisiblerevolution.net/index-video-web.html&quot;&gt;video tribute to Doug Engelbart&lt;/a&gt; and the fundamental challenges of seeing way ahead of your time (aka. Prescience) :-) &lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=semantic_web&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;semantic_web&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=semweb&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;semweb&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=visionary&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;visionary&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=history&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;history&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=hypertext&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;hypertext&lt;/a&gt;&lt;a href=&quot;http://www.openlinksw.com:8889/index.vspx?tag=hyperlink&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;hyperlink&lt;/a&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>A Sketch of Database History</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-11-04#893</atom:id>
  <atom:published>2005-11-04T21:14:55Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">I just stumbled across a 2003 article titled: &lt;a href=&quot;http://math.hws.edu/vaughn/cpsc/343/2003/history.html&quot;&gt;A Sketch of Database History&lt;/a&gt;. A pretty good read for those interested in this very important technology.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Clone the Google APIs: Kill That Noise</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-11-03#892</atom:id>
  <atom:published>2005-11-03T22:44:04Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I am kinda scratching my head a little re. the &amp;quot;Clone Google APIs&amp;quot; call; especially as Amazon&amp;#39;s &lt;a href=&quot;http://opensearch.a9.com/&quot;&gt;A9&lt;/a&gt; already provides &lt;a href=&quot;http://opensearch.a9.com/docs/howto.jsp&quot;&gt;infrastructure for generic search&lt;/a&gt;. A9 is open at both ends; you can consume search services via a RESTian API or plug your search engine into A9 (playing the role of A9 search service provider). &lt;/p&gt;&lt;p&gt;Quick Example using my blog: &lt;/p&gt;&lt;ul&gt;1. &lt;a href=&quot;http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&quot;&gt;My Blog&amp;#39;s Search Page&lt;/a&gt; (note it support Full Text and XPath/XQuery)&lt;/ul&gt;&lt;ul&gt;2. &lt;a href=&quot;http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&amp;q=#39web%202.0#39&amp;type=text&amp;output=html&quot;&gt;Search on pattern &amp;#39;Web 2.0&amp;#39;&lt;/a&gt; via my Blog&amp;#39;s Search Engine&lt;/ul&gt;&lt;ul&gt;3. &lt;a href=&quot;http://en.wikipedia.org/wiki/Hacktivism&quot; xmlns:n0=&quot;http&quot; n0:=&quot;http:&quot; a9.com=&quot;a9.com&quot; search=&quot;search&quot; morecolumns.jsp=&quot;morecolumns.jsp&quot; a=&quot;a&quot;&gt;Hactivism&lt;/a&gt;&amp;quot; regarding this matter. Certainly worth a full-post-scrape for my ongoing content annotation efforts (see &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/index.vspx?page=linkblog&quot;&gt;Linkblog&lt;/a&gt; and &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/index.vspx?page=summary&quot;&gt;BlogSummary&lt;/a&gt;). &lt;p&gt;Digest the rest of Dare&amp;#39;s post:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;&lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=3faf48bb-cf43-4fad-9145-cd749bd0288e&quot;&gt;Clone the Google APIs: Kill That Noise&lt;/a&gt;: &amp;quot;&lt;/p&gt;&lt;p&gt; Yesterday Dave Winer wrote in a post about &lt;a href=&quot;http://www.scripting.com/2005/11/02.html#When:2:31:38PM&quot;&gt;cloning the Google API&lt;/a&gt; Dave Winer wrote &lt;/p&gt;&lt;blockquote&gt;&lt;i&gt;Let&amp;#39;s make the &lt;a href=&quot;http://www.clonethegoogleapi.com/&quot;&gt;Google API an open standard&lt;/a&gt;. Back in 2002, Google took a bold first step to enable open architecture search engines, by creating an API that allowed developers to build applications on top of their search engine. However, there were severe limits on the capacity of these applications. So we got a good demo of what might be, now three years later, it&amp;#39;s time for the real thing.&lt;br /&gt;&lt;br /&gt;&lt;/i&gt;&lt;/blockquote&gt;and earlier that &lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;If you didn&amp;#39;t get a chance to hear &lt;a href=&quot;http://www.scripting.com/2005/11/01.html#When:12:26:58AM&quot;&gt;yesterday&amp;#39;s podcast&lt;/a&gt;, it recommends that Microsoft clone the &lt;a href=&quot;http://davenet.scripting.com/2002/04/13/whatsNextAfterTheGoogleApi&quot;&gt;Google API&lt;/a&gt; for search, without the keys, and without the limits. When a developer&amp;#39;s application generates a lot of traffic, buy him a plane ticket and dinner, and ask how you both can make some money off their excellent booming application of search. This is something Google can&amp;#39;t do, because search is their cash cow. That&amp;#39;s why Microsoft should do it. And so should Yahoo. Also, there&amp;#39;s no doubt Google will be competing with Apple soon, so they should be also thinking about ways to devalue Google&amp;#39;s advantage.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;p&gt; This doesn&amp;#39;t seem like a great idea to me for a wide variety of reasons but first, let&amp;#39;s start with a history lesson before I tackle this specific issue &lt;/p&gt;&lt;p&gt;&lt;b&gt;A Trip Down Memory Lane&lt;/b&gt;&lt;br /&gt; This history lesson &lt;strike&gt;used to be in&lt;/strike&gt; is in a post entitled &lt;a href=&quot;http://web.archive.org/web/20041011135623/http://www.evhead.com/archives/2003_05_10_archive_default.asp&quot;&gt;The Tragedy of the API&lt;/a&gt; by &lt;a href=&quot;http://www.evhead.com/&quot;&gt;Evan Williams&lt;/a&gt; &lt;strike&gt;but seems to be gone now&lt;/strike&gt;. Anyway, back in the early days of blogging the folks at Pyra [which eventually got bought by Google] created the &lt;a href=&quot;http://www.blogger.com/developers/api/1_docs/&quot;&gt;Blogger API&lt;/a&gt; for their service. Since Blogspot/Blogger was a popular service, a the number of applications that used the API quickly grew. At this point Dave Winer decided that since the Blogger API was so popular he should implement it in his weblogging tools but then he decided that he didn&amp;#39;t like some aspects of it such as application keys (sound familiar?) and did without them in his version of the API. Dave Winer&amp;#39;s version of the Blogger API became the &lt;a href=&quot;http://www.xmlrpc.com/metaWeblogApi&quot;&gt;MetaWeblog API&lt;/a&gt;. These APIs became de facto standards and a number of other weblogging applications implemented them. &lt;/p&gt;&lt;p&gt; After a while, the folks at Pyra decided that their API needed to evolve due to various flaws in its design. As Diego Doval put it in his post &lt;a href=&quot;http://www.dynamicobjects.com/d2r/archives/001921.html&quot;&gt;a review of blogging APIs&lt;/a&gt;, &lt;i&gt;The Blogger API is a joke, and a bad one at that&lt;/i&gt;. This lead to the creation of the &lt;a href=&quot;http://www.blogger.com/developers/api/documentation20.html&quot;&gt;Blogger API 2.0&lt;/a&gt;. At this point a heated debate erupted online where Dave Winer berated the Blogger folks for deviating from an industry standard. The irony of flaming a company for coming up with a v2 of their own API seemed to be lost on many of the people who participated in the debate. Eventually the Blogger API 2.0 went nowhere. &lt;/p&gt;&lt;p&gt; Today the blogging API world is a few de facto standards based on a hacky API created by a startup a few years ago, a number of site specific APIs (&lt;a href=&quot;http://www.livejournal.com/doc/server/ljp.csp.xml-rpc.protocol.html&quot;&gt;LiveJournal API&lt;/a&gt;, &lt;a href=&quot;http://www.sixapart.com/movabletype/docs/mtmanual_programmatic.html&quot;&gt;MovableType API&lt;/a&gt;, etc) and a number of inconsistently implemented versions of the &lt;a href=&quot;http://bitworking.org/projects/atom/&quot;&gt;Atom API&lt;/a&gt;.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;b&gt;On Cloning the Google Search API&lt;/b&gt;&lt;br /&gt; To me the most salient point in the hijacking of the Blogger API from Pyra is that it didn&amp;#39;t change the popularity of their service or even make Radio Userland (Dave Winer&amp;#39;s product) catch up to them in popularity. This is important to note since this is Dave Winer&amp;#39;s key argument for Microsoft cloning the Google API. &lt;/p&gt;&lt;p&gt; Off the top of my head, here are my top three technical reasons for Microsoft to ignore the calls to clone the Google Search APIs&lt;br /&gt;&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;&lt;u&gt;Difference in Feature Set:&lt;/u&gt; The features exposed by the API do not run the entire gamut of features that other search engines may want to expose. Thus even if you implement something that looks a lot like the Google API, you&amp;#39;d have to extend it to add the functionality that it doesn&amp;#39;t provide. For example, compare the &lt;a href=&quot;http://www.google.com/apis/reference.html&quot;&gt;features provided by the Google API&lt;/a&gt; to the &lt;a href=&quot;http://developer.yahoo.net/search/&quot;&gt;features provided by the Yahoo! search API&lt;/a&gt;. I can count about half a dozen features in the Yahoo! API that aren&amp;#39;t in the Google API. &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;u&gt;Difference in Technology Choice:&lt;/u&gt; The Google API uses SOAP. This to me is a phenomenally bad technical decision because it raises the bar to performing a basic operation (data retrieval) by using a complex technology. I much prefer Yahoo!&amp;#39;s approach of providing a RESTful API and &lt;strike&gt;MSN&lt;/strike&gt; Windows Live Search&amp;#39;s approach of providing RSS search feeds and a SOAP API for the folks who need such overkill. &lt;br /&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;u&gt;Unreasonable Demands:&lt;/u&gt; A number of Dave Winer&amp;#39;s demands seem contradictory. He asks companies to not require application keys but then advises them to contact application developers who&amp;#39;ve built high traffic applications about revenue sharing. Exactly how are these applications to be identified without some sort of application ID? As for removing the limits on the services? I guess Dave is ignoring the fact that providing services costs money, which I seem to remember is why &lt;a href=&quot;http://www.kottke.org/05/10/weblogscom-sold-to-verisign&quot;&gt;he sold weblogs.com to Verisign for a few million dollars&lt;/a&gt;. I do agree that some of the limits on existing search APIs aren&amp;#39;t terribly useful. The Google API limit of 1000 queries a day seems to guarantee that you won&amp;#39;t be able to power a popular application with the service. &lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;u&gt;Lack of Innovation:&lt;/u&gt; Copying Google sucks. &lt;br /&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;(Via &lt;a href=&quot;http://www.25hoursaday.com/weblog/&quot;&gt;Dare Obasanjo aka Carnage4Life&lt;/a&gt;.)&lt;/p&gt;&lt;/blockquote&gt;&lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Rise of Relational Databases</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-10-29#889</atom:id>
  <atom:published>2005-10-29T20:33:43Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">I suspect the subject of this post triggers the following questions: &lt;ul&gt;1. Don&amp;#39;t you mean the fall/death of Relational Databases?&lt;/ul&gt;&lt;ul&gt;2. Does anyone use these anymore?&lt;/ul&gt;&lt;ul&gt;3. What are these?&lt;/ul&gt; Relational Database Management Systems (RDBMS) are alive and kicking as expressed eloquently in this excerpt from a book titled &amp;quot;Funding A Revolution&amp;quot;: &lt;blockquote&gt;&lt;cite&gt;&lt;/cite&gt;&lt;p&gt; Large-scale computer applications require rapid access to large amounts of data. A computerized checkout system in a supermarket must track the entire product line of the market. Airline reservation systems are used at many locations simultaneously to place passengers on numerous flights on different dates. Library computers store millions of entries and access citations from hundreds of publications. Transaction processing systems in banks and brokerage houses keep the accounts that generate international flows of capital. World Wide Web search engines scan thousands of Web pages to produce quantitative responses to queries almost instantly. Thousands of small businesses and organizations use databases to track everything from inventory and personnel to DNA sequences and pottery shards from archaeological digs.&lt;/p&gt;&lt;p&gt;Thus, databases not only represent significant infrastructure for computer applications, but they also process the transactions and exchanges that drive the U.S. economy. &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;My only addition to the excerpt above is that the impact of databases extends beyond the U.S. economy. We are talking about the global economy. And this will be so for all of time!&lt;/p&gt;&lt;p&gt;I came across this page while enriching the links in one of my earlier &amp;quot;&lt;a href=&quot;http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&amp;q=history&amp;type=text&amp;output=html&quot;&gt;history&lt;/a&gt;&amp;quot; related posts about &lt;a href=&quot;http://www.openlinksw.com/weblog/kidehen@openlinksw.com/127/index.vspx?page=&amp;id=266&quot;&gt;Relational Database Technology pioneers&lt;/a&gt;. During this effort I also stumbled across another historic document titled: &amp;quot;&lt;a href=&quot;http://www.mcjones.org/System_R/SQL_Reunion_95/sqlr95.html&quot;&gt;1995 SQL Reunion&lt;/a&gt;&amp;quot;.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Breaking the Web Wide Open!</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-10-26#882</atom:id>
  <atom:published>2005-10-26T19:28:47Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;a href=&quot;http://marc.blogs.it/&quot;&gt;Marc Canter&lt;/a&gt;&amp;#39;s &lt;a href=&quot;http://marc.blogs.it/archives/2005/10/breaking_the_we.html&quot;&gt;Breaking the Web Wide Open! &lt;/a&gt; article is something I found pretty late (by my normal discovery standards). This was partly due to the pre- and post- Web 2.0 event noise levels that have dumped the description of an important industry inflection into the &amp;quot;Bozo Bin&amp;quot; of many. Personally, I think we shouldn&amp;#39;t confuse the Web 2.0 traditional-pitch-fest conference with an attempt to identify an important industry inflection).&lt;/p&gt;&lt;p&gt; Anyway, Marc&amp;#39;s article is a very refreshing read because it provides a really good insight into the general landscape of a rapidly evolving Web alongside genuine appreciation of our broader timeless pursuit of &amp;quot;Openness&amp;quot;. &lt;/p&gt;&lt;p&gt;To really help this document provide additional value have scrapped the content of the original post and dumped it below so that we can appreciate the value of the links embedded within the article (note: thanks to Virtuoso I only had to paste the content into my blog, the extraction to my &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/index.vspx?page=linkblog&quot;&gt;Linkblog&lt;/a&gt; and &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/index.vspx?page=summary&quot;&gt;Blog Summary&lt;/a&gt; Pages are simply features of my &lt;a href=&quot;http://www.openlinksw.com/virtuos&quot;&gt;Virtuoso &lt;/a&gt;based Blog Engine):&lt;/p&gt;&lt;blockquote&gt;&lt;h3 class=&quot;hed2&quot; style=&quot;padding-bottom: 10px&quot;&gt;Breaking the Web Wide Open! (complete story)&lt;/h3&gt;&lt;p&gt;Even the web giants like AOL, Google, MSN, and Yahoo need to observe these open standards, or they&amp;#39;ll risk becoming the &amp;quot;walled gardens&amp;quot; of the new web and be coolio no more.&lt;/p&gt;&lt;p class=&quot;byline&quot;&gt;&lt;b&gt;&lt;a href=&quot;http://community.alwayson-network.com/cgi-bin/WebObjects/AlwaysOn.woa/wa/display?id=9254:Person&quot;&gt;Marc Canter&lt;/a&gt;&lt;/b&gt; [&lt;a href=&quot;http://community.alwayson-network.com/cgi-bin/WebObjects/AlwaysOn.woa/wa/display?id=9254:Person&quot;&gt;&lt;b&gt;Broadband Mechanics, Inc.&lt;/b&gt;&lt;/a&gt;] | POSTED: 09.26.05 @12:00&lt;/p&gt;&lt;table width=&quot;100%&quot; border=&quot;0&quot; cellspacing=&quot;0&quot; cellpadding=&quot;0&quot;&gt;&lt;tr&gt;&lt;td valign=&quot;TOP&quot; class=&quot;copy1&quot;&gt;&lt;img src=&quot;http://community.alwayson-network.com/ao/images/thumb/19433429363e7cd6b1ecfb7.jpg&quot; align=&quot;LEFT&quot; border=&quot;0&quot; width=&quot;80&quot; style=&quot;margin: 0px 10px 5px 0px&quot; alt=&quot;&quot; /&gt;&lt;i&gt;&lt;b&gt;Editorial Note:&lt;/b&gt; Several months ago, AlwaysOn got a personal invitation from Yahoo founder Jerry Yang &amp;quot;to see and give us feedback on our new social media product, y!360.&amp;quot; We were happy to oblige and dutifully showed up, joining a conference room full of hard-core bloggers and new, new media types. The geeks gave Yahoo 360 an overwhelming thumbs down, with comments like, &amp;quot;So the only services I can use within this new network are Yahoo services? What if I don&amp;#39;t use Yahoo IM?&amp;quot; In essence, the Yahoo team was booed for being &amp;quot;closed web,&amp;quot; and we heartily agreed. With Yahoo 360, Yahoo continues building its own &amp;quot;walled garden&amp;quot; to control its 135 million customersÂan accusation also hurled at AOL in the early 1990s, before AOL migrated its private network service onto the web. As the&lt;/i&gt;Â  &lt;a href=&quot;http://bernardmoon.blogspot.com/2005/08/yahoos-personality-crisis.html&quot; target=&quot;_blank&quot;&gt;Economist&lt;i&gt; recently noted&lt;/i&gt;&lt;/a&gt;, &amp;quot;Yahoo, in short, has old media plans for the new-media era.&amp;quot;&lt;br /&gt;&lt;br /&gt;The irony to our view here is, of course, that today&amp;#39;s AO Network is also a &amp;quot;closed web.&amp;quot; In the end, Mr. Yang&amp;#39;s thoughtful invitation and our ensuing disappointment in his new service led to the assignment of this article. It also confirmed our existing plan to completely revamp the AO Network around open standards. To tie it all together, we recruited the chief architect of our new site, &lt;a href=&quot;http://www.corante.com/amateur/articles/20030211-3564.html&quot; target=&quot;_blank&quot;&gt;the notorious Marc Canter&lt;/a&gt;, to pen this piece. We look forward to our reader feedback.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Breaking the Web Wide Open!&lt;/b&gt;&lt;br /&gt;By Marc Canter&lt;br /&gt;&lt;br /&gt;For decades, &amp;quot;walled gardens&amp;quot; of proprietary standards and content have been the strategy of dominant players in mainframe computer software, wireless telecommunications services, and the World Wide WebÂit was their successful lock-in strategy of keeping their customers theirs. But like it or not, those walls are tumbling down. Open web standards are being adopted so widely, with such value and impact, that the web giantsÂAmazon, AOL, eBay, Google, Microsoft, and YahooÂare facing the difficult decision of opening up to what they don&amp;#39;t control.&lt;br /&gt;&lt;br /&gt;The online world is evolving into a new open web (sometimes called the Web 2.0), which is all about being personalized and customized for each user. Not only open source software, but &lt;i&gt;open standards&lt;/i&gt;Â are becoming an essential component. &lt;br /&gt;&lt;br /&gt;Many of the web giants have been using open source software for years. Most of them use at least parts of the &lt;a href=&quot;http://www.onlamp.com/pub/a/onlamp/2001/01/25/lamp.html&quot; target=&quot;_blank&quot;&gt;LAMP&lt;/a&gt; (Linux, Apache, MySQL, Perl/Python/PHP) stack, even if they aren&amp;#39;t well-known for giving back to the open source community. For these incumbents that grew big on proprietary web services, the methods, practices, and applications of open source software development are difficult to fully adopt. And the next open source movementsÂwhich will be as much about open standards as about codeÂwill be a lot harder for the incumbents to exploit.&lt;br /&gt;&lt;br /&gt;While the incumbents use cheap open source software to run their back-ends systems, their business models largely depend on proprietary software and algorithms. But our view a new slew of open software, open protocols, and open standards will confront the incumbents with the classic &lt;i&gt;&lt;a href=&quot;http://www.businessweek.com/chapter/christensen.htm&quot; target=&quot;_blank&quot;&gt;Innovator&amp;#39;s Dilemma&lt;/a&gt;&lt;/i&gt;.Â  Should they adopt these tools and standards, painfully cannibalizing their existing revenue for a new unproven concept, or should they stick with their currently lucrative model with the risk that eventually a bunch of upstarts eat their lunch? &lt;br /&gt;&lt;br /&gt;Credit should go to several of the web giants who have been making efforts to &amp;quot;open up.&amp;quot; Google, Yahoo, eBay, and Amazon all have Open APIs (Application Programming Interfaces) built into their data and systems. Any software developer can access and use them for whatever creative purposes they wish. This means that the API provider becomes an open platform for everyone to use and build on top of. This notion has expanded like wildfire throughout the blogosphere, so nowadays, Open APIs are pretty much required.&lt;br /&gt;&lt;br /&gt;Other incumbents also have open strategies. AOL has got the RSS religion, &lt;a href=&quot;http://www.siliconbeat.com/entries/2005/07/27/aol_gets_rss_religion_with_my_aoland_feedsters_help.html&quot; target=&quot;_blank&quot;&gt;providing a feedreader and RSS search&lt;/a&gt; in order to escape the &amp;quot;walled garden of content&amp;quot; stigma. &lt;a href=&quot;http://www.apple.com/podcasting/&quot; target=&quot;_blank&quot;&gt;Apple now incorporates podcasts&lt;/a&gt;, the &amp;quot;personal radio shows&amp;quot; that are latest rage in audio narrowcasting, into iTunes. Even Microsoft is supporting open standards, for example &lt;a href=&quot;http://www.microsoft.com/technet/prodtechnol/winxppro/plan/rtcprot.mspx#EKAA&quot; target=&quot;_blank&quot;&gt;by endorsing SIP (Session Initiation Protocol) for internet telephony and conferencing&lt;/a&gt; over Skype&amp;#39;s proprietary format or one of its own devising.&lt;br /&gt;&lt;br /&gt;But new open standards and protocols are in use, under construction, or being proposed every day, pushing the envelope of where we are right now. Many of these standards are coming from startup companies and small groups of developers, not from the giants. Together with the Open APIs, those new standards will contribute to a new, open infrastructure. Tens of thousands of developers will use and improve this open infrastructure to create new kinds of web-based applications and services, to offer web users a highly personalized online experience.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Brief History of Openness&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;At this point, I have to admit that I am not just a passive observer, full-time journalist or &amp;quot;just some blogger&amp;quot;Âbut an active evangelist and developer of these standards. It&amp;#39;s the vision of &amp;quot;open infrastructure&amp;quot; that&amp;#39;s driving &lt;a href=&quot;http://www.broadbandmechanics.com/bbm2005.htm&quot; target=&quot;_blank&quot;&gt;my company &lt;/a&gt; and the reason why I&amp;#39;m writing this article. This article will give you some of the background behind on these standards, and what the evolution of the next generation of open standards will look like.&lt;br /&gt;&lt;br /&gt;Starting back in the 1980s, establishing a software standard was a key strategy for any software company. My former company, MacroMind (which became Macromedia), achieved this goal early on with Director. As &lt;a href=&quot;http://webmonkey.wired.com/webmonkey/99/27/index3a_page6.html?tw=multimedia&quot; target=&quot;_blank&quot;&gt;Director evolved into Flash&lt;/a&gt;, the world saw that other companies besides Microsoft, Adobe, and Apple could establish true cross-platform, independent media standards.&lt;br /&gt;&lt;br /&gt;Then &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/&quot; target=&quot;_blank&quot;&gt;Tim Berners-Lee&lt;/a&gt; and &lt;a href=&quot;http://www.ibiblio.org/pioneers/andreesen.html&quot; target=&quot;_blank&quot;&gt;Marc Andreessen&lt;/a&gt; came along, and changed the rules of the software business and of entrepreneurialism. No matter how entrenched and &amp;quot;standardized&amp;quot; software was, the rug could still get pulled out from under it. &lt;a href=&quot;http://geekphilosopher.com/MainPage/WebBrowserWars.htm?q=Stocks&quot; target=&quot;_blank&quot;&gt;Netscape did it to Microsoft, and then Microsoft did it &lt;i&gt;back&lt;/i&gt;Â  to Netscape&lt;/a&gt;. The web evolved, and lots of standards evolved with it. The leading open source standards (such as the LAMP stack) became widely used alternatives to proprietary closed-source offerings. &lt;br /&gt;&lt;br /&gt;Open standards are more than just technology. Open standards mean sharing, empowering, and community support. Someone floats a new idea (or &lt;a href=&quot;http://en.wikipedia.org/wiki/Meme&quot; target=&quot;_blank&quot;&gt;meme&lt;/a&gt;) and the community runs with it â with each person making their own contributions to the standard â evolving it without a moment&amp;#39;s hesitation about &amp;quot;giving away their intellectual property.&amp;quot;&lt;br /&gt;&lt;br /&gt;One good example of this was &lt;a href=&quot;http://www.sifry.com/alerts/&quot; target=&quot;_blank&quot;&gt;Dave Sifry&lt;/a&gt;, who built the Technorati blog-tracking technology inspired by the &lt;a href=&quot;http://www.myelin.co.nz/ecosystem/&quot; target=&quot;_blank&quot;&gt;Blogging Ecosystem&lt;/a&gt;, a weekend project by young hacker &lt;a href=&quot;http://marc.blogs.it/archives/2005/07/phil_pearson_jo.html&quot; target=&quot;_blank&quot;&gt;Phil Pearson&lt;/a&gt;. Dave liked what he saw and he ran with itÂturning Technorati into what it is today.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Dave_Winer&quot; target=&quot;_blank&quot;&gt;Dave Winer&lt;/a&gt; has contributed enormously to this area of open standards. He defined and personally created several open standards and protocolsÂsuch as RSS, OPML, and XML-RPC. Dave has also &lt;a href=&quot;http://newhome.weblogs.com/historyOfWeblogs&quot; target=&quot;_blank&quot;&gt;helped build&lt;/a&gt; the blogosphere through his enthusiasm and passion.&lt;br /&gt;&lt;br /&gt;By 2003, hundreds of programmers were working on creating and establishing new standards for almost everything. The best of these new standards have evolved into compelling web services platforms â such as &lt;a href=&quot;http://del.icio.us/&quot; target=&quot;_blank&quot;&gt;del.icio.us&lt;/a&gt;, &lt;a href=&quot;http://webjay.org/about&quot; target=&quot;_blank&quot;&gt;Webjay&lt;/a&gt;, or &lt;a href=&quot;http://www.flickr.com/photos/tags/ao2005/&quot; target=&quot;_blank&quot;&gt;Flickr&lt;/a&gt;. Some have even spun off formal standards â like XSPF (a standard for playlists) or instant messaging standard XMPP (also known as Jabber).&lt;br /&gt;&lt;br /&gt;Today&amp;#39;s Open APIs are complemented by standardized SchemasÂthe structure of the data itself and its associated meta-data. Take for example a &lt;a href=&quot;http://www.ipodder.org/whatIsPodcasting&quot; target=&quot;_blank&quot;&gt;podcasting feed&lt;/a&gt;. It consists of: a) the radio show itself, b) information on who is on the show, what the show is about and how long the show is (the meta-data) and also c) API calls to retrieve a show (a single feed item) and play it from a specified server. &lt;br /&gt;&lt;br /&gt;The combination of Open APIs, standardized schemas for handling meta-data, and an industry which agrees on these standards are breaking the web wide open right now. So what new open standards should the web incumbentsÂand youÂbe watching? Keep an eye on the following developments:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Identity&lt;br /&gt;Attention&lt;br /&gt;Open Media&lt;br /&gt;Microcontent Publishing&lt;br /&gt;Open Social Networks&lt;br /&gt;Tags&lt;br /&gt;Pinging &lt;br /&gt;Routing&lt;br /&gt;Open Communications&lt;br /&gt;Device Management and Control&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;1. Identity&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Right now, you don&amp;#39;t really control your own online identity. At the core of just about every online piece of software is a membership system. Some systems allow you to browse a site anonymouslyÂbut unless you register with the site you can&amp;#39;t do things like search for an article, post a comment, buy something, or review it. The problem is that each and every site has its own membership system. So you constantly have to register with new systems, which cannot share dataÂeven you&amp;#39;d want them to. By establishing a &lt;a href=&quot;http://www.wired.com/news/privacy/0,1848,68329-2,00.html?tw=wn_story_page_next1&quot; target=&quot;_blank&quot;&gt;&amp;quot;single sign-on&amp;quot; standard&lt;/a&gt;, disparate sites can allow users to freely move from site to site, and let them control the movement of their personal profile data, as well as any other data they&amp;#39;ve created. &lt;br /&gt;&lt;br /&gt;With &lt;a href=&quot;http://www.thehindubusinessline.com/2005/01/03/stories/2005010301440200.htm&quot; target=&quot;_blank&quot;&gt;Passport, Microsoft unsuccessfully attempted&lt;/a&gt; to force its proprietary standard on the industry. Instead, a world is evolving where most people assume that users want to control their own data, whether that data is their profile, their blog posts and photos, or some collection of their past interactions, purchases, and recommendations. As long as users can control their digital identity, any kind of service or interaction can be layered on top of it.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.identity20.com/media/OSCON2005/&quot; target=&quot;_blank&quot;&gt;Identity 2.0&lt;/a&gt; is all about users controlling their own profile data and becoming their own agents. This way the users themselves, rather than other intermediaries, will profit from their ID info. Once developers start offering single sign-on to their users, and users have trusted places to store their dataÂwhich respect the limits and provide access controls over that data, users will be able to access personalized services which will understand and use their personal data.&lt;br /&gt;&lt;br /&gt;Identity 2.0 may seem like some geeky, visionary future standard that isn&amp;#39;t defined yet, but by putting each user&amp;#39;s digital identity at the core of all their online experiences, Identity 2.0 is becoming the cornerstone of the new open web. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Initiatives:&lt;/b&gt;&lt;br /&gt;Right now, Identity 2.0 is under construction through various efforts from Microsoft (the &lt;a href=&quot;http://msdn.microsoft.com/webservices/webservices/understanding/advancedwebservices/default.aspx?pull=/library/en-us/dnwebsrv/html/identitymetasystem.asp&quot; target=&quot;_blank&quot;&gt;&amp;quot;InfoCard&amp;quot; component built into the Vista operating system&lt;/a&gt; and its &amp;quot;&lt;a href=&quot;http://garage.docsearls.com/node/605&quot; target=&quot;_blank&quot;&gt;Identity Metasystem&lt;/a&gt;&amp;quot;), &lt;a href=&quot;http://sxip.com&quot; target=&quot;_blank&quot;&gt;Sxip Identity&lt;/a&gt;, &lt;a href=&quot;http://www.identtycommons.net&quot; target=&quot;_blank&quot;&gt;Identity Commons&lt;/a&gt;, &lt;a href=&quot;http://www.projectliberty.org/&quot; target=&quot;_blank&quot;&gt;Liberty Alliance&lt;/a&gt;, &lt;a href=&quot;http://lid.netmesh.org/&quot; target=&quot;_blank&quot;&gt;LID&lt;/a&gt; (NetMesh&amp;#39;s Lightweight ID), and SixApart&amp;#39;s &lt;a href=&quot;http://openid.net/&quot; target=&quot;_blank&quot;&gt;OpenID&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;More Movers and Shakers:&lt;/b&gt;&lt;br /&gt;Identity Commons and &lt;a href=&quot;http://www.identitywoman.net&quot; target=&quot;_blank&quot;&gt;Kaliya Hamlin&lt;/a&gt;, Sxip Identity and &lt;a href=&quot;http://blame.ca/dick/&quot; target=&quot;_blank&quot;&gt;Dick Hardt&lt;/a&gt;, the &lt;a href=&quot;http://www.identitygang.org/&quot; target=&quot;_blank&quot;&gt; Identity Gang&lt;/a&gt; and &lt;a href=&quot;http://www.searls.com/dochome.html#Bio&quot; target=&quot;_blank&quot;&gt;Doc Searls&lt;/a&gt;, Microsoft&amp;#39;s &lt;a href=&quot;http://www.identityblog.com/&quot; target=&quot;_blank&quot;&gt;Kim Cameron&lt;/a&gt;, &lt;a href=&quot;http://www.craigburton.com/&quot; target=&quot;_blank&quot;&gt;Craig Burton&lt;/a&gt;, &lt;a href=&quot;http://phil.windley.org/&quot; target=&quot;_blank&quot;&gt;Phil Windley&lt;/a&gt;, and &lt;a href=&quot;http://slashdot.org/article.pl?sid=05/07/05/2020221&amp;from=rss&quot; target=&quot;_blank&quot;&gt;Brad Fitzpatrick&lt;/a&gt;, to name a few.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;2. Attention&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;How many readers know what their online attention is worth? If you don&amp;#39;t, Google and Yahoo doÂthey make their living off our attention. They know what we&amp;#39;re searching for, happily turn it into a keyword, and sell that keyword to advertisers. They make money off our attention. We don&amp;#39;t. &lt;br /&gt;&lt;br /&gt;Technorati and friends proposed &lt;a href=&quot;http://blogs.zdnet.com/Gillmor/index.php?p=74&quot; target=&quot;_blank&quot;&gt;an attention standard, Attention.xml&lt;/a&gt;, designed to &amp;quot;help you keep track of what you&amp;#39;ve read, what you&amp;#39;re spending time on, and what you should be paying attention to.&amp;quot; &lt;a href=&quot;http://attentiontrust.org/&quot; target=&quot;_blank&quot;&gt;AttentionTrust&lt;/a&gt; is an effort by &lt;a href=&quot;http://blogs.zdnet.com/Gillmor/?p=132&quot; target=&quot;_blank&quot;&gt;Steve Gillmor&lt;/a&gt; and &lt;a href=&quot;http://majestic.typepad.com/seth/2005/07/attentiontrusto.html&quot; target=&quot;_blank&quot;&gt;Seth Goldstein &lt;/a&gt;to standardize on how captured end-user performance, browsing, and interest data are used. &lt;br /&gt;&lt;br /&gt;Blogger &lt;a href=&quot;http://worcester.typepad.com/pc4media/2005/07/attentiontrusto_1.html&quot; target=&quot;_blank&quot;&gt;Peter Caputa gives a good summary&lt;/a&gt; of AttentionTrust: &lt;blockquote&gt;&amp;quot;As we use the web, we reveal lots of information about ourselves by what we pay attention to. Imagine if all of that information could be stored in a nice neat little xml file. And when we travel around the web, we can optionally share it with websites or other people. We can make them pay for it, lease it ... we get to decide who has access to it, how long they have access to it, and what we want in return. And they have to tell us what they are going to do with our Attention data.&amp;quot;&lt;/blockquote&gt;&lt;br /&gt;So when you give your attention to sites that adhere to the AttentionTrust, your attention rights (&lt;i&gt;you own your attention, you can move your attention, you can pay attention and be paid for it&lt;/i&gt;,Â  and &lt;i&gt;you can see how your attention is used&lt;/i&gt;) are guaranteed. Attention data is crucial to the future of the open web, and Steve and Seth are making sure that no one entity or oligopoly controls it. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://blogs.zdnet.com/Gillmor/&quot; target=&quot;_blank&quot;&gt;Steve Gillmor&lt;/a&gt;, &lt;a href=&quot;http://majestic.typepad.com/about.html&quot; target=&quot;_blank&quot;&gt;Seth Goldstein&lt;/a&gt;, &lt;a href=&quot;http://www.sifry.com/alerts/&quot; target=&quot;_blank&quot;&gt;Dave Sifry&lt;/a&gt; and the &lt;a href=&quot;http://developers.technorati.com/wiki/attentionxml&quot; target=&quot;_blank&quot;&gt;other Attention.xml folks&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;3. Open Media&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Proprietary media standardsÂFlash, Windows Media, and QuickTime, to name a few Âhelped liven up the web. But they are proprietary standards that try to keep us locked in, and they weren&amp;#39;t created from scratch to handle today&amp;#39;s online content. That&amp;#39;s why, for many of us, an Open Media standard has been a holy grail. Yahoo&amp;#39;s new Media RSS standard brings us one step closer to achieving open media, as do &lt;a href=&quot;http://www.vorbis.com/faq/#what&quot; target=&quot;_blank&quot;&gt;Ogg Vorbis&lt;/a&gt; audio codecs, &lt;a href=&quot;http://webjay.org/&quot; target=&quot;_blank&quot;&gt;XSPF playlists&lt;/a&gt;, or &lt;a href=&quot;http://musicbrainz.org/&quot; target=&quot;_blank&quot;&gt;MusicBrainz&lt;/a&gt;. And several sites offer digital creators not only a place to store their content, but also to sell it. &lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://search.yahoo.com/mrss&quot; target=&quot;_blank&quot;&gt;Media RSS &lt;/a&gt;(being developed by Yahoo with help from the community) extends RSS and combines it with &amp;quot;RSS enclosures&amp;quot; Âadds metadata to any media itemÂto create a comprehensive solution for media &amp;quot;narrowcasters.&amp;quot; To gain acceptance for Media RSS, Yahoo knows it has to work with the community. As an active member of this community, I can tell you that we&amp;#39;ll create Media RSS equivalents for &lt;a href=&quot;http://www.xml.com/pub/a/2001/01/24/rdf.html&quot; target=&quot;_blank&quot;&gt;rdf&lt;/a&gt; (an alternative subscription format) and &lt;a href=&quot;http://www.atomenabled.org/&quot; target=&quot;_blank&quot;&gt;Atom&lt;/a&gt; (yet &lt;i&gt;another&lt;/i&gt;Â  subscription format), so no one will be able to complain that Yahoo is picking sides in format wars.&lt;br /&gt;&lt;br /&gt;When Yahoo announced the purchase of Flickr, Yahoo founder Jerry Yang insinuated that Yahoo is acquiring &amp;quot;open DNA&amp;quot; to turn Yahoo into &lt;a href=&quot;http://www.flickr.com/services/api/&quot; target=&quot;_blank&quot;&gt;an open standards player&lt;/a&gt;. Yahoo is showing what happens when you take a multi-billion dollar company and make openness one of its core valuesÂso Google, beware, even if Google does have more research fellows and Ph.D.s. &lt;br /&gt;&lt;br /&gt;The open media landscape is far and wide, reaching from game machine hacks and mobile phone downloads to PC-driven bookmarklets, players, and editors, and it includes many other standardization efforts. &lt;a href=&quot;http://www.xspf.org/&quot; target=&quot;_blank&quot;&gt;XSPF&lt;/a&gt; is an open standard for playlists, and MusicBrainz is an alternative to the proprietary (and originally effectively stolen) database that &lt;a href=&quot;http://en.wikipedia.org/wiki/Gracenote&quot; target=&quot;_blank&quot;&gt;Gracenote&lt;/a&gt; licenses. &lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.ourmedia.org/&quot; target=&quot;_blank&quot;&gt;Ourmedia.org&lt;/a&gt; is a community front-end to Brewster Kahle&amp;#39;s &lt;a href=&quot;http://www.archive.org&quot; target=&quot;_blank&quot;&gt;Internet Archive&lt;/a&gt;. Brewster has promised free bandwidth and free storage forever to any content creators who choose to share their content via the Internet Archive. Ourmedia.org is providing an easy-to-use interface and community to get content in and out of the Internet Archive, giving ourmedia.org users the ability to share their media anywhere they wish, without being locked into a particular service or tool. Ourmedia plans to offer open APIs and an open media registry that interconnects other open media repositories into a DNS-like registry (just like the www domain system), so folks can browse and discover open content across many open media services. Systems like &lt;a href=&quot;http://www.brightcove.com/&quot; target=&quot;_blank&quot;&gt;Brightcove&lt;/a&gt; and &lt;a href=&quot;http://www.evhead.com/2005/02/how-odeo-happened.asp&quot; target=&quot;_blank&quot;&gt;Odeo&lt;/a&gt; support the concept of an open registry, and hope to work with digital creators to sell their work to fulfill the financial aspect of &lt;a href=&quot;http://en.wikipedia.org/wiki/The_Long_Tail&quot; target=&quot;_blank&quot;&gt;the &amp;quot;Long Tail.&amp;quot;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;More Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://creativecommons.org/about/people&quot; target=&quot;_blank&quot;&gt;Creative Commons&lt;/a&gt;, the &lt;a href=&quot;http://www.omn.org/&quot; target=&quot;_blank&quot;&gt;Open Media Network&lt;/a&gt;, &lt;a href=&quot;http://www.momentshowing.net/about.html&quot; target=&quot;_blank&quot;&gt;Jay Dedman&lt;/a&gt;, &lt;a href=&quot;http://ryanedit.blogspot.com/&quot; target=&quot;_blank&quot;&gt;Ryanne Hodson&lt;/a&gt;, &lt;a href=&quot;http://michaelverdi.com/index.php&quot; target=&quot;_blank&quot;&gt;Michael Verdi&lt;/a&gt;, &lt;a href=&quot;http://www.chapmanlogic.com/blog/aboutEli.html&quot; target=&quot;_blank&quot;&gt;Eli Chapman&lt;/a&gt;, &lt;a href=&quot;http://www.unmediated.org/&quot; target=&quot;_blank&quot;&gt;Kenyatta Cheese&lt;/a&gt;, &lt;a href=&quot;http://www.itconversations.com/about.html&quot; target=&quot;_blank&quot;&gt;Doug Kaye&lt;/a&gt;, &lt;a href=&quot;http://www.wired.com/wired/archive/13.09/yahoo.html&quot; target=&quot;_blank&quot;&gt;Brad Horowitz&lt;/a&gt;, &lt;a href=&quot;http://webjay.org/about#colophon&quot; target=&quot;_blank&quot;&gt;Lucas Gonze&lt;/a&gt;, &lt;a href=&quot;http://musicbrainz.org/wd/MusicBrainzBio&quot; target=&quot;_blank&quot;&gt;Robert Kaye&lt;/a&gt;, &lt;a href=&quot;http://www.lifewithalacrity.com/&quot; target=&quot;_blank&quot;&gt;Christopher Allen&lt;/a&gt;, &lt;a href=&quot;http://en.wikipedia.org/wiki/Brewster_Kahle&quot; target=&quot;_blank&quot;&gt;Brewster Kahle&lt;/a&gt;, &lt;a href=&quot;http://www.newmediamusings.com/&quot; target=&quot;_blank&quot;&gt;JD Lasica&lt;/a&gt;, and indeed, &lt;a href=&quot;http://www.corante.com/amateur/articles/20030211-3564.html&quot; target=&quot;_blank&quot;&gt;Marc Canter&lt;/a&gt;, among others.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;4. Microcontent Publishing&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Unstructured content is cheap to create, but hard to search through. Structured content is expensive to create, but easy to search. &lt;a href=&quot;http://developers.technorati.com/wiki/MicroFormats&quot; target=&quot;_blank&quot;&gt;Microformats&lt;/a&gt; resolve the dilemma with simple structures that are cheap to use and easy to search.&lt;br /&gt;&lt;br /&gt;The first kind of widely adopted microcontent is blogging. Every post is an encapsulated idea, addressable via a URL called a permalink. You can syndicate or subscribe to this microcontent using RSS or an RSS equivalent, and news or blog aggregators can then display these feeds in a convenient readable fashion. But a blog post is just a block of unstructured textânot a bad thing, but just a first step for microcontent. When it comes to&lt;i&gt;structured&lt;/i&gt;Â data, such as personal identity profiles, product reviews, or calendar-type event data, RSS was not designed to maintain the integrity of the structures. &lt;br /&gt;&lt;br /&gt;Right now, blogging doesn&amp;#39;t have the underlying structure necessary for full-fledged microcontent publishing. But that will change. Think of local information services (such as movie listings, event guides, or restaurant reviews) that any college kid can access and use in her weekend programming project to create new services and tools.&lt;br /&gt;&lt;br /&gt;Today&amp;#39;s blogging tools will evolve into microcontent publishing systems, and will help spread the notion of structured data across the blogosphere. New ways to store, represent and produce microcontent will create new standards, such as &lt;a href=&quot;http://structuredblogging.org/&quot; target=&quot;_blank&quot;&gt;Structured Blogging&lt;/a&gt; and &lt;a href=&quot;http://microformats.org/&quot; target=&quot;_blank&quot;&gt;Microformats&lt;/a&gt;. Microformats differ from RSS feeds in that you can&amp;#39;t subscribe to them. Instead, Microformats are embedded into webpages and discovered by search engines like Google or Technorati. Microformats are creating common definitions for &amp;quot;What is a review or event? What are the specific fields in the data structure?&amp;quot; They can also specify what we can do with all this information.&lt;a href=&quot;http://www.opml.org/spec&quot; target=&quot;_blank&quot;&gt;OPML (Outline Processor Markup Language)&lt;/a&gt; is a hierarchical file format for storing microcontent and structured data. It was developed by &lt;a href=&quot;http://en.wikipedia.org/wiki/Dave_Winer&quot; target=&quot;_blank&quot;&gt;Dave Winer&lt;/a&gt; of RSS and podcast fame.&lt;br /&gt;&lt;br /&gt;Events are one popular type of microcontent. &lt;a href=&quot;http://www.openevents.com&quot; target=&quot;_blank&quot;&gt;OpenEvents&lt;/a&gt; is already working to create shared databases of standardized events, which would get used by a new generation of event portalsâsuch as &lt;a href=&quot;http://eventful.com/gotevents/&quot; target=&quot;_blank&quot;&gt;Eventful/EVDB&lt;/a&gt;, &lt;a href=&quot;http://upcoming.org/&quot; target=&quot;_blank&quot;&gt;Upcoming.org&lt;/a&gt;, and &lt;a href=&quot;http://www.whizspark.com/&quot; target=&quot;_blank&quot;&gt;WhizSpark&lt;/a&gt;. The idea of OpenEvents is that event-oriented systems and services can work together to establish shared events databases (and associated APIs) that any developer could then use to create and offer their own new service or application. &lt;a href=&quot;http://marc.blogs.it/archives/2005/04/rvw_redux_openr.html&quot; target=&quot;_blank&quot;&gt;OpenReviews&lt;/a&gt; is still in the conceptual stage, but it would make it possible to provide open alternatives to closed systems like Epinions, and establish a shared database of local and global reviews. Its shared open servers would be filled with all sorts of reviews for anyone to access. &lt;br /&gt;&lt;br /&gt;Why is this important? Because I predict that in the future, 10 times more people will be writing reviews than maintaining their own blog. The list of possible microcontent standards goes on: OpenJobpostings, OpenRecipes, and even OpenLists. Microsoft &lt;a href=&quot;http://www.reallysimplesyndication.com/2005/06/22&quot; target=&quot;_blank&quot;&gt;recently revealed&lt;/a&gt; that it has been working on an important new kind of microcontent: Listsâso OpenLists will attempt to establish standards for the &lt;i&gt;kind&lt;/i&gt;Â of lists we all use, such as lists of Links, lists of To Do Items, lists of People, Wish Lists, etc.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://tantek.com/log/2005/09.html&quot; target=&quot;_blank&quot;&gt;Tantek Ãelik&lt;/a&gt; and &lt;a href=&quot;http://en.wikipedia.org/wiki/Kevin_Marks&quot; target=&quot;_blank&quot;&gt;Kevin Marks&lt;/a&gt; of &lt;a href=&quot;http://developers.technorati.com/wiki/MicroFormats&quot; target=&quot;_blank&quot;&gt;Technorati&lt;/a&gt;, &lt;a href=&quot;http://dannyayers.com/&quot; target=&quot;_blank&quot;&gt;Danny Ayers&lt;/a&gt;, &lt;a href=&quot;http://www.meyerweb.com/&quot; target=&quot;_blank&quot;&gt;Eric Meyer&lt;/a&gt;, &lt;a href=&quot;http://photomatt.net/&quot; target=&quot;_blank&quot;&gt;Matt Mullenweg&lt;/a&gt;, &lt;a href=&quot;http://zlab.commerce.net/&quot; target=&quot;_blank&quot;&gt;Rohit Khare&lt;/a&gt;, &lt;a href=&quot;http://ifindkarma.typepad.com/relax/&quot; target=&quot;_blank&quot;&gt;Adam Rifkin&lt;/a&gt;, &lt;a href=&quot;http://www.sivas.com/aleene/&quot; target=&quot;_blank&quot;&gt;Arnaud Leene&lt;/a&gt;, &lt;a href=&quot;http://radio.weblogs.com/0110772/&quot; target=&quot;_blank&quot;&gt;Seb Paquet&lt;/a&gt;, &lt;a href=&quot;http://hublog.hubmed.org/&quot; target=&quot;_blank&quot;&gt;Alf Eaton&lt;/a&gt;, &lt;a href=&quot;http://www.myelin.co.nz/post/&quot; target=&quot;_blank&quot;&gt;Phil Pearson&lt;/a&gt;, &lt;a href=&quot;http://www.joereger.com/&quot; target=&quot;_blank&quot;&gt;Joe Reger&lt;/a&gt;, &lt;a href=&quot;http://bobwyman.pubsub.com/&quot; target=&quot;_blank&quot;&gt;Bob Wyman&lt;/a&gt; among others.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;5. Open Social Networks&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I&amp;#39;ll never forget the first time I met &lt;a href=&quot;http://www.jabrams.com/&quot; target=&quot;_blank&quot;&gt;Jonathan Abrams&lt;/a&gt;, the founder of Friendster. He was arrogant and brash and he claimed he &amp;quot;&lt;i&gt;owned&lt;/i&gt;&amp;quot;Â  all his users, and that he was going to monetize them and make a fortune off them. This attitude robbed Friendster of its momentum, letting MySpace, Facebook, and other social networks take Friendster&amp;#39;s place.&lt;br /&gt;&lt;br /&gt;Jonathan&amp;#39;s notion of social networks as a way to control users is typical of the Web 1.0 business model and its attitude towards users in general. Social networks have become one of the battlegrounds between old and new ways of thinking. Open standards for Social Networking will define those sides very clearly. Since meeting Jonathan, I have been working towards finding and establishing open standards for social networks. Instead of closed, centralized social networks with 10 million people in them, the goal is making it possible to have 10 million social networks that each have 10 people in them.&lt;br /&gt;&lt;br /&gt;FOAF (which stands for Friend Of A Friend, and describes people and relationships in a way that computers can parse) is a schema to represent not only your personal profile&amp;#39;s meta-data, but your social network as well. Thousands of researchers use the &lt;a href=&quot;http://www.foaf-project.org/&quot; target=&quot;_blank&quot;&gt;FOAF schema&lt;/a&gt; in their &amp;quot;Semantic Web&amp;quot; projects to connect people in all sorts of new ways. &lt;a href=&quot;http://gmpg.org/xfn/&quot; target=&quot;_blank&quot;&gt;XFN&lt;/a&gt; is a microformat standard for representing your social network, while &lt;a href=&quot;http://www.imc.org/pdi/&quot; target=&quot;_blank&quot;&gt;vCard&lt;/a&gt; (long familiar to users of contact manager programs like Outlook) is a microformat that contains your profile information. Microformats are baked into any xHTML webpage, which means that&lt;i&gt;any&lt;/i&gt;Â blog, social network page, or any webpage in general can &amp;quot;contain&amp;quot; your social network in itÂand be used by&lt;i&gt;any&lt;/i&gt;Â compatible tool, service or application. &lt;br /&gt;&lt;br /&gt;PeopleAggregator is an earlier project now being integrated into &lt;a href=&quot;http://drupal.org/&quot; target=&quot;_blank&quot;&gt;open content management framework Drupal&lt;/a&gt;. The &lt;a href=&quot;http://www.broadbandmechanics.com/PeopleAggregator/&quot; target=&quot;_blank&quot;&gt;PeopleAggregator APIs&lt;/a&gt; will make it possible to establish relationships, send messages, create or join groups, and post between different social networks. (Sneak preview: this technology will be available in the upcoming GoingOn Network.) &lt;br /&gt;&lt;br /&gt;All of these open social networking standards mean that inter-connected social networks will form a mesh that will parallel the blogosphere. This vibrant, distributed, decentralized world will be driven by open standards: personalized online experiences are what the new open web will be all aboutÂand what could be more personalized than people&amp;#39;s networks?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://esigler.2nw.net/&quot; target=&quot;_blank&quot;&gt;Eric Sigler&lt;/a&gt;, &lt;a href=&quot;http://lucifer.intercosmos.net/index.php?view=about&quot; target=&quot;_blank&quot;&gt;Joel De Gan&lt;/a&gt;, &lt;a href=&quot;http://crschmidt.net/&quot; target=&quot;_blank&quot;&gt;Chris Schmidt&lt;/a&gt;, &lt;a href=&quot;http://voidstar.com/&quot; target=&quot;_blank&quot;&gt;Julian Bond&lt;/a&gt;, &lt;a href=&quot;http://people.tribe.net/paul?_click_path=Application%5Btribe%5D.Person%5Bf2232c95-e123-43a3-b48d-24a5f11f09dc%5D&amp;r=10535&quot; target=&quot;_blank&quot;&gt;Paul Martino&lt;/a&gt;, &lt;a href=&quot;http://napsterization.org/stories/archives/000513.html&quot; target=&quot;_blank&quot;&gt;Mary Hodder&lt;/a&gt;, &lt;a href=&quot;http://public.2idi.com/=Drummond.Reed&quot; target=&quot;_blank&quot;&gt;Drummond Reed&lt;/a&gt;, &lt;a href=&quot;http://danbri.org/&quot; target=&quot;_blank&quot;&gt;Dan Brickley&lt;/a&gt;, &lt;a href=&quot;http://360.yahoo.com/profile-9lciejI3aafX1stHPoIRNmkmv4EowQ--&quot; target=&quot;_blank&quot;&gt;Randy Farmer&lt;/a&gt;, and &lt;a href=&quot;http://www.kaliyasblogs.net/Iwoman/&quot; target=&quot;_blank&quot;&gt;Kaliya Hamlin&lt;/a&gt;, to name a few.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;6. Tags&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Nowadays, no self-respecting tool or service can ship without &lt;a href=&quot;http://www.salon.com/tech/feature/2005/02/08/tagging/index_np.html&quot; target=&quot;_blank&quot;&gt;tags&lt;/a&gt;. Tags are keywords or phrases attached to photos, blog posts, URLs, or even video clips. These user- and creator-generated tags are an open alternative to what used to be the domain of librarians and information scientists: categorizing information and content using taxonomies. Tags are instead creating &lt;a href=&quot;http://www.wired.com/wired/archive/13.04/view.html?pg=4&quot; target=&quot;_blank&quot;&gt;&amp;quot;folksonomies.&amp;quot;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The recently proposed OpenTags concept would be an open, community-owned version of the popular &lt;a href=&quot;http://www.technorati.com/tag/&quot; target=&quot;_blank&quot;&gt;Technorati Tags service&lt;/a&gt;. It would aggregate the usage of tags across a wide range of services, sites, and content tools. In addition to Technorati&amp;#39;s current tag features, OpenTags would let groups of people share their tags in &amp;quot;&lt;a href=&quot;http://www.zeldman.com/daily/0405d.shtml/&quot; target=&quot;_blank&quot;&gt;TagClouds&lt;/a&gt;.&amp;quot; Open tagging is likely to include some of the open identity features discussed above, to create a tag system that is resilient to spam, and yet trustable across sites all over the web.&lt;br /&gt;&lt;br /&gt;OpenTags owes a debt to earlier versions of shared tagging systems, which include &lt;a href=&quot;http://www.topicexchange.com/&quot; target=&quot;_blank&quot;&gt;Topic Exchange&lt;/a&gt; and something called the &lt;a href=&quot;http://www.evectors.com/itkcollector/&quot; target=&quot;_blank&quot;&gt;k-collector&lt;/a&gt;Âa knowledge management tag aggregatorÂfrom Italian company eVectors. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers &amp;amp; Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://www.myelin.co.nz/notes/&quot; target=&quot;_blank&quot;&gt;Phil Pearson&lt;/a&gt;, &lt;a href=&quot;http://matt.blogs.it/&quot; target=&quot;_blank&quot;&gt;Matt Mower &lt;/a&gt;, &lt;a href=&quot;http://paolo.evectors.it/&quot; target=&quot;_blank&quot;&gt;Paolo Valdemarin&lt;/a&gt;, and &lt;a href=&quot;http://marc.blogs.it/archives/2005/03/opentopics.html&quot; target=&quot;_blank&quot;&gt;Mary Hodder&lt;/a&gt; and &lt;a href=&quot;http://www.equalsdrummond.name/index.php?p=39&quot; target=&quot;_blank&quot;&gt; Drummond Reed&lt;/a&gt; again, among others.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;7. Pinging&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Websites used to be mostly static. Search engines that &lt;a href=&quot;http://en.wikipedia.org/wiki/Web_crawler&quot; target=&quot;_blank&quot;&gt;crawled&lt;/a&gt; (or &amp;quot;spidered&amp;quot;) them every so often did a good enough job to show reasonably current versions of your cousin&amp;#39;s homepage or even &lt;i&gt;Time&lt;/i&gt;Â magazine&amp;#39;s weekly headlines. But when blogging took off, it became hard for search engines to keep up. (Google has only &lt;a href=&quot;http://searchenginewatch.com/searchday/article.php/3548411&quot; target=&quot;_blank&quot;&gt;just managed&lt;/a&gt; to offer &lt;a href=&quot;http://www.google.com/help/about_blogsearch.html&quot; target=&quot;_blank&quot;&gt;blog-search functionality&lt;/a&gt;, despite &lt;a href=&quot;http://www.alwayson-network.com/comments.php?id=325_0_2_0_C&quot; target=&quot;_blank&quot;&gt;buying Blogger&lt;/a&gt; back in early 2003.)&lt;br /&gt;&lt;br /&gt;To know what was new in the blogosphere, users couldn&amp;#39;t depend on services that spidered webpages once in a while. The solution: a way for blogs themselves to automatically notify blog-tracking sites that they&amp;#39;d been updated. &lt;a href=&quot;http://weblogs.com/&quot; target=&quot;_blank&quot;&gt;Weblogs.com&lt;/a&gt; was the first blog &amp;quot;ping service&amp;quot;: it displayed the name of a blog whenever that blog was updated. Pinging sites helped the blogosphere grow, and &lt;a href=&quot;http://blo.gs/&quot; target=&quot;_blank&quot;&gt;more tools&lt;/a&gt;, services, and portals started using pinging in new and different ways. Dozens of pinging services and sitesÂmost of which can&amp;#39;t talk to each otherÂsprang up. &lt;br /&gt;&lt;br /&gt;Matt Mullenweg (the creator of open source blogging software WordPress) decided that a one-stop service for pinging was needed. He created &lt;a href=&quot;http://pingomatic.com/&quot; target=&quot;_blank&quot;&gt;Ping-o-Matic&lt;/a&gt;Âwhich aggregates ping services and simplifies the pinging process for bloggers and tool developers. With Ping-o-Matic, any developer can alert all of the industry&amp;#39;s blogging tools and tracking sites at once. This new kind of open standard, with shared infrastructure, is a critical to the scalability of Web 2.0 services.&lt;br /&gt;&lt;br /&gt;As &lt;a href=&quot;http://pingomatic.com/about/&quot; target=&quot;_blank&quot;&gt;Matt said&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;There are a number of services designed specifically for tracking and connecting blogs. However it would be expensive for all the services to crawl all the blogs in the world all the time. By sending a small ping to each service you let them know you&amp;#39;ve updated so they can come check you out. They get the freshest data possible, you don&amp;#39;t get a thousand robots spidering your site all the time. Everybody wins.&lt;/blockquote&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://photomatt.net/about/&quot; target=&quot;_blank&quot;&gt;Matt Mullenweg&lt;/a&gt;, &lt;a href=&quot;http://trainedmonkey.com/entry/2251&quot; target=&quot;_blank&quot;&gt;Jim Winstead&lt;/a&gt;, &lt;a href=&quot;http://newhome.weblogs.com/faq&quot; target=&quot;_blank&quot;&gt;Dave Winer&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;8. Routing&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Bloggers used to have to manually enter the links and content snippets of blog posts or news items they wanted to blog. Today, some RSS aggregators can send a specified post directly into an associated blogging tool: as bloggers browse through the feeds they subscribe to, they can easily specify and send any post they wish to &amp;quot;&lt;a href=&quot;http://www.microsoftmonitor.com/archives/010209.html&quot; target=&quot;_blank&quot;&gt;reblog&lt;/a&gt;&amp;quot; from their news aggregator or feed reader into their blogging tool. (This is usually referred to as &amp;quot;&lt;a href=&quot;http://help.blogger.com/bin/answer.py?answer=152&amp;topic=17&quot; target=&quot;_blank&quot;&gt;BlogThis&lt;/a&gt;.&amp;quot;) As structured blogging comes into its own (see the section on Microcontent Publishing), it will be increasingly important to maintain the structural integrity of these pieces of microcontent when reblogging them. &lt;br /&gt;&lt;br /&gt;Promising standard &lt;a href=&quot;http://redirectthis.com/&quot; target=&quot;_blank&quot;&gt;RedirectThis&lt;/a&gt; will combine a &amp;quot;BlogThis&amp;quot;-like capability while maintaining the integrity of the microcontent. RedirectThis will let bloggers and content developers attach a simple &amp;quot;PostThis&amp;quot; button to their posts. Clicking on that button will send that post to the reader/blogger&amp;#39;s favorite &lt;a href=&quot;http://ecto.kung-foo.tv/archives/000990.php&quot; target=&quot;_blank&quot;&gt;blogging tool&lt;/a&gt;. This favorite tool is specified at the RedirectThis web service, where users register their blogging tool of choice. RedirectThis also helps maintain the integrity and structure of microcontentÂthen it&amp;#39;s just up to the user to prefer a blogging tool that also attains that lofty goal of microcontent integrity. &lt;br /&gt;&lt;br /&gt;OutputThis is another nascent web services standard, to let bloggers specify what &amp;quot;destinations&amp;quot; they&amp;#39;d like to have as options in their blogging tool. As new destinations are added to the service, more checkboxes would get added to their blogging toolÂallowing them to route their published microcontent to additional destinations.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://reblog.org/&quot; target=&quot;_blank&quot;&gt;Michael Migurski&lt;/a&gt;, &lt;a href=&quot;http://www.gonze.com/about&quot; target=&quot;_blank&quot;&gt;Lucas Gonze&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;9. Open Communications&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Likely, you&amp;#39;ve experienced the joys of finding friends on AIM or Yahoo Messenger, or the convenience of Skyping with someone overseas. Not that you&amp;#39;re about to throw away your mobile phone or BlackBerry, but for many, also having access to Instant Messaging (IM) and Voice over IP (VoIP) is crucial. &lt;br /&gt;&lt;br /&gt;IM and VoIP are mainstream technologies that already enjoy the benefits of open standards. Entire industries are bornÂright this secondÂbased around these open standards. &lt;a href=&quot;http://www.jabber.org/&quot; target=&quot;_blank&quot;&gt;Jabber&lt;/a&gt; has been an open IM technology for yearsÂin fact, &lt;a href=&quot;http://www.xmpp.org/history.html&quot; target=&quot;_blank&quot;&gt;as XMPP&lt;/a&gt;, it was officially dubbed a standard by &lt;a href=&quot;http://www.ietf.org/overview.html&quot; target=&quot;_blank&quot;&gt;the IETF&lt;/a&gt;. Although becoming an &lt;a href=&quot;http://en.wikipedia.org/wiki/IETF&quot; target=&quot;_blank&quot;&gt;official IETF standard&lt;/a&gt; is usually the kiss of death, Jabber looks like it&amp;#39;ll be around for a while, as entire generations of collaborative, work-group applications and services have been built on top of its messaging protocol. For VoIP, &lt;a href=&quot;http://skype.com/helloagain.html&quot; target=&quot;_blank&quot;&gt;Skype&lt;/a&gt; is clearly the leading standard todayÂthough one could &lt;a href=&quot;http://socialsoftware.weblogsinc.com/entry/1234000923058521/&quot; target=&quot;_blank&quot;&gt;argue just how &amp;quot;open&amp;quot; it is&lt;/a&gt; (and defenders of the IETF&amp;#39;s &lt;a href=&quot;http://www.cs.columbia.edu/sip/&quot; target=&quot;_blank&quot;&gt;SIP standard&lt;/a&gt; often do). But it is free and user-friendly, so there won&amp;#39;t be much argument from &lt;i&gt;users&lt;/i&gt;Â  about it being insufficiently open. Yet there may be a cloud on Skype&amp;#39;s horizon: web behemoth Google recently released a beta of &lt;a href=&quot;http://www.google.com/talk/developer.html&quot; target=&quot;_blank&quot;&gt;Google Talk, an IM client committed to open standards&lt;/a&gt;. It currently &lt;a href=&quot;http://radar.oreilly.com/archives/2005/08/google_talk_rel.html&quot; target=&quot;_blank&quot;&gt;supports XMPP, and will support SIP&lt;/a&gt; for VoIP calls.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Movers and Shakers:&lt;/b&gt;&lt;br /&gt;&lt;a href=&quot;http://www.jabber.org/people/jer.shtml&quot; target=&quot;_blank&quot;&gt;Jeremie Miller&lt;/a&gt;, &lt;a href=&quot;http://www.cs.columbia.edu/~hgs/&quot; target=&quot;_blank&quot;&gt;Henning Schulzrinne&lt;/a&gt;, &lt;a href=&quot;http://www.von.com/schedule_eos11114704148.html&quot; target=&quot;_blank&quot;&gt;Jon Peterson&lt;/a&gt;, &lt;a href=&quot;http://www.pulver.com/jeff/&quot; target=&quot;_blank&quot;&gt;Jeff Pulver&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;10. Device Management and Control&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To access online content, we&amp;#39;re using more and more devices. BlackBerrys, iPods, Treos, you name it. As the web evolves, more and more different devices will have to communicate with each other to give us the content we want when and where we want it. No-one wants to be dependent on one vendor anymoreÂlike, &lt;a href=&quot;http://www.alwayson-network.com/comments.php?id=P9409_0_6_0_C&quot; target=&quot;_blank&quot;&gt;say, Sony&lt;/a&gt;Âfor their laptop, phone, MP3 player, PDA, and digital camera, so that it all works together. We need fully interoperable devices, and the standards to make that work. And to fully make use of how content is moving online content and innovative web services, those standards need to be open.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Midi&quot; target=&quot;_blank&quot;&gt;MIDI (musical instrument digital interface)&lt;/a&gt;, one of the very first open standards in music, connected disparate vendors&amp;#39; instruments, post-production equipment, and recording devices. But MIDI is limited, and &lt;a href=&quot;http://www.oreillynet.com/pub/wlg/8015&quot; target=&quot;_blank&quot;&gt;MIDI II has been very slow to arrive&lt;/a&gt;. Now a new standard for controlling musical devices has emerged: &lt;a href=&quot;http://www.cnmat.berkeley.edu/OpenSoundControl/&quot; target=&quot;_blank&quot;&gt;OSC (Open SoundControl)&lt;/a&gt;. This protocol is optimized for modern networking technology and inter-connects music, video and controller devices with &amp;quot;other multimedia devices.&amp;quot; OSC is used by a wide range of developers, and is being taken up in the mainstream MIDI marketplace.&lt;br /&gt;&lt;br /&gt;Another open-standards-based device management technology is &lt;a href=&quot;http://www.zigbee.org&quot; target=&quot;_blank&quot;&gt;ZigBee&lt;/a&gt;, for building wireless intelligence and network monitoring into all kinds of devices. ZigBee is supported by many networking, consumer electronics, and mobile device companies.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Â  Â  Â  Â· Â· Â· Â· Â· Â· Â  Â  &lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Change to Openness&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The rise of open source software and its &amp;quot;&lt;a href=&quot;http://www.oreillynet.com/pub/a/oreilly/tim/articles/architecture_of_participation.html&quot; target=&quot;_blank&quot;&gt;architecture of participation&lt;/a&gt;&amp;quot; are completely shaking up the old proprietary-web-services-and-standards approach. Sun MicrosystemsÂwhose proprietary Java standard helped define the Web 1.0Âis opening its Solaris OS and has even announced the apparent paradox of an &lt;a href=&quot;http://blogs.zdnet.com/open-source/?p=418&quot; target=&quot;_blank&quot;&gt;open-source Digital Rights Management&lt;/a&gt; system.&lt;br /&gt;&lt;br /&gt;Today&amp;#39;s incumbents will have to adapt to the new openness of the Web 2.0. If they stick to their &lt;a href=&quot;http://www.gartner.com/DisplayDocument?doc_cd=131038&quot; target=&quot;_blank&quot;&gt;proprietary standards&lt;/a&gt;, code, and content, they&amp;#39;ll become the new walled gardensÂplaces users visit briefly to retrieve data and content from enclosed data silos, but not where users &amp;quot;live.&amp;quot; The incumbents&amp;#39; revenue models will have to change. Instead of &amp;quot;owning&amp;quot; their users, users will know they own themselves, and will expect a return on their valuable identity and attention. Instead of being locked into incompatible media formats, users will expect easy access to digital content across many platforms. &lt;br /&gt;&lt;br /&gt;Yesterday&amp;#39;s web giants and tomorrow&amp;#39;s users will need to find a mutually beneficial new balanceÂbetween open and proprietary, developer and user, hierarchical and horizontal, owned and shared, and compatible and closed. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Marc Canter is an active evangelist and developer of open standards. Early in his career, Marc founded MacroMind, which became Macromedia. These days, he is CEO of Broadband Mechanics, a founding member of the Identity Gang and of ourmedia.org. Broadband Mechanics is currently developing the &lt;a href=&quot;http://www.alwayson-network.com/comments.php?id=11262_0_1_0_C&quot; target=&quot;_blank&quot;&gt;GoingOn Network&lt;/a&gt; (with the AlwaysOn Network), as well as an open platform for social networking called the PeopleAggregator.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;A version of the above post appears in the Fall 2005 issue of AlwaysOn&amp;#39;s quarterly print blogozine, and ran as &lt;a href=&quot;http://www.alwayson-network.com/comments.php?id=12063_0_1_0_C&quot; target=&quot;_blank&quot;&gt;a four-part series&lt;/a&gt; on the AlwaysOn Network website.&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;p&gt;(Via &lt;a href=&quot;http://marc.blogs.it/&quot;&gt;Marc&amp;#39;s Voice&lt;/a&gt;.)&lt;/p&gt;&lt;/blockquote&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Yet Another RSS History</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-10-25#880</atom:id>
  <atom:published>2005-10-25T22:23:48Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;a href=&quot;http://dannyayers.com/archives/2005/10/24/yet-another-rss-history/&quot;&gt;Yet Another RSS History&lt;/a&gt;: &amp;quot;&lt;/p&gt;&lt;p&gt;&lt;em&gt;[You donât expect me to work out the CSS right after making it semantic, do you?] &lt;/em&gt;&lt;/p&gt;&lt;p&gt;Shift to another universe. Itâs sometime in the late 1990âs. &lt;a href=&quot;http://www.guha.com/cv.html&quot;&gt;Ramanathan Guha&lt;/a&gt;, &lt;a href=&quot;http://www.tbray.org/ongoing/&quot;&gt;Tim Bray&lt;/a&gt;, &lt;a href=&quot;http://scripting.com&quot;&gt;Dave Winer&lt;/a&gt;, &lt;a href=&quot;http://tantek.com&quot;&gt;Tantek Ãelik&lt;/a&gt;, &lt;a href=&quot;http://dan.libby.com/&quot;&gt;Dan Libby&lt;/a&gt; and &lt;a href=&quot;http://www.w3.org/People/Connolly/&quot;&gt;Dan Connolly&lt;/a&gt; are sharing a jacuzzi*. As they sip Marghueritas, their conversation goes like this: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;cite&gt;DanL&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;So, weâve got this idea for publishing content thatâs a bit like &lt;a href=&quot;http://www.w3.org/TR/NOTE-CDFsubmit.html&quot;&gt;CDF&lt;/a&gt;, but weâve made the system more of a service than just a desktop thing.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Guha&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Sounds cool. Might be a good fit with this RDF thing Iâve been working on.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Dave&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Hmm, Danâs stuff does sound cool, but with all due respect dude, RDF does seem a bit complicated. I really donât think the folks out in userland would get it. And they majored in graphs.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tim&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Maybe we could make it a bit more straightforward, you know, like put pointy brackets around it?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Dave&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Straightforwardâs good. Better still, simple. They like simple.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tantek&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;But what about the rest of the Web, you know, like HTML?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;DanL&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Hmm, but how do we do the timestamping kind of thing, and wrap it up in a âmicropostyâ way, the things that makes this distribution mode work?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Guha&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yeah, metadata is cool. Keep the metadata.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tim&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Not cheap though. The Web must be cheap. Did Andreesen show you his pictures..?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Dave&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;â¦âMicropostyâ? you mean like my newsletter thing, but on the Web?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;DanL&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yep, like Cool Diary Entry of the Day&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tim&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;But do we really need 1000 pages of spec for that?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tantek&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;â¦Incidentally, did you see my &lt;a href=&quot;http://tantek.com/CSS/Examples/boxmodelhack.html&quot;&gt;Box Model Hack?&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Guha&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yup.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;DanL&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yup.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Tim&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yup.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;Dave&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Yup. I explained that on DaveNet last year.&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;MarcC&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Hey! Iâve got it: âMyDigitalCocktailâ..?&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;li&gt;&lt;cite&gt;DanC&lt;/cite&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Hang on, that gives me &lt;a href=&quot;http://lists.w3.org/Archives/Public/www-rdf-interest/2000Mar/0103&quot;&gt;an idea&lt;/a&gt;â¦&lt;/p&gt;&lt;/blockquote&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;em&gt;There was a tangible outcome to this conversation: a document format which supports content and unambiguous, explicit, data and metadata, timestamping and much, much more. Itâs viewable in a regular browser. Can be syndicated; can be aggregated. Unlike forgetful RSS, archives are almost always retrievable using regular HTTP methods. In this universe there was no RSS. No syndication wars. No talking-at-cross-purposes conflict between docheads and dataheads, syntax fans and model fans. No-one had to publish simple data in Byzantine RDF/XML. No-one had to deal with doubly-escaped content and silent data loss. There was no need for any new format for business cards, calendars, blogs, link lists, reviews, pet profiles. XHTML with CSS was more than enough. DanL got the MyNetscape he wanted. Tim got the simple, tight format he wanted. Guha got the AI. Tantek got to do presentations in a cool black raincoat. DanC finally got his schedule on his Palm Pilot. Dave got the credit. MarcC got the parasols and a grass skirt none of the others would admit to having brought. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;Shift back to this universe. Check out &lt;a href=&quot;http://microformats.org/wiki/hatom&quot;&gt;hAtom&lt;/a&gt;. Itâs not finished yet, but &lt;a href=&quot;http://blogmatrix.com&quot;&gt;David&lt;/a&gt;âs been methodically working through the (utterly sound) &lt;a href=&quot;http://microformats.org/wiki/process&quot;&gt;microformats process&lt;/a&gt;. Looks good to me. &lt;/p&gt;&lt;p&gt;&lt;em&gt;* apologies for the imagery, but how else do think Silicon Valley might seem to someone raised in the cowpat-coated hills of Derbyshire?&lt;/em&gt;&lt;/p&gt;&lt;p&gt;PS. Apologies to everyone mentioned. And before you suggest it, blogging *is* therapy.&lt;/p&gt;&amp;quot; &lt;p&gt;(Via &lt;a href=&quot;http://dannyayers.com&quot;&gt;Raw&lt;/a&gt;.)&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Microsoft Gadgets, Start.com and Innovation</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-09-16#868</atom:id>
  <atom:published>2005-09-16T17:54:52Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt;&lt;p&gt;&lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=88270766-b9e1-407b-937f-ab41edce97de&quot;&gt;Microsoft Gadgets, Start.com and Innovation&lt;/a&gt;: &amp;quot;&lt;/p&gt;&lt;p&gt; A lot of &lt;a href=&quot;http://microsoftgadgets.com/blogs/gadgetnews/archive/2005/09/13/3.aspx#comments&quot;&gt;the comments in the initial post on the Microsoft Gadgets blog&lt;/a&gt; are complaints that the Microsoft is copying ideas from &lt;a href=&quot;http://www.apple.com/macosx/features/dashboard/&quot;&gt;Apple&amp;#39;s dashboard&lt;/a&gt;. First of all, people should give credit where it is due and acknowledge that &lt;a href=&quot;http://www.konfabulator.com/&quot;&gt;Konfabulator&lt;/a&gt; is the real pioneer when it comes to desktop widgets. More importantly, the core ideas in Microsoft Gadgets were pioneered by Microsoft not Apple or Konfabulator. &lt;/p&gt;&lt;p&gt; From the post &lt;a href=&quot;http://microsoftgadgets.com/blogs/gadgetnews/archive/2005/09/15/181.aspx&quot;&gt;A Brief History of Windows Sidebar&lt;/a&gt; by Sean Alexander &lt;/p&gt;&lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;b&gt;&lt;span&gt;&lt;?XML:NAMESPACE PREFIX = O ?&gt;Microsoft &amp;#39;Sideshow*&amp;#39; Research Project (2000-2001)&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;span&gt;While work started prior, in September 2001, a team of Microsoft researchers &lt;a href=&quot;http://research.microsoft.com/research/pubs/view.aspx?tr_id=488&quot;&gt;published a paper&lt;/a&gt; entitled, &amp;#39;Sideshow: Providing peripheral awareness of important information&amp;#39; including findings of their project. &lt;/span&gt;&lt;br /&gt; ...&lt;br /&gt;&lt;span&gt;The research paper provides screenshots that bear a striking resemblance to the Windows Sidebar. The paper is a good read for anyone thinking about Gadget development. For folks who have visited Microsoft campuses, you may recall the posters in elevator hallways and Sidebar running on many employees desktops. Technically one of the first teams to implement this concept &lt;/span&gt;&lt;/p&gt;&lt;span&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;i&gt;&lt;span&gt;*Internal code-name, not directly related to the official, Ã¢ÂÂWindows SideShowÃ¢ÂÂ¢Ã¢ÂÂ auxiliary display feature in Windows Vista.&lt;/span&gt;&lt;/i&gt;&amp;gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;b&gt;&lt;span&gt;Microsoft Ã¢ÂÂLonghornÃ¢ÂÂ Alpha Release (2003) &lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt;&lt;/span&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;span&gt;In 2003, Microsoft unveiled a new feature called, &amp;#39;Sidebar&amp;#39; at the Microsoft Professional DeveloperÃ¢ÂÂs Conference. This feature took the best concepts from Microsoft Research and applied them to a new platform code-named, &amp;#39;Avalon&amp;#39;, now formally known as Windows Presentation Foundation... &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt; &lt;/p&gt;&lt;b&gt;&lt;span&gt;Microsoft Windows Vista PDC Release (2005)&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt;&lt;/span&gt;&lt;/b&gt;&lt;p class=&quot;MsoNormal&quot;&gt;&lt;span&gt;While removed from public eye during the Longhorn plan change in 2004, a small team was formed to continue to incubate Windows Sidebar as a concept, dating back to its roots in 2000/2001 as a research exercise. Now Windows Sidebar will be a feature of Windows Vista. Feedback from customers and hardware industry dynamics are being taken into account, particularly adding support for DHTML-based Gadgets to support a broader range of developer and designer, enhanced security infrastructure, and better support for Widescreen (16:10, 16:9) displays. Additionally a new feature in Windows Sidebar is support for hosting of Web Gadgets which can be hosted on sites such as Start.com or run locally. Gadgets that run on the Windows desktop will also be available for Windows XP customers Ã¢ÂÂ more details to be shared here in the future.&lt;/span&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class=&quot;MsoNormal&quot; dir=&quot;ltr&quot;&gt;&lt;span&gt;So the desktop version of &amp;#39;Microsoft Gadgets&amp;#39; is the shipping version of Microsoft Research&amp;#39;s &amp;#39;Sideshow&amp;#39; project. Since the research paper was published a number of parties have shipped products inspired by that research including &lt;a href=&quot;http://www.activewin.com/reviews/software/apps/msn/msn8/interface.shtml&quot;&gt;MSN Dashboard&lt;/a&gt;, &lt;a href=&quot;http://desktop.google.com/features.html#sidebar&quot;&gt;Google Desktop&lt;/a&gt; and &lt;a href=&quot;http://www.desktopsidebar.com/&quot;&gt;Desktop Sidebar&lt;/a&gt; but this doesn&amp;#39;t change the fact that the Microsoft is the pioneer in this space. &lt;/span&gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot; dir=&quot;ltr&quot;&gt;&lt;span&gt;From the post &lt;a href=&quot;http://microsoftgadgets.com/blogs/gadgetnews/archive/2005/09/15/177.aspx&quot;&gt;Gadgets and Start.com&lt;/a&gt; by Sanaz Ahari &lt;/span&gt;&lt;/p&gt;&lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt;&lt;span&gt;&lt;p&gt;&lt;a href=&quot;http://start.com/&quot;&gt;Start.com &lt;/a&gt;was initially released on February 2005, on &lt;a href=&quot;http://start.com/1&quot;&gt;start.com/1&lt;/a&gt; Ã¢ÂÂ since then weÃ¢ÂÂve been innovating regularly (&lt;a href=&quot;http://start.com/2&quot;&gt;start.com/2&lt;/a&gt;, &lt;a href=&quot;http://start.com/3&quot;&gt;start.com/3&lt;/a&gt;, &lt;a href=&quot;http://start.com/&quot;&gt;start.com &lt;/a&gt;and &lt;a href=&quot;http://start.com/pdc&quot;&gt;start.com/pdc&lt;/a&gt;) working towards accomplishing our goals: &lt;/p&gt;&lt;ul&gt;&lt;li&gt; To bring the webÃ¢ÂÂs content to users through: &lt;ul&gt;&lt;li&gt; Rich DHTML components (Gadgets) &lt;/li&gt;&lt;li&gt; RSS and behaviors associated with RSS &lt;/li&gt;&lt;li&gt; High customizability and personalization&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt; To enable developers to extend their start experience by building their own Gadgets&lt;/li&gt;&lt;/ul&gt;&lt;p&gt; Yesterday marked a humble yet significant milestone for us Ã¢ÂÂ we opened our &amp;#39;Atlas&amp;#39; framework enabling developers to extend their start.com experience. You can read more it here: &lt;a href=&quot;http://start.com/developer&quot;&gt;http://start.com/developer&lt;/a&gt;. The key differentiators about our Gadgets are: &lt;/p&gt;&lt;ul&gt;&lt;li&gt; Most web applications were designed as closed systems rather than as a web platform. For example, most customizable &amp;#39;aggregator&amp;#39; web-sites consume feeds and provide a fair amount of layout customization. However, the systems were not extensible by developers. With start.com, the experience is now an integrated and extensible application platform. &lt;/li&gt;&lt;li&gt; We will be enriching the gadgets experience even further, enabling these gadgets to seamlessly work on Windows Sidebar&lt;/li&gt;&lt;/ul&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;p class=&quot;MsoNormal&quot; dir=&quot;ltr&quot;&gt;&lt;span&gt;The Start.com stuff is really cool. Currently with traditional portal sites like &lt;a href=&quot;http://my.msn.com/&quot;&gt;MyMSN&lt;/a&gt; or &lt;a href=&quot;http://my.yahoo.com/&quot;&gt;MyYahoo&lt;/a&gt;, I can customize my data sources by subscribing to RSS feeds but not how they look. Instead all my RSS feeds always look like a list of headlines. These portal sites usually use different widgets for display richer data like stock quotes or weather reports but there is no way for me to subscribe to a stock quote or weather report feed and have it look the same as the one provided by the site. &lt;a href=&quot;http://www.start.com/developer&quot;&gt;Start.com&lt;/a&gt; fundamentally changes this model by turning it on its head. I can create a custom RSS feed and specify how it should render in &lt;a href=&quot;http://www.start.com/&quot;&gt;Start.com&lt;/a&gt; using JavaScript which basically makes it a &lt;a href=&quot;http://www.start.com/&quot;&gt;Start.com&lt;/a&gt; gadget, no different from the default ones provided by the site. &lt;/span&gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot; dir=&quot;ltr&quot;&gt;&lt;span&gt;From my perspective, we&amp;#39;re shipping really innovative stuff but because of branding that has attempted to cash in on the &amp;#39;widgets&amp;#39; hype, we end up looking like followers and copycats. &lt;/span&gt;&lt;/p&gt;&lt;p class=&quot;MsoNormal&quot; dir=&quot;ltr&quot;&gt;&lt;span&gt;Marketing sucks. &lt;/span&gt;&lt;/p&gt;&amp;quot; &lt;p&gt;(Via &lt;a href=&quot;http://www.25hoursaday.com/weblog/&quot;&gt;Dare Obasanjo aka Carnage4Life&lt;/a&gt;.)&lt;/p&gt;&lt;/blockquote&gt; Posted for historic annotation purposes (re. Widgets as Microsoft didn&amp;#39;t copy Apple here at all; Apple just packaged this better at the expense of Konfabulator as already noted above). And yes, Marketing sucks big time!!</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Bill Gates: Cell Phones Will Overtake MP3 Players, Calls iPod &#39;Unsustainable&#39;</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-05-13#837</atom:id>
  <atom:published>2005-05-13T03:53:20Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt; &lt;p&gt;&lt;a href=&quot;http://www.macobserver.com/article/2005/05/12.12.shtml&quot;&gt;Bill Gates: Cell Phones Will Overtake MP3 Players, Calls iPod &#39;Unsustainable&#39;&lt;/a&gt; Microsoft&#39;s chairman draws on computing history to make his proclamation that the iPod phenomenon won&#39;t... &lt;/p&gt;&lt;/blockquote&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://www.macobserver.com/&quot;&gt;The Mac Observer&lt;/a&gt;]&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;Hmm..!&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;I think this one speaks for itself! Kind of reminds me of&amp;nbsp;the&amp;nbsp;ominous round during the &lt;a href=&quot;http://www.eastsideboxing.com/news.php?p=2100&amp;more=1&quot;&gt;rumble in the jungle&lt;/a&gt;&amp;nbsp;when Ali asked Foreman: &quot;Is that all you got George!&quot;.&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;Again,&amp;nbsp;Mac OS X&amp;nbsp;vs&amp;nbsp;Windows&amp;nbsp;is a rendition of Ali vs Foreman (circa 1974) as stated in an earlier &lt;a href=&quot;http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/index.vspx?id=793&quot;&gt;post&lt;/a&gt;; very much in line with the essence of the post fight analysis&amp;nbsp;expressed below:&lt;/div&gt; &lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt; &lt;div align=&quot;left&quot;&gt;&quot;Why did Foreman lose to Ali? The fact is Ali beat Foreman because he was tougher and stronger than he&#39;s ever given credit for. Ali didn&#39;t box Foreman! He went to the ropes and allowed Foreman to hit on him, is that boxing? What if Foreman had knocked him out while he was stationary against the ropes. It would&#39;ve been said for the rest of time, why did Ali remain stationary letting Foreman get off on him? How come he didn&#39;t use the ring and box? Which is exactly what those watching the fight were thinking and saying during rounds two through eight. That&#39;s not boxing, that&#39;s being forced to fight because your opponent will not allow you to box.&quot;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;/blockquote&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;The point I am trying to make here is simple: Bill&#39;s comments are more about hope than facts. The iPod does not define Apple, the company&#39;s future isn&#39;t inextricably linked to the iPod.&amp;nbsp;The company&#39;s&amp;nbsp;future (as I see it) isn&#39;t solely about Desktop Computing (the battle Microsoft won many years ago) or the use of the iPod to ramp up its future growth in this realm. Apple is&amp;nbsp;clearly focused on &quot;Digital Life Style&quot;,&amp;nbsp;&amp;nbsp;a broader incarnation of what&amp;nbsp;Bill &lt;a href=&quot;http://alia.org.au/advocacy/alw/1998/gates.response.html&quot;&gt;described&lt;/a&gt; as &quot;Web Life Style&quot;&amp;nbsp;in the late 90&#39;s.&lt;/div&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;Apple clearly understands that the Internet is the new Operating System (OS). It also understands that this OS isn&#39;t solely about personal Desktop Computing. Most important of all, it understands that it cannot own this OS (so it won&#39;t&amp;nbsp;repeat the fatal mistake of not licensing it to potential partners :-) ).&amp;nbsp; In a sense, the new OS protects Apple from itself (I&amp;nbsp;see certainly understand Bill&#39;s &lt;a href=&quot;http://www.openlinksw.com/blog/search.vspx?blogid=127&amp;q=apple+ipod%0D%0A%0D%0A&amp;type=text&amp;output=html&quot;&gt;point&lt;/a&gt; if Apple was just about the iPod).&lt;/div&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;Apple&amp;nbsp;is&amp;nbsp;using its significant prowess in technology, aesthetics and user experience innovation to provide great solutions (hardware and software)&amp;nbsp;that empower users of this new OS. Tiger (nee. &lt;a href=&quot;http://binarybonsai.com/archives/2005/01/29/jobs-nextstep-os/&quot;&gt;OpenStep / NeXTSTEP OS&lt;/a&gt;; the platform on which the first Web Browser was &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/WorldWideWeb.html&quot;&gt;created&lt;/a&gt;)&amp;nbsp;is a great example, what a &lt;a href=&quot;http://mlagazine.com/modules.php?op=modload&amp;name=News&amp;file=article&amp;sid=142&quot;&gt;journey&lt;/a&gt;! &lt;/div&gt; &lt;div align=&quot;left&quot; dir=&quot;ltr&quot;&gt;&amp;nbsp;&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>A History Of Communications</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-27#813</atom:id>
  <atom:published>2005-04-27T21:17:55Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://www.nixlog.com/archives/2005/03/20_a_history_of_communications.php&quot;&gt;A History Of Communications&lt;/a&gt; &lt;a href=&quot;http://www.nathan.com/projects/current/comtimeline.html&quot;&gt;A History of Communications Timeline&lt;/a&gt; (via &lt;a href=&quot;http://www.xplane.com/xblog/&quot;&gt;xBlog&lt;/a&gt;) &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://www.nixlog.com/&quot;&gt;nixlog&lt;/a&gt;]&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Condemned To Repeat The Past?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-13#779</atom:id>
  <atom:published>2005-04-13T16:05:11Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://techdirt.com/articles/20050413/0054217_F.shtml&quot;&gt;Condemned To Repeat The Past?&lt;/a&gt; Last week, I mentioned one of the &lt;a href=&quot;http://www.techdirt.com/books/20050407/022236_F.shtml&quot;&gt;lessons learned&lt;/a&gt; from Andy Kessler&#39;s newest book, &lt;a href=&quot;http://www.amazon.com/exec/obidos/ASIN/0060840978/techdirtcom/&quot;&gt;&lt;i&gt;How We Got Here: A Slightly Irreverent History of Technology and Markets&lt;/a&gt;&lt;/i&gt;. I&#39;ve just finished reading it, and just in time, as well, since you can now check it out for free. While the hard copy version doesn&#39;t come out until June, Kessler is releasing the book for &lt;a href=&quot;http://andykessler.com/hwgh.html&quot;&gt;free download&lt;/a&gt; off his site in e-book form. If you read his previous book, &lt;a href=&quot;http://www.amazon.com/exec/obidos/ASIN/0060740647/techdirtcom/&quot;&gt;&lt;i&gt;Running Money&lt;/i&gt;&lt;/a&gt;, you might even recognize a few short passages in the new book. In &lt;i&gt;Running Money&lt;/i&gt;, Kessler goes through his own experience figuring out the mental model that guided his investment philosophy in technology -- and part of that included a brief history lesson in the start of the industrial revolution. HWGH is basically an extended version of that history lesson, written in the same light tone -- designed to be the basic, quick history manual that anyone in the tech world (in just about any capacity, from engineer to business to investing) should read. Indeed, we&#39;re already seeing startups and companies making business decisions that seem like they&#39;re following the same bad footsteps companies took only five years ago. Is it any surprise that some are repeating the mistakes of 200 years ago as well? One of the worst things in the tech and business world these days is that many people can&#39;t view trends out past a quarter (or they simply extract one single trend, without recognizing how others impact them). Any intelligent business person needs to recognize how trends play out in the long term, and how they interact with each other. HWGH gives you plenty of trends from the past few &lt;i&gt;centuries&lt;/i&gt; to help guide you into that longer term thinking. &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://www.techdirt.com/&quot;&gt;Techdirt&lt;/a&gt;]&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>IDMS and its role in general DBMS History</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-03-28#772</atom:id>
  <atom:published>2005-03-28T16:34:21Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;A&amp;nbsp;great piece of DBMS history conveyed through the&amp;nbsp;&lt;a href=&quot;http://users.senet.com.au/~cherlet/idmshist.html&quot;&gt;story of IDMS&lt;/a&gt;.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The Lost 1984 Mac Video</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-03-22#760</atom:id>
  <atom:published>2005-03-22T20:20:47Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;A great piece&amp;nbsp;that reminds us of &lt;a href=&quot;http://en.wikipedia.org/wiki/Apple_Computer&quot;&gt;Apple Computer&#39;s&lt;/a&gt;&amp;nbsp;contributions to&amp;nbsp;&lt;a href=&quot;http://www.industrial-technology-and-witchcraft.de/1984.html&quot;&gt;desktop computing&lt;/a&gt;&amp;nbsp;history.&amp;nbsp;&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Longhorn: Fixing Your Own Mess?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-03-08#745</atom:id>
  <atom:published>2005-03-08T17:42:00Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Via the &lt;a href=&quot;http://www.alwayson-network.com/&quot;&gt;always-on&lt;/a&gt; network I stumbled across a great &lt;a href=&quot;http://www.alwayson-network.com/comments.php?id=9035_0_11_0_C&quot;&gt;article&lt;/a&gt; by Pip Coburn&amp;nbsp;that posed the following question: &quot;should Microsoft benefit from the mess it helped create?&quot;.&lt;/p&gt; &lt;p&gt;The article&amp;nbsp;discusses most of the key&amp;nbsp;issues, but it should also have included and discussed he following question: &quot;should Microsoft benefit from the mess that we let them create?&quot;.&amp;nbsp;By &quot;we&quot; I mean&amp;nbsp;the extensive pool of Microsoft product consumers, developers, and partners etc.&lt;/p&gt; &lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt; &lt;p&gt;&lt;em&gt;I have worked with Microsoft products (as a developer and user) for more years than I would like to remember; I have personally experienced the journey from Windows 2.0 to Windows XP (and played around with Longhorn).&lt;/em&gt; &lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;I added my question to this dialog&amp;nbsp;as without it&#39;s resultant perspective,&amp;nbsp;history will simply repeat itself. If IT technology decision makers don&#39;t change their product selection and acquisition habits, then why should Microsoft or any other vendor change their ways? Especially&amp;nbsp;when a perpetual promise-under deliver-repromise cycle works absolutely fine. This isn&#39;t rocket science, it basic common sense (but we know that common sense ain&#39;t that common).&lt;/p&gt; &lt;p&gt;Microsoft like most software companies&amp;nbsp;seek significant portions of their revenue growth&amp;nbsp;from product upgrades. In a sense, it&amp;nbsp;inherently implies that&amp;nbsp;these products will always be millions of miles away from the &quot;silver bullet&quot; promises espoused in the pre product release marketing and PR hype. Sadly, there was a time when Marketing and PR hype used to be about new features; a time when there was a clear line between a new feature and a fundamental product bug. &lt;/p&gt; &lt;p&gt;Buying products from any company simply because they have the largest market share&amp;nbsp;is dumb! All it does is encourage other vendors to focus on product market share rather than product quality, which ultimately results in the following:&lt;/p&gt; &lt;ol&gt; &lt;li&gt;You basically end up paying (rather than at least being credited) for opportunity costs arising from all the time lost&amp;nbsp;now your PC now works slower than you&amp;nbsp;do. &lt;br&gt;&lt;/li&gt; &lt;li&gt;You pay for bug fixes and architectural flaws instead of new features&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;Microsoft isn&#39;t a unique source of this problem, but hey! They are the largest Software Company (the one with the vital market share), and their software products are&amp;nbsp;on some 80-90% of desktops on this planet, and the planet isn&#39;t at its most productive at the current time, and no matter how you look at it, this loss of productivity has something to do with the increased nuisance of desktop computing. &lt;/p&gt; &lt;p&gt;If Microsoft could just focus on its core competence (BTW - I can&#39;t quite pint point this anymore&amp;nbsp;since they are in every software market that exists today), it would have at least have an iota of a chance in hell of cleaning up this mess.&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>OpenSolaris: Great Business Strategy or Dumb Luck?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-03-03#725</atom:id>
  <atom:published>2005-03-03T18:46:13Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt; &lt;p&gt;&lt;a href=&quot;http://blogs.sun.com/roller/page/jimgris/20050302#great_business_strategy_or_dumb&quot;&gt;Great Business Strategy or Dumb Luck&lt;/a&gt; Interesting read here today at ZDNet -- &lt;a href=&quot;http://news.zdnet.com/2100-9590_22-5596710.html&quot;&gt;Open Solaris and strategic consequences&lt;/a&gt;. Here&#39;s a bit of the conclusion:&lt;br&gt;&lt;br&gt;&lt;/p&gt; &lt;div style=&quot;MARGIN-LEFT: 40px&quot;&gt;&lt;span style=&quot;COLOR: rgb(153,0,0)&quot;&gt;Open Solaris may go down in history as one the finest examples of business strategy ever -- unless, of course, it&#39;s just dumb luck.&lt;/span&gt;&lt;br&gt;&lt;/div&gt;&lt;br&gt;So, we are so brilliant -- &lt;span style=&quot;FONT-STYLE: italic&quot;&gt;to the extreme&lt;/span&gt; -- that when &lt;a href=&quot;http://www.opensolaris.org/&quot;&gt;OpenSolaris&lt;/a&gt; succeeds it will be characterized as &quot;one of the finest examples of business strategy &lt;span style=&quot;TEXT-DECORATION: underline&quot;&gt;ever&lt;/span&gt;.&quot; Ever? That would be quite an achievement. But even if we are successful -- &lt;span style=&quot;FONT-STYLE: italic&quot;&gt;shifting to the extreme polar opposite now&lt;/span&gt; -- we could just as easily be considered &quot;dumb&quot; and that our achievement was &quot;just lucky.&quot; What? Why the extremes? Sorry. I just can&#39;t factory that. I realize I&#39;m a pretty simple guy, but this makes no sense to me. Why do people look at issues this way? I think this is why some conversations are so confusing. People argue to the extremes. Why can&#39;t Sun&#39;s &lt;a href=&quot;http://www.technorati.com/tag/OpenSolaris&quot;&gt;open sourcing&lt;/a&gt; of &lt;a href=&quot;http://www.technorati.com/tag/Solaris&quot;&gt;Solaris&lt;/a&gt; be seen as simply the natural evolution of a company, a development team, a product, and a market? Or the genuine attempt of the &lt;a href=&quot;http://www.sun.com/software/solaris/&quot;&gt;Solaris&lt;/a&gt; &lt;a href=&quot;http://www.samag.com/documents/s=9427/sam0414a/0414a.htm&quot;&gt;kernel engineers&lt;/a&gt; to engage with external developers in a community co-development model to improve the system for everyone involved? Why can&#39;t it be that simple? What am I missing here?&lt;/blockquote&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://blogs.sun.com/roller/page/jimgris&quot;&gt;Jim Grisanzio&lt;/a&gt;]&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;Jim makes a great point!&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;Also note that Open Source Solaris is a huge contribution to the Open Source community&amp;nbsp;from a company (that IMHO) has actually been one of the largest Open Source contributors in history period. We just don&#39;t track history very well these days thanks to the kind of zealotry written about &lt;a href=&quot;http://caustictech.typepad.com/caustictech/2004/06/the_open_source.html&quot;&gt;here&lt;/a&gt;&amp;nbsp;(*strong language*), and &amp;nbsp;&lt;a href=&quot;http://neopoleon.com/blog/posts/4343.aspx&quot;&gt;here&lt;/a&gt;&amp;nbsp;(in this case by&amp;nbsp;&lt;a href=&quot;http://neopoleon.com/blog/&quot;&gt;Rory Blythe&lt;/a&gt;).&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;BTW - &lt;a href=&quot;http://www.openlinksw.com/blog/search.vspx?blogid=127&amp;q=open+source%0D%0A&amp;type=text&amp;output=html&quot;&gt;Here&lt;/a&gt; are some of my previous posts on the subject of Open Source.&lt;/div&gt; &lt;div align=&quot;left&quot;&gt;&amp;nbsp;&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Bloglines</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-07-26#600</atom:id>
  <atom:published>2004-07-26T19:02:19Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://weblog.infoworld.com/udell/2004/07/25.html#a1047&quot;&gt;Bloglines&lt;/a&gt; &lt;p&gt;Since last fall, I&amp;#39;ve been recommending &lt;a href=&quot;http://www.blogines.com/&quot;&gt;Bloglines&lt;/a&gt; to first-timers as the fastest and easiest introduction to the subscription side of the blogosphere. Remarkably, this same application also meets the needs of some of the most &lt;a href=&quot;http://www.intertwingly.net/blog/1716.html&quot;&gt;advanced&lt;/a&gt; &lt;a href=&quot;http://jeremy.zawodny.com/blog/archives/001829.html&quot;&gt;users&lt;/a&gt;. I&amp;#39;ve now added myself to that list. Hats off to &lt;a href=&quot;http://www.wingedpig.com/&quot;&gt;Mark Fletcher&lt;/a&gt; for putting all the pieces together in such a masterful way. &lt;/p&gt; &lt;p&gt;What goes around comes around. Five years ago, centralized feed aggregators -- my.netscape.com and my.userland.com -- were the only game in town. Fat-client feedreaders only arrived on the scene later. Because of the well-known rich-versus-reach tradeoffs, I never really settled in with one of those. Most of the time I&amp;#39;ve used the Radio UserLand reader. It is browser-based, and it normally points to localhost, but I&amp;#39;ve been parking Radio UserLand on a secure server so that I can read the feeds it aggregates for me from anywhere. &lt;/p&gt; &lt;p&gt;Bloglines takes that idea and runs with it. Like the Radio UserLand reader, it supports the all-important (to me) consolidated view of new items. But its two-pane interface also shows me the list of feeds, highlighting those with new entries, so you can switch between a linear of scan of all new items and random access to particular feeds. Once you&amp;#39;ve read an item it vanishes, but you can recall already-read items like so: &lt;/p&gt; &lt;p align=&quot;center&quot;&gt; &lt;/p&gt;&lt;form action=&quot;&quot; method=&quot;get&quot;&gt;&lt;input name=&quot;sub&quot; type=&quot;hidden&quot; /&gt; Display items within the last &lt;select name=&quot;since&quot;&gt;&lt;option value=&quot;0&quot;&gt;Session&lt;/option&gt;&lt;option value=&quot;1&quot;&gt;1 Hour&lt;/option&gt;&lt;option value=&quot;2&quot;&gt;6 Hours&lt;/option&gt;&lt;option value=&quot;3&quot;&gt;12 Hours&lt;/option&gt;&lt;option value=&quot;4&quot;&gt;24 Hours&lt;/option&gt;&lt;option value=&quot;5&quot;&gt;48 Hours&lt;/option&gt;&lt;option value=&quot;6&quot;&gt;72 Hours&lt;/option&gt;&lt;option value=&quot;7&quot;&gt;Week&lt;/option&gt;&lt;option value=&quot;8&quot;&gt;Month&lt;/option&gt;&lt;option value=&quot;9&quot;&gt;All Items&lt;/option&gt;&lt;/select&gt; &lt;input name=&quot;Display&quot; type=&quot;submit&quot; value=&quot;Display&quot; /&gt; &lt;/form&gt; &lt;p&gt;If a month&amp;#39;s worth of some blog&amp;#39;s entries produces too much stuff to easily scan, you can switch that blog to a titles-only view. The titles expand to reveal all the content transmitted in the feed for that item. &lt;/p&gt; &lt;p&gt;I haven&amp;#39;t gotten around to organizing my feeds into folders, the way &lt;a href=&quot;http://www.bloglines.com/public/yoz&quot;&gt;other&lt;/a&gt; &lt;a href=&quot;http://www.bloglines.com/public/marccanter&quot;&gt;users&lt;/a&gt; of Bloglines do, but I&amp;#39;ve poked around enough to see that Bloglines, like Zope, handles foldering about as well as you can in a Web UI -- which is to say, well enough. With an intelligent local cache it could be really good; more on that later. &lt;/p&gt; &lt;p&gt;Bloglines does two kinds of data mining that are especially noteworthy. First, it counts and reports the number of Bloglines users subscribed to each blog. In the case of &lt;a href=&quot;http://www.bloglines.com/preview?siteid=297235&quot;&gt;Jonathan Schwartz&amp;#39;s weblog&lt;/a&gt;, for example, there are (as of this moment) &lt;a href=&quot;http://www.bloglines.com/userdir?siteid=297235&quot;&gt;253 subscribers&lt;/a&gt;. &lt;/p&gt; &lt;p&gt;Second, Bloglines is currently managing references to items more effectively than the competition. I was curious, for example, to gauge the reaction to the latest salvo in Schwartz&amp;#39;s ongoing campaign to turn up the heat on Red Hat. Bloglines reports &lt;a href=&quot;http://www.bloglines.com/citations?siteid=297235&amp;itemid=14&quot; target=&quot;_blank&quot; title=&quot;References To This Item From Other Blogs&quot;&gt;10 References&lt;/a&gt;. In this case, the comparable query on Feedster yields a &lt;a href=&quot;http://feedster.net//links.php?url=http%3A//blogs.sun.com/roller/page/jonathan/20040721%23competing_against_a_social_movement&quot;&gt;comparable result&lt;/a&gt;, but on the whole I&amp;#39;m finding Bloglines&amp;#39; assembly of conversations to be more reliable than Feedster&amp;#39;s (which, however, is still marked as &amp;#39;beta&amp;#39;). Meanwhile Technorati, though it casts a much wider net than either, is &lt;a href=&quot;http://www.technorati.com/cosmos/search.html?url=http://blogs.sun.com/roller/page/jonathan/20040721#competing_against_a_social_movement&quot;&gt;currently struggling&lt;/a&gt; with conversation assembly. &lt;/p&gt; &lt;p&gt;I love how Bloglines weaves everything together to create a dense web of information. For example, the list of &lt;a href=&quot;http://www.bloglines.com/userdir?siteid=297235&quot;&gt;subscribers to the Schwartz blog&lt;/a&gt; includes: &lt;i&gt;&lt;a href=&quot;http://www.bloglines.com/public/judell&quot;&gt;judell&lt;/a&gt; - subscribed since July 23, 2004&lt;/i&gt;. Click that link and you&amp;#39;ll see my Bloglines subscriptions. Which you can &lt;a href=&quot;http://www.bloglines.com/export?id=judell&quot;&gt;export&lt;/a&gt; and then -- if you&amp;#39;d like to see the world through my filter -- turn around and import. &lt;/p&gt; &lt;p&gt;Moving my 265 subscriptions into Bloglines wasn&amp;#39;t a complete no-brainer. I imported my &lt;a href=&quot;http://weblog.infoworld.com/udell/gems/mySubscriptions.opml&quot;&gt;Radio UserLand-generated OPML file&lt;/a&gt; without any trouble, but catching up on unread items -- that is, marking all of each feed&amp;#39;s sometimes lengthy history of items as having been read -- was painful. In theory you can do that by clicking once on the top-level folder containing all the feeds, which generates the consolidated view of unread items. In practice, that kept timing out. I finally had to touch a number of the larger feeds, one after another, in order to get everything caught up. A &lt;b&gt;Catch Up All Feeds&lt;/b&gt; feature would solve this problem. &lt;/p&gt; &lt;p&gt;Another feature I&amp;#39;d love to see is &lt;b&gt;Move To Next Unread Item&lt;/b&gt; -- wired to a link in the HTML UI, or to a keystroke, or ideally both. &lt;/p&gt; &lt;p&gt;Finally, I&amp;#39;d love it if Bloglines cached everything in a local database, not only for offline reading but also to make the UI more responsive and to accelerate queries that reach back into the archive. &lt;/p&gt; &lt;p&gt;Like Gmail, Bloglines is the kind of Web application that surprises you with what it can do, and makes you crave more. Some argue that to satisfy that craving, you&amp;#39;ll need to abandon the browser and switch to RIA (rich Internet application) technology -- Flash, Java, Avalon (someday), whatever. Others are concluding that perhaps the 80/20 solution that the browser is today can become a 90/10 or 95/5 solution tomorrow with some incremental changes. &lt;/p&gt; &lt;p&gt;Dare Obasanjo wondered, over the weekend, &amp;quot;What is Google building?&amp;quot; He wrote: &lt;/p&gt;&lt;blockquote class=&quot;personQuote DareObasanjo&quot;&gt;In the past couple of months Google has hired four people who used to work on Internet Explorer in various capacities [especially its XML support] who then moved to BEA; &lt;a href=&quot;http://davidbau.com/about/david_bau.html&quot;&gt;David Bau&lt;/a&gt;, &lt;a href=&quot;http://www.oreillynet.com/pub/au/1303&quot;&gt;Rod Chavez&lt;/a&gt;, &lt;a href=&quot;http://gary.burd.info/&quot;&gt;Gary Burd&lt;/a&gt; and most recently &lt;a href=&quot;http://www.eweek.com/article2/0,1759,1627319,00.asp&quot;&gt;Adam Bosworth&lt;/a&gt;. A number of my coworkers used to work with these guys since our team, the Microsoft XML team, was once part of the Internet Explorer team. It&amp;#39;s been interesting chatting in the hallways with folks contemplating what Google would want to build that requires folks with a background in building XML data access technologies both on the client side, Internet Explorer and on the server, BEA&amp;#39;s WebLogic. [&lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=1524b97e-f8b1-4e42-ac07-455337f299b4&quot;&gt;Dare Obasanjo&lt;/a&gt;] &lt;/blockquote&gt;It seems pretty clear to me. Web applications such as Gmail and Bloglines are already hard to beat. With a touch of &lt;a href=&quot;http://weblog.infoworld.com/udell/2004/06/15.html#a1023&quot;&gt;alchemy&lt;/a&gt; they just might become unstoppable. &lt;p&gt;&lt;/p&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://weblog.infoworld.com/udell/&quot;&gt;Jon&amp;#39;s Radio&lt;/a&gt;]&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>History leading up to today&#39;s IE Security and Backwardness Debacle</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-07-07#584</atom:id>
  <atom:published>2004-07-07T19:30:28Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Another &lt;a href=&quot;http://reviews.cnet.com/4520-3513_7-5142439-1.html&quot;&gt;insightful piece&lt;/a&gt; on the same painful subject of IE and the costs of vendor monoculture. The IE debacle is an&amp;nbsp;important forebearer of what&#39;s to come for those who hope this is simply a storm in a tea cup. It isn&#39;t inconceivable that a Longhorn upgrade wouldn&#39;t be pitched as the way out of this deepening dysfunctional-web quagmire.&amp;nbsp;Unfortunately many will bite if history and current mindset is a barometer :-(&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>XML, the New Database Heresy</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-06-04#555</atom:id>
  <atom:published>2004-06-04T04:04:48Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p dir=&quot;ltr&quot;&gt;A great &lt;a href=&quot;http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=d28ce1fb-7b27-407d-b1a3-0b9a34831ca1&quot;&gt;post &lt;/a&gt;by Dare, especially his bringing into context the essence of this matter refrred to by C.J. Date as &quot;XML the New Database Heresy&quot;.&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;I have little to add to this matter as our&amp;nbsp;understanding and vision is aptly expressed via the architecture and feature set of &lt;a href=&quot;http://www.openlinksw.com/virtuoso&quot;&gt;Virtuoso&lt;/a&gt; (this area was actually addressed circa 1999).&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;We are heading into a era of multi-model databases, these are single database engines that are capable of effectively serving the requirements of the Hierarchical, Network, Relational, and Object database &lt;a href=&quot;http://www.web-dictionary.org/encyclopedia/db/DBMS.html#Navigational_databases&quot;&gt;models&lt;/a&gt;&amp;nbsp;. As we get closer to the unravelling of universal storage, hopefully this will get clearer.&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;Back to Dare&#39;s commentary:&lt;/p&gt; &lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt; &lt;p&gt;&lt;a href=&quot;http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/d/Date:C=_J=.html&quot;&gt;C.J. Date&lt;/a&gt;, one of the most influential names in the relational database world, had some harsh words about XML&#39;s encroachment into the world of relational databases in a recent article entitled &lt;a href=&quot;http://searchdatabase.techtarget.com/originalContent/0,289142,sid13_gci962948,00.html&quot;&gt;Date defends relational model&amp;nbsp;&lt;/a&gt;&amp;nbsp;that appeared on SearchDatabases.com. Key parts of the article are excerpted below &lt;/p&gt; &lt;blockquote dir=&quot;ltr&quot; style=&quot;MARGIN-RIGHT: 0px&quot;&gt; &lt;p&gt;Date reserved his harshest criticism for the competition, namely object-oriented and XML-based DBMSs. Calling them &quot;the latest fashions in the computer world,&quot; Date said he rejects the argument that relational DBMSs are yesterday&#39;s news. Fans of object-oriented database systems &quot;see flaws in the relational model because they don&#39;t fully understand it,&quot; he said. &lt;/p&gt; &lt;p&gt;Date also said that XML enthusiasts have gone overboard. &lt;/p&gt; &lt;p&gt;&quot;XML was invented to solve the problem of data interchange, but having solved that, they now want to take over the world,&quot; he said. &quot;With XML, it&#39;s like we forget what we are supposed to be doing, and focus instead on how to do it.&quot; &lt;/p&gt; &lt;p&gt;Craig S. Mullins, the director of technology planning at BMC Software and a SearchDatabase.com expert, shares Date&#39;s opinion of XML. It can be worthwhile, Mullins said, as long as XML is only used as a method of taking data and putting it into a DBMS. But Mullins cautioned that XML data that is stored in relational DBMSs as whole documents will be useless if the data needs to be queried, and he stressed Date&#39;s point that XML is not a real data model. &lt;/p&gt;&lt;/blockquote&gt; &lt;p dir=&quot;ltr&quot;&gt;Craig Mullins points are more straightforward to answer since his comments don&#39;t jibe with the current state of the art in the XML world. He states that you can&#39;t query XML documents stored in databases but this is untrue. Almost three years ago, I was writing articles about &lt;a href=&quot;http://features.slashdot.org/article.pl?sid=01/10/29/0725214&amp;mode=thread&amp;tid=156&quot;&gt;querying XML documents stored in relational databases&lt;/a&gt;. Storing XML in a relational database doesn&#39;t mean it has to be stored in&amp;nbsp;as an opaque&amp;nbsp;binary BLOB or as a big, bunch of text which cannot effectively be queried. The next version of SQL Server will have extensive capabilities for querying XML data in relational database and doing joins across relational and XML data, a lot of this functionality is&amp;nbsp;described in the article on &lt;a href=&quot;http://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/dnsql90/html/sql2k5xml.asp&quot;&gt;XML Support in SQL Server 2005&lt;/a&gt;. As for XML not having a data model, I beg to differ. There is a data model for XML that many applications and people adhere to, often without realizing that they are doing so. This data model is the &lt;a href=&quot;http://www.w3.org/TR/1999/REC-xpath-19991116#data-model&quot;&gt;XPath 1.0 data model&lt;/a&gt;, which is being updated to handled typed data as the &lt;a href=&quot;http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/&quot;&gt;XQuery and XPath 2.0 data model&lt;/a&gt;. &lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;Now to tackle the meat of C.J. Date&#39;s criticisms which is that XML solves the problem of data interchange but now is showing up in the database. The thing first point I&#39;d like point out is that there are two broad usage patterns of XML, it&amp;nbsp; is used to represent both rigidly structured tabular data&amp;nbsp;(e.g., relational data or serialized objects) and semi-structured data (e.g., office documents). The latter type of data will only grow now that office productivity software like &lt;a href=&quot;http://www.microsoft.com/office&quot;&gt;Microsoft Office&lt;/a&gt; have enabled users to save their&amp;nbsp;documents as XML instead of proprietary binary formats. In many cases, these documents cannot simply shredded into relational tables. Sure you can shred an Excel spreadsheet written in&amp;nbsp;spreadsheetML into relational tables but is the same really feasible for a Word document written in WordprocessingML? Many enterprises would rather have their important business data being stored and queried from a unified location instead of the current situation where some data is in document management systems, some hangs around as random files in people&#39;s folders while some sits in a database management system. &lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;As for stating that critics of the relational model don&#39;t understand it,&amp;nbsp;I disagree. One of the major&amp;nbsp;benefits of using XML&amp;nbsp;in relational databases is that it is a lot easier to deal with fluid schemas or&amp;nbsp;data with sparse entries with XML. When the shape of the data tends to change or is not fixed the relational model is simply not&amp;nbsp;designed to deal with this. Constantly changing your database schema is simply not feasible and there is no easy way to provide the extensibility of XML where one can say &quot;after the &lt;font face=&quot;Courier New&quot;&gt;X &lt;/font&gt;element, any element from any namespace can appear&quot;. How would one describe the capacity to store âany dataâ in a traditional relational database without resorting to an opaque blob? &lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;I do tend to agree that some people are going overboard and trying to model their data hierarchically instead of relationally which experience has thought us is a bad idea. Recently on the XML-DEV mailing list entitled &lt;a href=&quot;http://lists.xml.org/archives/xml-dev/200405/msg00216.html&quot;&gt;Designing XML to Support Information Evolution &lt;/a&gt;where Roger L. Costello described his travails trying to model his data which was being transferred as XML in a hierarchical manner. Micheal Champion accurately described the process Roger Costello went through as having &quot;rediscovered the relational model&quot;. In&amp;nbsp;a response to that thread I wrote &quot;Hierarchical databases failed for a reason&quot;. &lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;Using hierarchy as a primary way to model data is bad for at least the following reasons &lt;/p&gt; &lt;ol dir=&quot;ltr&quot;&gt; &lt;li&gt; &lt;div&gt;Hierarchies tend to encourage redundancy. Imagine I have a &amp;lt;Customer&amp;gt; element who has one or more &amp;lt;ShippingAddress&amp;gt; elements as children as well as one or more &amp;lt;Order&amp;gt; elements as children as well. Each order was shipped to an address, so if modelled hierarchically each &amp;lt;Order&amp;gt; element also will have a &amp;lt;ShippingAddress&amp;gt; element which leads to a lot of unnecessary duplication of data. &lt;/div&gt;&lt;/li&gt; &lt;li&gt; &lt;div&gt;In the real world, there are often multiple&amp;nbsp;groups to which a piece of data belongs which often cannot be modelled with a single hierarchy. &amp;nbsp; &lt;/div&gt;&lt;/li&gt; &lt;li&gt; &lt;div&gt;Data is too tightly coupled. If I delete a &amp;lt;Customer&amp;gt; element, this means I&#39;ve automatically deleted his entire order history since all&amp;nbsp;the &amp;lt;Order&amp;gt; elements are children of &amp;lt;Customer&amp;gt;. Similarly if I query for a &amp;lt;Customer&amp;gt;, I end up getting all the &amp;lt;Order&amp;gt; information as well. &lt;/div&gt;&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;To put it simply, experience has taught the software world that the relational model is a better way to model data than the hierarchical model. Unfortunately, in the rush to embrace XML many a repreating the mistakes from decades ago in the new millenium. &lt;/p&gt;&lt;/blockquote&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://www.25hoursaday.com/weblog/&quot;&gt;Dare Obasanjo aka Carnage4Life&lt;/a&gt;]&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Essay about current and past trends -- Joi Ito</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-04-24#528</atom:id>
  <atom:published>2004-04-24T21:27:24Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Here are some thoughts on where I think things are going in the mobile and content space.&lt;/p&gt; &lt;p&gt;&lt;em&gt;I wrote this essay before reading &lt;a href=&quot;http://en.wikipedia.org/wiki/Free_Culture&quot;&gt;Free Culture&lt;/a&gt; so I&amp;#39;m saying a lot of stuff that &lt;a href=&quot;http://www.lessig.org/&quot;&gt;Larry&lt;/a&gt; says better...&lt;/em&gt;&lt;/p&gt; &lt;p&gt;Several crucial shifts in technology are emerging that will drastically affect the relationship between users and technology in the near future. Wireless Internet is becoming ubiquitous and economically viable. Internet capable devices are becoming smaller and more powerful.&lt;/p&gt; &lt;p&gt;Alongside technological shifts, new social trends are emerging. Users are shifting their attention from packaged content to social information about location, presence and community. Tools for identity, trust, relationship management and navigating social networks are becoming more popular. Mobile communication tools are shifting away from a 1-1 model, allowing for increased many-to-many interactions; such a shift is even being used to permit new forms of democracy and citizen participation in global dialog.&lt;/p&gt; &lt;p&gt;While new technological and social trends are occurring, it is not without resistance, often by the developers and distributors of technology and content. In order to empower the consumer as a community member and producer, communication carriers, hardware manufacturers and content providers must understand and build models that focus less on the content and more on the relationships. &lt;/p&gt; &lt;p&gt;&lt;strong&gt;Smaller faster&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Computing started out as large mainframe computers, software developers and companies &amp;#8220;time sharing&amp;#8221; for slices of computing time on the large machines. The mini-computer was cheaper and smaller, allowing companies and labs to own their own computers. The mini computer allowed a much greater number of people to have access to computers and even use them in real time. The mini computer lead to a burst in software and networking technologies. In the early 80&amp;#8217;s, the personal computer increased the number of computers by an order of magnitude and again, led to an explosion in new software and technology while lowering the cost even more. Console gaming companies proved once again that unit costs could be decreased significantly by dramatically increasing the number of units sold. Today, we have over a billion cell phones in the market. There are tens of millions camera phones. The incredible number of these devices has continued to lower the unit cost of computing as well as devices imbedded in these devices such as small cameras. High end phones have the computing power of the personal computers of the 80&amp;#8217;s and the game consoles of the 90&amp;#8217;s.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;History repeats with WiFi&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;There are parallels in the history of communications and computing. In the 1980&amp;#8217;s the technology of packet switched networks became widely deployed. Two standards competed. X.25 was a packet switched network technology being promoted by CCITT (a large, formal international standards body) and the telephone companies. It involved a system run by telephone companies including metered tariffs and multiple bilateral agreements between carriers to hook up.&lt;/p&gt; &lt;p&gt;Concurrently, universities and research labs were promoting TCP/IP and the Internet opportunity for loosely organized standards meetings being operated with flat rate tariffs and little or no agreements between the carriers. People just connected to the closest node and everyone agreed to freely carry traffic for others.&lt;/p&gt; &lt;p&gt;There were several &amp;#8220;free Internet&amp;#8221; services such as &amp;#8220;The Little Garden&amp;#8221; in San Francisco. Commercial service providers, particularly the telephone company operators such as SprintNet tried to shut down such free services by threatening not to carry this free traffic.&lt;/p&gt; &lt;p&gt;Eventually, large ISPs began providing high quality Internet connectivity and finally the telephone companies realized that the Internet was the dominant standard and shutdown or acquired the ISPs.&lt;/p&gt; &lt;p&gt;A similar trend is happening in wireless data services. GPRS is currently the dominant technology among mobile telephone carriers. GPRS allows users to transmit packets of data across the carrier network to the Internet. One can roam to other networks as long as the mobile operators have agreements with each other. Just like in the days of X.25, the system requires many bilateral agreements between the carriers; their goal is to track and bill for each packet of information.&lt;/p&gt; &lt;p&gt;Competing with this standard is WiFi. WiFi is just a simple wireless extension to the current Internet and many hotspots provide people with free access to the Internet in cafes and other public areas. WiFi service providers have emerged, while telephone operators &amp;#8211;such as a T-Mobile and Vodaphone- are capitalizing on paid WiFi services. Just as with the Internet, network operators are threatening to shut down free WiFi providers, citing a violation of terms of service. &lt;/p&gt; &lt;p&gt;Just as with X.25, the GPRS data network and the future data networks planned by the telephone carriers (e.g. 3G) are crippled with unwieldy standards bodies, bilateral agreements, and inherently complicated and expensive plant operations.&lt;/p&gt; &lt;p&gt;It is clear that the simplicity of WiFi and the Internet is more efficient than the networks planned by the telephone companies. That said, the availability of low cost phones is controlled by mobile telephone carriers, their distribution networks and their subsidies.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Content vs Context&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Many of the mobile telephone carriers are hoping that users will purchase branded content manufactured in Hollywood and packaged and distributed by the telephone companies using sophisticated technology to thwart copying.&lt;/p&gt; &lt;p&gt;Broadband in the home will always be cheaper than mobile broadband. Therefore it will be cheaper for people to download content at home and use storage devices to carry it with them rather than downloading or viewing content over a mobile phone network. Most entertainment content is not so time sensitive that it requires real time network access.&lt;/p&gt; &lt;p&gt;The mobile carriers are making the same mistake that many of the network service providers made in the 80s. Consider Delphi, a joint venture between IBM and Sears Roebuck. Delphi assumed that branded content was going to be the main use of their system and designed the architecture of the network to provide users with such content. Conversely, the users ended up using primary email and communications and the system failed to provide such services effectively due to the mis-design.&lt;/p&gt; &lt;p&gt;Similarly, it is clear that mobile computing is about communication. Not only are mobile phones being used for 1-1 communications, as expected through voice conversations; people are learning new forms of communication because of SMS, email and presence technologies. Often, the value of these communication processes is the transmission of &amp;#8220;state&amp;#8221; or &amp;#8220;context&amp;#8221; information; the content of the messages are less important.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Copyright and the Creative Commons&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;In addition to the constant flow of traffic keeping groups of people in touch with each other, significant changes are emerging in multimedia creation and sharing. The low cost of cameras and the nearly television studio quality capability of personal computers has caused an explosion in the number and quality of content being created by amateurs. Not only is this content easier to develop, people are using the power of weblogs and phones to distribute their creations to others. &lt;/p&gt; &lt;p&gt;The network providers and many of the hardware providers are trying to build systems that make it difficult for users to share and manipulate multimedia content. Such regulation drastically stifles the users&amp;#8217; ability to produce, share and communicate. This is particularly surprising given that such activities are considered the primary &amp;#8220;killer application&amp;#8221; for networks.&lt;/p&gt; &lt;p&gt;It may seem unintuitive to argue that packaged commercial content can co-exist alongside consumer content while concurrently stimulating content creation and sharing. In order to understand how this can work, it is crucial to understand how the current system of copyright is broken and can be fixed.&lt;/p&gt; &lt;p&gt;First of all, copyright in the multimedia digital age is inherently broken. Historically, copyright works because it is difficult to copy or edit works and because only few people produce new works over a very long period of time. Today, technology allows us to find, sample, edit and share very quickly. The problem is that the current notion of copyright is not capable of addressing the complexity and the speed of what technology enables artists to create. Large copyright holders, notably Hollywood studios, have aggressively extended and strengthened their copyright protections to try to keep the ability to produce and distribute creative works in the realm of large corporations.&lt;/p&gt; &lt;p&gt;Hollywood asserts, &amp;#8220;all rights reserved&amp;#8221; on works that they own. Sampling music, having a TV show running in the background in a movie scene or quoting lyrics to a song in a book about the history of music all require payment to and a negotiation with the copyright holder. Even though the Internet makes available a wide palette of wonderful works based on content from all over the world, the current copyright practices forbid most of such creation.&lt;/p&gt; &lt;p&gt;However, most artists are happy to have their music sampled if they receive attribution. Most writers are happy to be quoted or have their books copied for non-commercial use. Most creators of content realize that all content builds on the past and the ability for people to build on what one has created is a natural and extremely important part of the creative process.&lt;/p&gt; &lt;p&gt;Creative Commons tries to give artists that choice. By providing a more flexible copyright than the standards &amp;#8220;all rights reserved&amp;#8221; copyright of commercial content providers, Creative Commons allows artists to set a variety of rights to their works. This includes the ability to reuse for commercial use, copy, sample, require attribution, etc. Such an approach allows artists to decide how their work can be used, while providing people with the materials necessary for increased creation and sharing. &lt;/p&gt; &lt;p&gt;Creative Commons also provides for a way to make the copyright of pieces of content machine-readable. This means that a search engine or other tool to manipulate content is able to read the copyright. As such, an artist can search for songs, images and text to use while having the information to provide the necessary attribution.&lt;/p&gt; &lt;p&gt;Creative Commons can co-exist with the stringent copyright regimes of the Hollywood studios while allowing professional and amateur artists to take more control of how much they want their works to be shared and integrated into the commons. Until copyright law itself is fundamentally changed, the Creative Commons will provide an essential tool to provide an alternative to the completely inflexible copyright of commercial content. &lt;/p&gt; &lt;p&gt;Content is not like some lump of gold to be horded and owned which diminishes in value each time it is shared. Content is a foundation upon which community and relationships are formed. Content is the foundation for culture. We must evolve beyond the current copyright regime that was developed in a world where the creation and transmission of content was unwieldy and expense, reserved to those privileged artists who were funded by commercial enterprises. This will provide the emerging wireless networks and mobile devices with the freedom necessary for them to become the community building tools of sharing that is their destiny.&lt;br /&gt;&lt;/p&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://joi.ito.com/&quot;&gt;Joi Ito&amp;#39;s Web&lt;/a&gt;]&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Enterprise Databases get a grip on XML</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-01-06#442</atom:id>
  <atom:published>2004-01-06T23:17:07Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote style=&quot;margin-right: 0px;&quot; dir=&quot;ltr&quot;&gt; &lt;p&gt;&lt;a class=&quot;listLinkLrg&quot; title=&quot;http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D&quot; href=&quot;http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D&quot; target=&quot;_new&quot;&gt;&lt;strong&gt;&lt;font face=&quot;Verdana&quot;&gt;Databases get a grip on XML&lt;/font&gt;&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;&lt;font size=&quot;2&quot;&gt;&lt;/font&gt;&lt;font face=&quot;Verdana&quot;&gt;From &lt;a href=&quot;http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D&quot;&gt;Inforworld&lt;/a&gt;.&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;font face=&quot;Verdana,Geneva,Arial,sans-serif&quot; size=&quot;2&quot;&gt;The next iteration of the SQL standard was supposed to arrive in 2003. But SQL standardization has always been a glacially slow process, so nobody should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready yet. Even so, 2003 was a year in which XML-oriented data management, one of the areas addressed by the forthcoming standard, showed up on more and more developers? radar screens.ÃÂ  &lt;a title=&quot;http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D&quot; href=&quot;http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D&quot; target=&quot;_blank&quot;&gt;&amp;gt;&amp;gt; READ MORE&lt;/a&gt;&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;This article rounds up product for 2003 in the critical area of Enterprise Database Technology. It&amp;#39;s certainly provides an apt reflection of how Virtuoso compares with offerings from some the larger (but certainly slower to implement) database vendors in this space. As usual Jon Udell&amp;#39;s quote pretty much sums this up:&lt;/font&gt;&lt;/p&gt; &lt;blockquote style=&quot;margin-right: 0px;&quot; dir=&quot;ltr&quot;&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;!--StartFragment --&gt;&lt;span class=&quot;artText&quot;&gt;&lt;em&gt;&amp;quot;While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. &lt;/em&gt;&lt;a class=&quot;regularArticleU&quot; href=&quot;http://www.infoworld.com/699&quot;&gt;&lt;em&gt;OpenLink Software?s Virtuoso 3.0&lt;/em&gt;&lt;/a&gt;&lt;em&gt;, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated ?joins? across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian?s Mono.&amp;quot;&lt;/em&gt;&lt;/span&gt; &lt;/p&gt;&lt;/blockquote&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;Albeit still somewhat unknown to the broader industry we have remained true our &amp;quot;innovator&amp;quot; discipline, which still remains our chosen path to market leadership. Thus, its worth a quick Virtuoso release history, and featuresÃÂ recap as we get set to up the ante even further in 2004:&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.openlinksw.com/press/virtuoso.htm&quot;&gt;1998 - Virtuoso&amp;#39;s initial public beta&lt;/a&gt; release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources.&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.openlinksw.com/press/virtuoso1.htm&quot;&gt;1999 - Virtuoso&amp;#39;s official commercial&lt;/a&gt; release, with emphasis stillÃÂ on Virtual Database functionality for ODBC, JDBC accessible SQL Databases.&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.openlinksw.com/press/v2releas.htm&quot;&gt;2000 - Virtuoso 2.0&lt;/a&gt; adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI, HTTP, Replication, Free Text Indexing (*feature update*), POP3, and NNTP support.&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.openlinksw.com/press/v27releas.htm&quot;&gt;2002 - Virtuoso 2.7&lt;/a&gt; extends Virtualization prowess beyond data access via enhancements to its Web Services protocol stack implementation by enabling SQL Stored Procedures to be published as Web Services. It also debutsÃÂ its Object-Relational engine enhancements that include theÃÂ incorporation of Java and Microsoft .NET Objects into its User Defined Type, User Defined Functions, and Stored ProcedureÃÂ offerings.&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.openlinksw.com/press/virt3beta.htm&quot;&gt;2003 - Virtuoso 3.0&lt;/a&gt; extends data and application logic virtualization into the Application Server realm (basically a Virtual Application server too!), by adding support for ASP.NET, PHP, Java Server Pages runtime hosting (making applications built using any of these languages deployable using Virtuoso across all supported platforms).&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot; size=&quot;2&quot;&gt;Collectively each of these releases have contributed to a very premeditated architecture and vision that will ultimately unveil the inherent power of critical I.S infrastructure virtualizationÃÂ along the following lines; data storage, data access , and application logic via coherent integration of SQL, XML, Web Services, and Persistent Stored Modules (.NET, Java, and other object based component building blocks).&lt;/font&gt;&lt;/p&gt; &lt;p dir=&quot;ltr&quot;&gt;&lt;font face=&quot;Verdana&quot;&gt;&lt;/font&gt;ÃÂ &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Borland&#39;s Early Years: A Wild Ride</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-11-05#415</atom:id>
  <atom:published>2003-11-05T17:31:38Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;A href=&quot;http://weblogs.asp.net/ssivakumar/posts/35896.aspx&quot;&gt;Borland&#39;s Early Years: A Wild Ride&lt;/A&gt; &lt;FONT face=Verdana size=2&gt;A nice article about history of Borland &lt;/FONT&gt;&lt;A href=&quot;http://www.eweek.com/article2/0,4149,1370757,00.asp&quot;&gt;&lt;FONT face=Verdana size=2&gt;http://www.eweek.com/article2/0,4149,1370757,00.asp&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Verdana size=2&gt; &lt;/FONT&gt; &lt;DIV align=right&gt;[via &lt;A href=&quot;http://weblogs.asp.net/&quot;&gt;WebLogs @ ASP.NET&lt;/A&gt;]&lt;/DIV&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The Nigerian SCO Connection</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-10-01#888</atom:id>
  <atom:published>2003-10-01T22:44:03Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;I am a Nigerian reminiscing as my countryÂ that turns 43 today (as a post-colonial independent nation). &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;October the 1st is an emotional day for many Nigerians, especially those of us in the Diaspora. Our country remains a paradox as the excerpts below attest:&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;The more popular view of Nigerians as a result of the proliferation of 419 scams (the mangled by-productÂ of misdirected intellectual prowess and the boundless depths of greed --Â which applies to perpetrators and victims alike).&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p style=&quot;margin-left: 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;a href=&quot;http://www.beblogging.com/blog/20031001-214515&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;The Nigerian SCO Connection&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; &amp;quot;I AM MR. DARL MCBRIDE CURRENTLY SERVING AS THE PRESIDENT AND CHIEF EXECUTIVE OFFICER OF THE SCO GROUP ...&amp;quot; [via &lt;a href=&quot;http://www.beblogging.com/blog/&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Be Blogging&lt;/font&gt;&lt;/span&gt;&lt;/a&gt;]&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Funny! But many a truth is told in jest (I think that&amp;#39;s how the quote goes); this one is pretty damned poignant. &lt;/span&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Unbeknownst to many, there are other views of Nigeria (unfortunately these aren&amp;#39;t the norm).&lt;/span&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;The call for optimism by our president (he doesn&amp;#39;t support or condone the 419 nonsense):&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;Â &lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;President Olusegun Obasanjo urged Nigerians &lt;a href=&quot;http://odili.net/news/source/2003/oct/1/40.html&quot;&gt;to change their ways and be optimistic about the future&lt;/a&gt; as &lt;country-region xmlns=&quot;st1&quot; xmlns:n0=&quot;w&quot; n0:st=&quot;on&quot;&gt;Nigeria&lt;/country-region&gt; marks its 43rd &lt;place xmlns=&quot;st1&quot; xmlns:n0=&quot;w&quot; n0:st=&quot;on&quot;&gt;&lt;city n0:st=&quot;on&quot;&gt;Independence&lt;/city&gt;&lt;/place&gt; anniversary.Â  Read on &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;[via &lt;a href=&quot;http://nigeriaworld.com/&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Odili.net &lt;/font&gt;&lt;/span&gt;&lt;/a&gt;ï¿½ this site desperately needs RSS!]&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;Â &lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;There is an increasing pool of key high-tech players of Nigerian decent (and nationality)Â making constructive impact on the high-tech industry (making it &lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;&lt;a href=&quot;http://www.infoworld.com/article/03/05/23/21FEinnovidehen_1.html&quot;&gt;less lonely for myself&lt;/a&gt; &lt;/font&gt;&lt;/span&gt;and other Nigerians in the high-tech arena):&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;Â &lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;a href=&quot;http://www.kuro5hin.org/user/Carnage4Life/diary&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Dare Obasanjo&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; is a member of Microsoft&amp;#39;s WebData team, which among other things develops the components within the System.Xml and System.Data namespace of the .NET Framework, Microsoft XML Core Services (MSXML), and Microsoft Data Access Components (MDAC). More of Dare&amp;#39;s writings on XML can be found on his &lt;a href=&quot;http://msdn.microsoft.com/voices/xml.asp&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Extreme XML column&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; on MSDN. &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Â &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;a href=&quot;http://uche.ogbuji.net/uche.ogbuji.net/caramusis/&quot;&gt;&lt;!--StartFragment --&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Uche Ogbuji&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; is a consultant and co-founder of Fourthought Inc., a consulting firm specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, the open source platform for XML middleware. Mr. Ogbuji is a Computer Engineer and writer born in &lt;country-region xmlns=&quot;st1&quot; xmlns:n0=&quot;w&quot; n0:st=&quot;on&quot;&gt;Nigeria&lt;/country-region&gt;, living and working in &lt;place xmlns=&quot;st1&quot; xmlns:n0=&quot;w&quot; n0:st=&quot;on&quot;&gt;&lt;city n0:st=&quot;on&quot;&gt;Boulder&lt;/city&gt;, &lt;state n0:st=&quot;on&quot;&gt;Colorado&lt;/state&gt;, &lt;country-region n0:st=&quot;on&quot;&gt;USA&lt;/country-region&gt;&lt;/place&gt;. &lt;br /&gt;&lt;b&gt;Website&lt;/b&gt;:Â &lt;a href=&quot;http://www.fourthought.com/&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;http://www.fourthought.com/&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Â &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;a href=&quot;http://www.emeagwali.com/index.shtml&quot;&gt;&lt;!--StartFragment --&gt;&lt;!--StartFragment --&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;Philip Emeagwali&lt;/font&gt;&lt;/span&gt;&lt;/a&gt;, a computer scientist, is one of the &lt;a href=&quot;http://emeagwali.com/history/internet/index.html&quot; _base_href=&quot;http://radioafrica.biz&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;fathers of the Internet&lt;/font&gt;&lt;/span&gt;&lt;/a&gt; and a &lt;a href=&quot;http://www.emeagwali.com/printed-articles/upstream/natures-own-numbers-man_upstream_january-27-1997.html&quot; target=&quot;new&quot; _base_href=&quot;http://radioafrica.biz&quot;&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;trailblazer in petroleum extraction&lt;/font&gt;&lt;/span&gt;&lt;/a&gt;,&amp;quot; as quoted by &lt;a href=&quot;http://fyi.cnn.com/fyi/interactive/specials/bhm/story/black.innovators.html&quot; target=&quot;new&quot; _base_href=&quot;http://radioafrica.biz&quot;&gt;&lt;i&gt;&lt;span style=&quot;font-size: 12pt; font-family: &#39;Times New Roman&#39;;&quot;&gt;&lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;CNN&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/a&gt;.Â &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;Â &lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Philip leavesÂ all NigeriansÂ withÂ this &lt;a href=&quot;http://emeagwali.com/speeches/nigeria/43rd-independence-anniversary-message/index.html&quot;&gt;important message&lt;/a&gt; on this special day (key excerpt below):&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p style=&quot;margin-left: 0.5in;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;&amp;quot;Our investments in education and technology will be our legacy to our children. They are investments that will bring the best out of the next generation of Nigerians and enable us to reach our potential as individuals, as communities, as a nation.&amp;quot; &lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p class=&quot;MsoNormal&quot; style=&quot;margin: 0in 0in 0pt;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Happy Birthday dear motherland!&lt;/span&gt;&lt;/p&gt; &lt;a href=&quot;index.vspx?tag=Africa&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Africa&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=Nigeria&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Nigeria&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=xml&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;xml&lt;/a&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The Future of Weblogging</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-09-18#250</atom:id>
  <atom:published>2003-09-19T02:21:02Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;P&gt;Nico MacDonald: &lt;A href=&quot;http://www.spy.co.uk/Articles/Spiked/Weblogging/&quot;&gt;The Future of Weblogging&lt;/A&gt;. &lt;/P&gt; &lt;DIV class=SpyStandfirst&gt;Nico Macdonald puts Weblogging in the context of the history of online publishing, explaining its novelty and value, and indicating where it needs to innovate. He concludes with a proposal encouraging publishers to properly embrace the Weblogging model. &lt;/DIV&gt; &lt;DIV class=SpyStandfirst&gt;&amp;nbsp;&lt;/DIV&gt; &lt;DIV class=SpyStandfirst&gt;&lt;A href=&quot;http://www.spy.co.uk/Articles/Spiked/Weblogging/&quot;&gt;More.&lt;/A&gt;&lt;/DIV&gt; &lt;DIV align=right&gt;[via &lt;A href=&quot;http://www.scripting.com/&quot;&gt;Scripting News&lt;/A&gt;] &lt;DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RSS: INJAN (It&#39;s not just about news)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-08-21#241</atom:id>
  <atom:published>2003-08-21T15:41:25Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;When Virtuoso first unleashed support for XML (in-built XSL, Native XML Storage, Validating XML Parser, XPath, and XQuery) the core message was the delivery of a single server solution that would address the challenges of creating XML data.&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;In the year 2000 the question of the shape and form of XML data was unclear to many, and reading the article below basically took me back in time to when we released &lt;a href=&quot;http://www.it-director.com/article.php?articleid=916&quot;&gt;Virtuoso 2.0&lt;/a&gt; (we are now at &lt;a href=&quot;http://www.openlinksw.com/virtuoso&quot;&gt;release 3.0&lt;/a&gt; commercially with a &lt;a href=&quot;http://www.openlinksw.com/press/virt32_wwdc1.htm&quot;&gt;3.2 beta &lt;/a&gt;dropping any minute).&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;RSS is a great XML application, and it does a great job ofÂ demonstrating howÂ XML --the new data access foundation layer-- will galvanize the next generation Web (I refer to this as Web 2.0.). &lt;/span&gt;&lt;/p&gt; &lt;blockquote dir=&quot;ltr&quot; style=&quot;margin-right: 0px;&quot;&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt; &lt;p&gt;&lt;a href=&quot;http://jena.hpl.hp.com:3030/blojsom-hp/blog/technologies/blogging/metadata/?permalink=1214847A10C1966396472E816A7A4243.textile&quot;&gt;RSS: INJAN (It&amp;#39;s not just about news)&lt;/a&gt; &lt;/p&gt; &lt;p&gt;&lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; is not just about news, according to &lt;a href=&quot;http://groups.yahoo.com/group/rss-dev/message/5764&quot;&gt;Ian Davis on rss-dev&lt;/a&gt;.&lt;br /&gt;He presents a nice list of alternatives, which I reproduce here (and to which Iï¿½d add, of course, bibliography management)&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Sitemaps: one of the Sï¿½s in &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; stands for summary. A sitemap is a summary of the content on a site, the items are pages or content areas. This is clearly a non-chronological ordering of items. Is a hierarchy of &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; sitemaps implied here ï¿½ how would the linking between them work? How hard would it be to hack a web browser to pick up the &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; sitemap and display it in a sidebar when you visit the site?&lt;/li&gt; &lt;li&gt;Small ads: also known as classifieds. These expire so thereï¿½s some kind of dynamic going on here but the ordering of items isnï¿½t necessarily chronological. How to describe the location of the seller, or the condition of the item or even the price. Not every ad is selling something ï¿½ perhaps itï¿½s to rent out a room.&lt;/li&gt; &lt;li&gt;Personals: similar model to the small ads. No prices though (I hope). Comes with a ready made vocabulary of terms that could be converted to an &lt;span class=&quot;caps&quot;&gt;RDF&lt;/span&gt; schema. Probably should do that just for the hell of it anyway ï¿½ gsoh&lt;/li&gt; &lt;li&gt;Weather reports: how about a weekï¿½s worth of weather in an &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; channel. If an item is dated in the future, should an aggregator display it before time? Alternate representations include maps of temperature and pressure etc.&lt;/li&gt; &lt;li&gt;Auctions: again, related to small ads, but these are much more time limited since there is a hard cutoff after which the auction is closed. The sequence of bids could be interesting ï¿½ would it make sense to thread them like a discussion so you can see the tactics?&lt;/li&gt; &lt;li&gt;TV listings: this is definitely chronological but with a twist ï¿½ the items have durations. They also have other metadata such as cast lists, classification ratings, widescreen, stereo, program type. Some types have additional information such as director and production year.&lt;/li&gt; &lt;li&gt;Top ten listings: top ten singles, books, dvds, richest people, ugliest, rear of the year etc. Not chronological, but has definate order. May update from day to day or even more often.&lt;/li&gt; &lt;li&gt;Sales reporting: imagine if every department of a company reported their sales figures via &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt;. Then the divisions aggregate the departmental figures and republish to the regional offices, who aggregate and add value up the chain. The chairman of the company subscribes to one super-aggregate feed.&lt;/li&gt; &lt;li&gt;Membership lists / buddy lists: could I publish my buddy list from Jabber or other instant messengers? Maybe as an interchange format or perhaps could be used to look for shared contacts. Lots of potential overlap with &lt;span class=&quot;caps&quot;&gt;FOAF&lt;/span&gt; here.&lt;/li&gt; &lt;li&gt;Mailing lists: or in fact any messaging system such as usenet. There are some efforts at doing this already (e.g. yahoogroups) but we need more information ï¿½ threads; references; headers; links into archives.&lt;/li&gt; &lt;li&gt;Price lists / inventory: the items here are products or services. No particular ordering but itï¿½d be nice to be able to subscribe to a catalog of products and prices from a company. The aggregator should be able to pick out price rises or bargains given enough history.&lt;/li&gt; &lt;div align=&quot;right&quot;&gt;[via &lt;a href=&quot;http://jena.hpl.hp.com:3030/blojsom-hp/blog/&quot;&gt;Semantic Blogging Demonstrator&lt;/a&gt;] &lt;/div&gt;&lt;/ul&gt;&lt;/span&gt;&lt;/blockquote&gt; &lt;p&gt;&lt;span style=&quot;font-size: 10pt; font-family: Arial;&quot;&gt;Thus, if we can comprehend RSS (the blog article below does a great job) we should be able to see the fundamental challenges that are before any organization seeking to exploit the potential of the imminent Web 2.0 inflection; how will you cost-effectively create XML data from existing data sources? Without upgrading or switching database engines, operating systems, programming languages? Put differently how can you exploit this phenomenonÂ without losing your ever dwindling technology choices (believe me choices are dwindling fast but most are oblivious to this fact).&lt;/span&gt;&lt;/p&gt;&lt;p xmlns=&quot;o&quot;&gt;&lt;/p&gt; &lt;p&gt;Â &lt;/p&gt; &lt;a href=&quot;index.vspx?tag=xml&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;xml&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=rss&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;rss&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=syndication&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;syndication&lt;/a&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Interesting Database History: INFORMIX</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-26#192</atom:id>
  <atom:published>2003-06-26T23:45:45Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;!--StartFragment --&gt;&lt;A href=&quot;http://www.wikipedia.org/wiki/Informix&quot;&gt;Interesting Database History: INFORMIX&lt;/A&gt; &lt;P class=subtitle&gt;From &lt;A href=&quot;http://www.wikipedia.org/&quot;&gt;Wikipedia&lt;/A&gt;, the free encyclopedia. &lt;/P&gt; &lt;P&gt;&lt;STRONG&gt;Informix&lt;/STRONG&gt; is a &lt;A class=internal title=&quot;Relational database&quot; href=&quot;http://www.wikipedia.org/wiki/Relational_database&quot;&gt;relational database&lt;/A&gt; and for almost 20 years was also the name of the company who developed it. Informix DBMS was a development of the pioneering &lt;A class=internal title=Ingres href=&quot;http://www.wikipedia.org/wiki/Ingres&quot;&gt;Ingres&lt;/A&gt; system that also led to &lt;A class=internal title=Sybase href=&quot;http://www.wikipedia.org/wiki/Sybase&quot;&gt;Sybase&lt;/A&gt; and &lt;A class=internal title=&quot;SQL Server&quot; href=&quot;http://www.wikipedia.org/wiki/SQL_Server&quot;&gt;SQL Server&lt;/A&gt;, and was the #2 database system behind &lt;A class=internal title=Oracle href=&quot;http://www.wikipedia.org/wiki/Oracle&quot;&gt;Oracle&lt;/A&gt; for some time in the 1990s. Their brush with success was surprisingly short-lived however, and by 2000 a series of management blunders had all but destroyed the company. In &lt;A class=internal title=2001 href=&quot;http://www.wikipedia.org/wiki/2001&quot;&gt;2001&lt;/A&gt; they were purchased by &lt;A class=internal title=IBM href=&quot;http://www.wikipedia.org/wiki/IBM&quot;&gt;IBM&lt;/A&gt; in order to gain access to Informix&#39;s existing market share and customer base. Long term plans to merge Informix technology with &lt;A class=internal title=DB2 href=&quot;http://www.wikipedia.org/wiki/DB2&quot;&gt;DB2&lt;/A&gt; are in place, since the Informix Arrowhead project is now called DB2 Arrowhead. &lt;A class=internal title=IBM href=&quot;http://www.wikipedia.org/wiki/IBM&quot;&gt;IBM&lt;/A&gt; is also commited in supporting older versions. &lt;/P&gt; &lt;P&gt;&lt;A href=&quot;http://www.wikipedia.org/wiki/Informix&quot;&gt;Read on.&lt;/A&gt;&lt;/P&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>How Databases Changed The World</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-09#113</atom:id>
  <atom:published>2003-06-09T09:28:17Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://searchdatabase.techtarget.com/bestWebLinks/0,289521,sid13_tax281575,00.html&quot;&gt;&lt;b&gt;How Databases Changed The World&lt;/b&gt;&lt;/a&gt; by Tim DiChiara, Site Editor (&lt;a href=&quot;http://www.searchdatabase.com&quot;&gt;SearchDatabase.com&lt;/a&gt;) How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned? I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years: Chris Date, Herb Edelstein, Bob Epstein, Ken Jacobs, Pat Selinger, Roger Sippl and Michael Stonebraker. It&#39;s available via streaming &lt;a href=&quot;http://www.computerhistory.org/events/lectures/db_02102003/&quot;&gt;video&lt;/a&gt; and was recorded in February at the Computer History Museum in Mountain View, California. After a chatty and lengthy (45 minutes!) introduction only interesting to hardcore insiders, you can see Chris Date waxing eloquent about Ted Codd (complete with quotes from Shakespeare, no less), Herb Edelstein waxing eloquent about Chris Date, and Michael Stonebraker at his geeky best. There&#39;s also interesting trivia about the beginnings of SQL, the role of INGRES, why the relational model will stand the test of time and some friendly Oracle and IBM bashing (and Microsoft and Sybase and...). I urge all you data management pros interested in broadening your knowledge of the field to check it out! If you&#39;re still not satiated, don&#39;t forget about our collection of backgrounders about the DBMS and the data management industry.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>How Databases Changed The World</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-09#266</atom:id>
  <atom:published>2003-06-09T09:28:17Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;a href=&quot;http://searchdatabase.techtarget.com/bestWebLinks/0,289521,sid13_tax281575,00.html&quot;&gt;&lt;b&gt;How Databases Changed The World&lt;/b&gt;&lt;/a&gt; by Tim DiChiara, Site Editor (&lt;a href=&quot;http://www.searchdatabase.com&quot;&gt;SearchDatabase.com&lt;/a&gt;) &lt;/p&gt;&lt;p&gt;How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned? I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years:&lt;/p&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Chris_Date&quot;&gt;Chris Date&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.computerhistory.org/events/lectures/db_02102003/edelstein/&quot;&gt;Herb Edelstein&lt;/a&gt; &lt;br /&gt;&lt;a href=&quot;http://www.computerhistory.org/events/lectures/db_02102003/epstein/&quot;&gt;Bob Epstein&lt;/a&gt; (&lt;a href=&quot;http://en.wikipedia.org/wiki/Sybase&quot;&gt;Sybase&lt;/a&gt; who shared code with Microsoft for remarketing on SQL Server on OS/2 which inevitably lead to the &lt;a href=&quot;http://en.wikipedia.org/wiki/Microsoft_SQL_Server&quot;&gt;Microsoft SQL Server&lt;/a&gt; we know today)&lt;br /&gt;&lt;a href=&quot;http://www.oracle.com/corporate/pressroom/html/kjacobs.html&quot;&gt;Ken Jacobs&lt;/a&gt; (&lt;a href=&quot;http://en.wikipedia.org/wiki/Oracle_database&quot;&gt;Oracle&lt;/a&gt;&amp;#39;s Dr. DBA)&lt;br /&gt;&lt;a href=&quot;http://www.witi.com/center/witimuseum/halloffame/2004/pselinger.php&quot;&gt;Pat Selinger &lt;/a&gt; (&lt;a href=&quot;http://en.wikipedia.org/wiki/DB2&quot;&gt;DB2&lt;/a&gt; precursor called System R) &lt;br /&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Informix&quot;&gt;Roger Sippl&lt;/a&gt; (&lt;a href=&quot;http://en.wikipedia.org/wiki/Informix&quot;&gt;Informix&lt;/a&gt;)&lt;br /&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Michael_Stonebraker&quot;&gt;Michael Stonebraker&lt;/a&gt; (&lt;a href=&quot;http://en.wikipedia.org/wiki/Ingres&quot;&gt;Ingres&lt;/a&gt;, &lt;a href=&quot;http://en.wikipedia.org/wiki/PostgreSQL&quot;&gt;Postgres&lt;/a&gt;, and &lt;a href=&quot;http://mariposa.cs.berkeley.edu/about.html&quot;&gt;Mariposa&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;The event is available via streaming &lt;a href=&quot;http://www.computerhistory.org/events/lectures/db_02102003/&quot;&gt;video&lt;/a&gt; and was recorded in February at the Computer History Museum in Mountain View, California. After a chatty and lengthy (45 minutes!) introduction only interesting to hardcore insiders, you can see Chris Date waxing eloquent about Ted Codd (complete with quotes from Shakespeare, no less), Herb Edelstein waxing eloquent about Chris Date, and Michael Stonebraker at his geeky best. There&amp;#39;s also interesting trivia about the beginnings of SQL, the role of INGRES, why the relational model will stand the test of time and some friendly Oracle and IBM bashing (and Microsoft and Sybase and...). I urge all you data management pros interested in broadening your knowledge of the field to check it out! If you&amp;#39;re still not satiated, don&amp;#39;t forget about our collection of backgrounders about the DBMS and the data management industry. &lt;a href=&quot;index.vspx?tag=sql&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;sql&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=rdbms&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;rdbms&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=database&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;database&lt;/a&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The Market For Money</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-05#254</atom:id>
  <atom:published>2003-06-06T03:06:53Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;a href=&quot;http://www.ventureblog.com/articles/indiv/2003/000119.html&quot;&gt;The Market For Money&lt;/a&gt; &lt;p&gt;I just read this on &lt;a href=&quot;http://www.loftesness.com/radio/&quot;&gt;Scott Loftesness&amp;#39;s blog&lt;/a&gt; and thought it was worth sharing. &lt;a href=&quot;http://www.loftesness.com/career.html&quot;&gt;Scott&lt;/a&gt; was an EVP at Visa in the early 90&amp;#39;s and his blog is an unbelievably comprehensive discussion of the payments space. Here is his discussion of Visa&amp;#39;s recent announcement that this Visa system had reached $1 Trillion in annual United States transaction volume. It is an amazing growth curve and reminds me of a comment Peter Thiel, the former CEO of PayPal, made to me one day when we were having lunch -- he said that when he was pitching PayPal to VCs he was tempted to describe his market opportunity as the &amp;quot;market for money.&amp;quot; Visa&amp;#39;s numbers prove that that is precisely their market. VC&amp;#39;s are always looking for big markets to penetrate and the market for money certainly qualifies. Here are Scott&amp;#39;s thoughts:&lt;/p&gt; &lt;blockquote&gt;Visa USA announced this morning that, for the first time, its annual sales volume exceeded $1 trillion. &lt;/blockquote&gt; &lt;blockquote&gt;The record usage means that an average of $32,000 went through the Visa system every second of every day over the 12-month period that ended March 31 - or nearly 10 percent of the 2002 U.S. Gross Domestic Product. &lt;/blockquote&gt; &lt;blockquote&gt;&amp;quot;One trillion dollars is an almost incomprehensible number, but it represents clear evidence of the silent revolution we&amp;#39;re witnessing in the way consumers pay for goods and services. It means $12 of every $100 consumers spent in the U.S. is spent using a Visa card,&amp;quot; said Carl Pascarella, president and CEO of Visa USA. &amp;quot;This is an important milestone in the history of U.S. commerce. Clearly, more and more people rely upon the security and convenience of Visa credit, debit and other payment products. To put it into context, $1 trillion could buy 162,000 Harley-Davidson motorcycles every day for a year.&amp;quot;&lt;/blockquote&gt; &lt;blockquote&gt;By comparison, $1 trillion is greater than the combined volume of all other U.S. payment organizations, a field that includes MasterCard, American Express, Discover and others.&lt;/blockquote&gt; &lt;blockquote&gt;Just before I left Visa in 1994, I remember having a discussion with a colleague about growth in sales volume. 1993 had just ended with $500 billion in annual Visa sales on an international basis. We were focused on that total growing to $1 trillion globally over the next five years. As I recall, the US in 1993 was about 40+% of the global total -- so the growth in US volume over the last nine years has been pretty amazing. Of course, this is also one of those statistics that has a nice built-in inflation hedge too (the numbers just keep growing!). $32,000 a second -- at a $50 average ticket that works out to an average of 640 Visa transactions per second.[via &lt;a href=&quot;http://www.ventureblog.com/&quot;&gt;VentureBlog&lt;/a&gt;]&lt;/blockquote&gt; &lt;div&gt;&lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Who&#39;s handing out the crack at Microsoft?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-03#357</atom:id>
  <atom:published>2003-06-03T19:22:15Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;A href=&quot;http://www.surfmind.com/musings///2003/05/31/&quot;&gt;Who&#39;s handing out the crack at Microsoft?&lt;/A&gt; IE6 is the last non-OS based release of the browser?&amp;nbsp; &quot;Futher improvements to IE will require enhancements to the underlying OS&quot;.&amp;nbsp; &lt;A href=&quot;http://www.microsoft.com/technet/treeview/default.asp?url=/technet/itcommunity/chats/trans/ie/ie0507.asp&quot;&gt;source&lt;/A&gt;, via &lt;A href=&quot;http://techno-weenie.com/archives/2003/05/30/003134.php&quot;&gt;techno-weenie&lt;/A&gt;.&lt;BR&gt;&lt;BR&gt;You gotta be kidding.&amp;nbsp; This is great news for Mozilla, even give the recent &lt;A href=&quot;http://www.mozillazine.org/talkback.html?article=3226&quot;&gt;AOL prostitution&lt;/A&gt;. There are huge strides left to be made in the browser UI -- and they have huge potential impact.&amp;nbsp; There is no other software paradigm in the history of computers that&#39;s used and usable by as many people.&lt;BR&gt;&lt;BR&gt;Others in the blogdom have questioned the economic payoff of improving IE, given a 88%+ market share. The abandonment of standards based progress by MSoft is reprehensible.&amp;nbsp; [via &lt;A href=&quot;http://surfmind.com/musings/&quot;&gt;Surf*Mind*Musings&lt;/A&gt;] &lt;DIV&gt;&lt;/DIV&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Who&#39;s handing out the crack at Microsoft?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-03#87</atom:id>
  <atom:published>2003-06-03T19:22:15Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;A href=&quot;http://www.surfmind.com/musings///2003/05/31/&quot;&gt;Who&#39;s handing out the crack at Microsoft?&lt;/A&gt; IE6 is the last non-OS based release of the browser?&amp;nbsp; &quot;Futher improvements to IE will require enhancements to the underlying OS&quot;.&amp;nbsp; &lt;A href=&quot;http://www.microsoft.com/technet/treeview/default.asp?url=/technet/itcommunity/chats/trans/ie/ie0507.asp&quot;&gt;source&lt;/A&gt;, via &lt;A href=&quot;http://techno-weenie.com/archives/2003/05/30/003134.php&quot;&gt;techno-weenie&lt;/A&gt;.&lt;BR&gt;&lt;BR&gt;You gotta be kidding.&amp;nbsp; This is great news for Mozilla, even give the recent &lt;A href=&quot;http://www.mozillazine.org/talkback.html?article=3226&quot;&gt;AOL prostitution&lt;/A&gt;. There are huge strides left to be made in the browser UI -- and they have huge potential impact.&amp;nbsp; There is no other software paradigm in the history of computers that&#39;s used and usable by as many people.&lt;BR&gt;&lt;BR&gt;Others in the blogdom have questioned the economic payoff of improving IE, given a 88%+ market share. The abandonment of standards based progress by MSoft is reprehensible.&amp;nbsp; [via &lt;A href=&quot;http://surfmind.com/musings/&quot;&gt;Surf*Mind*Musings&lt;/A&gt;] &lt;DIV&gt;&lt;/DIV&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>&lt;a href=&quot;http://blogs.law.harvard.edu/isItTheSyntax&quot;&gt;My thoughts&lt;/a&gt;</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-23#327</atom:id>
  <atom:published>2003-05-23T15:39:41Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;P&gt;&lt;A href=&quot;http://blogs.law.harvard.edu/isItTheSyntax&quot;&gt;My thoughts&lt;/A&gt; re Tim Bray&#39;s thread on RDF. [via &lt;A href=&quot;http://www.scripting.com/&quot;&gt;Scripting News&lt;/A&gt;]&lt;/P&gt; &lt;P&gt;&lt;EM&gt;Key excerpt of relevance to us (as potential providers of an application that demonstrates RDFs value prop.):&lt;/EM&gt;&lt;/P&gt; &lt;P&gt;It&#39;s not the syntax that makes the difference, it&#39;s the app. History supports this view. How many people tried to pry apart the &lt;A href=&quot;http://dictionary.reference.com/search?q=obscure&quot;&gt;&lt;STRONG&gt;&lt;FONT color=#920011&gt;obscure&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/A&gt; Excel file format on the Mac? Or the Lotus file format on the PC? Name all the market leaders of the past, and only the Web had both the killer app and a transparent format. Maybe the relationship is multiplicative. Maybe Excel would have &lt;I&gt;been the Web&lt;/I&gt; if it had used an open file format that anyone could understand. What if you could have created a spreadsheet with BBEdit or a HyperTalk script? The mind boggles at the possibilities (it never happened, of course).&lt;/P&gt; &lt;P&gt;&lt;EM&gt;Even in Office 2003 there is a failure to really open things up.&lt;/EM&gt;&lt;BR&gt;&lt;BR&gt;An aside, &lt;A href=&quot;http://scriptingnews.userland.com/weblogsearch/?q=paoli&quot;&gt;&lt;STRONG&gt;&lt;FONT color=#920011&gt;Jean Paoli&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/A&gt; rushes into the room, jumping up and down and saying &quot;That&#39;s what I&#39;m doing that&#39;s what I&#39;m doing.&quot;&lt;BR&gt;&lt;BR&gt;Anyway, I don&#39;t see any killer apps in the RDF crowd. I see lots of people with strong opinions and not much software. Killer apps are not something you wish into existence. Lots of people have said that RDF models a relational database. Okay that tells me something important, the killer app is a relational database. &lt;/P&gt; &lt;P&gt;&lt;EM&gt;Ha Ha!&lt;/EM&gt;&lt;/P&gt; &lt;P&gt;But we already have relational databases. They were new when I was a grad student, and that was a &lt;I&gt;long&lt;/I&gt; time ago. &lt;IMG src=&quot;http://static.userland.com/shortcuts/images/qbullets/sidesmiley.gif&quot;&gt;&lt;BR&gt;&lt;BR&gt;&lt;EM&gt;Yeah, but what we don&#39;t have is a relational databases that incorporate RDF as part of the database technology evolution roadmap. Of course many will get it (and FUD-emulate) when we unveil something via Virtuoso.&lt;/EM&gt;&lt;/P&gt; &lt;DIV&gt;&lt;/DIV&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>&lt;a href=&quot;http://blogs.law.harvard.edu/isItTheSyntax&quot;&gt;My thoughts&lt;/a&gt;</atom:title>
  <atom:id>http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-23#55</atom:id>
  <atom:published>2003-05-23T15:39:41Z</atom:published>
  <atom:updated>2006-06-22T08:56:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&lt;a href=&quot;http://blogs.law.harvard.edu/isItTheSyntax&quot;&gt;My thoughts&lt;/a&gt; re Tim Bray&amp;#39;s thread on RDF. [via &lt;a href=&quot;http://www.scripting.com/&quot;&gt;Scripting News&lt;/a&gt;]&lt;/p&gt; &lt;p&gt;&lt;em&gt;Key excerpt of relevance to us (as potential providers of an application that demonstrates RDFs value prop.):&lt;/em&gt;&lt;/p&gt; &lt;p&gt;It&amp;#39;s not the syntax that makes the difference, it&amp;#39;s the app. History supports this view. How many people tried to pry apart the &lt;a href=&quot;http://dictionary.reference.com/search?q=obscure&quot;&gt;&lt;strong&gt;&lt;font color=&quot;#920011&quot;&gt;obscure&lt;/font&gt;&lt;/strong&gt;&lt;/a&gt; Excel file format on the Mac? Or the Lotus file format on the PC? Name all the market leaders of the past, and only the Web had both the killer app and a transparent format. Maybe the relationship is multiplicative. Maybe Excel would have &lt;i&gt;been the Web&lt;/i&gt; if it had used an open file format that anyone could understand. What if you could have created a spreadsheet with BBEdit or a HyperTalk script? The mind boggles at the possibilities (it never happened, of course).&lt;/p&gt; &lt;p&gt;&lt;em&gt;Even in Office 2003 there is a failure to really open things up.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;An aside, &lt;a href=&quot;http://scriptingnews.userland.com/weblogsearch/?q=paoli&quot;&gt;&lt;strong&gt;&lt;font color=&quot;#920011&quot;&gt;Jean Paoli&lt;/font&gt;&lt;/strong&gt;&lt;/a&gt; rushes into the room, jumping up and down and saying &amp;quot;That&amp;#39;s what I&amp;#39;m doing that&amp;#39;s what I&amp;#39;m doing.&amp;quot;&lt;br /&gt;&lt;br /&gt;Anyway, I don&amp;#39;t see any killer apps in the RDF crowd. I see lots of people with strong opinions and not much software. Killer apps are not something you wish into existence. Lots of people have said that RDF models a relational database. Okay that tells me something important, the killer app is a relational database. &lt;/p&gt; &lt;p&gt;&lt;em&gt;Ha Ha!&lt;/em&gt;&lt;/p&gt; &lt;p&gt;But we already have relational databases. They were new when I was a grad student, and that was a &lt;i&gt;long&lt;/i&gt; time ago. &lt;img src=&quot;http://static.userland.com/shortcuts/images/qbullets/sidesmiley.gif&quot; /&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Yeah, but what we don&amp;#39;t have is a relational databases that incorporate RDF as part of the database technology evolution roadmap. Of course many will get it (and FUD-emulate) when we unveil something via Virtuoso.&lt;/em&gt;&lt;/p&gt; &lt;div&gt;&lt;/div&gt;</atom:content>
 </atom:entry>
</atom:feed>