<?xml version="1.0" encoding="UTF-8" ?>
<!--RDF based XML document generated By OpenLink Virtuoso-->
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
 <rss:channel xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/weblog/dav/dav-blog-1/">
  <rss:title>OpenLink Community Blog</rss:title>
  <rss:link>http://www.openlinksw.com/weblog/dav/dav-blog-1/</rss:link>
  <rss:description>A Collection of blogs by OpenLink Staff</rss:description>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">kidehen@openlinksw.com</dc:creator>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2009-11-23T14:14:34Z</dc:date>
  <rss:items>
   <rdf:Seq>
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2009-01-08#1514" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-08-27#1424" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-04-10#1334" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?date=2007-08-27#1245" />
      <rdf:li rdf:resource="http://www.openlinksw.com/weblog/oerling/?date=2007-08-27#1244" />
      <rdf:li rdf:resource="http://www.openlinksw.com/weblog/oerling/?date=2007-05-23#1198" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?date=2007-05-23#1201" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-10-20#1065" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-05-05#968" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-04-13#957" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-28#819" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-25#807" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-02-28#698" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-01-04#657" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-12-09#648" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-05-17#546" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-04-26#531" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-01-06#442" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-26#192" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-17#279" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-09#266" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-21#319" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-21#48" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-16#301" />
   </rdf:Seq>
  </rss:items>
 </rss:channel>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2009-01-08#1514">
  <rss:title>New ADO.NET 3.x Provider for Virtuoso Released (Update 2)</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2009-01-08T04:36:47Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">I am pleased to announce the immediate availability of the Virtuoso ADO.NET 3.5 data provider for Microsoft&#39;s .NET platform. What is it? A data access driver/provider that provides conceptual entity oriented access to RDBMS data managed by Virtuoso. Naturally, it also uses Virtuoso&#39;s in-built virtual / federated database layer to provide access to ODBC and JDBC accessible RDBMS engines such as: Oracle (7.x to latest), SQL Server (4.2 to latest), Sybase, IBM Informix (5.x to latest), IBM DB2, Ingres (6.x to latest), Progress (7.x to OpenEdge), MySQL, PostgreSQL, Firebird, and others using our ODBC or JDBC bridge drivers. Benefits? Technical: It delivers an Entity-Attribute-Value + Classes &amp; Relationships model over disparate data sources that are materialized as .NET Entity Framework Objects, which are then consumable via ADO.NET Data Object Services, LINQ for Entities, and other ADO.NET data consumers. The provider is fully integrated into Visual Studio 2008 and delivers the same &quot;ease of use&quot; offered by Microsoft&#39;s own SQL Server provider, but across Virtuoso, Oracle, Sybase, DB2, Informix, Ingres, Progress (OpenEdge), MySQL, PostgreSQL, Firebird, and others. The same benefits also apply uniformly to Entity Frameworks compatibility. Bearing in mind that Virtuoso is a multi-model (hybrid) data manager, this also implies that you can use .NET Entity Frameworks against all data managed by Virtuoso. Remember, Virtuoso&#39;s SQL channel is a conduit to Virtuoso&#39;s core; thus, RDF (courtesy of SPASQL as already implemented re. Jena/Sesame/Redland providers), XML, and other data forms stored in Virtuoso also become accessible via .NET&#39;s Entity Frameworks. Strategic: You can choose which entity oriented data access model works best for you: RDF Linked Data &amp; SPARQL or .NET Entity Frameworks &amp; Entity SQL. Either way, Virtuoso delivers a commercial grade, high-performance, secure, and scalable solution. How do I use it? Simply follow one of guides below: Using Visual Studio 2008 &amp; Virtuoso to build an Entity Frameworks based Windows forms application Using Visual Studio 2008 &amp; Virtuoso to build an ADO.NET Data Services based application Note: When working with external or 3rd party databases, simply use the Virtuoso Conductor to link the external data source into Virtuoso. Once linked, the remote tables will simply be treated as though they are native Virtuoso tables leaving the virtual database engine to handle the rest. This is similar to the role the Microsoft JET engine played in the early days of ODBC, so if you&#39;ve ever linked an ODBC data source into Microsoft Access, you are ready to do the same using Virtuoso. Related Entity Oriented Data Access Yoda &amp; the Data FORCE.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I am pleased to announce the immediate availability of the <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtAdoNet35Provider" id="link-id142e7390">Virtuoso ADO.NET 3.5 data provider</a> for Microsoft&#39;s .NET platform.</p>

<h3>What is it?</h3>
<p>A data access driver/provider that provides conceptual <a href="http://dbpedia.org/resource/Entity" id="link-id11c36c00">entity</a> oriented access to <a href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id12fb8618">RDBMS</a> data managed by Virtuoso. Naturally, it also uses Virtuoso&#39;s in-built virtual / <a href="http://dbpedia.org/resource/federated_database_system" id="link-id115bedc8">federated database</a> layer to provide access to <a href="http://dbpedia.org/resource/Open_Database_Connectivity" id="link-id15153c08">ODBC</a> and <a href="http://dbpedia.org/resource/Java_Database_Connectivity" id="link-id13418908">JDBC</a> accessible RDBMS engines such as: <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id134d72f0">Oracle</a> (7.x to latest), <a href="http://dbpedia.org/resource/SQL" id="link-id15757b88">SQL</a> Server (4.2 to latest), <a href="http://dbpedia.org/resource/Sybase" id="link-id15ef8d48">Sybase</a>, IBM <a href="http://dbpedia.org/resource/IBM_Informix" id="link-id12f56aa0">Informix</a> (5.x to latest), IBM <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id119feb38">DB2</a>, <a href="http://dbpedia.org/resource/Ingres" id="link-id14e3d6c8">Ingres</a> (6.x to latest), Progress (7.x to OpenEdge), <a href="http://dbpedia.org/resource/MySQL" id="link-id11295630">MySQL</a>, PostgreSQL, <a href="http://dbpedia.org/resource/Firebird_database_server" id="link-id12f40448">Firebird</a>, and others using our ODBC or JDBC bridge drivers.</p>

<h3>Benefits?</h3>
<h4>Technical:</h4>
<p>It delivers an <a href="http://dbpedia.org/resource/Entity-attribute-value_model" id="link-id14012040">Entity-Attribute-Value + Classes &amp; Relationships model</a> over disparate data sources that are materialized as .NET Entity Framework Objects, which are then consumable via ADO.NET Data Object Services, LINQ for Entities, and other ADO.NET data consumers.</p> 

<p>The provider is fully integrated into Visual Studio 2008 and delivers the same &quot;ease of use&quot; offered by Microsoft&#39;s own SQL Server provider, but across Virtuoso, Oracle, Sybase, DB2, Informix, Ingres, <a href="http://dbpedia.org/resource/Progress_4GL" id="link-id158d1fe8">Progress (OpenEdge</a>), MySQL, PostgreSQL, Firebird, and others. The same benefits also apply uniformly to Entity Frameworks compatibility.</p>
<p>
Bearing in mind that Virtuoso is a multi-model (hybrid) data manager, this also implies that you can use .NET Entity Frameworks against all data managed by Virtuoso. Remember, Virtuoso&#39;s SQL channel is a conduit to Virtuoso&#39;s core; thus, RDF (courtesy of <a href="http://esw.w3.org/topic/SPASQL" id="link-id133c9b70">SPASQL</a> as already implemented re. <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtJenaProvider" id="link-id11380b80">Jena</a>/<a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSesame2Provider" id="link-id10fc0c88">Sesame</a>/<a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtRDFDriverRedland" id="link-id1390f730">Redland</a> providers), XML, and other data forms stored in Virtuoso also become accessible via .NET&#39;s Entity Frameworks.</p>
<br />
<h4>Strategic:</h4>
<p>You can choose which entity oriented data access model works best for you: RDF <a href="http://dbpedia.org/resource/Linked_Data" id="link-id151354f0">Linked Data</a> &amp; <a href="http://dbpedia.org/resource/SPARQL" id="link-id15dc5eb0">SPARQL</a> or .NET Entity Frameworks &amp; <a href="http://en.wikipedia.org/wiki/ADO.NET_Entity_Framework#Entity_SQL" id="link-id14404e80">Entity SQL</a>. Either way, Virtuoso delivers a commercial grade, high-performance, secure, and scalable solution.</p>
<br />
<h3>How do I use it?</h3>

Simply follow one of guides below:
<ul>
<li>
  <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtEntityFrameworkSchoolDbWinFormApp" id="link-id15e5c580">Using Visual Studio 2008 &amp; Virtuoso to build an Entity Frameworks based Windows forms application</a>
</li>
<li>
  <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtUsingMsAdoNetDataServicesWithVirtuoso" id="link-id157912b0">Using Visual Studio 2008 &amp; Virtuoso to build an ADO.NET Data Services based application</a>
</li>
</ul>

<p>
<b>Note:</b> When working with external or 3rd party databases, simply use the Virtuoso Conductor to link the external data source into Virtuoso. Once linked, the remote tables will simply be treated as though they are native Virtuoso tables leaving the <a href="http://dbpedia.org/resource/Virtual_Database" id="link-id15b04b18">virtual database</a> engine to handle the rest. This is similar to the role the Microsoft JET engine played in the early days of ODBC, so if you&#39;ve ever linked an ODBC data source into Microsoft Access, you are ready to do the same using Virtuoso.</p>

<h3>Related</h3>
<ul>
<li>
  <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1420" id="link-id160afdd0">Entity Oriented Data Access</a>
</li>
<li>
  <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1474" id="link-id113eeb50">Yoda &amp; the Data FORCE.</a>
</li>
</ul>
]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-08-27#1424">
  <rss:title>Crunchbase &amp; Semantic Web Interview (Remix - Update 1)</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2008-08-27T18:16:37Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">After reading Bengee&#39;s interview with CrunchBase, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse. CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is? Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes &amp; Relationships). CrunchBase: And what inspired you to create it? Me: Bengee built a new space with your data, and we&#39;ve built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles). Bengee posted a notice to the Linking Open Data Community&#39;s public mailing list announcing his effort. Bearing in mind the fact that we&#39;ve been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee&#39;s RDFization efforts, and ours. Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we&#39;ve taken the &quot;RDFization on the fly&quot; approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we&#39;ve also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data. CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality? Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to &quot;Mesh&quot; (rather than &quot;Mash&quot;) CrunchBase data with other Linked Data sources on the Web without writing a single line of code. CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web? Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap. CrunchBase: Can you put into laymanâs terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well? Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources). SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That&#39;s it in a nutshell. CrunchBase: On your website you wrote that âRDF and SPARQL as productivity boosters in everyday web developmentâ. Can you elaborate on why you believe that to be true? Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn&#39;t confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the &quot;Information at Your Fingertips&quot; vision of yore. Even though it&#39;s taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision&#39;s intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn&#39;t any knowledge without accessible Information, and there isn&#39;t any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages). Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet. CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0? Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of &quot;Smart Data Access&quot; to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs. Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge: Amazon.com Microsoft Google Apple</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>After reading <a href="http://blog.crunchbase.com/2008/08/26/building-a-semantic-web-interview-with-benjamin-nowack/" id="link-id16b8e0e0">Bengee&#39;s interview with CrunchBase</a>, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse.</p>
<blockquote>
<cite><a href="http://www.crunchbase.com/" id="link-id17c8e7b8">CrunchBase</a>: When we released the <a href="http://www.crunchbase.com/help/api" id="link-id16681f68">CrunchBase API</a>, you were one of the first developers to step up and quickly released a <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com's%20BLOG%20%5B127%5D/1395" id="link-id1016d5f0">CrunchBase Sponger Cartridge</a>. Can you explain what a CrunchBase Sponger Cartridge is?</cite>
</blockquote>

<blockquote>
<a href="http://myopenlink.net/dataspace/person/kidehen#this" id="link-id13243300">Me</a>: A Sponger Cartridge is a <a href="http://dbpedia.org/resource/Data">data</a> access driver for <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> Resources that plugs into our <a href="http://virtuoso.openlinksw.com" id="link-id17042f08">Virtuoso</a> <a href="http://dbpedia.org/resource/Virtuoso_Universal_Server" id="link-id1399b588">Universal Server</a> (DBMS and <a href="http://dbpedia.org/resource/Linked_Data" id="link-id137fd188">Linked Data</a> <a href="http://dbpedia.org/resource/Giant_Global_Graph" id="link-id100b23d8">Web</a> Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based <a href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id10418750">Linked Data graph</a> that essentially describes the resource via its properties (Attributes &amp; Relationships).
</blockquote>
<br />
<img src="http://virtuoso.openlinksw.com/presentations/Creating_Deploying_Exploiting_Linked_Data2/images/ldp4.png" />
<br />
<br />
<br />
<blockquote>
<cite>CrunchBase: And what inspired you to create it?</cite>
</blockquote>
<blockquote>
<a href="http://myopenlink.net/dataspace/person/kidehen#this" id="link-id12fa60c0">Me</a>: Bengee built a new space with your data, and we&#39;ve built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of <a href="http://dbpedia.org/resource/Linked_Data" id="link-id101a8d28">Linked Data</a> i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id14a3ff30">Semantic Web</a> circles).</blockquote>

<blockquote>
<a href="http://cb.semsol.org/" id="link-id182a0170">Bengee</a> posted a notice to the <a href="http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData" id="link-id131e8d10">Linking Open Data Community</a>&#39;s public <a href="http://lists.w3.org/Archives/Public/public-lod/2008Jul/0110.html" id="link-id11dd0720">mailing list announcing his effort</a>. Bearing in mind the fact that we&#39;ve been using <a href="http://www.openlinksw.com/blog/~kidehen/?id=1144" id="link-id117cf6e8">middleware to mesh the realms of Web 2.0 and the Linked Data Web</a> for a while, it was a no-brainer to knock something up based on the conceptual similarities between <a href="http://wikicompany.org/wiki/Main_Page" id="link-id13b87a68">Wikicompany</a> and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee&#39;s RDFization efforts, and ours.</blockquote>

<blockquote>Bengee created an RDF based <a href="http://dbpedia.org/resource/Linked_Data" id="link-id133c8fc8">Linked Data</a> warehouse based on the data exposed by your API, which is exposed via the <a href="http://cb.semsol.org/" id="link-id1826f928">Semantic CrunchBase</a> <a href="http://en.wikipedia.org/wiki/Data_Spaces" id="link-id102d8890">data space</a>. In our case we&#39;ve taken the &quot;RDFization on the fly&quot; approach which produces a transient <a href="http://dbpedia.org/resource/Linked_Data" id="link-id16a0b8d0">Linked Data</a> View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the <a href="http://dbpedia.org/resource/Linked_Data" id="link-id1668e6c8">Linked Data</a> <a href="http://dbpedia.org/resource/Giant_Global_Graph" id="link-id188e7da0">Web</a> is about incorporating HTTP into the  naming scheme of these data sources so that the conventional <a href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id13490710">URL</a> based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of <a href="http://dbpedia.org/resource/Linked_Data" id="link-id169aa568">Linked Data</a> on the Web (e.g. <a href="http://dbpedia.org/resource/DBpedia" id="link-id10af10e8">DBpedia</a>, <a href="http://www.pingthesemanticweb.com/about/" id="link-id10a2b710">PingTheSemanticWeb</a>, and others), we&#39;ve also automatically meshed Crunchbase data with related data in <a href="http://dbpedia.org/resource/DBpedia" id="link-id1403cd40">DBpedia</a> and Wikicompany data.</blockquote> 
<br />

<blockquote>
<cite>CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?</cite>
</blockquote>
<blockquote>
<a href="http://myopenlink.net/dataspace/person/kidehen#this" id="link-id177d24c8">Me</a>: Yes, the <a href="http://ode.openlinksw.com" id="link-id10725ca0">OpenLink Data Explorer</a> which provides CrunchBase site visitors with the option to explore the <a href="http://dbpedia.org/resource/Linked_Data" id="link-id17dedea8">Linked Data</a> in the CrunchBase <a href="http://en.wikipedia.org/wiki/Data_Spaces" id="link-id13f02a00">data space</a>. It also allows them to &quot;Mesh&quot; (rather than &quot;Mash&quot;) CrunchBase data with other <a href="http://dbpedia.org/resource/Linked_Data" id="link-id11fb3ba0">Linked Data</a> sources on the Web without writing a single line of code. </blockquote>
<br />

<blockquote>
<cite>CrunchBase: You have been immersed in the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id12e18a00">Semantic Web</a> movement for a while now. How did you first get interested in the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id15132110">Semantic Web</a>?</cite>
</blockquote>
<blockquote>
<a href="http://myopenlink.net/dataspace/person/kidehen#this" id="link-id0xddaa9c8">Me</a>: We saw the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id188b3330">Semantic Web</a> as a vehicle for standardizing conceptual views of heterogeneous data sources via <a href="http://dbpedia.org/resource/Context_%28language_use%29" id="link-id10350978">context</a> lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of <a href="http://dbpedia.org/resource/Open_Database_Connectivity" id="link-id171d6798">ODBC</a>, <a href="http://dbpedia.org/resource/Java_Database_Connectivity" id="link-id138120a0">JDBC</a>, and OLE-DB data providers, we decided to build a <a href="http://dbpedia.org/resource/Virtual_Database" id="link-id13ea6618">Virtual Database</a> Engine (see: <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSHistory" id="link-id11a4fa30">Virtuoso History</a>), and in doing so we sought a standards based mechanism for the conceptual output of the <a href="http://dbpedia.org/resource/Federated_database_system" id="link-id101a1248">data virtualization</a> effort. As of the time of the <a href="http://www.w3.org/DesignIssues/Semantic.html" id="link-id18882cf8">seminal unveiling of the Semantic Web in 1998</a> we were clear about two things, in relation to the effects of the Web and <a href="http://dbpedia.org/resource/Internet" id="link-id12fa2c58">Internet</a> data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop <a href="http://virtuoso.openlinksw.com" id="link-id102b09a0">Virtuoso</a> with an eye to leveraging the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id11984d98">Semantic Web</a> as a vehicle from completing its technical roadmap.</blockquote>
<br />

<blockquote>
<cite>CrunchBase: Can you put into laymanâs terms exactly what RDF and <a href="http://dbpedia.org/resource/SPARQL" id="link-id1066dcf0">SPARQL</a> are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?</cite>
</blockquote>
<blockquote>Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the <a href="http://www.eslincanada.com/englishlesson2.html" id="link-id178b94a8">Subject, Predicate, and Object principle</a>. Associated with the core data model, as part of the overall framework,  are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: <a href="http://dbpedia.org/resource/RDFa" id="link-id188db0a8">RDFa</a> (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).</blockquote> 
<blockquote>
<a href="http://dbpedia.org/resource/SPARQL" id="link-id188c2030">SPARQL</a> is the query language associated with the RDF Data Model, just as <a href="http://dbpedia.org/resource/SQL" id="link-id13f0ffe0">SQL</a> is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and <a href="http://dbpedia.org/resource/Linked_Data" id="link-id166874d0">linked data</a> on the Web, you can query against Web using <a href="http://dbpedia.org/resource/SPARQL" id="link-id1016cc98">SPARQL</a> just as you would against an <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id101c9708">Oracle</a>/<a href="http://dbpedia.org/resource/SQL" id="link-id11cb0b18">SQL</a> Server/<a href="http://dbpedia.org/resource/IBM_DB2" id="link-id10760ec0">DB2</a>/<a href="http://dbpedia.org/resource/IBM_Informix" id="link-id1066c8c0">Informix</a>/<a href="http://dbpedia.org/resource/Ingres" id="link-id18894f40">Ingres</a>/<a href="http://dbpedia.org/resource/MySQL" id="link-iddc9ebb0">MySQL</a>/etc.. DBMS using <a href="http://dbpedia.org/resource/SQL" id="link-id1030d120">SQL</a>. That&#39;s it in a nutshell.</blockquote>
<br />

<blockquote>
<cite>CrunchBase: On your website you wrote that âRDF and <a href="http://dbpedia.org/resource/SPARQL" id="link-id168e9ad0">SPARQL</a> as productivity boosters in everyday web developmentâ. Can you elaborate on why you believe that to be true?</cite>
</blockquote>
<blockquote>Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of <a href="http://dbpedia.org/resource/Linked_Data" id="link-id0x179f6328">Linked Data</a> that isn&#39;t confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF <a href="http://dbpedia.org/resource/Linked_Data">Linked Data</a> is about infrastructure for the true materialization of the &quot;<a href="http://dbpedia.org/resource/Information" id="link-id13e475b8">Information</a> at Your Fingertips&quot; vision of yore. Even though it&#39;s taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision&#39;s intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: <a href="http://dbpedia.org/resource/Knowledge" id="link-id13e38a30">Knowledge</a> is Power, well there isn&#39;t any <a href="http://dbpedia.org/resource/Knowledge" id="link-id188b7348">knowledge</a> without accessible <a href="http://dbpedia.org/resource/Information" id="link-id140415d0">Information</a>, and there isn&#39;t any accessible <a href="http://dbpedia.org/resource/Information" id="link-id11a976e8">Information</a> without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).</blockquote> <blockquote>Bottom line, RDF based Linked Data is about Open <a href="http://dbpedia.org/resource/Reference_(computer_science)" id="link-id1206bfb8">Data access by reference</a> using URIs (HTTP based <a href="http://dbpedia.org/resource/Entity" id="link-idfaa6ce0">Entity</a> IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the <a href="http://dbpedia.org/resource/Internet" id="link-id188ecc68">Internet</a>.</blockquote>
<br />

<blockquote>
<cite>CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id12e2d968">Semantic Web</a>, or Semanti<a href="http://dbpedia.org/resource/C_(programming_language)" id="link-id105744c0">c</a> Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id13fa4218">Semantic Web</a> will play in Web 3.0?</cite>
</blockquote>
<blockquote>Me: I agree with Nova. But I see Web 3.0 as a phase within the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id188c9000">Semantic Web</a> innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of &quot;Smart Data Access&quot; to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, <a href="http://dbpedia.org/resource/Context_%28language_use%29" id="link-id188d2ef8">context</a> fidelity, and comprehension of privacy) exposed by URIs.</blockquote>
<p>Here are some examples of the CrunchBase Linked Data <a href="http://en.wikipedia.org/wiki/Data_Spaces" id="link-id122756f8">Space</a>, as projected via our CruncBase Sponger  Cartridge:</p>
<ol>
<li>
  <a href="http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Famazon" id="link-id13e0fd18">Amazon.com</a>
</li>
<li>
  <a href="http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fmicrosoft" id="link-id13eef9e0">Microsoft</a>
</li>
<li>
  <a href="http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fgoogle" id="link-id13fe47a0">Google</a>
</li>
<li>
  <a href="http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fapple" id="link-id170c73b8">Apple</a>
</li>
</ol>
]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2008-04-10#1334">
  <rss:title>Linked Data enabling PHP Applications</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2008-04-10T18:09:49Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Daniel lewis has penned a variation of post about Linked Data enabling PHP applications such as: Wordpress, phpBB3, MediaWiki etc. Daniel simplifies my post by using diagrams to depict the different paths for PHP based applications exposing Linked Data - especially those that already provide a significant amount of the content that drives Web 2.0. If all the content in Web 2.0 information resources are distillable into discrete data objects endowed with HTTP based IDs (URIs), with zero &quot;RDF handcrafting Tax&quot;, what do we end up with? A Giant Global Graph of Linked Data; the Web as a Database. So, what used to apply exclusively, within enterprise settings re. Oracle, DB2, Informix, Ingres, Sybase, Microsoft SQL Server, MySQL, PostrgeSQL, Progress Open Edge, Firebird, and others, now applies to the Web. The Web becomes the &quot;Distributed Database Bus&quot; that connects database records across disparate databases (or Data Spaces). These databases manage and expose records that are remotely accessible &quot;by reference&quot; via HTTP. As I&#39;ve stated at every opportunity in the past, Web 2.0 is the greatest thing that every happened to the Semantic Web vision :-) Without the &quot;Web 2.0 Data Silo Conundrum&quot; we wouldn&#39;t have the cry for &quot;Data Portability&quot; that brings a lot of clarity to some fundamental Web 2.0 limitations that end-users ultimately find unacceptable. In the late &#39;80s, the SQL Access Group (now part of X/Open) addressed a similar problem with RDBMS silos within the enterprise that lead to the SAG CLI which is exists today as Open Database Connectivity. In a sense we now have WODBC (Web Open Database Connectivity), comprised of Web Services based CLIs and/or traditional back-end DBMS CLIs (ODBC, JDBC, ADO.NET, OLE-DB, or Native), Query Language (SPARQL Query Language), and a Wire Protocol (HTTP based SPARQL Protocol) delivering Web infrastructure equivalents of SQL and RDA, but much better, and with much broader scope for delivering profound value due to the Web&#39;s inherent openness. Today&#39;s PHP, Python, Ruby, Tcl, Perl, ASP.NET developer is the enterprise 4GL developer of yore, without enterprise confinement. We could even be talking about 5GL development once the Linked Data interaction is meshed with dynamic languages (delivering higher levels of abstraction at the language and data interaction levels). Even the underlying schemas and basic design will evolve from Closed World (solely) to a mesh of Closed &amp; Open World view schemas.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>
<a href="http://myopenlink.net/dataspace/person/danieljohnlewis#this" id="link-id10820610">Daniel lewis</a> has penned a variation of post about <a href="http://vanirsystems.com/danielsblog/2008/04/10/simplified-adding-wordpress-blogs-into-the-linked-data-web-using-virtuoso/" id="link-id10827948">Linked Data enabling PHP applications</a> such as: <a href="http://dbpedia.org/resource/WordPress" id="link-id10426278">Wordpress</a>, <a href="http://dbpedia.org/resource/PhpBB" id="link-id13f431c0">phpBB3</a>, <a href="http://dbpedia.org/resource/MediaWiki" id="link-id10dd8760">MediaWiki</a> etc.</p>

<p>Daniel simplifies my post by using diagrams to depict the different paths for <a href="http://dbpedia.org/resource/PHP" id="link-id10adcc08">PHP</a> based applications exposing <a href="http://dbpedia.org/resource/Linked_Data" id="link-id107b4e60">Linked Data</a> - especially those that already provide a significant amount of the content that drives <a href="http://dbpedia.org/resource/World_Wide_Web" id="link-id13b0ab48">Web</a> 2.0.</p>

<p>If all the content in <a href="http://dbpedia.org/resource/World_Wide_Web" id="link-id0x1d499470">Web</a> 2.0 <a href="http://dbpedia.org/resource/Information" id="link-id12bd3b10">information</a> resources are distillable into discrete <a href="http://dbpedia.org/resource/Data" id="link-id10962060">data</a> objects endowed with <a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id176a30e8">HTTP</a> based IDs (URIs), with zero &quot;<a href="http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&q=rdf%20tax&type=text&output=html" id="link-id1098bcd8">RDF handcrafting Tax</a>&quot;, what do we end up with? A <a href="http://dbpedia.org/resource/Giant_Global_Graph" id="link-id1372ce88">Giant Global Graph</a> of <a href="http://dbpedia.org/resource/Linked_Data" id="link-id0xa29f0658">Linked Data</a>; the <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> as a Database.</p> <p>So, what used to apply exclusively, within enterprise settings re. <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id12d91448">Oracle</a>, <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id13dd27d8">DB2</a>, <a href="http://dbpedia.org/resource/IBM_Informix" id="link-id108e6b98">Informix</a>, <a href="http://dbpedia.org/resource/Ingres" id="link-id13383708">Ingres</a>, <a href="http://dbpedia.org/resource/Sybase" id="link-idfed8aa8">Sybase</a>, <a href="http://dbpedia.org/resource/Microsoft_SQL_Server" id="link-id10b8b190">Microsoft SQL Server</a>, <a href="http://dbpedia.org/resource/MySQL" id="link-id13066ea8">MySQL</a>, PostrgeSQL, Progress Open Edge, <a href="http://dbpedia.org/resource/Firebird_database_server" id="link-id104f0a78">Firebird</a>, and others, now applies to the Web. The Web becomes the &quot;<a href="http://dbpedia.org/resource/federated_database_system" id="link-id105a5340">Distributed Database</a> Bus&quot; that connects database records across disparate databases (or <a href="http://dbpedia.org/resource/Data" id="link-id0xc706c68">Data</a> Spaces). These databases manage and expose records that are remotely accessible &quot;by reference&quot; via <a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x1c8f7fe0">HTTP</a>.</p>

<p>As I&#39;ve stated at every opportunity in the past, Web 2.0 is the greatest thing that every happened to the <a href="http://dbpedia.org/resource/Semantic_Web" id="link-id13d65278">Semantic Web</a> vision :-) Without the &quot;<a href="http://www.openlinksw.com/weblog/public/search.vspx?blogid=127&q=Web%202.0%20%20conundrum&type=text&output=html" id="link-id100d16d0">Web 2.0 Data Silo Conundrum</a>&quot; we wouldn&#39;t have the cry for &quot;<a href="http://dbpedia.org/resource/Data">Data</a> Portability&quot; that brings a lot of clarity to some fundamental Web 2.0 limitations that end-users ultimately find unacceptable.</p> 
<p>
In the late &#39;80s, the <a href="http://dbpedia.org/resource/SQL" id="link-idff4f0d0">SQL</a> <a href="http://dbpedia.org/resource/SQL_Access_Group" id="link-id138fbd40">Access Group</a> (now part of <a href="http://dbpedia.org/resource/X/Open" id="link-id104ee010">X</a>/<a href="http://dbpedia.org/resource/X/Open" id="link-id0xac9eab8">Open</a>) addressed a similar problem with <a href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id106d2008">RDBMS</a> silos within the enterprise that lead to the SAG <a href="http://dbpedia.org/resource/Call_Level_Interface" id="link-id105d45d0">CLI</a> which is exists today as Open Database Connectivity.</p>

<p>In a sense we now have WODBC (Web Open Database Connectivity), comprised of Web Services based CLIs and/or traditional back-end DBMS CLIs (<a href="http://dbpedia.org/resource/Open_Database_Connectivity" id="link-id13f58708">ODBC</a>, <a href="http://dbpedia.org/resource/Java_Database_Connectivity" id="link-id10aa81e0">JDBC</a>, <a href="http://dbpedia.org/resource/ADO.NET" id="link-id5fddb68">ADO</a>.<a href="http://dbpedia.org/resource/ADO.NET" id="link-id0x9f085a10">NET</a>, OLE-DB, or Native),  Query Language (<a href="http://dbpedia.org/resource/SPARQL" id="link-id10adb5c8">SPARQL</a> Query Language), and a Wire Protocol (<a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol">HTTP</a> based <a href="http://www.w3.org/TR/rdf-sparql-protocol/" id="link-id126fa068">SPARQL Protocol</a>) delivering Web infrastructure equivalents of <a href="http://dbpedia.org/resource/SQL" id="link-id0x1d0a5fc8">SQL</a> and RDA, but much better, and with much broader scope for delivering profound value due to the Web&#39;s inherent openness. Today&#39;s <a href="http://dbpedia.org/resource/PHP" id="link-id0xc88ed68">PHP</a>, <a href="http://dbpedia.org/resource/Python_programming_language" id="link-id10a70530">Python</a>, <a href="http://dbpedia.org/resource/Ruby_programming_language" id="link-id13d9da18">Ruby</a>, <a href="http://dbpedia.org/resource/Tcl" id="link-id10a3c2a8">Tcl</a>, <a href="http://dbpedia.org/resource/Perl" id="link-id13e1b6f0">Perl</a>, <a href="http://dbpedia.org/resource/ASP.NET" id="link-id10810388">ASP</a>.<a href="http://dbpedia.org/resource/ASP.NET" id="link-id0xa22ce378">NET</a>  developer is the enterprise <a href="http://dbpedia.org/resource/4GL" id="link-id1396a500">4GL</a> developer of yore, without enterprise confinement. We could even be talking about <a href="http://dbpedia.org/resource/5GL" id="link-id1077f250">5GL</a> development once the <a href="http://dbpedia.org/resource/Linked_Data">Linked Data</a> interaction is meshed with dynamic languages (delivering higher levels of abstraction at the language and data interaction levels). Even the underlying schemas and  basic design will evolve from <a href="http://dbpedia.org/resource/Closed_world_assumption" id="link-id10b280c8">Closed World</a> (solely) to a mesh of Closed &amp; <a href="http://dbpedia.org/resource/Open_world_assumption" id="link-id104b9978">Open World</a> view schemas.</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?date=2007-08-27#1245">
  <rss:title>Virtuoso Cluster Preview</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2007-08-27T09:51:45Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuoso Cluster Preview I wrote the basics of the Virtuoso clustering support over the past three weeks.Â  It can now manage connections, decide where things go, do two phase commits, insert and select data from tables partitioned over multiple Virtuoso instances.Â  It works about enough to be measured, of which I will blog more over the next two weeks. I will in the following give a features preview of what will be in the Virtuoso clustering support when it is released in the fall of this year (2007). Data Partitioning A Virtuoso database consists of indices only, so that the row of a table is stored together with the primary key.Â  Blobs are stored on separate pages when they do not fit inline within the row.Â  With clustering, partitioning can be specified index by index. Partitioning means that values of specific columns are used for determining where the containing index entry will be stored.Â  Virtuoso partitions by hash and allows specifying what parts of partitioning columns are used for the hash, for example bits 14-6 of an integer or the first 5 characters of a string.Â  Like this, key compression gains are not lost by storing consecutive values on different partitions. Once the partitioning is specified, we specify which set of cluster nodes stores this index.Â  Not every index has to be split evenly across all nodes.Â  Also, all nodes do not have to have equal slices of the partitioned index, accommodating differences in capacity between cluster nodes. Each Virtuoso instance can manage up to 32TB of data.Â  A cluster has no definite size limit. Load Balancing and Fault Tolerance When data is partitioned, an operation on the data goes where the data is. Â This provides a certain natural parallelism but we will discuss this further below. Some data may be stored multiple times in the cluster, either for fail-over or for splitting read load.Â  Some data, such as database schema, is replicated on all nodes.Â  When specifying a set of nodes for storing the partitions of a key, it is possible to specify multiple nodes for the same partition.Â  If this is the case, updates go to all nodes and reads go to a randomly picked node from the group. If one of the nodes in the group fails, operation can resume with the surviving node. Â The failed node can be brought back online from the transaction logs of the surviving nodes. A few transactions may be rolled back at the time of failure and again at the time of the failed node rejoining the cluster but these are aborts as in the case of deadlock and lose no committed data. Shared Nothing The Virtuoso architecture does not require a SAN for disk sharing across nodes.Â  This is reasonable since a few disks on a local controller can easily provide 300MB/s of read and passing this over an interconnect fabric that would also have to carry inter-node messages could saturate even a fast network. Client View A SQL or HTTP client can connect to any node of the cluster and get an identical view of all data with full transactional semantics.Â  DDL operations like table creation and package installation are limited to one node, though. Applications such as ODS will run unmodified.Â  They are installed on all nodes with a single install command.Â  After this, the data partitioning must be declared, which is a one time operation to be done cluster by cluster.Â  The only application change is specifying the partitioning columns for each index.Â  The gain is optional redundant storage and capacity not limited to a single machine.Â  The penalty is that single operations may take a little longer when not all data is managed by the same process but then the parallel throughput is increased. Â We note that the main ODS performance factor is web page logic and not database access. Â Thus splitting the web server logic over multiple nodes gives basically linear scaling. Parallel Query Execution Message latency is the principal performance factor in a clustered database.Â  Due to this, Virtuoso packs the maximum number of operations in a single message.Â  For example, when doing a loop join that reads one table sequentially and retrieves a row of another table for each row of the outer table, a large number of the join of the inner loop are run in parallel.Â  So, if there is a join of five tables that gets one row from each table and all rows are on different nodes, the time will be spent on message latency.Â  If each step of the join gets 10 rows, for a total of 100000 results, the message latency is not a significant factorÂ and the cluster will clearly outperform a single node. Also, if the workload consists of large numbers of concurrent short updates or queries, the message latencies will even out and throughput will scale up even if doing a single transaction were faster on a single node. Parallel SQL There are SQL extensions for stored procedures allowing parallelizing operations. Â For example, if a procedure has a loop doing inserts, the inserted rows can be buffered until a sufficient number is available, at which point they are sent in batches to the nodes concerned. Â Transactional semantics are kept but error detection is deferred to the actual execution. Transactions Each transaction is owned by one node of the cluster, the node to which the client is connected.Â  When more than one node besides the owner of the transaction is updated, two phase commit is used.Â  This is transparent to the application code.Â  No external transaction monitor is required, the Virtuoso instances perform these functions internally.Â  There is a distributed deadlock detection scheme based on the nodes periodically sharing transaction waiting information. Since read transactions can operate without locks, reading the last committed state of uncommitted updated rows, waiting for locks is not very common. Interconnect and Threading Virtuoso uses TCP to connect between instances.Â  A single instance can have multiple listeners at different network interfaces for cluster activity.Â  The interfaces will be used in a round-robin fashion by the peers, spreading the load over all network interfaces. A separate thread is created for monitoring each interface.Â  Long messages, such as transfers of blobs are done on a separate thread, thus allowing normal service on the cluster node while the transfer is proceeding. We will have to test the performance of TCP over Infiniband to see if there is clear gain in going to a lower level interface like MPI.Â  The Virtuoso architecture is based on streams connecting cluster nodes point to point.Â  The design does not per se gain from remote DMA or other features provided by MPI.Â  Typically, messages are quite short, under 100K. Â Flow control for transfer of blobs is however nice to have but can be written at the application level if needed.Â  We will get real data on the performance of different interconnects in the next weeks. Deployment and Management Configuring is quite simple, with each process sharing a copy of the same configuration file. Â One line in the file differs from host to host, telling it which one it is.Â  Otherwise the database configuration files are individual per host, accommodating different file system layouts etc. Â Setting up a node requires copying the executable and two configuration files, no more.Â  Â All functionality is contained in a single process.Â  There are no installers to be run or such. Changing the number or network interface of cluster nodes requires a cluster restart.Â  Changing data partitioning requires copying the data into a new table and renaming this over the old one.Â  This is time consuming and does not mix well with updates.Â  Splitting an existing cluster node requires no copying with repartitioning but shifting data between partitions does. A consolidated status report shows the general state and level of intra-cluster traffic as count of messages and count of bytes. Start, shutdown, backup, and package installation commands can only be issued from a single master node. Otherwise all is symmetrical. Present State and Next Developments The basics are now in place.Â  Some code remains to be written for such things as distributed deadlock detection, 2-phase commit recovery cycle, management functions, etc.Â  Some SQL operations like text index, statistics sampling, and index intersection need special support, yet to be written. The RDF capabilities are not specifically affected by clustering except in a couple of places.Â  Loading will be slightly revised to use larger batches of rows to minimize latency, for example. There is a pretty much infinite world of SQL optimizations for splitting aggregates, taking advantage of co-located joins etc.Â  These will be added gradually.Â  These are however not really central to the first application of RDF storage but are quite important for business intelligence, for example. We will run some benchmarks for comparing single host and clustered Virtuoso instances over the next weeks.Â  Some of this will be with real data, giving an estimate on when we can move some of the RDF data we presently host to the new platform.Â  We will benchmark against Oracle and DB2 later but first we get things to work and compare against ourselves. We roughly expect a halving in space consumption and a significant increase in single query performance and linearly scaling parallel throughput through addition of cluster nodes. The next update will be on this blog within two weeks.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<div>
<div style="display:none;">Virtuoso Cluster Preview</div>
<p>
 <b><i>I wrote the basics of the <a href="http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1229" id="link-id1383c310">Virtuoso clustering support</a> over the past three weeks.Â  It can now manage connections, decide where things go, do two phase commits, insert and select <a href="http://dbpedia.org/resource/Data" id="link-id0xad603e0">data</a> from tables partitioned over multiple <a href="http://virtuoso.openlinksw.com" id="link-id0xbc64f48">Virtuoso</a> instances.Â  It works about enough to be measured, of which I will <a href="http://dbpedia.org/resource/Blog" id="link-id0xb958e90">blog</a> more over the next two weeks.</i>
 </b>
</p>
<p>
 <b><i>I will in the following give a features preview of what will be in the Virtuoso clustering support when it is released in the fall of this year (2007).</i>
 </b>
</p>
<h3>Data Partitioning</h3>
<p>A Virtuoso database consists of indices only, so that the row of a table is stored together with the primary key.Â  Blobs are stored on separate pages when they do not fit inline within the row.Â  With clustering, partitioning can be specified index by index. Partitioning means that values of specific columns are used for determining where the containing index entry will be stored.Â  Virtuoso partitions by hash and allows specifying what parts of partitioning columns are used for the hash, for example bits 14-6 of an integer or the first 5 characters of a string.Â  Like this, key compression gains are not lost by storing consecutive values on different partitions.</p>
<p>Once the partitioning is specified, we specify which set of cluster nodes stores this index.Â  Not every index has to be split evenly across all nodes.Â  Also, all nodes do not have to have equal slices of the partitioned index, accommodating differences in capacity between cluster nodes.</p>
<p>Each Virtuoso instance can manage up to 32TB of data.Â  A cluster has no definite size limit.</p>
<h3>Load Balancing and Fault Tolerance</h3>
<p>When data is partitioned, an operation on the data goes where the data is. Â This provides a certain natural parallelism but we will discuss this further below.</p>
<p>Some data may be stored multiple times in the cluster, either for fail-over or for splitting read load.Â  Some data, such as database schema, is replicated on all nodes.Â  When specifying a set of nodes for storing the partitions of a key, it is possible to specify multiple nodes for the same partition.Â  If this is the case, updates go to all nodes and reads go to a randomly picked node from the group.</p>
<p>If one of the nodes in the group fails, operation can resume with the surviving node. Â The failed node can be brought back online from the transaction logs of the surviving nodes. A few transactions may be rolled back at the time of failure and again at the time of the failed node rejoining the cluster but these are aborts as in the case of deadlock and lose no committed data.</p>
<h3>Shared Nothing</h3>
<p>The Virtuoso architecture does not require a SAN for disk sharing across nodes.Â  This is reasonable since a few disks on a local controller can easily provide 300MB/s of read and passing this over an interconnect fabric that would also have to carry inter-node messages could saturate even a fast network. </p>
<h3>Client View</h3>
<p>A <a href="http://dbpedia.org/resource/SQL" id="link-id0xbb32b58">SQL</a> or <a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0xaa4e2b8">HTTP</a> client can connect to any node of the cluster and get an identical view of all data with full transactional semantics.Â  DDL operations like table creation and package installation are limited to one node, though.</p>
<p>Applications such as <a href="http://dbpedia.org/resource/OpenLink_Data_Spaces" id="link-id0x1bc18300">ODS</a> will run unmodified.Â  They are installed on all nodes with a single install command.Â  After this, the data partitioning must be declared, which is a one time operation to be done cluster by cluster.Â  The only application change is specifying the partitioning columns for each index.Â  The gain is optional redundant storage and capacity not limited to a single machine.Â  The penalty is that single operations may take a little longer when not all data is managed by the same process but then the parallel throughput is increased. Â We note that the main ODS performance factor is web page logic and not database access. Â Thus splitting the web server logic over multiple nodes gives basically linear scaling.</p>
<h3>Parallel Query Execution</h3>
<p>Message latency is the principal performance factor in a clustered database.Â  Due to this, Virtuoso packs the maximum number of operations in a single message.Â  For example, when doing a loop join that reads one table sequentially and retrieves a row of another table for each row of the outer table, a large number of the join of the inner loop are run in parallel.Â  So, if there is a join of five tables that gets one row from each table and all rows are on different nodes, the time will be spent on message latency.Â  If each step of the join gets 10 rows, for a total of 100000 results, the message latency is not a significant factorÂ and the cluster will clearly outperform a single node.</p>
<p>Also, if the workload consists of large numbers of concurrent short updates or queries, the message latencies will even out and throughput will scale up even if doing a single transaction were faster on a single node.</p> <h3>Parallel SQL</h3> <p>There are SQL extensions for stored procedures allowing parallelizing operations. Â For example, if a procedure has a loop doing inserts, the inserted rows can be buffered until a sufficient number is available, at which point they are sent in batches to the nodes concerned. Â Transactional semantics are kept but error detection is deferred to the actual execution.</p>
<h3>Transactions</h3>
<p>Each transaction is owned by one node of the cluster, the node to which the client is connected.Â  When more than one node besides the owner of the transaction is updated, two phase commit is used.Â  This is transparent to the application code.Â  No external transaction monitor is required, the Virtuoso instances perform these functions internally.Â  There is a distributed deadlock detection scheme based on the nodes periodically sharing transaction waiting <a href="http://dbpedia.org/resource/Information" id="link-id0xb78b5c0">information</a>.</p>
<p>Since read transactions can operate without locks, reading the last committed state of uncommitted updated rows, waiting for locks is not very common.</p>
<h3>Interconnect and Threading</h3>
<p>Virtuoso uses TCP to connect between instances.Â  A single instance can have multiple listeners at different network interfaces for cluster activity.Â  The interfaces will be used in a round-robin fashion by the peers, spreading the load over all network interfaces. A separate thread is created for monitoring each interface.Â  Long messages, such as transfers of blobs are done on a separate thread, thus allowing normal service on the cluster node while the transfer is proceeding.</p>
<p>We will have to test the performance of TCP over <i>Infiniband</i> to see if there is clear gain in going to a lower level interface like <i>MPI</i>.Â  The Virtuoso architecture is based on streams connecting cluster nodes point to point.Â  The design does not per se gain from remote DMA or other features provided by MPI.Â  Typically, messages are quite short, under 100K. Â Flow control for transfer of blobs is however nice to have but can be written at the application level if needed.Â  We will get real data on the performance of different interconnects in the next weeks. </p>
<h3>Deployment and Management</h3>
<p>Configuring is quite simple, with each process sharing a copy of the same configuration file. Â One line in the file differs from host to host, telling it which one it is.Â  Otherwise the database configuration files are individual per host, accommodating different file system layouts etc. Â Setting up a node requires copying the executable and two configuration files, no more.Â  Â All functionality is contained in a single process.Â  There are no installers to be run or such.</p>
<p>Changing the number or network interface of cluster nodes requires a cluster restart.Â  Changing data partitioning requires copying the data into a new table and renaming this over the old one.Â  This is time consuming and does not mix well with updates.Â  Splitting an existing cluster node requires no copying with repartitioning but shifting data between partitions does.</p>
<p>A consolidated status report shows the general state and level of intra-cluster traffic as count of messages and count of bytes.</p>
<p>Start, shutdown, backup, and package installation commands can only be issued from a single master node. Otherwise all is symmetrical.</p>
<h3>Present State and Next Developments</h3>
<p>The basics are now in place.Â  Some code remains to be written for such things as distributed deadlock detection, 2-phase commit recovery cycle, management functions, etc.Â  Some SQL operations like text index, statistics sampling, and index intersection need special support, yet to be written.</p>
<p>The <a href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0xbca4e90">RDF</a> capabilities are not specifically affected by clustering except in a couple of places.Â  Loading will be slightly revised to use larger batches of rows to minimize latency, for example.</p>
<p>There is a pretty much infinite world of SQL optimizations for splitting aggregates, taking advantage of co-located joins etc.Â  These will be added gradually.Â  These are however not really central to the first application of RDF storage but are quite important for business intelligence, for example.</p>
<p>We will run some benchmarks for comparing single host and clustered Virtuoso instances over the next weeks.Â  Some of this will be with real data, giving an estimate on when we can move some of the RDF data we presently host to the new platform.Â  We will benchmark against <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id0xa9cc1b8">Oracle</a> and <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id0x1be5abb0">DB2</a> later but first we get things to work and compare against ourselves.</p>
<p>We roughly expect a halving in space consumption and a significant increase in single query performance and linearly scaling parallel throughput through addition of cluster nodes.</p>
<p>
<i>The <a href="http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1246" id="link-id106de430">next update</a> will be on this blog within two weeks.</i>
</p>
</div>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/weblog/oerling/?date=2007-08-27#1244">
  <rss:title>Virtuoso Cluster Preview</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2007-08-27T09:44:40Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">I wrote the basics of the Virtuoso clustering support over the past three weeks.Â  It can now manage connections, decide where things go, do two phase commits, insert and select data from tables partitioned over multiple Virtuoso instances.Â  It works about enough to be measured, of which I will blog more over the next two weeks. I will in the following give a features preview of what will be in the Virtuoso clustering support when it is released in the fall of this year (2007). Data Partitioning A Virtuoso database consists of indices only, so that the row of a table is stored together with the primary key.Â  Blobs are stored on separate pages when they do not fit inline within the row.Â  With clustering, partitioning can be specified index by index. Partitioning means that values of specific columns are used for determining where the containing index entry will be stored.Â  Virtuoso partitions by hash and allows specifying what parts of partitioning columns are used for the hash, for example bits 14-6 of an integer or the first 5 characters of a string.Â  Like this, key compression gains are not lost by storing consecutive values on different partitions. Once the partitioning is specified, we specify which set of cluster nodes stores this index.Â  Not every index has to be split evenly across all nodes.Â  Also, all nodes do not have to have equal slices of the partitioned index, accommodating differences in capacity between cluster nodes. Each Virtuoso instance can manage up to 32TB of data.Â  A cluster has no definite size limit. Load Balancing and Fault Tolerance When data is partitioned, an operation on the data goes where the data is. Â This provides a certain natural parallelism but we will discuss this further below. Some data may be stored multiple times in the cluster, either for fail-over or for splitting read load.Â  Some data, such as database schema, is replicated on all nodes.Â  When specifying a set of nodes for storing the partitions of a key, it is possible to specify multiple nodes for the same partition.Â  If this is the case, updates go to all nodes and reads go to a randomly picked node from the group. If one of the nodes in the group fails, operation can resume with the surviving node. Â The failed node can be brought back online from the transaction logs of the surviving nodes. A few transactions may be rolled back at the time of failure and again at the time of the failed node rejoining the cluster but these are aborts as in the case of deadlock and lose no committed data. Shared Nothing The Virtuoso architecture does not require a SAN for disk sharing across nodes.Â  This is reasonable since a few disks on a local controller can easily provide 300MB/s of read and passing this over an interconnect fabric that would also have to carry inter-node messages could saturate even a fast network. Client View A SQL or HTTP client can connect to any node of the cluster and get an identical view of all data with full transactional semantics.Â  DDL operations like table creation and package installation are limited to one node, though. Applications such as ODS will run unmodified.Â  They are installed on all nodes with a single install command.Â  After this, the data partitioning must be declared, which is a one time operation to be done cluster by cluster.Â  The only application change is specifying the partitioning columns for each index.Â  The gain is optional redundant storage and capacity not limited to a single machine.Â  The penalty is that single operations may take a little longer when not all data is managed by the same process but then the parallel throughput is increased. Â We note that the main ODS performance factor is web page logic and not database access. Â Thus splitting the web server logic over multiple nodes gives basically linear scaling. Parallel Query Execution Message latency is the principal performance factor in a clustered database.Â  Due to this, Virtuoso packs the maximum number of operations in a single message.Â  For example, when doing a loop join that reads one table sequentially and retrieves a row of another table for each row of the outer table, a large number of the join of the inner loop are run in parallel.Â  So, if there is a join of five tables that gets one row from each table and all rows are on different nodes, the time will be spent on message latency.Â  If each step of the join gets 10 rows, for a total of 100000 results, the message latency is not a significant factorÂ and the cluster will clearly outperform a single node. Also, if the workload consists of large numbers of concurrent short updates or queries, the message latencies will even out and throughput will scale up even if doing a single transaction were faster on a single node. Parallel SQL There are SQL extensions for stored procedures allowing parallelizing operations. Â For example, if a procedure has a loop doing inserts, the inserted rows can be buffered until a sufficient number is available, at which point they are sent in batches to the nodes concerned. Â Transactional semantics are kept but error detection is deferred to the actual execution. Transactions Each transaction is owned by one node of the cluster, the node to which the client is connected.Â  When more than one node besides the owner of the transaction is updated, two phase commit is used.Â  This is transparent to the application code.Â  No external transaction monitor is required, the Virtuoso instances perform these functions internally.Â  There is a distributed deadlock detection scheme based on the nodes periodically sharing transaction waiting information. Since read transactions can operate without locks, reading the last committed state of uncommitted updated rows, waiting for locks is not very common. Interconnect and Threading Virtuoso uses TCP to connect between instances.Â  A single instance can have multiple listeners at different network interfaces for cluster activity.Â  The interfaces will be used in a round-robin fashion by the peers, spreading the load over all network interfaces. A separate thread is created for monitoring each interface.Â  Long messages, such as transfers of blobs are done on a separate thread, thus allowing normal service on the cluster node while the transfer is proceeding. We will have to test the performance of TCP over Infiniband to see if there is clear gain in going to a lower level interface like MPI.Â  The Virtuoso architecture is based on streams connecting cluster nodes point to point.Â  The design does not per se gain from remote DMA or other features provided by MPI.Â  Typically, messages are quite short, under 100K. Â Flow control for transfer of blobs is however nice to have but can be written at the application level if needed.Â  We will get real data on the performance of different interconnects in the next weeks. Deployment and Management Configuring is quite simple, with each process sharing a copy of the same configuration file. Â One line in the file differs from host to host, telling it which one it is.Â  Otherwise the database configuration files are individual per host, accommodating different file system layouts etc. Â Setting up a node requires copying the executable and two configuration files, no more.Â  Â All functionality is contained in a single process.Â  There are no installers to be run or such. Changing the number or network interface of cluster nodes requires a cluster restart.Â  Changing data partitioning requires copying the data into a new table and renaming this over the old one.Â  This is time consuming and does not mix well with updates.Â  Splitting an existing cluster node requires no copying with repartitioning but shifting data between partitions does. A consolidated status report shows the general state and level of intra-cluster traffic as count of messages and count of bytes. Start, shutdown, backup, and package installation commands can only be issued from a single master node. Otherwise all is symmetrical. Present State and Next Developments The basics are now in place.Â  Some code remains to be written for such things as distributed deadlock detection, 2-phase commit recovery cycle, management functions, etc.Â  Some SQL operations like text index, statistics sampling, and index intersection need special support, yet to be written. The RDF capabilities are not specifically affected by clustering except in a couple of places.Â  Loading will be slightly revised to use larger batches of rows to minimize latency, for example. There is a pretty much infinite world of SQL optimizations for splitting aggregates, taking advantage of co-located joins etc.Â  These will be added gradually.Â  These are however not really central to the first application of RDF storage but are quite important for business intelligence, for example. We will run some benchmarks for comparing single host and clustered Virtuoso instances over the next weeks.Â  Some of this will be with real data, giving an estimate on when we can move some of the RDF data we presently host to the new platform.Â  We will benchmark against Oracle and DB2 later but first we get things to work and compare against ourselves. We roughly expect a halving in space consumption and a significant increase in single query performance and linearly scaling parallel throughput through addition of cluster nodes. The next update will be on this blog within two weeks.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>
 <b><i>I wrote the basics of the <a href="http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1229" id="link-id1383c310">Virtuoso clustering support</a> over the past three weeks.Â  It can now manage connections, decide where things go, do two phase commits, insert and select <a href="http://dbpedia.org/resource/Data" id="link-id0xbbf5988">data</a> from tables partitioned over multiple <a href="http://virtuoso.openlinksw.com" id="link-id0x1da47d98">Virtuoso</a> instances.Â  It works about enough to be measured, of which I will <a href="http://dbpedia.org/resource/Blog" id="link-id0xabf4a10">blog</a> more over the next two weeks.</i>
 </b>
</p>
<p>
 <b><i>I will in the following give a features preview of what will be in the Virtuoso clustering support when it is released in the fall of this year (2007).</i>
 </b>
</p>
<h3>Data Partitioning</h3>
<p>A Virtuoso database consists of indices only, so that the row of a table is stored together with the primary key.Â  Blobs are stored on separate pages when they do not fit inline within the row.Â  With clustering, partitioning can be specified index by index. Partitioning means that values of specific columns are used for determining where the containing index entry will be stored.Â  Virtuoso partitions by hash and allows specifying what parts of partitioning columns are used for the hash, for example bits 14-6 of an integer or the first 5 characters of a string.Â  Like this, key compression gains are not lost by storing consecutive values on different partitions.</p>
<p>Once the partitioning is specified, we specify which set of cluster nodes stores this index.Â  Not every index has to be split evenly across all nodes.Â  Also, all nodes do not have to have equal slices of the partitioned index, accommodating differences in capacity between cluster nodes.</p>
<p>Each Virtuoso instance can manage up to 32TB of data.Â  A cluster has no definite size limit.</p>
<h3>Load Balancing and Fault Tolerance</h3>
<p>When data is partitioned, an operation on the data goes where the data is. Â This provides a certain natural parallelism but we will discuss this further below.</p>
<p>Some data may be stored multiple times in the cluster, either for fail-over or for splitting read load.Â  Some data, such as database schema, is replicated on all nodes.Â  When specifying a set of nodes for storing the partitions of a key, it is possible to specify multiple nodes for the same partition.Â  If this is the case, updates go to all nodes and reads go to a randomly picked node from the group.</p>
<p>If one of the nodes in the group fails, operation can resume with the surviving node. Â The failed node can be brought back online from the transaction logs of the surviving nodes. A few transactions may be rolled back at the time of failure and again at the time of the failed node rejoining the cluster but these are aborts as in the case of deadlock and lose no committed data.</p>
<h3>Shared Nothing</h3>
<p>The Virtuoso architecture does not require a SAN for disk sharing across nodes.Â  This is reasonable since a few disks on a local controller can easily provide 300MB/s of read and passing this over an interconnect fabric that would also have to carry inter-node messages could saturate even a fast network. </p>
<h3>Client View</h3>
<p>A <a href="http://dbpedia.org/resource/SQL" id="link-id0x9fc302a0">SQL</a> or <a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x19faa348">HTTP</a> client can connect to any node of the cluster and get an identical view of all data with full transactional semantics.Â  DDL operations like table creation and package installation are limited to one node, though.</p>
<p>Applications such as <a href="http://dbpedia.org/resource/OpenLink_Data_Spaces" id="link-id0x20cd9e98">ODS</a> will run unmodified.Â  They are installed on all nodes with a single install command.Â  After this, the data partitioning must be declared, which is a one time operation to be done cluster by cluster.Â  The only application change is specifying the partitioning columns for each index.Â  The gain is optional redundant storage and capacity not limited to a single machine.Â  The penalty is that single operations may take a little longer when not all data is managed by the same process but then the parallel throughput is increased. Â We note that the main ODS performance factor is web page logic and not database access. Â Thus splitting the web server logic over multiple nodes gives basically linear scaling.</p>
<h3>Parallel Query Execution</h3>
<p>Message latency is the principal performance factor in a clustered database.Â  Due to this, Virtuoso packs the maximum number of operations in a single message.Â  For example, when doing a loop join that reads one table sequentially and retrieves a row of another table for each row of the outer table, a large number of the join of the inner loop are run in parallel.Â  So, if there is a join of five tables that gets one row from each table and all rows are on different nodes, the time will be spent on message latency.Â  If each step of the join gets 10 rows, for a total of 100000 results, the message latency is not a significant factorÂ and the cluster will clearly outperform a single node.</p>
<p>Also, if the workload consists of large numbers of concurrent short updates or queries, the message latencies will even out and throughput will scale up even if doing a single transaction were faster on a single node.</p> <h3>Parallel SQL</h3> <p>There are SQL extensions for stored procedures allowing parallelizing operations. Â For example, if a procedure has a loop doing inserts, the inserted rows can be buffered until a sufficient number is available, at which point they are sent in batches to the nodes concerned. Â Transactional semantics are kept but error detection is deferred to the actual execution.</p>
<h3>Transactions</h3>
<p>Each transaction is owned by one node of the cluster, the node to which the client is connected.Â  When more than one node besides the owner of the transaction is updated, two phase commit is used.Â  This is transparent to the application code.Â  No external transaction monitor is required, the Virtuoso instances perform these functions internally.Â  There is a distributed deadlock detection scheme based on the nodes periodically sharing transaction waiting <a href="http://dbpedia.org/resource/Information" id="link-id0xbcc0a50">information</a>.</p>
<p>Since read transactions can operate without locks, reading the last committed state of uncommitted updated rows, waiting for locks is not very common.</p>
<h3>Interconnect and Threading</h3>
<p>Virtuoso uses TCP to connect between instances.Â  A single instance can have multiple listeners at different network interfaces for cluster activity.Â  The interfaces will be used in a round-robin fashion by the peers, spreading the load over all network interfaces. A separate thread is created for monitoring each interface.Â  Long messages, such as transfers of blobs are done on a separate thread, thus allowing normal service on the cluster node while the transfer is proceeding.</p>
<p>We will have to test the performance of TCP over <i>Infiniband</i> to see if there is clear gain in going to a lower level interface like <i>MPI</i>.Â  The Virtuoso architecture is based on streams connecting cluster nodes point to point.Â  The design does not per se gain from remote DMA or other features provided by MPI.Â  Typically, messages are quite short, under 100K. Â Flow control for transfer of blobs is however nice to have but can be written at the application level if needed.Â  We will get real data on the performance of different interconnects in the next weeks. </p>
<h3>Deployment and Management</h3>
<p>Configuring is quite simple, with each process sharing a copy of the same configuration file. Â One line in the file differs from host to host, telling it which one it is.Â  Otherwise the database configuration files are individual per host, accommodating different file system layouts etc. Â Setting up a node requires copying the executable and two configuration files, no more.Â  Â All functionality is contained in a single process.Â  There are no installers to be run or such.</p>
<p>Changing the number or network interface of cluster nodes requires a cluster restart.Â  Changing data partitioning requires copying the data into a new table and renaming this over the old one.Â  This is time consuming and does not mix well with updates.Â  Splitting an existing cluster node requires no copying with repartitioning but shifting data between partitions does.</p>
<p>A consolidated status report shows the general state and level of intra-cluster traffic as count of messages and count of bytes.</p>
<p>Start, shutdown, backup, and package installation commands can only be issued from a single master node. Otherwise all is symmetrical.</p>
<h3>Present State and Next Developments</h3>
<p>The basics are now in place.Â  Some code remains to be written for such things as distributed deadlock detection, 2-phase commit recovery cycle, management functions, etc.Â  Some SQL operations like text index, statistics sampling, and index intersection need special support, yet to be written.</p>
<p>The <a href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0xb919200">RDF</a> capabilities are not specifically affected by clustering except in a couple of places.Â  Loading will be slightly revised to use larger batches of rows to minimize latency, for example.</p>
<p>There is a pretty much infinite world of SQL optimizations for splitting aggregates, taking advantage of co-located joins etc.Â  These will be added gradually.Â  These are however not really central to the first application of RDF storage but are quite important for business intelligence, for example.</p>
<p>We will run some benchmarks for comparing single host and clustered Virtuoso instances over the next weeks.Â  Some of this will be with real data, giving an estimate on when we can move some of the RDF data we presently host to the new platform.Â  We will benchmark against <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id0x1ddf8288">Oracle</a> and <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id0xa04b6ae8">DB2</a> later but first we get things to work and compare against ourselves.</p>
<p>We roughly expect a halving in space consumption and a significant increase in single query performance and linearly scaling parallel throughput through addition of cluster nodes.</p>
<p>
<i>The <a href="http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1246" id="link-id106de430">next update</a> will be on this blog within two weeks.</i>
</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/weblog/oerling/?date=2007-05-23#1198">
  <rss:title>Virtuoso Cluster</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2007-05-23T14:11:38Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">We often get questions on clustering support, especially around RDF, where databases quickly get rather large. So we will answer them here. But first on some support technology. We have an entire new disk allocation and IO system. It is basically operational but needs some further tuning. It offers much better locality and much better sequential access speeds. Specially for dealing with large RDF databases, we will introduce data compression. We have over the years looked at different key compression possibilities but have never been very excited by them since thy complicate random access to index pages and make for longer execution paths, require scraping data for one logical thing from many places, and so on. Anyway, now we will compress pages before writing them to disk, so the cache is in machine byte order and alignment and disk is compressed. Since multiple processors are commonplace on servers, they can well be used for compression, that being such a nicely local operation, all in cache and requiring no serialization with other things. Of course, what was fixed length now becomes variable length, but if the compression ratio is fairly constant, we reserve space for the expected compressed size, and deal with the rare overflows separately. So no complicated shifting data around when something grows. Once we are done with this, this could well be a separate intermediate release. Now about clusters. We have for a long time had various plans for clusters but have not seen the immediate need for execution. With the rapid growth in the Linking Open Data movement and questions on web scale knowledge systems, it is time to get going. How will it work? Virtuoso remains a generic DBMS, thus the clustering support is an across the board feature, not something for RDF only. So we can join Oracle, IBM DB2, and others at the multi-terabyte TPC races. We introduce hash partitioning at the index level and allow for redundancy, where multiple nodes can serve the same partition, allowing for load balancing read and replacement of failing nodes and growth of cluster without interruption of service. The SQL compiler, SPARQL, and database engine all stay the same. There is a little change in the SQL run time, not so different from what we do with remote databases at present in the context of our virtual database federation. There is a little extra complexity for distributed deadlock detection and sometimes multiple threads per transaction. We remember that one RPC round trip Is 3-4 index lookups, so we pipeline things so as to move requests in batches, a few dozen at a time. The cluster support will be in the same executable and will be enabled by configuration file settings. Administration is limited to one node, but Web and SQL clients can connect to any node and see the same data. There is no balancing between storage and control nodes because clients can simply be allocated round robin for statistically even usage. In relational applications, as exemplified by TPC-C, if one partitions by fields with an application meaning (such as warehouse ID), and if clients have an affinity to a particular chunk of data, they will of course preferentially connect to nodes hosting this data. With RDF, such affinity is unlikely, so nodes are basically interchangeable. In practice, we develop in June and July. Then we can rent a supercomputer maybe from Amazon EC2 and experiment away. We should just come up with a name for this. Maybe something astronomical, like star cluster. Big, bright but in this case not far away.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>We often get questions on clustering support, especially around <a href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x1dfb7228">RDF</a>, where databases quickly get rather large. So we will answer them here.</p>
<p>But first on some support technology. We have an entire new disk allocation and IO system. It is basically operational but needs some further tuning. It offers much better locality and much better sequential access speeds.</p>
<p>Specially for dealing with large RDF databases, we will introduce <a href="http://dbpedia.org/resource/Data" id="link-id0x1e5ba388">data</a> compression. We have over the years looked at different key compression possibilities but have never been very excited by them since thy complicate random access to index pages and make for longer execution paths, require scraping data for one logical thing from many places, and so on. Anyway, now we will compress pages before writing them to disk, so the cache is in machine byte order and alignment and disk is compressed. Since multiple processors are commonplace on servers, they can well be used for compression, that being such a nicely local operation, all in cache and requiring no serialization with other things.</p>
<p>Of course, what was fixed length now becomes variable length, but if the compression ratio is fairly constant, we reserve space for the expected compressed size, and deal with the rare overflows separately. So no complicated shifting data around when something grows.</p>
<p>Once we are done with this, this could well be a separate intermediate release.</p>
<p>Now about clusters. We have for a long time had various plans for clusters but have not seen the immediate need for execution. With the rapid growth in the Linking Open Data movement and questions on web scale <a href="http://dbpedia.org/resource/Knowledge" id="link-id0x1df3a540">knowledge</a> systems, it is time to get going.</p>
<p>How will it work? <a href="http://virtuoso.openlinksw.com" id="link-id0x1e0b6e28">Virtuoso</a> remains a generic DBMS, thus the clustering support is an across the board feature, not something for RDF only. So we can join <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id0xab13a90">Oracle</a>, IBM <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id0x1c2c2258">DB2</a>, and others at the multi-terabyte TPC races.</p>
<p>We introduce hash partitioning at the index level and allow for redundancy, where multiple nodes can serve the same partition, allowing for load balancing read and replacement of failing nodes and growth of cluster without interruption of service.</p>
<p>The <a href="http://dbpedia.org/resource/SQL" id="link-id0x1c3ef770">SQL</a> compiler, <a href="http://dbpedia.org/resource/SPARQL" id="link-id0x1e0beba0">SPARQL</a>, and database engine all stay the same. There is a little change in the SQL run time, not so different from what we do with remote databases at present in the context of our <a href="http://dbpedia.org/resource/Virtual_Database" id="link-id0x1bc05f00">virtual database</a> federation. There is a little extra complexity for distributed deadlock detection and sometimes multiple threads per transaction. We remember that one RPC round trip Is 3-4 index lookups, so we pipeline things so as to move requests in batches, a few dozen at a time.</p>
<p>The cluster support will be in the same executable and will be enabled by configuration file settings. Administration is limited to one node, but Web and SQL clients can connect to any node and see the same data. There is no balancing between storage and control nodes because clients can simply be allocated round robin for statistically even usage. In relational applications, as exemplified by <a href="http://dbpedia.org/resource/TPC-C" id="link-id0x1dfbe2b0">TPC-C</a>, if one partitions by fields with an application meaning (such as warehouse ID), and if clients have an affinity to a particular chunk of data, they will of course preferentially connect to nodes hosting this data. With RDF, such affinity is unlikely, so nodes are basically interchangeable.</p>
<p>In practice, we develop in June and July. Then we can rent a supercomputer maybe from Amazon EC2 and experiment away.</p>
<p>We should just come up with a name for this. Maybe something astronomical, like star cluster. Big, bright but in this case not far away.</p>
]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?date=2007-05-23#1201">
  <rss:title>Virtuoso Cluster</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2007-05-23T14:09:37Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuoso Cluster We often get questions on clustering support, especially around RDF, where databases quickly get rather large. So we will answer them here. But first on some support technology. We have an entire new disk allocation and IO system. It is basically operational but needs some further tuning. It offers much better locality and much better sequential access speeds. Specially for dealing with large RDF databases, we will introduce data compression. We have over the years looked at different key compression possibilities but have never been very excited by them since thy complicate random access to index pages and make for longer execution paths, require scraping data for one logical thing from many places, and so on. Anyway, now we will compress pages before writing them to disk, so the cache is in machine byte order and alignment and disk is compressed. Since multiple processors are commonplace on servers, they can well be used for compression, that being such a nicely local operation, all in cache and requiring no serialization with other things. Of course, what was fixed length now becomes variable length, but if the compression ratio is fairly constant, we reserve space for the expected compressed size, and deal with the rare overflows separately. So no complicated shifting data around when something grows. Once we are done with this, this could well be a separate intermediate release. Now about clusters. We have for a long time had various plans for clusters but have not seen the immediate need for execution. With the rapid growth in the Linking Open Data movement and questions on web scale knowledge systems, it is time to get going. How will it work? Virtuoso remains a generic DBMS, thus the clustering support is an across the board feature, not something for RDF only. So we can join Oracle, IBM DB2, and others at the multi-terabyte TPC races. We introduce hash partitioning at the index level and allow for redundancy, where multiple nodes can serve the same partition, allowing for load balancing read and replacement of failing nodes and growth of cluster without interruption of service. The SQL compiler, SPARQL, and database engine all stay the same. There is a little change in the SQL run time, not so different from what we do with remote databases at present in the context of our virtual database federation. There is a little extra complexity for distributed deadlock detection and sometimes multiple threads per transaction. We remember that one RPC round trip Is 3-4 index lookups, so we pipeline things so as to move requests in batches, a few dozen at a time. The cluster support will be in the same executable and will be enabled by configuration file settings. Administration is limited to one node, but Web and SQL clients can connect to any node and see the same data. There is no balancing between storage and control nodes because clients can simply be allocated round robin for statistically even usage. In relational applications, as exemplified by TPC-C, if one partitions by fields with an application meaning (such as warehouse ID), and if clients have an affinity to a particular chunk of data, they will of course preferentially connect to nodes hosting this data. With RDF, such affinity is unlikely, so nodes are basically interchangeable. In practice, we develop in June and July. Then we can rent a supercomputer maybe from Amazon EC2 and experiment away. We should just come up with a name for this. Maybe something astronomical, like star cluster. Big, bright but in this case not far away.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<div>
<div style="display:none;">Virtuoso Cluster</div>
<p>We often get questions on clustering support, especially around <a href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x1e53d008">RDF</a>, where databases quickly get rather large. So we will answer them here.</p>
<p>But first on some support technology. We have an entire new disk allocation and IO system. It is basically operational but needs some further tuning. It offers much better locality and much better sequential access speeds.</p>
<p>Specially for dealing with large RDF databases, we will introduce <a href="http://dbpedia.org/resource/Data" id="link-id0x1e042690">data</a> compression. We have over the years looked at different key compression possibilities but have never been very excited by them since thy complicate random access to index pages and make for longer execution paths, require scraping data for one logical thing from many places, and so on. Anyway, now we will compress pages before writing them to disk, so the cache is in machine byte order and alignment and disk is compressed. Since multiple processors are commonplace on servers, they can well be used for compression, that being such a nicely local operation, all in cache and requiring no serialization with other things.</p>
<p>Of course, what was fixed length now becomes variable length, but if the compression ratio is fairly constant, we reserve space for the expected compressed size, and deal with the rare overflows separately. So no complicated shifting data around when something grows.</p>
<p>Once we are done with this, this could well be a separate intermediate release.</p>
<p>Now about clusters. We have for a long time had various plans for clusters but have not seen the immediate need for execution. With the rapid growth in the Linking Open Data movement and questions on web scale <a href="http://dbpedia.org/resource/Knowledge" id="link-id0x1e7714f0">knowledge</a> systems, it is time to get going.</p>
<p>How will it work? <a href="http://virtuoso.openlinksw.com" id="link-id0x1e3caea8">Virtuoso</a> remains a generic DBMS, thus the clustering support is an across the board feature, not something for RDF only. So we can join <a href="http://dbpedia.org/resource/Oracle_Database" id="link-id0x1ac67648">Oracle</a>, IBM <a href="http://dbpedia.org/resource/IBM_DB2" id="link-id0x1c2267d0">DB2</a>, and others at the multi-terabyte TPC races.</p>
<p>We introduce hash partitioning at the index level and allow for redundancy, where multiple nodes can serve the same partition, allowing for load balancing read and replacement of failing nodes and growth of cluster without interruption of service.</p>
<p>The <a href="http://dbpedia.org/resource/SQL" id="link-id0x1daea638">SQL</a> compiler, <a href="http://dbpedia.org/resource/SPARQL" id="link-id0x1ddb8c50">SPARQL</a>, and database engine all stay the same. There is a little change in the SQL run time, not so different from what we do with remote databases at present in the context of our <a href="http://dbpedia.org/resource/Virtual_Database" id="link-id0x1e13a880">virtual database</a> federation. There is a little extra complexity for distributed deadlock detection and sometimes multiple threads per transaction. We remember that one RPC round trip Is 3-4 index lookups, so we pipeline things so as to move requests in batches, a few dozen at a time.</p>
<p>The cluster support will be in the same executable and will be enabled by configuration file settings. Administration is limited to one node, but Web and SQL clients can connect to any node and see the same data. There is no balancing between storage and control nodes because clients can simply be allocated round robin for statistically even usage. In relational applications, as exemplified by <a href="http://dbpedia.org/resource/TPC-C" id="link-id0x1c236bb0">TPC-C</a>, if one partitions by fields with an application meaning (such as warehouse ID), and if clients have an affinity to a particular chunk of data, they will of course preferentially connect to nodes hosting this data. With RDF, such affinity is unlikely, so nodes are basically interchangeable.</p>
<p>In practice, we develop in June and July. Then we can rent a supercomputer maybe from Amazon EC2 and experiment away.</p>
<p>We should just come up with a name for this. Maybe something astronomical, like star cluster. Big, bright but in this case not far away.</p>
</div>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-10-20#1065">
  <rss:title>Birds of a Feather Flock Together - Mac OS X &amp; Rails</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2006-10-20T23:55:40Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">A very cool video promo for Ruby on Rails and Mac OS X, or should I say: 37 Signals &amp; Apple :-) Either way, very cool! BTW - We have just released a collection of High-Performance Data Providers for ActiveRecord. Our providers deliver Consistent Functionality to RoR developers across Virtuoso, Oracle, SQL Server, Sybase, DB2, Ingres, Informix, and others without compromising performance or cross platform portability.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>A very cool v<a href="http://www.apple.com/education/whymac/compsci/video.html">ideo promo for Ruby on Rails and Mac OS X</a>, or should I say: 37 Signals &amp; Apple :-) Either way, very cool!</p>

<p>BTW - We have just released a collection of <a href="http://rubyforge.org/projects/odbc-rails/">High-Performance Data Providers for ActiveRecord</a>. Our providers deliver </p>
<blockquote>Consistent Functionality</blockquote> to RoR developers across <a href="http://virtuoso.openlinksw.com/wiki/main/">Virtuoso</a>, Oracle, SQL Server, Sybase, DB2, Ingres, Informix, and others without compromising performance or cross platform portability.]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-05-05#968">
  <rss:title>&quot;Free&quot; Databases: Express vs. Open-Source RDBMSs</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2006-05-05T16:02:17Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Very detailed and insightful peek into the state of affairs re. database engines (Open &amp; Closed Source). I added the missing piece regarding the &quot;Virtuoso Conductor&quot; (the Web based Admin UI for Virtuoso) to the original post below. I also added a link to our live SPARQL Demo so that anyone interested can start playing around with SPARQL and SPARQL integrated into SQL right away. Another good thing about this post is the vast amount of valuable links that it contains. To really appreciate this point simply visit my Linkblog (excuse the current layout :-) - a Tab if you come in via the front door of this Data Space (what I used to call My Weblog Home Page). &quot;Free&quot; Databases: Express vs. Open-Source RDBMSs: &quot;Open-source relational database management systems (RDBMSs) are gaining IT mindshare at a rapid pace. As an example, BusinessWeek&#39;s February 6, 2006 &#39; Taking On the Database Giants &#39; article asks &#39;Can open-source upstarts compete with Oracle, IBM, and Microsoft?&#39; and then provides the answer: &#39;It&#39;s an uphill battle, but customers are starting to look at the alternatives.&#39; There&#39;s no shortage of open-source alternatives to look at. The BusinessWeek article concentrates on MySQL, which BW says &#39;is trying to be the Ikea of the database world: cheap, needs some assembly, but has a sleek, modern design and does the job.&#39; The article also discusses Postgre[SQL] and Ingres, as well as EnterpriseDB, an Oracle clone created from PostgreSQL code*. Sun includes PostgreSQL with Solaris 10 and, as of April 6, 2006, with Solaris Express.** *Frank Batten, Jr., the investor who originally funded Red Hat, invested a reported $16 million into Great Bridge with the hope of making a business out of providing paid support to PostgreSQL users. Great Bridge stayed in business only 18 months , having missed an opportunity to sell the business to Red Hat and finding that selling $50,000-per-year support packages for an open-source database wasn&#39;t easy. As Batten concluded, &#39;We could not get customers to pay us big dollars for support contracts.&#39; Perhaps EnterpriseDB will be more successful with a choice of $5,000, $3,000, or $1,000 annual support subscriptions . **Interestingly, Oracle announced in November 2005 that Solaris 10 is &#39;its preferred development and deployment platform for most x64 architectures, including x64 (x86, 64-bit) AMD Opteron and Intel Xeon processor-based systems and Sun&#39;s UltraSPARC(R)-based systems.&#39; There is a surfeit of reviews of current MySQL, PostgreSQL andâto a lesser extentâIngres implementations. These three open-source RDBMSs come with their own or third-party management tools. These systems compete against free versions of commercial (proprietary) databases: SQL Server 2005 Express Edition (and its MSDE 2000 and 1.0 predecessors), Oracle Database 10g Express Edition, IBM DB2 Express-C, and Sybase ASE Express Edition for Linux where database size and processor count limitations aren&#39;t important. Click here for a summary of recent InfoWorld reviews of the full versions of these four databases plus MySQL, which should be valid for Express editions also. The FTPOnline Special Report article, &#39;Microsoft SQL Server Turns 17,&#39; that contains the preceding table is here (requires registration.) SQL Server 2005 Express Edition SP-1 Advanced Features SQL Server 2005 Express Edition with Advanced Features enhances SQL Server 2005 Express Edition (SQL Express or SSX) dramatically, so it deserves special treatment here. SQL Express gains full text indexing and now supports SQL Server Reporting Services (SSRS) on the local SSX instance. The SP-1 with Advanced Features setup package, which Microsoft released on April 18, 2006, installs the release version of SQL Server Management Studio Express (SSMSE) and the full version of Business Intelligence Development Studio (BIDS) for designing and editing SSRS reports. My &#39;Install SP-1 for SQL Server 2005 and Express&#39; article for FTPOnline&#39;s SQL Server Special Report provides detailed, illustrated installation instructions for and related information about the release version of SP-1. SP-1 makes SSX the most capable of all currently available Express editions of commercial RDBMSs for Windows. OpenLink Software&#39;s Virtuoso Open-Source Edition OpenLink Software announced an open-source version of it&#39;s Virtuoso Universal Server commercial DBMS on April 11, 2006. On the initial date of this post, May 2, 2006, Virtuoso Open-Source Edition (VOS) was virtually under the radar as an open-source product. According to this press release, the new edition includes: SPARQL compliant RDF Triple Store SQL-200n Object-Relational Database Engine (SQL, XML, and Free Text) Integrated BPEL Server and Enterprise Service Bus WebDAV and Native File Server Web Application Server that supports PHP, Perl, Python, ASP.NET, JSP, etc. Runtime Hosting for Microsoft .NET, Mono, and Java VOS only lacks the virtual server and replication features that are offered by the commercial edition. VOS includes a Web-based administration tool called the &quot;Virtuoso Conductor&quot; According to Kingsley Idehen&#39;s Weblog, &#39;The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).&#39; InfoWorld&#39;s Jon Udell has tracked Virtuoso&#39;s progress since 2002, with an additional article in 2003 and a one-hour podcast with Kingsley Idehen on April 26, 2006. A major talking point for Virtuoso is its support for Atom 0.3 syndication and publication, Atom 1.0 syndication and (forthcoming) publication, and future support for Google&#39;s GData protocol, as mentioned in this Idehen post. Yahoo!&#39;s Jeremy Zawodny points out that the &#39;fingerprints&#39; of Adam Bosworth, Google&#39;s VP of Engineering and the primary force behind the development of Microsoft Access, &#39;are all over GData.&#39; Click here to display a list of all OakLeaf posts that mention Adam Bosworth. One application for the GData protocol is querying and updating the Google Base database independently of the Google Web client, as mentioned by Jeremy: &#39;It&#39;s not about building an easier onramp to Google Base. ... Well, it is. But, again, that&#39;s the small stuff.&#39; Click here for a list of posts about my experiences with Google Base. Watch for a future OakLeaf post on the subject as the GData APIs gain ground. Open-Source and Free Embedded Database Contenders Open-source and free embedded SQL databases are gaining importance as the number and types of mobile devices and OSs proliferate. Embedded databases usually consist of Java classes or Windows DLLs that are designed to minimize file size and memory consumption. Embedded databases avoid the installation hassles, heavy resource usage and maintenance cost associated with client/server RDBMSs that run as an operating system service. Andrew Hudson&#39;s December 2005 &#39;Open Source databases rounded up and rodeoed&#39; review for The Enquirer provides brief descriptions of one commercial and eight open source database purveyors/products: Sleepycat, MySQL, PostgreSQL, Ingres, InnoBase, Firebird, IBM Cloudscape (a.k.a, Derby), Genezzo, and Oracle. Oracle Sleepycat* isn&#39;t an SQL Database, Oracle InnoDB* is an OEM database engine that&#39;s used by MySQL, and Genezzo is a multi-user, multi-server distributed database engine written in Perl. These special-purpose databases are beyond the scope of this post. * Oracle purchased Sleepycat Software, Inc. in February 2006 and purchased Innobase OY in October 2005 . The press release states: &#39;Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software.&#39; Derby is an open-source release by the Apache Software Foundation of the Cloudscape Java-based database that IBM acquired when it bought Informix in 2001. IBM offers a commercial release of Derby as IBM Cloudscape 10.1. Derby is a Java class library that has a relatively light footprint (2 MB), which make it suitable for client/server synchronization with the IBM DB2 Everyplace Sync Server in mobile applications. The IBM DB2 Everyplace Express Edition isn&#39;t open source or free*, so it doesn&#39;t qualify for this post. The same is true for the corresponding Sybase SQL Anywhere components.** * IBM DB2 Everyplace Express Edition with synchronization costs $379 per server (up to two processors) and $79 per user. DB2 Everyplace Database Edition (without DB2 synchronization) is $49 per user. (Prices are based on those when IBM announced version 8 in November 2003.) ** Sybase&#39;s iAnywhere subsidiary calls SQL Anywhere &#39;the industry&#39;s leading mobile database.&#39; A Sybase SQL Anywhere Personal DB seat license with synchronization to SQL Anywhere Server is $119; the cost without synchronization wasn&#39;t available from the Sybase Web site. Sybase SQL Anywhere and IBM DB2 Everyplace perform similar replication functions. Sun&#39;s Java DB, another commercial version of Derby, comes with the Solaris Enterprise Edition, which bundles Solaris 10, the Java Enterprise System, developer tools, desktop infrastructure and N1 management software. A recent Between the Lines blog entry by ZDNet&#39;s David Berlind waxes enthusiastic over the use of Java DB embedded in a browser to provide offline persistence. RedMonk analyst James Governor and eWeek&#39;s Lisa Vaas wrote about the use of Java DB as a local data store when Tim Bray announced Sun&#39;s Derby derivative and Francois Orsini demonstrated Java DB embedded in the Firefox browser at the ApacheCon 2005 conference. Firebird is derived from Borland&#39;s InterBase 6.0 code, the first commercial relational database management system (RDBMS) to be released as open source. Firebird has excellent support for SQL-92 and comes in three versions: Classic, SuperServer and Embedded for Windows, Linux, Solaris, HP-UX, FreeBSD and MacOS X. The embedded version has a 1.4-MB footprint. Release Candidate 1 for Firebird 2.0 became available on March 30, 2006 and is a major improvement over earlier versions. Borland continues to promote InterBase, now at version 7.5, as a small-footprint, embedded database with commercial Server and Client licenses. SQLite is a featherweight C library for an embedded database that implements most SQL-92 entry- and transitional-level requirements (some through the JDBC driver) and supports transactions within a tiny 250-KB code footprint. Wrappers support a multitude of languages and operating systems, including Windows CE, SmartPhone, Windows Mobile, and Win32. SQLite&#39;s primary SQL-92 limitations are lack of nested transactions, inability to alter a table design once committed (other than with RENAME TABLE and ADD COLUMN operations), and foreign-key constraints. SQLite provides read-only views, triggers, and 256-bit encryption of database files. A downside is the the entire database file is locked when while a transaction is in progress. SQLite uses file access permissions in lieu of GRANT and REVOKE commands. Using SQLite involves no license; its code is entirely in the public domain. The Mozilla Foundation&#39;s Unified Storage wiki says this about SQLite: &#39;SQLite will be the back end for the unified store [for Firefox]. Because it implements a SQL engine, we get querying &#39;for free&#39;, without having to invent our own query language or query execution system. Its code-size footprint is moderate (250k), but it will hopefully simplify much existing code so that the net code-size change should be smaller. It has exceptional performance, and supports concurrent access to the database. Finally, it is released into the public domain, meaning that we will have no licensing issues.&#39; Vieka Technology, Inc.&#39;s eSQL 2.11 is a port of SQLite to Windows Mobile (Pocket PC and Smartphone) and Win32, and includes development tools for Windows devices and PCs, as well as a .NET native data provider. A conventional ODBC driver also is available. eSQL for Windows (Win32) is free for personal and commercial use; eSQL for Windows Mobile requires a license for commercial (for-profit or business) use. HSQLDB isn&#39;t on most reviewers&#39; radar, which is surprising because it&#39;s the default database for OpenOffice.org (OOo) 2.0&#39;s Base suite member. HSQLDB 1.8.0.1 is an open-source (BSD license) Java dembedded database engine based on Thomas Mueller&#39;s original Hypersonic SQL Project. Using OOo&#39;s Base feature requires installing the Java 2.0 Runtime Engine (which is not open-source) or the presence of an alternative open-source engine, such as Kaffe. My prior posts about OOo Base and HSQLDB are here, here and here. The HSQLDB 1.8.0 documentation on SourceForge states the following regarding SQL-92 and later conformance: HSQLDB 1.8.0 supports the dialect of SQL defined by SQL standards 92, 99 and 2003. This means where a feature of the standard is supported, e.g. left outer join, the syntax is that specified by the standard text. Many features of SQL92 and 99 up to Advanced Level are supported and here is support for most of SQL 2003 Foundation and several optional features of this standard. However, certain features of the Standards are not supported so no claim is made for full support of any level of the standards. Other less well-known embedded databases designed for or suited to mobile deployment are Mimer SQL Mobile and VistaDB 2.1 . Neither product is open-source and require paid licensing; VistaDB requires a small up-front payment by developers but offers royalty-free distribution. Java DB, Firebird embedded, SQLite and eSQL 2.11 are contenders for lightweight PC and mobile device database projects that aren&#39;t Windows-only. SQL Server 2005 Everywhere If you&#39;re a Windows developer, SQL Server Mobile is the logical embedded database choice for mobile applications for Pocket PCs and Smartphones. Microsoft&#39;s April 19, 2006 press release delivered the news that SQL Server 2005 Mobile Editon (SQL Mobile or SSM) would gain a big brotherâSQL Server 2005 Everywhere Edition. Currently, the SSM client is licensed (at no charge) to run in production on devices with Windows CE 5.0, Windows Mobile 2003 for Pocket PC or Windows Mobile 5.0, or on PCs with Windows XP Tablet Edition only. SSM also is licensed for development purposes on PCs running Visual Studio 2005. Smart Device replication with SQL Server 2000 SP3 and later databases has been the most common application so far for SSM. By the end of 2006, Microsoft will license SSE for use on all PCs running any Win32 version or the preceding device OSs. A version of SQL Server Management Studio Express (SSMSE)âupdated to support SSEâis expected to release by the end of the year. These features will qualify SSE as the universal embedded database for Windows client and smart-device applications. For more details on SSE, read John Galloway&#39;s April 11, 2006 blog post and my &#39;SQL Server 2005 Mobile Goes Everywhere&#39; article for the FTPOnline Special Report on SQL Server.&quot; (Via OakLeaf Systems.)</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[
 <p>Very detailed and insightful peek into the state of affairs re. database engines (Open &amp; Closed Source).</p>   <p>I added the missing piece regarding the &quot;Virtuoso Conductor&quot; (the Web based Admin UI for Virtuoso) to the original post below. I also added a link to our live SPARQL Demo so that anyone interested can start playing around with SPARQL and SPARQL integrated into SQL right away.</p>  <p>Another good thing about this post is the vast amount of valuable links that it contains. To really appreciate this point simply visit my Linkblog (excuse the current layout :-) - a Tab if you come in via the front door of this <a href="http://www.infoworld.com/opinions/index.html">Data Space</a> (what I used to call <a href="http://www.openlinksw.com/blog/%7Ekidehen/">My Weblog Home Page</a>).</p>   <blockquote>  <p>   <a href="http://oakleafblog.blogspot.com/2006/05/free-databases-express-vs-open-source.html">&quot;Free&quot; Databases: Express vs. Open-Source RDBMSs</a>: &quot;<span style="font-family: verdana;">Open-source relational database management systems (RDBMSs) are gaining IT mindshare at a rapid pace. As an example, <em>BusinessWeek</em>&#39;s February 6, 2006 &#39;</span>   <a href="http://www.businessweek.com/technology/content/feb2006/tc20060206_918648.htm"><span style="font-family: verdana;">Taking On the Database Giants</span>   </a><span style="font-family: verdana;">&#39; article asks &#39;Can open-source upstarts compete with Oracle, IBM, and Microsoft?&#39; and then provides the answer: &#39;It&#39;s an uphill battle, but customers are starting to look at the alternatives.&#39;</span>   <br />   <span style="font-family: Verdana;"></span>   <br />   <span style="font-family: Verdana;">There&#39;s no shortage of open-source alternatives to look at. The <em>BusinessWeek</em> article concentrates on <a href="http://www.mysql.com/">MySQL</a>, which <em>BW</em> says &#39;is trying to be the Ikea of the database world: cheap, needs some assembly, but has a sleek, modern design and does the job.&#39; The article also discusses <a href="http://www.postgresql.org/">Postgre[SQL]</a> and <a href="http://www.ingres.com/products/Prod_Ingres_2006.html">Ingres</a>, as well as <a href="http://www.enterprisedb.com/">EnterpriseDB</a>, an Oracle clone created from PostgreSQL code*. Sun includes <a href="http://www.sun.com/software/solaris/postgres.jsp">PostgreSQL with Solaris 10</a> and, as of April 6, 2006, with <a href="http://docs.sun.com/app/docs/doc/819-2183/6n4g726uc?a=view">Solaris Express</a>.**</span>   <br />   <span style="font-family: Verdana;"></span>   <br />   <span style="font-family: Verdana;"><span style="font-size: 85%;">*Frank Batten, Jr., the investor who originally funded Red Hat, invested a reported </span>    <a href="http://www.theinquirer.net/?article=28201"><span style="font-size: 85%;">$16 million into Great Bridge</span>    </a><span style="font-size: 85%;"> with the hope of making a business out of providing paid support to PostgreSQL users. </span>    <a href="http://news.com.com/2100-1001-272715.html"><span style="font-size: 85%;">Great Bridge stayed in business only 18 months</span>    </a><span style="font-size: 85%;">, having </span>    <a href="http://news.com.com/2100-1001-268915.html"><span style="font-size: 85%;">missed an opportunity to sell the business to Red Hat</span>    </a><span style="font-size: 85%;"> and finding that selling </span>    <a href="http://news.com.com/2100-1001-269729.html"><span style="font-size: 85%;">$50,000-per-year support packages</span>    </a><span style="font-size: 85%;"> for an open-source database wasn&#39;t easy. As Batten concluded, &#39;We could not get customers to pay us big dollars for support contracts.&#39; Perhaps EnterpriseDB will be more successful with a choice of </span>    <a href="http://www.enterprisedb.com/shop.do?cID=10000&pID=10001"><span style="font-size: 85%;">$5,000, $3,000, or $1,000 annual support subscriptions</span>    </a><span style="font-size: 85%;">.</span>   </span>   <br />   <span style="font-family: Verdana;"></span><span style="font-family: Verdana;"></span><span style="font-family: Verdana;"></span>   <br />   <span style="font-family: Verdana; font-size: 85%;">**Interestingly, <a href="http://www.sun.com/smi/Press/sunflash/2005-11/sunflash.20051115.4.xml">Oracle announced in November 2005</a> that Solaris 10 is &#39;its preferred development and deployment platform for most x64 architectures, including x64 (x86, 64-bit) AMD Opteron and Intel Xeon processor-based systems and Sun&#39;s UltraSPARC(R)-based systems.&#39;</span>   <br />   <br />   <span style="font-family: Verdana;">There is a surfeit of reviews of current MySQL, PostgreSQL andâto a lesser extentâIngres implementations. These three open-source RDBMSs come with their own or third-party management tools. These systems compete against free versions of commercial (proprietary) databases: <a href="http://msdn.microsoft.com/vstudio/express/sql/">SQL Server 2005 Express Edition</a> (and its MSDE 2000 and 1.0 predecessors), <a href="http://www.oracle.com/technology/products/database/xe/index.html" target="_blank">Oracle Database 10g Express Edition</a>, <a href="http://www-306.ibm.com/software/data/db2/udb/db2express/download.html" target="_blank">IBM DB2 Express-C</a>, and <a href="http://www.sybase.com/linux_promo" target="_blank">Sybase ASE Express Edition for Linux</a> where database size and processor count limitations aren&#39;t important. Click <a href="http://www.ftponline.com/special/sqlserver/rjennings-overview/table4.aspx">here</a> for a summary of recent <em>InfoWorld</em> reviews of the full versions of these four databases plus MySQL, which should be valid for Express editions also. The <a href="http://www.ftponline.com/special/sqlserver/">FTPOnline Special Report</a> article, &#39;Microsoft SQL Server Turns 17,&#39; that contains the preceding table is <a href="http://www.ftponline.com/special/sqlserver/rjennings-overview/">here</a> (requires registration.)</span>   <br />   <br />  </p>  <p>   <strong><span style="font-family: verdana;">SQL Server 2005 Express Edition SP-1 Advanced Features</span>   </strong>  </p>  <p>   <span style="font-family: Verdana;"><a href="http://www.microsoft.com/downloads/details.aspx?familyid=4C6BA9FD-319A-4887-BC75-3B02B5E48A40&displaylang=en">SQL Server 2005 Express Edition with Advanced Features</a> enhances SQL Server 2005 Express Edition (SQL Express or SSX) dramatically, so it deserves special treatment here. SQL Express gains full text indexing and now supports SQL Server Reporting Services (SSRS) on the local SSX instance. The SP-1 with Advanced Features setup package, which Microsoft released on April 18, 2006, installs the release version of SQL Server Management Studio Express (SSMSE) and the full version of Business Intelligence Development Studio (BIDS) for designing and editing SSRS reports. My &#39;<a href="http://www.ftponline.com/special/sqlserver/rjennings-sp1/">Install SP-1 for SQL Server 2005 and Express</a>&#39; article for FTPOnline&#39;s <a href="http://www.ftponline.com/special/sqlserver/">SQL Server Special Report</a> provides detailed, illustrated installation instructions for and related information about the release version of SP-1. SP-1 makes SSX the most capable of all currently available Express editions of commercial RDBMSs for Windows.</span>  </p>  <p>   <strong><span style="font-family: verdana;">OpenLink Software&#39;s Virtuoso Open-Source Edition</span>   </strong>   <br />   <span style="font-family: verdana;"></span>   <br />   <span style="font-family: verdana;"><a href="http://openlinksw.com/">OpenLink Software</a> announced an <a href="http://virtuoso.openlinksw.com/wiki/main/Main/">open-source version</a> of it&#39;s <a href="http://virtuoso.openlinksw.com/">Virtuoso Universal Server</a> commercial DBMS on April 11, 2006. On the initial date of this post, May 2, 2006, Virtuoso Open-Source Edition (VOS) was virtually under the radar as an open-source product. According to <a href="http://www.openlinksw.com/press/VOSPressRelease.htm">this press release</a>, the new edition includes:</span> <span style="font-family: Verdana;"></span>  </p>  <blockquote>   <span style="font-family: Verdana;"></span>  </blockquote> <blockquote></blockquote> <blockquote></blockquote>  <ul>   <li>     <a href="http://demo.openlinksw.com/sparql_demo/">SPARQL compliant RDF Triple Store</a> </li>   <li>SQL-200n Object-Relational Database Engine (SQL, XML, and Free Text) </li>   <li>Integrated BPEL Server and Enterprise Service Bus</li>   <li>WebDAV and Native File Server </li>   <li>Web Application Server that supports PHP, Perl, Python, ASP.NET, JSP, etc. </li>   <li>Runtime Hosting for Microsoft .NET, Mono, and Java </li>  </ul>VOS only lacks the virtual server and replication features that are offered by the commercial edition. VOS includes a Web-based administration tool called the &quot;Virtuoso Conductor&quot; According to <a href="http://www.openlinksw.com/blog/%7Ekidehen/index.vspx?page=&id=951&sid=&realm=">Kingsley Idehen&#39;s Weblog</a>, &#39;The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).&#39;<br /> <br /> <em>InfoWorld</em>&#39;s Jon Udell has tracked Virtuoso&#39;s progress since <a href="http://www.infoworld.com/article/02/04/12/020415plvirtuoso_1.html">2002</a>, with an <a href="http://www.infoworld.com/article/03/03/21/12virtuoso_1.html">additional article in 2003</a> and a <a href="http://weblog.infoworld.com/udell/2006/04/28.html#a1437">one-hour podcast with Kingsley Idehen</a> on April 26, 2006. A major talking point for Virtuoso is its support for Atom 0.3 syndication and publication, Atom 1.0 syndication and (forthcoming) publication, and future support for Google&#39;s <a href="http://code.google.com/apis/gdata/overview.html">GData protocol</a>, as mentioned in <a href="http://www.openlinksw.com/blog/%7Ekidehen/index.vspx?page=&id=965">this Idehen post</a>. Yahoo!&#39;s <a href="http://jeremy.zawodny.com/blog/archives/006687.html">Jeremy Zawodny</a> points out that the &#39;fingerprints&#39; of <a href="http://oakleafblog.blogspot.com/2005/11/adam-bosworth-learning-from-web-and.html">Adam Bosworth</a>, Google&#39;s VP of Engineering and the primary force behind the development of Microsoft Access, &#39;are all over GData.&#39; Click <a href="http://search.blogger.com/?as_q=bosworth&ie=UTF-8&ui=blg&amp;bl_url=oakleafblog.blogspot.com&x=50&y=10">here</a> to display a list of all OakLeaf posts that mention Adam Bosworth.<br /> <br />One application for the GData protocol is querying and updating the Google Base database independently of the Google Web client, as mentioned by Jeremy: &#39;It&#39;s not about building an easier onramp to Google Base. ... Well, it is. But, again, that&#39;s the small stuff.&#39; Click <a href="http://search.blogger.com/?as_q=%22google+base%22&ie=UTF-8&x=50&y=9&q=%22google+base%22+blogurl:oakleafblog.blogspot.com&filter=0&ui=blg&sa=N&start=0">here</a> for a list of posts about my experiences with Google Base. Watch for a future OakLeaf post on the subject as the GData APIs gain ground.<br /> <span style="font-family: Verdana;"></span> <br />  <span style="font-family: Verdana;"><strong>Open-Source and Free Embedded Database Contenders</strong>  </span> <br /> <span style="font-family: Verdana;"></span> <br /> <span style="font-family: Verdana;">Open-source and free embedded SQL databases are gaining importance as the number and types of mobile devices and OSs proliferate. Embedded databases usually consist of Java classes or Windows DLLs that are designed to minimize file size and memory consumption. Embedded databases avoid the installation hassles, heavy resource usage and maintenance cost associated with client/server RDBMSs that run as an operating system service.</span> <br /> <br /> <span style="font-family: Verdana;">Andrew Hudson&#39;s December 2005 &#39;<a href="http://www.theinquirer.net/?article=28201">Open Source databases rounded up and rodeoed</a>&#39; review for The Enquirer provides brief descriptions of one commercial and eight open source database purveyors/products: Sleepycat, MySQL, PostgreSQL, Ingres, InnoBase, Firebird, IBM Cloudscape (a.k.a, Derby), Genezzo, and Oracle. Oracle <a href="http://www.sleepycat.com/">Sleepycat</a>* isn&#39;t an SQL Database, Oracle <a href="http://www.innodb.com/index.php">InnoDB</a>* is an OEM database engine that&#39;s used by MySQL, and <a href="http://www.genezzo.com/">Genezzo</a> is a multi-user, multi-server distributed database engine written in Perl. These special-purpose databases are beyond the scope of this post.</span> <br /> <br />  <span style="font-family: Verdana;"><span style="font-size: 85%;">* Oracle <a href="http://www.oracle.com/sleepycat/index.html">purchased Sleepycat Software, Inc. in February 2006</a> and </span>   <a href="http://www.oracle.com/innodb/index.html"><span style="font-size: 85%;">purchased Innobase OY in October 2005</span>   </a><span style="font-size: 85%;">. The press release states: &#39;Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software.&#39; </span>  </span> <br /> <span style="font-family: Verdana; font-size: 85%;"></span> <br /> <span style="font-family: Verdana;">   <a href="http://db.apache.org/derby/"><strong>Derby</strong>   </a> is an open-source release by the <a href="http://www.apache.org/">Apache Software Foundation</a> of the <a href="http://www.infoworld.com/article/04/08/03/HNcloudscape_1.html">Cloudscape Java-based database that IBM acquired</a> when it bought Informix in 2001. IBM offers a commercial release of Derby as <a href="http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0408cline/">IBM Cloudscape 10.1</a>. Derby is a Java class library that has a relatively light footprint (2 MB), which make it suitable for <a href="http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0503stumpf/">client/server synchronization</a> with the IBM DB2 Everyplace Sync Server in <a href="http://www-128.ibm.com/developerworks/library/wi-cloud/">mobile applications</a>. The IBM DB2 Everyplace Express Edition isn&#39;t open source or free*, so it doesn&#39;t qualify for this post. The same is true for the corresponding Sybase SQL Anywhere components.**</span> <br /> <br /> <br />  <p>   <span style="font-family: verdana; font-size: 85%;">* IBM DB2 Everyplace Express Edition with synchronization costs $379 per server (up to two processors) and $79 per user. DB2 Everyplace Database Edition (without DB2 synchronization) is $49 per user. (Prices are based on those when </span>   <a href="http://news.earthweb.com/wireless/article.php/3107101"><span style="font-family: verdana; font-size: 85%;">IBM announced version 8</span>   </a><span style="font-family: verdana; font-size: 85%;"> in November 2003.)</span>  </p>  <p>   <span style="font-family: verdana; font-size: 85%;">** Sybase&#39;s iAnywhere subsidiary calls SQL Anywhere &#39;the industry&#39;s leading mobile database.&#39; A Sybase SQL Anywhere Personal DB seat license with synchronization to SQL Anywhere Server is $119; the cost without synchronization wasn&#39;t available from the Sybase Web site. Sybase SQL Anywhere and IBM DB2 Everyplace perform similar replication functions.</span>  </p>  <p>   <span style="font-family: Verdana;">Sun&#39;s <a href="http://developers.sun.com/prodtech/javadb/"><strong>Java DB</strong></a>, another commercial version of Derby, comes with the <a href="http://www.sun.com/software/solaris/">Solaris Enterprise Edition</a>, which bundles Solaris 10, the Java Enterprise System, developer tools, desktop infrastructure and N1 management software. A recent Between the Lines blog entry by ZDNet&#39;s David Berlind waxes enthusiastic over the use of <a href="http://blogs.zdnet.com/BTL/?p=2298">Java DB embedded in a browser</a> to provide offline persistence. RedMonk analyst <a href="http://www.redmonk.com/jgovernor/archives/001151.html">James Governor</a> and <em>eWeek</em>&#39;s <a href="http://www.eweek.com/article2/0,1895,1902407,00.asp">Lisa Vaas</a> wrote about the use of Java DB as a local data store when <a href="http://www.sauria.com/blog/2005/12/13#1440">Tim Bray announced Sun&#39;s Derby derivative</a> and <a href="http://blogs.sun.com/roller/page/FrancoisOrsini?entry=derby_apachecon_demo">Francois Orsini</a> demonstrated Java DB embedded in the Firefox browser at the ApacheCon 2005 conference.</span>   <br />   <span style="font-family: Verdana;"></span>   <br />   <span style="font-family: Verdana;">    <a href="http://www.firebirdsql.org/"><strong>Firebird</strong>    </a> is derived from Borland&#39;s InterBase 6.0 code, the first commercial relational database management system (RDBMS) to be released as open source. Firebird has excellent support for SQL-92 and comes in three versions: Classic, SuperServer and Embedded for Windows, Linux, Solaris, HP-UX, FreeBSD and MacOS X. The embedded version has a 1.4-MB footprint. Release Candidate 1 for Firebird 2.0 became available on March 30, 2006 and is a major improvement over earlier versions. <a href="http://www.borland.com/us/products/interbase/index.html">Borland continues to promote InterBase</a>, now at version 7.5, as a small-footprint, embedded database with commercial Server and Client licenses.</span>   <br />   <span style="font-family: Verdana;"></span>   <br />   <span style="font-family: Verdana;">    <a href="http://www.sqlite.org/index.html"><strong>SQLite</strong>    </a> is a featherweight C library for an embedded database that implements most SQL-92 entry- and transitional-level requirements (some through the JDBC driver) and supports transactions within a tiny 250-KB code footprint. <a href="http://www.sqlite.org/cvstrac/wiki?p=SqliteWrappers">Wrappers</a> support a multitude of languages and operating systems, including Windows CE, SmartPhone, Windows Mobile, and Win32. SQLite&#39;s primary <a href="http://www.sqlite.org/omitted.html">SQL-92 limitations</a> are lack of nested transactions, inability to alter a table design once committed (other than with RENAME TABLE and ADD COLUMN operations), and foreign-key constraints. SQLite provides read-only views, triggers, and 256-bit encryption of database files. A downside is the the entire database file is <a href="http://weblogs.asp.net/jgalloway/archive/2006/04/12/442615.aspx">locked when while a transaction is in progress</a>. SQLite uses file access permissions in lieu of GRANT and REVOKE commands. Using SQLite involves no license; its code is entirely in the public domain.</span>  </p>  <p>   <span style="font-family: Verdana; font-size: 85%;">The Mozilla Foundation&#39;s <a href="http://wiki.mozilla.org/Mozilla2:Unified_Storage">Unified Storage wiki</a> says this about SQLite: &#39;SQLite will be the back end for the unified store [for Firefox]. Because it implements a SQL engine, we get querying &#39;for free&#39;, without having to invent our own query language or query execution system. Its code-size footprint is moderate (250k), but it will hopefully simplify much existing code so that the net code-size change should be smaller. It has exceptional performance, and supports concurrent access to the database. Finally, it is released into the public domain, meaning that we will have no licensing issues.&#39;</span>  </p>  <p>   <span style="font-family: verdana;">Vieka Technology, Inc.&#39;s <a href="http://vieka.com/esql.htm"><strong>eSQL 2.11</strong></a> is a port of SQLite to Windows Mobile (Pocket PC and Smartphone) and Win32, and includes development tools for Windows devices and PCs, as well as a .NET native data provider. A conventional ODBC driver also is available. eSQL for Windows (Win32) is free for personal and commercial use; eSQL for Windows Mobile requires a license for commercial (for-profit or business) use.</span>  </p>  <p>   <span style="font-family: verdana;">    <a href="http://hsqldb.org/"><strong>HSQLDB</strong>    </a> isn&#39;t on most reviewers&#39; radar, which is surprising because it&#39;s the default database for <a href="http://www.openoffice.org/">OpenOffice.org</a> (OOo) 2.0&#39;s <a href="http://www.openoffice.org/product/base.html">Base</a> suite member. HSQLDB 1.8.0.1 is an open-source (BSD license) Java dembedded database engine based on Thomas Mueller&#39;s original Hypersonic SQL Project. Using OOo&#39;s Base feature requires installing the Java 2.0 Runtime Engine (which is not open-source) or the presence of an alternative open-source engine, such as Kaffe. My prior posts about OOo Base and HSQLDB are <a href="http://oakleafblog.blogspot.com/2005/10/openoffice-base-20-vs-microsoft-access.html">here</a>, <a href="http://oakleafblog.blogspot.com/2005/10/openoffice-base-20-vs-microsoft-access_22.html">here</a> and <a href="http://oakleafblog.blogspot.com/2005/10/openoffice-20-base-matches-microsoft.html">here</a>.</span>  </p>  <p>   <span style="font-family: verdana;">The <a href="http://hsqldb.sourceforge.net/web/hsqlDocsFrame.html">HSQLDB 1.8.0 documentation</a> on SourceForge states the following regarding SQL-92 and later conformance:</span>  </p>  <span style="font-family: verdana;">   <blockquote>    <p>     <span style="font-family: verdana;">HSQLDB 1.8.0 supports the dialect of SQL defined by SQL standards 92, 99 and 2003. This means where a feature of the standard is supported, e.g. left outer join, the syntax is that specified by the standard text. Many features of SQL92 and 99 up to Advanced Level are supported and here is support for most of SQL 2003 Foundation and several optional features of this standard. However, certain features of the Standards are not supported so no claim is made for full support of any level of the standards. </span>    </p>   </blockquote>   <span style="font-family: verdana;"><span style="font-size: 85%;">Other less well-known embedded databases designed for or suited to mobile deployment are </span>    <a href="http://www.mimer.com/leftright.asp?secId=172"><span style="font-size: 85%;">Mimer SQL Mobile</span>    </a><span style="font-size: 85%;"> and </span>    <a href="http://www.vistadb.net/"><span style="font-size: 85%;">VistaDB 2.1</span>    </a><span style="font-size: 85%;">. Neither product is open-source and require paid licensing; VistaDB requires a small up-front payment by developers but offers royalty-free distribution.</span>   </span> <br /> <br /> <span style="font-family: Verdana;">Java DB, Firebird embedded, SQLite and eSQL 2.11 are contenders for lightweight PC and mobile device database projects that aren&#39;t Windows-only.</span> <br /> <br />   <strong>    <span style="font-family: verdana;">SQL Server 2005 Everywhere<br />    </span><span style="font-family: Verdana;"></span>   </strong> <br /> <span style="font-family: verdana;">If you&#39;re a Windows developer, SQL Server Mobile is the logical embedded database choice for mobile applications for Pocket PCs and Smartphones. Microsoft&#39;s April 19, 2006 press release delivered the news that SQL Server 2005 Mobile Editon (SQL Mobile or SSM) would gain a big brotherâSQL Server 2005 Everywhere Edition. </span> <br /> <span style="font-family: verdana;"></span> <br /> <span style="font-family: verdana;">Currently, the SSM client is licensed (at no charge) to run in production on devices with Windows CE 5.0, Windows Mobile 2003 for Pocket PC or Windows Mobile 5.0, or on PCs with Windows XP Tablet Edition only. SSM also is licensed for development purposes on PCs running Visual Studio 2005.</span>   <span style="font-family: verdana;"> Smart Device replication with SQL Server 2000 SP3 and later databases has been the most common application so far for SSM.<br /> <br />   </span><span style="font-family: verdana;">By the end of 2006, Microsoft will license SSE for use on <em>all</em> PCs running any Win32 version or the preceding device OSs. A version of SQL Server Management Studio Express (SSMSE)âupdated to support SSEâis expected to release by the end of the year. These features will qualify SSE as <em>the universal embedded database</em> for Windows client and smart-device applications. </span> <br /> <br /> <span style="font-family: verdana;">For more details on SSE, read <a href="http://weblogs.asp.net/jgalloway/archive/2006/04/11/442451.aspx">John Galloway&#39;s April 11, 2006 blog post</a> and my &#39;<a href="http://www.ftponline.com/special/sqlserver/rjennings-mobile/">SQL Server 2005 Mobile Goes Everywhere</a>&#39; article for the <a href="http://www.ftponline.com/special/sqlserver/">FTPOnline Special Report on SQL Server</a>.</span><span style="font-family: verdana;"></span>&quot;  <p>(Via <a href="http://oakleafblog.blogspot.com">OakLeaf Systems</a>.)</p>  </span> </blockquote> 
]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2006-04-13#957">
  <rss:title>Prerelational DBMS vendors &amp;mdash; a quick overview</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2006-04-13T19:04:34Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Prerelational DBMS vendors â a quick overview: &quot; IBM. With BOMP and D-BOMP, IBM was probably the first company to commercialize precursors to DBMS. (BOMP stood for Bill Of Materials Planning, foreshadowing the hierarchical architecture of IMS.) Out of those grew DL/1 and IMS, IBMâs flagship hierarchical DBMS, and the worldâs first dominant DBMS product(s). Of course, IBM also innovated relational DBMS, via the research of E. F. âTedâ Codd, then some prototype products, and eventual the mainframe version of DB2. To this day DB2 on the mainframe remains one of the worldâs major DBMS, as does the separate but related product of DB2 for âopen systems.â Cincom. In the 1970s, Cincom was probably the most successful independent software product company. Its flagship product was Total, a shallow-network DBMS that was a little more general than the strictly hierarchical IMS. Whatâs more, Total ran on almost any brand of computer hardware. Cincom remains independent and privately held to this day. Cullinane/Cullinet. Charlie Bachman innovated a true network DBMS at Honeywell, but it didnât turn into a serious product at that time. B. F. Goodrich, however, ran a version. This is what John Cullinaneâs company bought and turned into IDMS, which at least on the mainframe supplanted Total as the technical, mind share, and probably revenue market leader. Cullinet (as it was then called) ran into technical difficulties, however, losing ground to the more flexible index-based DBMS. It was eventually sold to Computer Associates. A lot of software industry leaders cut their teeth at Cullinet, notably Andrew âFlipâ Filipowski, later the colorful founder of Platinum. Other alumni include Renato âRonâ Zambonini, Dave Litwack, Dave Ireland, and the original PowerBuilder development team. John Landry and Bob Weiler ran the firm for a while toward the end, but they donât really count; rather, theyâre the most prominent alumni of applications pioneer McCormack &amp; Dodge. Note: Index-based is a term I used in and probably coined for my first report in 1982, comprising both inverted-list and relational RDBMS, as opposed to the link(ed)-list hierarchical and network products such as IMS, Total, and IDBMS. The companies that beat Cullinet were long-time rival Software AG, and then especially Applied Data Research; then all three of those independents were blown out by IBMâs DB2. And then the whole mainframe DBMS business was in turn obsoleted by the rise of UNIX â¦ but Iâm getting ahead of my story. Software AG. Like Cincom, Germany-based Software AG is a 1970s DBMS pioneer that has always remained independent and privately held. Sort of. Twice, Software AG of North America was spun off as a separate, eventually public company. Software AGâs flagship DBMS was the inverted list product ADABAS. SAPâs MaxDB was also owned by Software AG for a while (and seemingly by every other significant German computer company as well â or more precisely, by Nixdorf where it was developed, and by Siemens after it bought Nixdorf). I actually visited Software AG in Darmstadt once. Founder Peter Schnell and key techie Peter Page were both gracious hosts. Schnell was proud of their new building, and especially of the hexagon-based wooden dual desks heâd personally designed. General analytic rule â when the CEO is focused on the dÃ©cor, this is not a good sign for the companyâs near-term prospects. (I call this having an âedifice complex.â) Applied Data Research (ADR). ADR is often credited as being the first independent software company, having introduced products in the late 1960s and prevailed in antitrust struggles against IBM to allow the business to survive. Basically, it sold programmer productivity tools. This led it to acquire Datacom/DB, an inverted-list DBMS developed in the Dallas area. In the early 1980s, Datacom/DB began to boom, and was on a track to surpass both IDMS and ADABAS in market share until DB2 showed up and blew them all away. ADR was particularly aided by its fourth-generation language (4GL) IDEAL, which was an excellent product notwithstanding the famous State of New Jersey fiasco. (As John Landry said to me about that one, â4GLs are powerful tools. In particular, they allow you to write bad programs really quickly.â) ADR was an underappreciated powerhouse, boasting all of the Fortune 100 as customers way back in the early 1980s (yes, even archrival IBM). When the DBMS business stalled, however, ADR was quickly sold â first to Ameritech (the Illinois-based Baby Bell company), and soon thereafter to Computer Associates. Computer Corporation of America (CCA). CCAâs DBMS Model 204 may have been the best of the prerelational products, boasting an inverted-list architecture akin to that of ADABAS and Datacom/DB. The company was also interesting in that it was first and foremost a government contract research shop, and hence did all sorts of interesting prototype work that sadly never got commercialized. In about 1983 it became that the company wasnât going anywhere, and it put itself up for sale. I was personally instrumental in that decision. Our investment banker pretended he was considering taking CCA public. CCA President Jim Rothnie showed us revenue projections. I asked how he had gotten them. He replied that he had taken the market size projection 5 years out, assumed 10%, and drawn a âplausible curve.â However, I quickly got Socratic with him. âHow many salesmen do you have?â âHow much revenue does the average experienced salesman produce?â âHow many experienced salesmen do you expect to have next year?â âHow high do you think their average productivity can grow?â âLet us multiply.â (Yes, I really said that. I can be a jerk. And anyway Jim was the sort of analytic guy one can say that to without giving serious offense.) CCA was sold to a Canadian insurance company whose name Iâve now forgotten. Eventually, it was spun back out (perhaps after some intermediate changes of ownership), and resurfaced as primarily a data integration company, called Praxis. In the real old days (mid 1970s, perhaps), Model 204 was resold by Informatics (later Informatics General, later the hostile takeover that became the guts of Sterling Software, which like so many other companies was eventually absorbed into Computer Associates). I know this because Richard Currier used to sell the product when he worked at Informatics. That probably makes Richard and me about the only two people who still remember the fact. Hmm. I forgot to mention Intelâs System 2000. Well, truth be told it was a dying product even back when I first became an analyst in 1981, and I recall nothing about it, except Gene Lowenthalâs observation that Intel had had trouble selling chips and DBMS through the same salesforce. I think Al Sisto, who I probably met when he was head of sales at RTI (Relational Technology, Inc. â later called Ingres), came out of that business, but Iâm not 100% sure. I remember Pete Tierney from that RTI management team more clearly anyway, although thatâs mainly because we stayed in touch at subsequent companies over the years.&quot; (Via Software Memories.)</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>
<a href="http://www.softwarememories.com/2006/02/09/prerelational-dbms-vendors-a-quick-overview/">Prerelational DBMS vendors â a quick overview</a>: &quot;</p>
<p>
<strong>IBM. </strong> With BOMP and D-BOMP, IBM was probably the first company to commercialize precursors to DBMS.  (BOMP stood for Bill Of Materials Planning, foreshadowing the hierarchical architecture of IMS.)  Out of those grew DL/1 and IMS, IBMâs flagship hierarchical DBMS, and the worldâs first dominant DBMS product(s).  Of course, IBM also innovated relational DBMS, via the research of E. F. âTedâ Codd, then some prototype products, and eventual the mainframe version of DB2.  To this day DB2 on the mainframe remains one of the worldâs major DBMS, as does the separate but related product of DB2 for âopen systems.â</p>
	<p>
<strong>Cincom. </strong> In the 1970s, Cincom was probably the most successful independent software product company.  Its flagship product was Total, a shallow-network DBMS that was a little more general than the strictly hierarchical IMS.  Whatâs more, Total ran on almost any brand of computer hardware.  Cincom remains independent and privately held to this day.</p>
	<p>
<strong>Cullinane/Cullinet.</strong>  Charlie Bachman innovated a true network DBMS at Honeywell, but it didnât turn into a serious product at that time.  B. F. Goodrich, however, ran a version.  This is what John Cullinaneâs company bought and turned into IDMS, which at least on the mainframe supplanted Total as the technical, mind share, and probably revenue market leader. Cullinet (as it was then called) ran into technical difficulties, however, losing ground to the more flexible index-based DBMS.  It was eventually sold to Computer Associates.   </p>
	<p>A lot of software industry leaders cut their teeth at Cullinet, notably Andrew âFlipâ Filipowski, later the colorful founder of Platinum.  Other alumni include Renato âRonâ Zambonini, Dave Litwack, Dave Ireland, and the original PowerBuilder development team.  John Landry and Bob Weiler ran the firm for a while toward the end, but they donât really count; rather, theyâre the most prominent alumni of applications pioneer McCormack &amp; Dodge.</p>
	<p>
<strong>Note:</strong>  <em>Index-based</em> is a term I used in and probably coined for my first report in 1982, comprising both inverted-list and relational RDBMS, as opposed to the link(ed)-list hierarchical and network products such as IMS, Total, and IDBMS.  The companies that beat Cullinet were long-time rival Software AG, and then especially Applied Data Research; then all three of those independents were blown out by IBMâs DB2.  And then the whole mainframe DBMS business was in turn obsoleted by the rise of UNIX â¦ but Iâm getting ahead of my story.</p>
	<p>
<strong>Software AG.</strong>   Like Cincom, Germany-based Software AG is a 1970s DBMS pioneer that has always remained independent and privately held.  Sort of.  Twice, Software AG of North America was spun off as a separate, eventually public company.  Software AGâs flagship DBMS was the inverted list product ADABAS.  SAPâs MaxDB was also owned by Software AG for a while (and seemingly by every other significant German computer company as well â or more precisely, by Nixdorf where it was developed, and by Siemens after it bought Nixdorf).</p>
	<p>I actually visited Software AG in Darmstadt once.  Founder Peter Schnell and key techie Peter Page were both gracious hosts.  Schnell was proud of their new building, and especially of the hexagon-based wooden dual desks heâd personally designed.    General analytic rule â when the CEO is focused on the dÃ©cor, this is not a good sign for the companyâs near-term prospects.  (I call this having an âedifice complex.â)</p>
	<p>
<strong>Applied Data Research (ADR). </strong> ADR is often credited as being the first independent software company, having introduced products in the late 1960s and prevailed in antitrust struggles against IBM to allow the business to survive.  Basically, it sold programmer productivity tools.  This led it to acquire Datacom/DB, an inverted-list DBMS developed in the Dallas area.   In the early 1980s, Datacom/DB began to boom, and was on a track to surpass both IDMS and ADABAS in market share until DB2 showed up and blew them all away.  ADR was particularly aided by its fourth-generation language (4GL) IDEAL, which was an excellent product notwithstanding the famous State of New Jersey fiasco.  (As John Landry said to me about that one, â4GLs are powerful tools.  In particular, they allow you to write bad programs really quickly.â)</p>
	<p>ADR was an underappreciated powerhouse, boasting all of the Fortune 100 as customers way back in the early 1980s (yes, even archrival IBM).  When the DBMS business stalled, however, ADR was quickly sold â first to Ameritech (the Illinois-based Baby Bell company), and soon thereafter to Computer Associates.</p>
	<p>
<strong>Computer Corporation of America (CCA). </strong> CCAâs DBMS Model 204 may have been the best of the prerelational products, boasting an inverted-list architecture akin to that of ADABAS and Datacom/DB.  The company was also interesting in that it was first and foremost a government contract research shop, and hence did all sorts of interesting prototype work that sadly never got commercialized.  In about 1983 it became that the company wasnât going anywhere, and it put itself up for sale.  </p>
	<p>I was personally instrumental in that decision.  Our investment banker pretended he was considering taking CCA public.  CCA President Jim Rothnie showed us revenue projections.  I asked how he had gotten them.  He replied that he had taken the market size projection 5 years out, assumed 10%, and drawn a âplausible curve.â  However, I quickly got Socratic with him.  âHow many salesmen do you have?â âHow much revenue does the average experienced salesman produce?â  âHow many experienced salesmen do you expect to have next year?â âHow high do you think their average productivity can grow?â  âLet us multiply.â  (Yes, I really said that.  I can be a jerk.  And anyway Jim was the sort of analytic guy one can say that to without giving serious offense.)</p>
	<p>CCA was sold to a Canadian insurance company whose name Iâve now forgotten.  Eventually, it was spun back out (perhaps after some intermediate changes of ownership), and resurfaced as primarily a data integration company, called Praxis.</p>
	<p>In the real old days (mid 1970s, perhaps), Model 204 was resold by Informatics (later Informatics General, later the hostile takeover that became the guts of Sterling Software, which like so many other companies was eventually absorbed into Computer Associates).  I know this because Richard Currier used to sell the product when he worked at Informatics.  That probably makes Richard and me about the only two people who still remember the fact.</p>
	<p>Hmm.  I forgot to mention <strong>Intelâs System 2000. </strong> Well, truth be told it was a dying product even back when I first became an analyst in 1981, and I recall nothing about it, except Gene Lowenthalâs observation that Intel had had trouble selling chips and DBMS through the same salesforce.  I think Al Sisto, who I probably met when he was head of sales at RTI (Relational Technology, Inc. â later called Ingres), came out of that business, but Iâm not 100% sure.  I remember Pete Tierney from that RTI management team more clearly anyway, although thatâs mainly because we stayed in touch at subsequent companies over the years.</p>&quot;

<p>(Via <a href="http://www.softwarememories.com">Software Memories</a>.)</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-28#819">
  <rss:title>SAP, IBM Make Play for Oracle Database Customers With New DB2 Version</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005-04-28T21:52:55Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">CNET reports: There are a whopping 44,000 SAP customers running on Oracle databases, and IBM wants them. To get them, for the first time ever, it&#39;s optimized its enterprise database for a specific vendor&#39;s applications. The new version of DB, 8.2.2, will include a slew of SAP-optimized features, including self-tuning, self-configuration, silent install, dynamic storage allocation and more. Wouldn&#39;t SAP be better served by simply making their application database independent via ODBC? This process really could have commenced years ago and prevented today&#39;s dilema: Your Partner has become Your most aggressive Competitor! SAP tuned for specifically for DB2 or SAP tuned likewise for Microsoft SQL simply reeks of: &quot;Same Sh*t different Pile&quot;.  Microsoft and IBM will emulate Oracle in due course regarding their assault on SAP&#39;s market if DBMS specificity remains the SAP data access API strategy (this is a simple fact). SAP should be using its quest for DBMS independence to stimulate or contribute ODBC enhancements (should ODBC be lacking in areas critical to its application needs; it is available in Open Source form and across all major platforms). Should the ODBC API not be the problem, then it can push ODBC Driver vendors (DBMS vendors such as IBM included) to get their Drivers in shape (should they be lacking, I know our ODBC Drivers are absolutely fine for this kind of task). Database specificity gets application vendors nowhere. You can only control your business development destiny by being database independent. When applications are database independent the intellectual capital that drives your applications is preserved. This is akin to building physical and logical firewalls around the ecosystem created by your products. This is much better that being a pseudo DBMS engine reseller for a future competitor.      </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<font size="2">
<p><a href="http://ct.enews.eweek.com/rd/cts?d=186-1965-11-96-81585-221904-0-0-0-1">CNET reports</a>:</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>There are a whopping <u>44,000 SAP customers</u> running on Oracle databases, and IBM wants them. To get them, for the first time ever, it's optimized its enterprise database for a specific vendor's applications. The new version of DB, 8.2.2, will include a slew of SAP-optimized features, including self-tuning, self-configuration, silent install, dynamic storage allocation and more. </p></blockquote>
<p dir="ltr">Wouldn't SAP be better served by simply making their application database independent via <a href="http://uda.openlinksw.com/odbc/">ODBC</a>? This process really could have&nbsp;commenced years ago and prevented today's dilema: Your Partner has&nbsp;become Your most aggressive Competitor! </p>
<p dir="ltr">SAP tuned for specifically for DB2 or SAP tuned likewise for Microsoft SQL&nbsp;simply reeks of: "Same Sh*t different Pile".&nbsp; Microsoft and IBM will emulate Oracle in due course regarding their assault on SAP's market if DBMS specificity remains the SAP data access API strategy (this is a simple fact).</p>
<p dir="ltr">SAP should be using its quest for DBMS independence to stimulate or contribute ODBC enhancements&nbsp;(should ODBC be lacking in areas critical to its application needs; it is available in <a href="http://www.iodbc.org">Open Source form</a>&nbsp;and across all major platforms).&nbsp;Should the ODBC API not be the problem, then it can&nbsp;push ODBC Driver vendors (DBMS vendors such as IBM included) to get their Drivers in shape (should they be lacking, I know <a href="http://uda.openlinksw.com/odbc ">our ODBC Drivers</a> are absolutely fine for this kind of task).</p>
<p dir="ltr">Database specificity gets application vendors nowhere.&nbsp;You can only control your business development destiny by being database independent. When applications are database independent&nbsp;the intellectual capital&nbsp;that drives your applications is preserved. This is akin to building physical and logical firewalls around the ecosystem created by your products. This is much better that being a pseudo DBMS engine reseller for a future competitor.</p>
<p dir="ltr">&nbsp;</p>
<p dir="ltr">&nbsp;</p>
<p dir="ltr"></font>&nbsp;</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-04-25#807">
  <rss:title>Oracle To Support .NET Runtime Hosting</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005-04-25T22:54:03Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Better late than never! Oracle has announced the commencement of a journey that we completed in 2002 (across Microsoft .NET and Mono). Hopefully, their support of CRL Runtime Hosting will bring added clarity to the intrinsic value of the multi-language bindings via the ECMA-CLI that facilitate the development and deployment of DBMS Stored Procedures using a plethora of languages (ditto creation of User Defined Types, Function, Table Value Functions). I also hope that Oracle will support Mono -off the bat- rather than taking the typical &quot;we will port to Mono sometime in the future...&quot; type message which will not be acceptable, especially as we pulled this off first time around in 2002 (as atop Mono then). Thus, I am sure they can do it in 2005 :-) Hopefully we should be able to add Oracle 10g Release 2 and DB2 to our SQL CLR hosting features comparison document that currently only covers SQL Server 2005 and Virtuoso.      </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Better late than never! Oracle&nbsp;has <a href="http://www.oracle.com/technology/oramag/oracle/05-may/o35briefs.html">announced</a> the commencement of a journey&nbsp;that we <a href="http://www.novell.com/news/press/archive/2002/ximian_archive/pr112502.html">completed in 2002 </a>(across Microsoft .NET and Mono). Hopefully, their support of CRL Runtime Hosting will bring added clarity to the intrinsic value of the multi-language bindings via the ECMA-CLI that facilitate the development and deployment of DBMS Stored Procedures using a plethora of languages (ditto creation of User Defined Types, Function, Table Value Functions).</p>
<p>I also hope that Oracle will support Mono -off the bat- rather than&nbsp;taking the typical&nbsp;"we will port to Mono sometime in the future..." type message which&nbsp;will not be acceptable, especially as we pulled this off first time around in&nbsp;2002 (as atop Mono then). Thus,&nbsp;I am sure they can do it in 2005 :-)</p>
<p>Hopefully we should be able to add Oracle 10g Release 2 and <a href="http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0406evans/index.html">DB2</a> to our SQL CLR hosting features <a href="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/index.vspx?id=138">comparison document</a> that currently only covers SQL Server 2005 and <a href="http://virtuoso.openlinksw.com/">Virtuoso</a>.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-02-28#698">
  <rss:title>The Cost of Database Specificity</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005-02-28T15:57:24Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">The cost of writing database specific applications (Open or Closed Source) adversely affects application developers/vendors and end user alike. This article in Network Computing (regarding Oracle and PeopleSoft&#39;s DB2&#39;s user base) provides great insight into the time-tested problem of writing or acquiring database driven applications that are database specific.  DB2 users of PeopleSoft and IBM (the DB2 developer and vendor) suspect that Oracle will obviously try to use its ownership of PeopleSoft to covertly coerce DB2 users into becoming Oracle DBMS users. This strategy would take the form of new features and fixes discrimination as somewhat echoed in these excerpts: &quot;..In the crescendo surrounding the Oracle-PeopleSoft merger, one question has been repeatedly drowned out: What happens to users of PeopleSoft&#39;s DB2 database? Oracle chief Larry Ellison has repeatedly assured DB2 users--and IBM--that Oracle will continue to support DB2 and PeopleSoft&#39;s interfaces to IBM&#39;s WebSphere platform. But IBM isn&#39;t taking any chances, announcing an initiative to alter DB2 to work with products from Oracle rival SAP.&quot; &quot;..IBM has good reason to be concerned. Oracle vies with SAP as the leading vendor for enterprise applications, but it&#39;s under pressure to show concrete benefits from the merger by combining assets and pumping up revenue. One obvious tactic will be to use the PeopleSoft applications to steer enterprise customers toward the Oracle database by optimizing performance and features toward the Oracle back end.&quot; If PeopleSoft&#39;s application core was ODBC based, the vulnerability to this predictable competitive tactic would at the very least be significantly alleviated. DB2 end-users and IBM the product vendor would have a much stronger basis for countering Oracle by taking them to task about their claimed inability to implement new application functionality enhancements against DB2 etc. especially as this would have morphed into a generic database issue as opposed to a DB2 specific issue -- by virtue of the application and data access layer seperation provided by ODBC&#39;s architecture.  </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p dir="ltr" style="MARGIN-RIGHT: 0px">The cost of writing database specific applications (Open or Closed Source) adversely affects application developers/vendors and end user alike. <a href="http://www.nwc.com/showitem.jhtml?docid=1603buzz3">This</a> article in <a href="http://www.nwc.com">Network Computing</a> (regarding Oracle and PeopleSoft's DB2's user base) provides great insight into&nbsp;the time-tested problem of writing or acquiring database&nbsp;driven applications that&nbsp;are database specific.&nbsp;<span class="grey12"></p></span>
<p dir="ltr" style="MARGIN-RIGHT: 0px">DB2 users of PeopleSoft and IBM (the DB2 developer and vendor) suspect that Oracle will obviously try to use its ownership of PeopleSoft to covertly coerce&nbsp;DB2 users&nbsp;into becoming Oracle DBMS users. This strategy would take the form of&nbsp;new features and fixes discrimination&nbsp;as somewhat&nbsp;echoed in these excerpts:</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p dir="ltr" style="MARGIN-RIGHT: 0px"><span class="grey12">"..In the crescendo surrounding the Oracle-PeopleSoft merger, one question has been repeatedly drowned out: What happens to users of PeopleSoft's DB2 database? Oracle chief Larry Ellison has repeatedly assured DB2 users--and IBM--that Oracle will continue to support DB2 and PeopleSoft's interfaces to IBM's WebSphere platform. But IBM isn't taking any chances, announcing an initiative to alter DB2 to work with products from Oracle rival SAP." </span></p>
<p dir="ltr" style="MARGIN-RIGHT: 0px"><span class="grey12">"..IBM has good reason to be concerned. Oracle vies with SAP as the leading vendor for enterprise applications, but it's under pressure to show concrete benefits from the merger by combining assets and pumping up revenue. One obvious tactic will be to use the PeopleSoft applications to steer enterprise customers toward the Oracle database by optimizing performance and features toward the Oracle back end."</span></p></blockquote>
<p dir="ltr" style="MARGIN-RIGHT: 0px">If PeopleSoft's application core was ODBC based, the vulnerability to this predictable competitive tactic would at the very least be significantly alleviated. DB2 end-users and IBM the product vendor would have a much stronger&nbsp;basis for countering Oracle by taking them&nbsp;to task about their claimed inability to implement new application functionality enhancements against DB2 etc. especially as this would&nbsp;have morphed into&nbsp;a generic database issue as opposed to a DB2 specific issue --&nbsp;by virtue of the application and data access layer seperation provided by <a href="http://uda.openlinksw.com/odbc/">ODBC's architecture</a>. </p>
<p dir="ltr" style="MARGIN-RIGHT: 0px">&nbsp;</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2005-01-04#657">
  <rss:title>IBM Flexes XML Muscle</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005-01-04T17:18:36Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Here is another article titled &quot;IBM Flexes XML Muscle&quot; that covers the same general theme: IBM&#39;s appreciation of Unified Storage. As indicated in an earlier post: IBM is clearly validating what we have done with Virtuoso (as was the case initially with their Virtual / Federated DBMS initiative ala DB2 Integrator). Here is an excerpt from today&#39;s eWeek article supporting this position: To achieve maximum XML performance, bolstered indexing attributes in the technology will enable advanced search functions and a higher degree of filtering. IBM is also adding support for XPath and XQuery data models. This will allow users to create views that involve SQL and XQuery by sending the protocol through DB2&#39;s query optimizer for a unified query plan. Read on.. Virtuoso has been doing this since 2000; unfortunately a lot of</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Here is another article titled &quot;<a href="http://www.eweek.com/article2/0,1759,1747224,00.asp?kc=ewnws010305dtx1k0000599">IBM Flexes XML Muscle</a>&quot; that covers the same general theme: IBM&#39;s appreciation of Unified Storage.</p>
<p>As indicated in an earlier <a href="http://www.openlinksw.com/blog/~kidehen/index.vspx?id=648">post</a>: IBM is clearly validating what we have done with Virtuoso (as was the case initially with their Virtual / Federated DBMS initiative ala DB2 Integrator). Here is an excerpt from today&#39;s <a href="http://www.eweek.com/article2/0,1759,1747224,00.asp?kc=ewnws010305dtx1k0000599">eWeek article</a> supporting this position:</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>To achieve maximum XML performance, bolstered indexing attributes in the technology will enable advanced search functions and a higher degree of filtering. IBM is also adding support for XPath and XQuery data models. This will allow users to create views that involve SQL and XQuery by sending the protocol through DB2&#39;s query optimizer for a unified query plan. </p>
<p><a href="http://www.eweek.com/article2/0,1759,1747224,00.asp?kc=ewnws010305dtx1k0000599">Read on..</a></p></blockquote>
<p dir="ltr"><a href="http://virtuoso.openlinksw.com/">Virtuoso</a> has been doing this since 2000; unfortunately a lot of</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-12-09#648">
  <rss:title>IBM Moves Database Goal Posts</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2004-12-09T13:20:52Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Taking the Butler Research consultant quote in this piece from &quot;The Register&quot; at face value, one can only assume that IBM is basically throwing in the towel re. DB2 and its ability to handle XML :-) The excerpt below certainly implies this: ..So, using relational storage is inadequate for one reason or another, and IBM has concluded that another approach is necessary. The company’s next generation database will therefore have two storage engines: one relational store and one native XML store. And let me be quite clear about this: these engines will be completely separate, with separate tablespaces, separate indexes (Btrees and so forth on the one hand, and hierarchical on the other), and so on... Hold on here! IBM only</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Taking the Butler Research consultant quote in this <a href="http://www.theregister.co.uk/2004/12/09/ibm_database_goalposts/">piece</a> from &quot;<a href="http://www.theregister.co.uk">The Register</a>&quot; at face value, one can only assume that IBM is basically throwing in the towel re. DB2 and its ability to handle XML :-) The excerpt below certainly implies this:</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>..So, using relational storage is inadequate for one reason or another, and IBM has concluded that another approach is necessary. The company&#8217;s next generation database will therefore have two storage engines: one relational store and one native XML store. And let me be quite clear about this: these engines will be completely separate, with separate tablespaces, separate indexes (Btrees and so forth on the one hand, and hierarchical on the other), and so on...</p></blockquote>
<p dir="ltr">Hold on here! IBM only</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-05-17#546">
  <rss:title>Preventable SQL DBMS Vulnerabilities</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2004-05-18T00:42:08Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Here are some excerpts (inlined) with my comments (outlined) from an interesting article on SQL DBMS exploits and vulnerabilities by Aaron C. Newman, for DB2 Magazine titled &quot;6 Security Secrets Attackers don&#39;t want You To Know&quot;. How secure is your data? Looking at your information management resources through a would-be intruder&#39;s eyes can help you find (and fix) vulnerabilities. Naturally :-) When E. F. Codd developed his relational data model in 1970, the business world was a different place. Almost 35 years after his seminal work appeared, RDBMSs that sprung from Codd&#39;s ideas are the standard for storing corporate information. And, with government and industry regulations dictating what kinds of information companies have to store, manage, and audit (and for how long), protecting this information is more important than ever. Unfortunately, it&#39;s also more challenging Even in 1985, when Dr. Codd published 12 guidelines for RDBMSs, there was little concern for data security. In those days, gaining access to a database was so difficult that advanced security features on the database were irrelevant. Today, RDBMSs carry the lifeblood of every organization. Note the use of the plural: Organizations now have many databases that are decentralized in terms of use and security controls. E-business demands that data access be extended to customers, partners, suppliers, and other parties who were rarely considered in the early data management days. With all this availability ? not to mention pressure from an array of government and industry regulations (see the sidebar, &quot;Security and Compliance&quot;) ? the need to control exactly who can access or modify data is becoming paramount. Absolute facts, that are still partially understood at best. For instance we are still in a so called &quot;Information Age&quot; in which standards based data access remains an issue of contempt instead of absolute necessity. There are a number of prevailing myths about standards based data access that continue to cloak reality: ODBC, JDBC, ADO.NET, OLEDB all deliver poor performance (compared to their native, proprietary, and database specific counterparts; native interfaces) You can&#39;t really right generic database applications with these standards due to inconsistencies in the DBMS implementations of SQL (not true! there are many aspects of the specs that address these concerns if only a majority of driver vendors would implement these features, and the application developers actually used them by seeking drivers with full implementations). Even if the above were true (which I refute strongly), how about the general security vulnerabilities that affect both Native, and Standards compliant, data access interfaces? Aaron&#39;s article does a good job of highlighting 6 areas of vulnerability: DBMS Defaults (usernames and passwords) Authentication (at connect time) Database Privileges Fixpaks Buffer Overflows SQL Injection What I have been able to do very quickly (thanks to blogging, and the power of a blog engine that supports WebDAV), is write a tabulated response to each of the items (bar Fixpaks) indicating how the OpenLink Multi-Tier Data Access Drivers (for ODBC, JDBC, ADO.NET, and OLEDB) protect corporate databases from each of these vulnerabilities. To cut a long story short, we are increasingly living a contradiction where the terms &quot;simple&quot; and &quot;free&quot; are supposed to lead us to products that can adequately handle the challenges of an increasingly sophisticated grid of inter-connecting point. I have been asked on numerous occassions, &quot;How can you build a company and business based on data access technology?&quot;. My reply is the same as usual, &quot;because everything comes down to data&quot;. If the data is compromised in anyway, then kiss Information, Knowledge, and everything else goodbye!  </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Here are some excerpts (inlined) with my comments (outlined)&nbsp;from an <a href="http://www.db2mag.com/showArticle.jhtml?articleID=18901175">interesting article</a>&nbsp;on SQL DBMS exploits and vulnerabilities by <a href="http://www.appsecinc.com/">Aaron C. Newman</a>, for <a href="http://www.db2mag.com/show">DB2 Magazine</a>&nbsp;titled "6 Security Secrets Attackers don't want You To Know".</p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>How secure is your data? Looking at your information management resources through a would-be intruder's eyes can help you find (and fix) vulnerabilities.</p></blockquote>
<p dir="ltr">Naturally :-)</p>
<p></p>
<blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
<p>When E. F. Codd developed his relational data model in 1970, the business world was a different place. Almost 35 years after his seminal work appeared, RDBMSs that sprung from Codd's ideas are the standard for storing corporate information. And, with government and industry regulations dictating what kinds of information companies have to store, manage, and audit (and for how long), protecting this information is more important than ever. Unfortunately, it's also more challenging</p>
<p>Even in 1985, when <a href="http://www.databaseanswers.com/codds_rules.htm">Dr. Codd published 12 guidelines for RDBMSs</a>, there was little concern for data security. In those days, gaining access to a database was so difficult that advanced security features on the database were irrelevant. </p>
<p>Today, RDBMSs carry the lifeblood of every organization. Note the use of the plural: Organizations now have many databases that are decentralized in terms of use and security controls. E-business demands that data access be extended to customers, partners, suppliers, and other parties who were rarely considered in the early data management days. With all this availability ? not to mention pressure from an array of government and industry regulations (see the sidebar, <a href="http://www.db2mag.com/showArticle.jhtml?articleID=18901175#sidebar">"Security and Compliance"</a>) ? the need to control exactly who can access or modify data is becoming paramount. </p></blockquote>
<p dir="ltr">Absolute facts, that are still partially understood at best. For instance we are still in a so called "Information Age" in which standards based data access remains an issue of contempt instead of absolute necessity. </p>
<p dir="ltr">There are a number of prevailing myths about standards based data access that continue to cloak reality:</p>
<ol dir="ltr">
<li>
<div>ODBC, JDBC, ADO.NET, OLEDB all deliver poor performance (compared to their native, proprietary, and database specific counterparts; native interfaces)<br></div></li>
<li>
<div>You can't really right generic database applications with these standards due to inconsistencies in the DBMS implementations of SQL (not true! there are many aspects of the specs that address these concerns if only a majority of driver vendors would implement these features, and the application developers actually used them by seeking drivers with full implementations).</div></li></ol>
<p>Even if the above were true (which I refute strongly), how about the general security vulnerabilities that affect both Native, and Standards compliant, data access interfaces?</p>
<p>Aaron's article does a good job of highlighting 6 areas of vulnerability:</p>
<ol>
<li>
<div>DBMS Defaults (usernames and passwords)</div></li>
<li>
<div>Authentication (at connect time)</div></li>
<li>
<div>Database Privileges</div></li>
<li>
<div>Fixpaks </div></li>
<li>
<div>Buffer Overflows</div></li>
<li>
<div>SQL Injection</div></li></ol>
<p>What I have been able to do very quickly (thanks to blogging, and the power of a blog engine that supports <a href="http://www.openlinksw.com/blog/~kidehen/index.vspx?id=543">WebDAV</a>), is write a <a href="http://www.openlinksw.com/blog/~kidehen/articles/uda_rule_book_sql_attacks.htm">tabulated response to each of the items </a>(bar Fixpaks) indicating how the <a href="http://www.openlinksw.com/info/mtproduct.htm">OpenLink Multi-Tier Data Access Drivers </a>(for ODBC, JDBC, ADO.NET, and OLEDB) protect corporate databases from each of these vulnerabilities.</p>
<p>To cut a long story short, we are increasingly living a contradiction where the terms "simple" and "free" are supposed to lead us to products that can adequately handle the challenges of an increasingly sophisticated grid of inter-connecting point. </p>
<p>I have been asked on numerous occassions, "How can you build a company and business based on data access technology?". My reply is the same as usual, "because everything comes down to data". If the data is compromised in anyway, then kiss Information, Knowledge, and everything else goodbye!</p>
<table align="right" border="0" cellpadding="5" cellspacing="0" width="336">
<tbody>
<tr>
<td></td></tr></tbody></table>
<p>&nbsp;</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-04-26#531">
  <rss:title>SQL-XML Evaluation and Comparison By InfoWorld</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2004-04-27T00:15:05Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">There is a new evaluation and comparison article from InfoWorld that compares the SQL-XML integration offerings of the major DBMS vendors (Oracle 10g, DB2 8.1, Sybase ASE 12.5, and SQL Server 2000)</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>There is a <a href="http://www.infoworld.com/infoworld/article/04/04/23/17FExml_2.html">new evaluation and comparison </a>article from InfoWorld that compares the SQL-XML integration offerings of the major DBMS vendors (Oracle 10g, DB2 8.1, Sybase ASE 12.5, and SQL Server 2000)</p>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2004-01-06#442">
  <rss:title>Enterprise Databases get a grip on XML</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2004-01-06T23:17:07Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Databases get a grip on XMLFrom Inforworld.The next iteration of the SQL standard was supposed to arrive in 2003. But SQL standardization has always been a glacially slow process, so nobody should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready yet. Even so, 2003 was a year in which XML-oriented data management, one of the areas addressed by the forthcoming standard, showed up on more and more developers? radar screens.ÃÂ  &gt;&gt; READ MORE This article rounds up product for 2003 in the critical area of Enterprise Database Technology. It&#39;s certainly provides an apt reflection of how Virtuoso compares with offerings from some the larger (but certainly slower to implement) database vendors in this space. As usual Jon Udell&#39;s quote pretty much sums this up: &quot;While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. OpenLink Software?s Virtuoso 3.0, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated ?joins? across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian?s Mono.&quot; Albeit still somewhat unknown to the broader industry we have remained true our &quot;innovator&quot; discipline, which still remains our chosen path to market leadership. Thus, its worth a quick Virtuoso release history, and featuresÃÂ recap as we get set to up the ante even further in 2004: 1998 - Virtuoso&#39;s initial public beta release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources. 1999 - Virtuoso&#39;s official commercial release, with emphasis stillÃÂ on Virtual Database functionality for ODBC, JDBC accessible SQL Databases. 2000 - Virtuoso 2.0 adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI, HTTP, Replication, Free Text Indexing (*feature update*), POP3, and NNTP support. 2002 - Virtuoso 2.7 extends Virtualization prowess beyond data access via enhancements to its Web Services protocol stack implementation by enabling SQL Stored Procedures to be published as Web Services. It also debutsÃÂ its Object-Relational engine enhancements that include theÃÂ incorporation of Java and Microsoft .NET Objects into its User Defined Type, User Defined Functions, and Stored ProcedureÃÂ offerings. 2003 - Virtuoso 3.0 extends data and application logic virtualization into the Application Server realm (basically a Virtual Application server too!), by adding support for ASP.NET, PHP, Java Server Pages runtime hosting (making applications built using any of these languages deployable using Virtuoso across all supported platforms). Collectively each of these releases have contributed to a very premeditated architecture and vision that will ultimately unveil the inherent power of critical I.S infrastructure virtualizationÃÂ along the following lines; data storage, data access , and application logic via coherent integration of SQL, XML, Web Services, and Persistent Stored Modules (.NET, Java, and other object based component building blocks). ÃÂ </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[
<blockquote style="margin-right: 0px;" dir="ltr"> <p><a class="listLinkLrg" title="http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D" href="http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D" target="_new"><strong><font face="Verdana">Databases get a grip on XML</font></strong></a><br /><font size="2"></font><font face="Verdana">From <a href="http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D">Inforworld</a>.</font><br /><br /><font face="Verdana,Geneva,Arial,sans-serif" size="2">The
next iteration of the SQL standard was supposed to arrive in 2003. But
SQL standardization has always been a glacially slow process, so nobody
should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready
yet. Even so, 2003 was a year in which XML-oriented data management,
one of the areas addressed by the forthcoming standard, showed up on
more and more developers? radar screens.ÃÂ  <a title="http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D" href="http://newsletter.infoworld.com/t?ctl=4FEDB6:1F3948D" target="_blank">&gt;&gt; READ MORE</a></font></p></blockquote> <p dir="ltr"><font face="Verdana" size="2">This
article rounds up product for 2003 in the critical area of Enterprise
Database Technology. It&#39;s certainly provides an apt reflection of how
Virtuoso compares with offerings from some the larger (but certainly
slower to implement) database vendors in this space. As usual Jon
Udell&#39;s quote pretty much sums this up:</font></p> <blockquote style="margin-right: 0px;" dir="ltr"> <p dir="ltr"><!--StartFragment --><span class="artText"><em>&quot;While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. </em><a class="regularArticleU" href="http://www.infoworld.com/699"><em>OpenLink Software?s Virtuoso 3.0</em></a><em>,
which we reviewed in March, stole thunder from all three major players.
Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2
Information Integrator, it functions as database middleware that can
perform federated ?joins? across SQL and XML sources. And like the
forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or
in the case of Linux, Novell/Ximian?s Mono.&quot;</em></span> </p></blockquote> <p dir="ltr"><font face="Verdana" size="2">Albeit
still somewhat unknown to the broader industry we have remained true
our &quot;innovator&quot; discipline, which still remains our chosen path to
market leadership. Thus, its worth a quick Virtuoso release history,
and featuresÃÂ recap as we get set to up the ante even further in
2004:</font></p> <p dir="ltr"><font face="Verdana" size="2"><a href="http://www.openlinksw.com/press/virtuoso.htm">1998 - Virtuoso&#39;s initial public beta</a> release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources.</font></p> <p dir="ltr"><font face="Verdana" size="2"><a href="http://www.openlinksw.com/press/virtuoso1.htm">1999 - Virtuoso&#39;s official commercial</a> release, with emphasis stillÃÂ on Virtual Database functionality for ODBC, JDBC accessible SQL Databases.</font></p> <p dir="ltr"><font face="Verdana" size="2"><a href="http://www.openlinksw.com/press/v2releas.htm">2000 - Virtuoso 2.0</a>
adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI,
HTTP, Replication, Free Text Indexing (*feature update*), POP3, and
NNTP support.</font></p> <p dir="ltr"><font face="Verdana" size="2"><a href="http://www.openlinksw.com/press/v27releas.htm">2002 - Virtuoso 2.7</a>
extends Virtualization prowess beyond data access via enhancements to
its Web Services protocol stack implementation by enabling SQL Stored
Procedures to be published as Web Services. It also debutsÃÂ its
Object-Relational engine enhancements that include
theÃÂ incorporation of Java and Microsoft .NET Objects into its User
Defined Type, User Defined Functions, and Stored
ProcedureÃÂ offerings.</font></p> <p dir="ltr"><font face="Verdana" size="2"><a href="http://www.openlinksw.com/press/virt3beta.htm">2003 - Virtuoso 3.0</a>
extends data and application logic virtualization into the Application
Server realm (basically a Virtual Application server too!), by adding
support for ASP.NET, PHP, Java Server Pages runtime hosting (making
applications built using any of these languages deployable using
Virtuoso across all supported platforms).</font></p> <p dir="ltr"><font face="Verdana" size="2">Collectively
each of these releases have contributed to a very premeditated
architecture and vision that will ultimately unveil the inherent power
of critical I.S infrastructure virtualizationÃÂ along the following
lines; data storage, data access , and application logic via coherent
integration of SQL, XML, Web Services, and Persistent Stored Modules
(.NET, Java, and other object based component building blocks).</font></p> <p dir="ltr"><font face="Verdana"></font>ÃÂ </p>
]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-26#192">
  <rss:title>Interesting Database History: INFORMIX</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-06-26T23:45:45Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Interesting Database History: INFORMIX From Wikipedia, the free encyclopedia. Informix is a relational database and for almost 20 years was also the name of the company who developed it. Informix DBMS was a development of the pioneering Ingres system that also led to Sybase and SQL Server, and was the #2 database system behind Oracle for some time in the 1990s. Their brush with success was surprisingly short-lived however, and by 2000 a series of management blunders had all but destroyed the company. In 2001 they were purchased by IBM in order to gain access to Informix&#39;s existing market share and customer base. Long term plans to merge Informix technology with DB2 are in place, since the Informix Arrowhead project is now called DB2 Arrowhead. IBM is also commited in supporting older versions. Read on.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<!--StartFragment --><A href="http://www.wikipedia.org/wiki/Informix">Interesting Database History: INFORMIX</A>
<P class=subtitle>From <A href="http://www.wikipedia.org/">Wikipedia</A>, the free encyclopedia. </P>
<P><STRONG>Informix</STRONG> is a <A class=internal title="Relational database" href="http://www.wikipedia.org/wiki/Relational_database">relational database</A> and for almost 20 years was also the name of the company who developed it. Informix DBMS was a development of the pioneering <A class=internal title=Ingres href="http://www.wikipedia.org/wiki/Ingres">Ingres</A> system that also led to <A class=internal title=Sybase href="http://www.wikipedia.org/wiki/Sybase">Sybase</A> and <A class=internal title="SQL Server" href="http://www.wikipedia.org/wiki/SQL_Server">SQL Server</A>, and was the #2 database system behind <A class=internal title=Oracle href="http://www.wikipedia.org/wiki/Oracle">Oracle</A> for some time in the 1990s. Their brush with success was surprisingly short-lived however, and by 2000 a series of management blunders had all but destroyed the company. In <A class=internal title=2001 href="http://www.wikipedia.org/wiki/2001">2001</A> they were purchased by <A class=internal title=IBM href="http://www.wikipedia.org/wiki/IBM">IBM</A> in order to gain access to Informix's existing market share and customer base. Long term plans to merge Informix technology with <A class=internal title=DB2 href="http://www.wikipedia.org/wiki/DB2">DB2</A> are in place, since the Informix Arrowhead project is now called DB2 Arrowhead. <A class=internal title=IBM href="http://www.wikipedia.org/wiki/IBM">IBM</A> is also commited in supporting older versions. </P>
<P><A href="http://www.wikipedia.org/wiki/Informix">Read on.</A></P>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-17#279">
  <rss:title>Ingres - A Forgotten Database, the untold story</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-06-17T11:18:57Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Ingres - A Forgottent Database The Untold Story Ingres (technically, Advantage Ingres Enterprise) is, arguably, the forgotten database. There used to be five major databases: Oracle, DB2, Sybase, Informix and Ingres. Then along came Microsoft and, if you listened to most press comment (or the lack of it), you would think that there were only two of these left, plus SQL Server. [From IT-Director] Oracle, Microsoft, and IBM would certainly like the illusion of a 3 horse race, as this is the only way they can induce Ingres, Informix, and Sybase users to jump ship, and this, even though database migrations are by far the most risk prone and problematic aspects of any IT infrastructure. Here is the interesting logic from the self-made big three, if you want to take advanatage of new paradigms and technologies such as XML, Web Services, and anything else in the pipeline you have to move all your data out of these databases, and then get all the mission critical applications re-associated with one of these databases, and by the way when you do so it is advisable that you use native interfaces (so that sometime in the future you have no chance whatsoever of repeating this folly at their expense). The simple fact of the matter (which the self-made big three do not want you to know) is that you can put ODBC, JDBC, even platform specific data access APIs such as OLE DB and ADO.NET atop any of these databases, and then explore and exploit the benefits of new technologies and paradigms as long as the tool pool supports one of more of these standards. Unfortunately the no-brainer above appears to be the more difficult of the choices before decision makers. In other words, many would rather dig themselves into a deeper hole (unknowingly i can only presume) that ultimately leads to technology lock-in. The biggest challenge before any RDBMS based infrastructure today isn&#39;t which of the self-made big three to migrate to wholesale, rather, how to make progressive use of the pool of disparate applications, and application databases that proliferate the enterprise. This is another way of understanding the burgeoning market for Virtual Databases, which in my opiion present the new frontier in database technology.  </dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<P><A href="http://www.it-director.com/article.php?articleid=10951">Ingres - A Forgottent Database The Untold Story</A></P>
<P><EM>Ingres (technically, Advantage Ingres Enterprise) is, arguably, the forgotten database. There used to be five major databases: Oracle, DB2, Sybase, Informix and Ingres. Then along came Microsoft and, if you listened to most press comment (or the lack of it), you would think that there were only two of these left, plus SQL Server</EM>. [From <A href="http://www.it-director.com/article.php?articleid=10951">IT-Director</A>]</P>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">Oracle, Microsoft, and IBM would certainly like the illusion of a 3 horse race, as this is the only way they can induce Ingres, Informix, and Sybase users to jump ship, and this, even though database migrations are by far the most risk prone and problematic aspects of any IT infrastructure. <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">Here is the interesting logic from the self-made big three, if you want to take advanatage of new paradigms and technologies such as XML, Web Services, and anything else in the pipeline you have to move all your data out of these databases, and then get all the mission critical applications re-associated with one of these databases, and by the way when you do so it is advisable that you use native interfaces (so that sometime in the future you have no chance whatsoever of repeating this folly at their expense).<o:p></o:p></SPAN></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">The simple fact of the matter (which the self-made big three do not want you to know) is that you can put ODBC, JDBC, even platform specific data access APIs such as OLE DB and ADO.NET atop any of these databases, and then explore and exploit the benefits of new technologies and paradigms as long as the tool pool supports one of more of these standards.<o:p></o:p></SPAN></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">Unfortunately the no-brainer above appears to be the more difficult of the choices before decision makers. In other words, many would rather dig themselves into a deeper hole (unknowingly i can only presume) that ultimately leads to technology lock-in.<o:p></o:p></SPAN></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">The biggest challenge before any RDBMS based infrastructure today isn't which of the self-made big three to migrate to wholesale, rather, how to make progressive use of the pool of disparate applications, and application databases that proliferate the enterprise. <o:p></o:p></SPAN></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">This is another way of understanding the burgeoning market for Virtual Databases, which in my opiion present the new frontier in database technology.<o:p></o:p></SPAN></P>
<P>&nbsp;</P></BLOCKQUOTE>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-06-09#266">
  <rss:title>How Databases Changed The World</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-06-09T09:28:17Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">How Databases Changed The World by Tim DiChiara, Site Editor (SearchDatabase.com) How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned? I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years:Chris DateHerb Edelstein Bob Epstein (Sybase who shared code with Microsoft for remarketing on SQL Server on OS/2 which inevitably lead to the Microsoft SQL Server we know today)Ken Jacobs (Oracle&#39;s Dr. DBA)Pat Selinger  (DB2 precursor called System R) Roger Sippl (Informix)Michael Stonebraker (Ingres, Postgres, and Mariposa)The event is available via streaming video and was recorded in February at the Computer History Museum in Mountain View, California. After a chatty and lengthy (45 minutes!) introduction only interesting to hardcore insiders, you can see Chris Date waxing eloquent about Ted Codd (complete with quotes from Shakespeare, no less), Herb Edelstein waxing eloquent about Chris Date, and Michael Stonebraker at his geeky best. There&#39;s also interesting trivia about the beginnings of SQL, the role of INGRES, why the relational model will stand the test of time and some friendly Oracle and IBM bashing (and Microsoft and Sybase and...). I urge all you data management pros interested in broadening your knowledge of the field to check it out! If you&#39;re still not satiated, don&#39;t forget about our collection of backgrounders about the DBMS and the data management industry. sqlrdbmsdatabase</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[
 <p><a href="http://searchdatabase.techtarget.com/bestWebLinks/0,289521,sid13_tax281575,00.html"><b>How Databases Changed The World</b></a>     by Tim DiChiara, Site Editor (<a href="http://www.searchdatabase.com">SearchDatabase.com</a>)      </p><p>How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned?    I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years:</p><a href="http://en.wikipedia.org/wiki/Chris_Date">Chris Date</a><br /><a href="http://www.computerhistory.org/events/lectures/db_02102003/edelstein/">Herb Edelstein</a> <br /><a href="http://www.computerhistory.org/events/lectures/db_02102003/epstein/">Bob Epstein</a> (<a href="http://en.wikipedia.org/wiki/Sybase">Sybase</a> who shared code with Microsoft for remarketing on SQL Server on OS/2 which inevitably lead to the <a href="http://en.wikipedia.org/wiki/Microsoft_SQL_Server">Microsoft SQL Server</a> we know today)<br /><a href="http://www.oracle.com/corporate/pressroom/html/kjacobs.html">Ken Jacobs</a> (<a href="http://en.wikipedia.org/wiki/Oracle_database">Oracle</a>&#39;s Dr. DBA)<br /><a href="http://www.witi.com/center/witimuseum/halloffame/2004/pselinger.php">Pat Selinger </a> (<a href="http://en.wikipedia.org/wiki/DB2">DB2</a> precursor called System R) <br /><a href="http://en.wikipedia.org/wiki/Informix">Roger Sippl</a> (<a href="http://en.wikipedia.org/wiki/Informix">Informix</a>)<br /><a href="http://en.wikipedia.org/wiki/Michael_Stonebraker">Michael Stonebraker</a> (<a href="http://en.wikipedia.org/wiki/Ingres">Ingres</a>, <a href="http://en.wikipedia.org/wiki/PostgreSQL">Postgres</a>, and <a href="http://mariposa.cs.berkeley.edu/about.html">Mariposa</a>)<br /><br />The event is available via streaming <a href="http://www.computerhistory.org/events/lectures/db_02102003/">video</a> and was recorded in February at the Computer History Museum in Mountain View, California.    After a chatty and lengthy (45 minutes!) introduction only interesting to hardcore insiders, you can see Chris Date waxing eloquent about Ted Codd (complete with quotes from Shakespeare, no less), Herb Edelstein waxing eloquent about Chris Date, and Michael Stonebraker at his geeky best. There&#39;s also interesting trivia about the beginnings of SQL, the role of INGRES, why the relational model will stand the test of time and some friendly Oracle and IBM bashing (and Microsoft and Sybase and...).     I urge all you data management pros interested in broadening your knowledge of the field to check it out! If you&#39;re still not satiated, don&#39;t forget about our collection of backgrounders about the DBMS and the data management industry.      
<a href="index.vspx?tag=sql" rel="tag" style="display:none;">sql</a><a href="index.vspx?tag=rdbms" rel="tag" style="display:none;">rdbms</a><a href="index.vspx?tag=database" rel="tag" style="display:none;">database</a>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-21#319">
  <rss:title>&lt;big&gt;SQL Injection FAQ &lt;/big&gt;</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-05-21T22:27:45Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">SQL Injection FAQ Â http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3Â  Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection? Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. http://www.owasp.orgWhat is SQL Injection and why is all this information not included in the regular FAQ? SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer. Â  Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ.Â  In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers.Â  However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted. What causes SQL Injection? SQL injection is usually caused by developers who use &quot;string-building&quot; techniques in order to execute SQL code.Â  For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown): Set myRecordset = myConnection.execute(&quot;SELECT * FROM myTable WHERE someText =&#39;&quot; &amp; request.form(&quot;inputdata&quot;) &amp; &quot;&#39;&quot;) The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation.Â  We are trusting that user has not entered something malicious - something like the innocent looking single quote (&#39;).Â  Let&#39;s consider what would happen if a user entered the following text into the search form: &#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39; -- Then, when the query string is assembled and sent to SQL Server, the server will process the following code: SELECT * FROM myTable WHERE someText =&#39;&#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39;--&#39; Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database.Â  If this application were running as &#39;sa&#39; and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine.Â  Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer&#39;s code. More Very intresting that these are all Native Interface based exploits.Â  So the security issue isn&#39;t ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p align="center"><font color="#0080c0" size="2"><big><strong><big>SQL Injection FAQ </big></strong></big></font></p>
<p align="center">
</p><p align="center"><strong><font color="red"></font></strong></p><strong><font color="red">Â </font></strong><a href="http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&tabid=3"><strong><font color="red"><a href="http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3">http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3</a><br /></font></strong></a><strong><font color="red">Â </font></strong> <br />
<div align="center">
<center>
<table width="80%" border="0">
<tbody>
<tr>
<td width="100%">
<p><big><strong><font size="2">Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection?</font></strong></big></p>
<p><font size="2">Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. </font><a href="http://www.owasp.org/" target="_blank"><a href="http://www.owasp.org/"><font size="2">http://www.owasp.org</font></a></a></p><font size="2"><b>What is SQL Injection and why is all this information not included in the regular FAQ?</b> </font>
<p><font size="2">SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer. Â  Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ.Â  In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers.Â  However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted.</font></p>
<p><big><strong><font size="2">What causes SQL Injection?</font></strong></big></p>
<p><font size="2">SQL injection is usually caused by developers who use &quot;string-building&quot; techniques in order to execute SQL code.Â  For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown):</font></p>
<p><font face="Courier New" color="#ff0000" size="2">Set myRecordset = myConnection.execute(&quot;SELECT * FROM myTable WHERE someText =&#39;&quot; &amp; request.form(&quot;inputdata&quot;) &amp; &quot;&#39;&quot;)</font></p>
<p><font size="2">The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation.Â  We are trusting that user has not entered something malicious - something like the innocent looking single quote (&#39;).Â  Let&#39;s consider what would happen if a user entered the following text into the search form:</font></p>
<p><font size="2">&#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39; --</font></p>
<p><font size="2">Then, when the query string is assembled and sent to SQL Server, the server will process the following code:</font></p>
<p><font face="Courier New" color="#ff0000" size="2">SELECT * FROM myTable WHERE someText =&#39;&#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39;--&#39;</font></p>
<p><font size="2">Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database.Â  If this application were running as &#39;sa&#39; and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine.Â  Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer&#39;s code.</font></p>
<p><a href="http://www.sqlsecurity.com/faq-inj.asp"><font size="2">More</font></a></p>
<p><em><font color="#000000" size="2">Very intresting that these are all Native Interface based exploits.Â  So the security issue isn&#39;t ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality.</font></em></p></td></tr></tbody></table></center></div>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-21#48">
  <rss:title>&lt;big&gt;SQL Injection FAQ &lt;/big&gt;</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-05-21T22:27:45Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">SQL Injection FAQ  http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3  Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection? Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. http://www.owasp.orgWhat is SQL Injection and why is all this information not included in the regular FAQ? SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer.   Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ.  In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers.  However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted. What causes SQL Injection? SQL injection is usually caused by developers who use &quot;string-building&quot; techniques in order to execute SQL code.  For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown): Set myRecordset = myConnection.execute(&quot;SELECT * FROM myTable WHERE someText =&#39;&quot; &amp; request.form(&quot;inputdata&quot;) &amp; &quot;&#39;&quot;) The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation.  We are trusting that user has not entered something malicious - something like the innocent looking single quote (&#39;).  Let&#39;s consider what would happen if a user entered the following text into the search form: &#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39; -- Then, when the query string is assembled and sent to SQL Server, the server will process the following code: SELECT * FROM myTable WHERE someText =&#39;&#39; exec master..xp_cmdshell &#39;net user test testpass /ADD&#39;--&#39; Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database.  If this application were running as &#39;sa&#39; and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine.  Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer&#39;s code. More Very intresting that these are all Native Interface based exploits.  So the security issue isn&#39;t ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<P align=center><FONT color=#0080c0 size=2><BIG><STRONG><BIG>SQL Injection FAQ </BIG></STRONG></BIG></FONT></P>
<P align=center>
<P align=center><STRONG><FONT color=red></FONT></STRONG></P><STRONG><FONT color=red>&nbsp;</FONT></STRONG><A href="http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;tabid=3"><STRONG><FONT color=red><A href="http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3">http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&amp;;tabid=3</A><BR></FONT></STRONG></A><STRONG><FONT color=red>&nbsp;</FONT></STRONG> <BR>
<DIV align=center>
<CENTER>
<TABLE width="80%" border=0>
<TBODY>
<TR>
<TD width="100%">
<P><BIG><STRONG><FONT size=2>Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection?</FONT></STRONG></BIG></P>
<P><FONT size=2>Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. </FONT><A href="http://www.owasp.org/" target=_blank><A href="http://www.owasp.org/"><FONT size=2>http://www.owasp.org</FONT></A></A></P><FONT size=2><B>What is SQL Injection and why is all this information not included in the regular FAQ?</B> </FONT>
<P><FONT size=2>SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer. &nbsp; Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ.&nbsp; In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers.&nbsp; However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted.</FONT></P>
<P><BIG><STRONG><FONT size=2>What causes SQL Injection?</FONT></STRONG></BIG></P>
<P><FONT size=2>SQL injection is usually caused by developers who use "string-building" techniques in order to execute SQL code.&nbsp; For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown):</FONT></P>
<P><FONT face="Courier New" color=#ff0000 size=2>Set myRecordset = myConnection.execute("SELECT * FROM myTable WHERE someText ='" &amp; request.form("inputdata") &amp; "'")</FONT></P>
<P><FONT size=2>The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation.&nbsp; We are trusting that user has not entered something malicious - something like the innocent looking single quote (').&nbsp; Let's consider what would happen if a user entered the following text into the search form:</FONT></P>
<P><FONT size=2>' exec master..xp_cmdshell 'net user test testpass /ADD' --</FONT></P>
<P><FONT size=2>Then, when the query string is assembled and sent to SQL Server, the server will process the following code:</FONT></P>
<P><FONT face="Courier New" color=#ff0000 size=2>SELECT * FROM myTable WHERE someText ='' exec master..xp_cmdshell 'net user test testpass /ADD'--'</FONT></P>
<P><FONT size=2>Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database.&nbsp; If this application were running as 'sa' and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine.&nbsp; Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer's code.</FONT></P>
<P><A href="http://www.sqlsecurity.com/faq-inj.asp"><FONT size=2>More</FONT></A></P>
<P><EM><FONT color=#000000 size=2>Very intresting that these are all Native Interface based exploits.&nbsp; So the security issue isn't ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality.</FONT></EM></P></TD></TR></TBODY></TABLE></CENTER></DIV>]]></content:encoded>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?date=2003-05-16#301">
  <rss:title>&lt;p&gt;IBM TO SHIP DB2 INTEGRATION SOFTWARE&lt;/p&gt;</rss:title>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2003-05-16T20:34:07Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">IBM TO SHIP DB2 INTEGRATION SOFTWARE Posted May 15, 2003 4:46 PM Pacific Time IBM on Tuesday plans to announce availability of its DB2 Information Integrator software, for integrating and analyzing multiple forms of information, the company acknowledged on Thursday. In beta since February, the software is intended to enable customers to manage centrally data, text, images, photos, video and audio files stored in different databases, according to IBM. XML content and Web services also are supported. Interesting Quote: &quot;If we move to information as a utility for giant data grids, this is key technology for hiding or making unimportant the location and type of data. This software enables the data to be accessed transparently wherever it might be,&quot; Jones said. Product PricingDB2 Information Integrator will be available for $20,000 per processor and $15,000 per data source connector.Detail will also be available on Tuesday. The cost for a bulk adapter license is about $75,000. If change capture is involved, the adapter license costs about $150,000. Real-time integration costs are mips-based, with a starting cost of about $300,000. One adapter can be used to translate and make native calls to all environments. Very interesting pricing!  For the full story: http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<FONT size=2>
<P>IBM TO SHIP DB2 INTEGRATION SOFTWARE</P>
<P>Posted May 15, 2003 4:46 PM Pacific Time</P>
<P>IBM on Tuesday plans to announce availability of its DB2 Information Integrator software, for integrating and analyzing multiple forms of information, the company acknowledged on Thursday.</P>
<P>In beta since February, the software is intended to enable customers to manage centrally data, text, images, photos, video and audio files stored in different databases, according to IBM. XML content and Web services also are supported.</P>
<P><EM><STRONG>Interesting Quote:</STRONG></EM></P>
<P class=ArticleBody page="1">"If we move to information as a utility for giant data grids, this is key technology for hiding or making unimportant the location and type of data. This software enables the data to be accessed transparently wherever it might be," Jones said. </P>
<P class=ArticleBody page="1"><EM><STRONG>Product Pricing</STRONG></EM><BR>DB2 Information Integrator will be available for $20,000 per processor and $15,000 per data source connector.<BR>Detail will also be available on Tuesday. </P>
<P class=ArticleBody page="1">The cost for a bulk adapter license is about $75,000. If change capture is involved, the adapter license costs about $150,000. Real-time integration costs are mips-based, with a starting cost of about $300,000. One adapter can be used to translate and make native calls to all environments. <BR><BR><EM>Very interesting pricing!&nbsp; </EM></P>
<P class=ArticleBody page="1">For the full story: </FONT><A href="http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html"><U><FONT color=#0000ff size=2><a href="http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html">http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html</a></U></FONT></A></P>]]></content:encoded>
 </rss:item>
</rdf:RDF>