<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>

<title>OpenLink Virtuoso (Product Blog)</title><link>http://www.openlinksw.com/blog/vdb/blog/</link><description>A great place to track Virtuoso&#39;s rapid evolution.</description><managingEditor>kidehen@openlinksw.com</managingEditor><pubDate>Mon, 23 Nov 2009 12:30:52 GMT</pubDate><generator>Virtuoso Universal Server 05.12.3041</generator><webMaster>kidehen@openlinksw.com</webMaster><image><title>OpenLink Virtuoso (Product Blog)</title><url>http://www.openlinksw.com/weblog/public/images/vbloglogo.gif</url><link>http://www.openlinksw.com/blog/vdb/blog/</link><description>A great place to track Virtuoso&#39;s rapid evolution.</description><width>88</width><height>31</height></image>
<item><title>Virtuoso update</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2007-09-24#1263</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1263#comments</comments><pubDate>Mon, 24 Sep 2007 15:03:10 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2007-09-24T11:03:10.000003-04:00</n0:modified><description>&lt;div&gt;&lt;div style=&quot;display:none;&quot;&gt;Virtuoso update&lt;/div&gt;
 &lt;font face=&quot;Arial&quot;&gt;  &lt;font size=&quot;2&quot;&gt;I have been occupied with getting the  Virtuoso cluster to fuifull functionality.  It now does  insert/delete/update and select over partitions spread over multiple instances  and can send co-located joins and other co-located query fragments over to  partner nodes. &lt;span class=&quot;093250614-24092007&quot;&gt; Of course, the more  functionality is transferred per  message, the  better.&lt;/span&gt;  &lt;/font&gt; &lt;/font&gt;  &lt;div&gt; &lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;&lt;br /&gt;Most of the code is definitely written.  Now  it is a question of testing and adding optimizations here and there, also  detecting distributed deadlocks.&lt;/font&gt; &lt;/div&gt;  &lt;div&gt; &lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;&lt;br /&gt;I&amp;#39;ll go to the &lt;a href=&quot;http://www.sabre-conference.com/&quot;&gt;SABRE&lt;/a&gt; conference in Leipzig this  week, where we present an update on Virtuoso.  This is an updated  verversion of the Virtuoso intro paper, now also at the Virtuoso web  site.&lt;/font&gt; &lt;/div&gt; &lt;div&gt; &lt;/div&gt; &lt;div&gt; &lt;font face=&quot;Arial&quot; size=&quot;2&quot;&gt;In another month, I&amp;#39;ll be at the RDF Relational  Mapping workshop in Cambridge, MA.  There will be more updates on our  results before then.&lt;/font&gt; &lt;/div&gt;   
&lt;a href=&quot;index.vspx?tag=Database&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Database&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=Databases&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Databases&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=Virtuoso&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Virtuoso&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=Scalability&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Scalability&lt;/a&gt;&lt;a href=&quot;index.vspx?tag=Clustering&quot; rel=&quot;tag&quot; style=&quot;display:none;&quot;&gt;Clustering&lt;/a&gt;&lt;/div&gt;</description></item><item><title>More on RDF and Vertical Storage</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2007-06-11#1223</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1223#comments</comments><pubDate>Mon, 11 Jun 2007 08:35:00 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2007-06-11T04:36:17.000003-04:00</n0:modified><description>
We actually did the experiment I mentioned a couple of posts back, about storing RDF triples column-wise. &lt;br /&gt; &lt;br /&gt;The test loads 4.8 million triples of LUBM data and reads the whole set on one index and then checks if it finds the same row on another index.&lt;br /&gt; &lt;br /&gt;Reading GSPO and checking OGPS takes 27 seconds.&lt;span&gt;  &lt;/span&gt;Doing the same with column wise bitmap indices on S, G, P and O takes 86 seconds.&lt;span&gt;   &lt;/span&gt;The latter checks the existence of the row by AND&amp;#39;ing 4 bitmap indices and the former checks its existence by a single lookup in a multi-part index whose last part is a bitmap.&lt;span&gt;  &lt;/span&gt;The result is approximately what one would expect.&lt;span&gt;  &lt;/span&gt;The bitmap AND could be optimized a bit, dropping the time to maybe 70 seconds.&lt;span&gt;  &lt;/span&gt; &lt;br /&gt; &lt;br /&gt;Now speaking of compression, it is true that column storage will work better.&lt;span&gt;  &lt;/span&gt;For example the G and P columns will compress to pretty much nothing.&lt;span&gt;  &lt;/span&gt;On a row layout they compress too but not to nothing since even if a value is not unique you have to store the place where the value is if you want to read rows in constant time per row. &lt;br /&gt; &lt;br /&gt;What is nice with the 4 bitmaps is that no combination of search conditions is penalized.&lt;span&gt;  &lt;/span&gt;But the trick of using bitmaps for self-join is lost:&lt;span&gt;  &lt;/span&gt;You can&amp;#39;t evaluate {?s a Person . ?s name &amp;quot;Mary&amp;quot;} by and&amp;#39;ing the S  bitmaps for persons and for subjects named &amp;quot;Mary&amp;quot;. &lt;br /&gt; &lt;br /&gt;The 4 bitmap indices are remarkably compact, though. 8840 pages all together.&lt;br /&gt;We could probably get the G, S, P, O columns in 3000 pages or so, using very little&lt;span&gt;  &lt;/span&gt;compression.&lt;br /&gt;The OGPS index is &lt;span&gt;  &lt;/span&gt;5169 pages and the GSPO index is 21243 pages. &lt;br /&gt; &lt;br /&gt;None of the figures have any compression, except what a bitmap naturally produces. &lt;br /&gt; &lt;br /&gt;Now we have figured out a modified row layout which will about double working set with the same memory and keep things in rows.&lt;span&gt;  &lt;/span&gt;We will try that.&lt;span&gt;  &lt;/span&gt;The GSPO index will be about&lt;span&gt;  &lt;/span&gt;10000 pages and OGPS will be about 4500.&lt;span&gt;  &lt;/span&gt;We do not expect much impact on search or insert times.&lt;br /&gt; &lt;br /&gt;We looked at using gzip for  database pages.&lt;span&gt;  &lt;/span&gt;They go to between 1/4 to 1/3 page.&lt;span&gt;   &lt;/span&gt;But this does not improve working set and having variable length pages generates all kinds of special cases you don’tt want.&lt;span&gt;  &lt;/span&gt;So we will improve working set first and deal with somewhat compressed data in the execution engine. &lt;br /&gt;After that, maybe gzip will cut the size to 1/2 or so but&lt;span&gt;  &lt;/span&gt;that will be good for disk only.&lt;span&gt;  &lt;/span&gt;And it does not so much matter how much you transfer but how many seeks you do.&lt;br /&gt; &lt;br /&gt;Still, column-wise storage will likely win for size.&lt;span&gt;  &lt;/span&gt;So if the working set is much larger than memory this may have an edge.&lt;span&gt;  &lt;/span&gt;To keep all bases covered we will eventually add this as an option.&lt;br /&gt; &lt;br /&gt;
&lt;p&gt; &lt;a href=&quot;http://www.technorati.com/tags/semantic%20web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/database&quot; rel=&quot;tag&quot;&gt;Database&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/databases&quot; rel=&quot;tag&quot;&gt;Databases&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/virtuoso&quot; rel=&quot;tag&quot;&gt;Virtuoso&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/sparql&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt;|&lt;a href=&quot;http://www.technorati.com/tag/rdf&quot; rel=&quot;tag&quot;&gt;RDF&lt;/a&gt; &lt;/p&gt;
&lt;br /&gt;
</description></item><item><title>Virtuoso Open Source 5.0 Release Imminent</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2007-03-16#1160</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1160#comments</comments><pubDate>Fri, 16 Mar 2007 09:55:29 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2007-03-16T05:55:29-04:00</n0:modified><description>&lt;div&gt;
&lt;div style=&quot;display:none;&quot;&gt;Virtuoso Open Source 5.0 Release Imminent&lt;/div&gt;
&lt;p&gt;We are a couple of days from releasing the Virtuoso Open Source 5.0 cut. This will make the technology that we are showing with Dbpedia and the various OpenLink web sites available to the public.&lt;/p&gt;
&lt;p&gt;The updates involve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Significant database engine improvements, as discussed in previous posts.&lt;/li&gt;
&lt;li&gt;Tons of RDF related bug fixes.&lt;/li&gt;
&lt;li&gt;Text index extension to SPARQL&lt;/li&gt;
&lt;li&gt;New SQL data type capturing the whole XML Schema scalar type system used in RDF.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Soon to follow are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Basic inference for RDF, including type and property subsumption.&lt;/li&gt;
&lt;li&gt;Whole new disk IO system with much better disk locality.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Existing databases will be automatically upgraded when started with the new Virtuoso 5.0 server. Note that after upgrade, the RDF data is not backward compatible.&lt;/p&gt;
&lt;p&gt;We will be rolling out more Virtuoso hosted semantic web content in the &lt;a href=&quot;http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/&quot;&gt;Linking Open Data project&lt;/a&gt;, part of our participation in the Semantic Web Education and Outreach activity at W3C.&lt;/p&gt;
&lt;p&gt;
&lt;a href=&quot;http://www.technorati.com/tags/semantic%20web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/database&quot; rel=&quot;tag&quot;&gt;Database&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/databases&quot; rel=&quot;tag&quot;&gt;Databases&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/virtuoso&quot; rel=&quot;tag&quot;&gt;Virtuoso&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/sparql&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt;|&lt;a href=&quot;http://www.technorati.com/tag/rdf&quot; rel=&quot;tag&quot;&gt;RDF&lt;/a&gt; &lt;/p&gt;    
&lt;/div&gt;</description></item><item><title>Virtuoso Open Source 5.0 Release Imminent</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2007-03-16#1160</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1160#comments</comments><pubDate>Fri, 16 Mar 2007 09:55:29 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2007-03-16T05:55:29-04:00</n0:modified><description>&lt;div&gt;
&lt;div style=&quot;display:none;&quot;&gt;Virtuoso Open Source 5.0 Release Imminent&lt;/div&gt;
&lt;p&gt;We are a couple of days from releasing the Virtuoso Open Source 5.0 cut. This will make the technology that we are showing with Dbpedia and the various OpenLink web sites available to the public.&lt;/p&gt;
&lt;p&gt;The updates involve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Significant database engine improvements, as discussed in previous posts.&lt;/li&gt;
&lt;li&gt;Tons of RDF related bug fixes.&lt;/li&gt;
&lt;li&gt;Text index extension to SPARQL&lt;/li&gt;
&lt;li&gt;New SQL data type capturing the whole XML Schema scalar type system used in RDF.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Soon to follow are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Basic inference for RDF, including type and property subsumption.&lt;/li&gt;
&lt;li&gt;Whole new disk IO system with much better disk locality.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Existing databases will be automatically upgraded when started with the new Virtuoso 5.0 server. Note that after upgrade, the RDF data is not backward compatible.&lt;/p&gt;
&lt;p&gt;We will be rolling out more Virtuoso hosted semantic web content in the &lt;a href=&quot;http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/&quot;&gt;Linking Open Data project&lt;/a&gt;, part of our participation in the Semantic Web Education and Outreach activity at W3C.&lt;/p&gt;
&lt;p&gt;
&lt;a href=&quot;http://www.technorati.com/tags/semantic%20web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/database&quot; rel=&quot;tag&quot;&gt;Database&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/databases&quot; rel=&quot;tag&quot;&gt;Databases&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/virtuoso&quot; rel=&quot;tag&quot;&gt;Virtuoso&lt;/a&gt; |&lt;a href=&quot;http://www.technorati.com/tags/sparql&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt;|&lt;a href=&quot;http://www.technorati.com/tag/rdf&quot; rel=&quot;tag&quot;&gt;RDF&lt;/a&gt; &lt;/p&gt;    
&lt;/div&gt;</description></item><item><title>Recent Virtuoso Developments</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2006-09-19#1044</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1044#comments</comments><pubDate>Tue, 19 Sep 2006 11:45:29 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2006-09-19T07:45:29.000003-04:00</n0:modified><description>&lt;div&gt;
&lt;div style=&quot;display:none;&quot;&gt;Recent Virtuoso Developments&lt;/div&gt;
&lt;p&gt;We have been extensively working on virtual database refinements.  There aremany SQL cost model adjustments to better model  distributed queries and wenow support direct access to Oracle and Informix statistics system tables.Thus, when you attach  a table from one or the other, you automatically getup to date statistics.  This helps &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso&lt;/a&gt; optimize distributed  queries.Also the documentation is updated as concerns these, with a new section ondistributed query optimization.&lt;/p&gt;
&lt;p&gt;On the applications side, we have been keeping up with the SIOC RDF ontologydevelopments.  All ODS applications now make  their data available as SIOCgraphs for download and SPARQL query access.&lt;/p&gt;
&lt;p&gt;What is most exciting however is our advance in mapping relational data intoRDF.  We now have a mapping language that makes  arbitrary legacy data in &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso&lt;/a&gt; or elsewhere in the relational world RDF queriable.  We will putout a white paper on  this in a few days.&lt;/p&gt;
&lt;p&gt;Also we have some innovations in mind for optimizing the physical storage ofRDF triples.  We keep experimenting, now with  our sights set to the highend of triple storage, towards billion triple data sets.  We areexperimenting with a new more space efficient index structure  for betterworking set behavior.  Next week will yield the first results.&lt;/p&gt;
&lt;a href=&quot;http://www.technorati.com/tags/sql&quot; rel=&quot;tag&quot;&gt;SQL&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/database&quot; rel=&quot;tag&quot;&gt;Database&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/databases&quot; rel=&quot;tag&quot;&gt;Databases&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/virtuoso&quot; rel=&quot;tag&quot;&gt;Virtuoso&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/programming&quot; rel=&quot;tag&quot;&gt;Programming&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/semantic%20web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/sparql&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt; &lt;/div&gt;</description></item><item><title>Recent Virtuoso Developments</title><guid>http://www.openlinksw.com/blog/vdb/blog/?date=2006-09-19#1044</guid><comments>http://www.openlinksw.com/blog/vdb/blog/?id=1044#comments</comments><pubDate>Tue, 19 Sep 2006 11:45:29 GMT</pubDate><n0:modified xmlns:n0="http://www.openlinksw.com/weblog/">2006-09-19T07:45:29.000003-04:00</n0:modified><description>&lt;div&gt;
&lt;div style=&quot;display:none;&quot;&gt;Recent Virtuoso Developments&lt;/div&gt;
&lt;p&gt;We have been extensively working on virtual database refinements.  There aremany SQL cost model adjustments to better model  distributed queries and wenow support direct access to Oracle and Informix statistics system tables.Thus, when you attach  a table from one or the other, you automatically getup to date statistics.  This helps &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso&lt;/a&gt; optimize distributed  queries.Also the documentation is updated as concerns these, with a new section ondistributed query optimization.&lt;/p&gt;
&lt;p&gt;On the applications side, we have been keeping up with the SIOC RDF ontologydevelopments.  All ODS applications now make  their data available as SIOCgraphs for download and SPARQL query access.&lt;/p&gt;
&lt;p&gt;What is most exciting however is our advance in mapping relational data intoRDF.  We now have a mapping language that makes  arbitrary legacy data in &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso&lt;/a&gt; or elsewhere in the relational world RDF queriable.  We will putout a white paper on  this in a few days.&lt;/p&gt;
&lt;p&gt;Also we have some innovations in mind for optimizing the physical storage ofRDF triples.  We keep experimenting, now with  our sights set to the highend of triple storage, towards billion triple data sets.  We areexperimenting with a new more space efficient index structure  for betterworking set behavior.  Next week will yield the first results.&lt;/p&gt;
&lt;a href=&quot;http://www.technorati.com/tags/sql&quot; rel=&quot;tag&quot;&gt;SQL&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/database&quot; rel=&quot;tag&quot;&gt;Database&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/databases&quot; rel=&quot;tag&quot;&gt;Databases&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/virtuoso&quot; rel=&quot;tag&quot;&gt;Virtuoso&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/programming&quot; rel=&quot;tag&quot;&gt;Programming&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/semantic%20web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt; &lt;a href=&quot;http://www.technorati.com/tags/sparql&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt; &lt;/div&gt;</description></item>
</channel>
</rss>
