<?xml version="1.0" encoding="UTF-8" ?>
<!--ATOM based XML document generated By OpenLink Virtuoso-->
<atom:feed xmlns:atom="http://www.w3.org/2005/Atom" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:vi="http://www.openlinksw.com/weblog/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/">
<atom:id>http://www.openlinksw.com/blog/vdb/blog/</atom:id>
<atom:title>OpenLink Virtuoso (Product Blog)</atom:title>
<atom:link href="http://www.openlinksw.com/blog/vdb/blog/" type="text/html" rel="alternate" />
<atom:link href="http://www.openlinksw.com/blog/vdb/blog/gems/atom_tag_arch.xml?:tag=virtuoso&amp;:bid=136" type="application/atom+xml" rel="self" />
<atom:subtitle>A great place to track Virtuoso&#39;s rapid evolution.</atom:subtitle>
 <atom:author>
  <atom:name>kidehen@openlinksw.com</atom:name>
  <atom:email>kidehen@openlinksw.com</atom:email>
  </atom:author>
<atom:updated>2013-06-19T11:06:53Z</atom:updated>
<atom:generator>Virtuoso Universal Server 06.04.3136</atom:generator>
<atom:logo>http://www.openlinksw.com/weblog/public/images/vbloglogo.gif</atom:logo>
 <atom:entry>
  <atom:title>RDF and Transactions</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-22#1690</atom:id>
  <atom:published>2011-03-22T22:52:56Z</atom:published>
  <atom:updated>2011-03-22T17:44:21-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I will here talk about &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x249bc940&quot;&gt;RDF&lt;/a&gt; and transactions for developers in general. The next one talks about specifics and is for specialists.&lt;/p&gt; &lt;p&gt;Transactions are certainly not the first thing that comes to mind when one hears &amp;quot;RDF&amp;quot;. We have at times used a recruitment questionnaire where we ask applicants to define a transaction. Many vaguely remember that it is a unit of work, but usually not more than that. We sometimes get questions from users about why they get an error message that says &amp;quot;deadlock&amp;quot;. &amp;quot;Deadlock&amp;quot; is what happens when multiple users concurrently update balances on multiple bank accounts in the wrong order. What does this have to do with RDF?&lt;/p&gt; &lt;p&gt;There are in fact users who even use XA with a &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x22c8dbc8&quot;&gt;Virtuoso&lt;/a&gt;-based RDF application. &lt;a class=&quot;auto-href&quot; href=&quot;http://semanticweb.org/id/Franz_Inc&quot; id=&quot;link-id0x27bd0c08&quot;&gt;Franz&lt;/a&gt; also has publicized their development of full &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x283985c8&quot;&gt;ACID&lt;/a&gt; capabilities for &lt;a class=&quot;auto-href&quot; href=&quot;http://semanticweb.org/id/AllegroGraph&quot; id=&quot;link-id0x238ba438&quot;&gt;AllegroGraph&lt;/a&gt;. RDF is a database &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x2864fef8&quot;&gt;schema&lt;/a&gt; model, and transactions will inevitably become an issue in databases.&lt;/p&gt; &lt;p&gt;At the same time, the developer population trained with &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0x284d2d80&quot;&gt;MySQL&lt;/a&gt; and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x237230e8&quot;&gt;PHP&lt;/a&gt; is not particularly transaction-aware. Transactions have gone out of style, declares the No-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x2920cc88&quot;&gt;SQL&lt;/a&gt; crowd. Well, it is not so much SQL they object to but ACID, i.e., transactional guarantees. We will talk more about this in the next post. The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x283f0588&quot;&gt;SPARQL&lt;/a&gt; language and protocol do not go into transactions, except for expressing the wish that an &lt;code&gt;UPDATE&lt;/code&gt; request to an end-point be atomic. But beware -- atomicity is a gateway drug, and soon one finds oneself on full ACID. &lt;/p&gt; &lt;p&gt;If one says that a thing will either happen &lt;i&gt;in its entirety&lt;/i&gt; or &lt;i&gt;not at all,&lt;/i&gt; which is what (A) atomicity means, then the question arises of (I) isolation; that is, what happens if somebody else does something to the same &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x238280f8&quot;&gt;data&lt;/a&gt; at the same time? Then comes the question of whether a thing, once having happened, will stay that way; i.e., (D) durability. Finally, there is (&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x276714b8&quot;&gt;C&lt;/a&gt;) consistency, which means that the transaction&amp;#39;s result must not contradict restrictions the database is supposed to enforce. RDF usually has no restrictions; thus consistency mostly means that the internal state of the DBMS must be consistent, e.g., different indices on triples/quads should contain the same data.&lt;/p&gt; &lt;p&gt;There are, of course, database-like consistency criteria that one can express in RDF Schema and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x28625a90&quot;&gt;OWL&lt;/a&gt;, concerning data types, mandatory presence of properties, or restrictions on cardinality (i.e., one may only have one spouse at a time, and the like). &lt;/p&gt; &lt;p&gt;If one indeed did enforce them all, then RDF would be very like the relational model -- with all the restrictions, but without the 40 years of work on &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x249bf4f8&quot;&gt;RDBMS&lt;/a&gt; performance. For this reason, RDF use tends to involve data that is not structured enough to be a good fit for RDBMS.&lt;/p&gt; &lt;p&gt;There is of course the OWL side, where consistency is important but is defined in such complex ways that they again are not a good fit for RDBMS. RDF could be seen to be split between the schema-last world and the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x249504f8&quot;&gt;knowledge&lt;/a&gt; representation world. I will here focus on the schema-last side.&lt;/p&gt; &lt;p&gt;Transactions are relevant in RDF in two cases: 1. If data is trickle loaded in small chunks, one likes to know that the chunks do not get lost or corrupted; 2. If the application has any semantics that reserve resources, then these operations need transactions. The latter is not so common with RDF but examples include read-write situations, like checking if a seat is available and then reserving it. Transactionality guarantees that the same seat does not get reserved twice.&lt;/p&gt; &lt;p&gt;Web people argue with some justification that since the four cardinal virtues of database never existed on the web to begin with, applying strict ACID to web data is beside the point, like locking the stable after the horse has long since run away. This may be so; yet the systems used for processing data, whether that data is dirty or not, benefit from predictable operation under concurrency and from not losing data.&lt;/p&gt; &lt;p&gt;Analytics workloads are not primarily about transactions, but still need to specify what happens with updates. Analyzing data from measurements may not have concurrent updates, but there the transaction issue is replaced by the question of making explicit how the data was acquired and what processing has been applied to it before storage.&lt;/p&gt; &lt;p&gt;As mentioned before, the &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x27d952d0&quot;&gt;LOD2&lt;/a&gt; project is at the crossroads of RDF and database. I construe its mission to be the making of RDF into a respectable database discipline. Database respectability in turn is as good as inconceivable without addressing the very bedrock on which this science was founded: transactions.&lt;/p&gt; &lt;p&gt;As previously argued, we need well-defined and auditable benchmarks. This again brings up the topic of transactions. Once we embark on the database benchmark route, there is no way around this. &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x2359d2d0&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x28edb770&quot;&gt;H&lt;/a&gt; mandates that the system under test support transactions, and the audit involves a test for this. We can do no less.&lt;/p&gt; &lt;p&gt;This has led me to more closely examine the issue of RDF and transactions, and whether there exist differences between transactions applied to RDF and to relational data. &lt;/p&gt; &lt;p&gt;As concerns Virtuoso, our position has been that one can get full ACID in Virtuoso, whether in SQL or SPARQL, by using a connected client (e.g., &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x23a55698&quot;&gt;ODBC&lt;/a&gt;, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x235cecf0&quot;&gt;JDBC&lt;/a&gt;, or the &lt;a class=&quot;auto-href&quot; href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x23213900&quot;&gt;Jena&lt;/a&gt; or &lt;a class=&quot;auto-href&quot; href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x277874d0&quot;&gt;Sesame&lt;/a&gt; frameworks), and setting the isolation options on the connection. Having taken this step, one then must take the next step, which consists of dealing with deadlocks; i.e., with concurrent utilization, it may happen that the database at any time notifies the client that the transaction got aborted and the client must retry.&lt;/p&gt; &lt;p&gt;Web developers especially do not like this, because this is not what MySQL has taught them to expect. MySQL does have transactional back-ends like InnoDB, but often gets used without transactions.&lt;/p&gt; &lt;p&gt;With the March 2011 Virtuoso releases, we have taken a closer look at transactions with RDF. It is more practical to reduce the possibility of errors than to require developers to pay attention. For this reason we have automated isolation settings for RDF, greatly reduced the incidence of deadlocks, and even incorporated automatic deadlock retries where applicable.&lt;/p&gt; &lt;p&gt;If all users lock resources they need in the same order, there will be no deadlocks. This is what we do with RDF load in Virtuoso 7; thus any mix of concurrent &lt;code&gt;INSERTs&lt;/code&gt; and &lt;code&gt;DELETEs&lt;/code&gt;, if these are under a certain size (normally 10000 quads) are guaranteed never to fail due to locking. These could still fail due to running out of space, though. With previous versions, there always was a possibility of having an &lt;code&gt;INSERT&lt;/code&gt; or &lt;code&gt;DELETE&lt;/code&gt; fail because of deadlock with multiple users. Vectored &lt;code&gt;INSERT&lt;/code&gt; and &lt;code&gt;DELETE&lt;/code&gt; are sufficient for making web crawling or archive maintenance practically deadlock free, since there the primary transaction is the &lt;code&gt;INSERT&lt;/code&gt; or &lt;code&gt;DELETE&lt;/code&gt; of a small graph. &lt;/p&gt; &lt;p&gt;Furthermore, since the &lt;a class=&quot;auto-href&quot; href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id0x23eadf50&quot;&gt;SPARQL protocol&lt;/a&gt; has no way of specifying transactions consisting of multiple client-server exchanges, the SPARQL end-point may deal with deadlocks by itself. If all else fails, it can simply execute requests one after the other, thus eliminating any possibility of locking. We note that many statements will be intrinsically free of deadlocks by virtue of always locking in key order, but this cannot be universally guaranteed with arbitrary size operations; thus concurrent operations might still sometimes deadlock. Anyway, vectored execution as introduced in Virtuoso 7, besides getting easily double-speed random access, also greatly reduces deadlocks by virtue of ordering operations.&lt;/p&gt; &lt;p&gt;In the next post we will talk about what transactions mean with RDF and whether there is any difference with the relational model.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-22#1688</atom:id>
  <atom:published>2011-03-22T22:32:28Z</atom:published>
  <atom:updated>2011-03-22T17:04:43-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;This article covers the changes we have made to the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x2361bf18&quot;&gt;BSBM&lt;/a&gt; test driver during our series of experiments.&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Drill-down mode&lt;/b&gt; - For queries that have a product type as parameter, the test driver will invoke the query multiple times with each time a random subtype of the product type of the previous invocation. The starting point of the drill-down is an a random type from a settable level in the hierarchy. The rationale for the drill-down mode is that depending on the parameter choice, there can be 1000x differences in query run time. Thus run times of consecutive query mixes will be incomparable unless we guarantee that each mix has a predictable number of queries with a product type from each level in the hierarchy.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;b&gt;Permutation of query mix&lt;/b&gt; - In the BI workload, the queries are run in a random order on each thread in multiuser mode. Doing exactly the same thing on many threads is not realistic for large queries. The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x2834cec8&quot;&gt;data&lt;/a&gt; access patterns must be spread out in order to evaluate how bulk IO is organized with differing concurrent demands. The permutations are deterministic on consecutive runs and do not depend on the non-deterministic timing of concurrent activities. For queries with a drill-down, the individual executions that make up the drill-down are still consecutive.&lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;New metrics&lt;/b&gt; - The BI Power is the geometric mean of query run times scaled to queries per hour and multiplied by the scale factor, where 100 Mt is considered the unit scale. The BI Throughput is the arithmetic mean of the run times scaled to QPH and adjusted to scale as with the Power metric. These are analogous to the &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x236c5158&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x28814950&quot;&gt;H&lt;/a&gt; Power and Throughput metrics. &lt;/p&gt; &lt;p&gt;The &lt;i&gt;Power&lt;/i&gt; is defined as&lt;/p&gt; &lt;blockquote&gt;(scale_factor / 284826) * 3600 / ((t0 * t1 * ... * tn) ^(1 / n)) &lt;/blockquote&gt; &lt;p&gt;The &lt;i&gt;Throughput&lt;/i&gt; is defined as&lt;/p&gt; &lt;blockquote&gt;(scale_factor / 284826) * 3600 / ((t0 + t2 + ... + tn) / n)&lt;/blockquote&gt; &lt;p&gt;The magic number 284826 is the scale that generates approximately 100 million triples (100 Mt). We consider this &amp;quot;scale one.&amp;quot; The reason for the multiplication is that scores at different scales should get similar numbers, otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries.&lt;/p&gt; &lt;p&gt;We also show the percentage each query represents from the total time the test driver waits for responses. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Deadlock retry&lt;/b&gt; - When running update mixes, it is possible that a transaction gets aborted by a deadlock. We have made a retry logic for this.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Cluster mode&lt;/b&gt; - Cluster databases may have multiple interchangeable &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x240f9008&quot;&gt;HTTP&lt;/a&gt; listeners. With this mode, one can specify multiple end-points so a multi-user workload can divide itself evenly over these.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Identifying matter&lt;/b&gt; - A version number was added to test driver output. Use of the new switches is also indicated in the test driver output.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;SUT &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x249b7208&quot;&gt;CPU&lt;/a&gt;&lt;/b&gt; - In comparing results it is crucial to differentiate between in memory runs and IO bound runs. To make this easier, we have added an option to report server CPU times over the timed portion (excluding warm-ups). A pluggable self-script determines the CPU times for the system; thus clusters can be handled, too. The time is given as a sum of the time the server processes have aged during the run and as a percentage over the wall-clock time.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;These changes will soon be available &lt;a href=&quot;http://blogs.usnet.private:8893/RPC2&quot; id=&quot;link-id0x1f9a57c0&quot;&gt;as a diff&lt;/a&gt; and &lt;a href=&quot;http://blogs.usnet.private:8893/RPC2&quot; id=&quot;link-id0x1f2fea08&quot;&gt;as a source tree&lt;/a&gt;. This version is labeled &lt;b&gt;&lt;code&gt;BSBM Test Driver 1.1-opl&lt;/code&gt;&lt;/b&gt;; the &lt;b&gt;&lt;code&gt;-opl&lt;/code&gt;&lt;/b&gt; signifies OpenLink additions. &lt;/p&gt; &lt;p&gt;We invite FU Berlin to include these enhancements into their Source Forge repository of the BSBM test driver. There is more precise documentation of these options in the README file in the above distribution.&lt;/p&gt; &lt;p&gt;The next planned upgrade of the test driver concerns adding support for &amp;quot;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x2865ac68&quot;&gt;RDF&lt;/a&gt;-H&amp;quot;, the RDF adaptation of the industry standard TPC-H decision support benchmark for &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x23597bb0&quot;&gt;RDBMS&lt;/a&gt;.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1db2be00&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1dfcc038&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x197c26d0&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1d149cf0&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1ab69450&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1e67d688&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1dad87c8&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1cc73830&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1d6879a8&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1dfae510&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1ef052a0&quot;&gt;Benchmarks, Redux (part 11): The Substance of Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1dadddb0&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e662ef0&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1df6fa70&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 14): BSBM BI Mix</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-22#1687</atom:id>
  <atom:published>2011-03-22T22:31:32Z</atom:published>
  <atom:updated>2011-03-22T17:04:38-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In this post, we look at how we run the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x236dcda8&quot;&gt;BSBM&lt;/a&gt;-BI mix. We consider the 100 Mt and 1000 Mt scales with &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x284893c0&quot;&gt;Virtuoso&lt;/a&gt; 7 using the same hardware and software as in the previous posts. The changes to workload and metric are given in the previous post.&lt;/p&gt; &lt;p&gt;Our intent here is to look at whether the metric works, and to see what results will look like in general. We are as much testing the benchmark as we are testing the system-under-test (SUT). The results shown here will likely not be comparable with future ones because we will most likely change the composition of the workload since it seems a bit out of balance. Anyway, for the sake of disclosure, we attach the query templates. The test driver we used will be made available soon, so the interested may still try a comparison with their systems. If you practice with this workload for the coming races, the effort will surely not be wasted.&lt;/p&gt; &lt;p&gt;Once we have come up with a rules document, we will redo all that we have published so far by-the-book, and have it audited as part of the &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x23724860&quot;&gt;LOD2&lt;/a&gt; service we plan for this (see previous posts in this series). This will introduce comparability; but before we get that far with the BI workload, the workload needs to evolve a bit.&lt;/p&gt; &lt;p&gt;Below we show samples of test driver output; the whole output is &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/BenchmarksReduxSupportingFiles/br.tar.gz&quot; id=&quot;link-id0x1b703ad8&quot;&gt;downloadable&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;100 Mt Single User&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; bsbm/testdriver -runs 1 -w 0 -idir /bs/1 -drill \ -ucf bsbm/usecases/businessIntelligence/&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x2385eb48&quot;&gt;sparql&lt;/a&gt;.txt \ -dg &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x22e2f508&quot;&gt;http&lt;/a&gt;://bsbm.org http://localhost:8604/sparql &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 43348.14ms, total: 43440ms Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 43.3481s / 43.3481s Elapsed runtime: 43.348 seconds QMpH: 83.049 query mixes per hour CQET: 43.348 seconds average runtime of query mix CQET (geom.): 43.348 seconds geometric mean runtime of query mix AQET (geom.): 0.492 seconds geometric mean runtime of query Throughput: 1494.874 BSBM-BI throughput: qph*scale BI Power: 7309.820 BSBM-BI Power: qph*scale (geom) &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;100 Mt 8 User &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; Thread 6: query mix 3: 195793.09ms, total: 196086.18ms Thread 8: query mix 0: 197843.84ms, total: 198010.50ms Thread 7: query mix 4: 201806.28ms, total: 201996.26ms Thread 2: query mix 5: 221983.93ms, total: 222105.96ms Thread 4: query mix 7: 225127.55ms, total: 225317.49ms Thread 3: query mix 6: 225860.49ms, total: 226050.17ms Thread 5: query mix 2: 230884.93ms, total: 231067.61ms Thread 1: query mix 1: 237836.61ms, total: 237959.11ms Benchmark run completed in 237.985427s Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 195.7931s / 237.8366s Total runtime (sum): 1737.137 seconds Elapsed runtime: 1737.137 seconds QMpH: 121.016 query mixes per hour CQET: 217.142 seconds average runtime of query mix CQET (geom.): 216.603 seconds geometric mean runtime of query mix AQET (geom.): 2.156 seconds geometric mean runtime of query Throughput: 2178.285 BSBM-BI throughput: qph*scale BI Power: 1669.745 BSBM-BI Power: qph*scale (geom) &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;1000 Mt Single User&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 608707.03ms, total: 608768ms Scale factor: 2848260 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 608.7070s / 608.7070s Elapsed runtime: 608.707 seconds QMpH: 5.914 query mixes per hour CQET: 608.707 seconds average runtime of query mix CQET (geom.): 608.707 seconds geometric mean runtime of query mix AQET (geom.): 5.167 seconds geometric mean runtime of query Throughput: 1064.552 BSBM-BI throughput: qph*scale BI Power: 6967.325 BSBM-BI Power: qph*scale (geom) &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;1000 Mt 8 User &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; bsbm/testdriver -runs 8 -mt 8 -w 0 -idir /bs/10 -drill \ -ucf bsbm/usecases/businessIntelligence/sparql.txt \ -dg http://bsbm.org http://localhost:8604/sparql &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; Thread 3: query mix 4: 2211275.25ms, total: 2211371.60ms Thread 4: query mix 0: 2212316.87ms, total: 2212417.99ms Thread 8: query mix 3: 2275942.63ms, total: 2276058.03ms Thread 5: query mix 5: 2441378.35ms, total: 2441448.66ms Thread 6: query mix 7: 2804001.05ms, total: 2804098.81ms Thread 2: query mix 2: 2808374.66ms, total: 2808473.71ms Thread 1: query mix 6: 2839407.12ms, total: 2839510.63ms Thread 7: query mix 1: 2889199.23ms, total: 2889263.17ms Benchmark run completed in 2889.302566s Scale factor: 2848260 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 2211.2753s / 2889.1992s Total runtime (sum): 20481.895 seconds Elapsed runtime: 20481.895 seconds QMpH: 9.968 query mixes per hour CQET: 2560.237 seconds average runtime of query mix CQET (geom.): 2544.284 seconds geometric mean runtime of query mix AQET (geom.): 13.556 seconds geometric mean runtime of query Throughput: 1794.205 BSBM-BI throughput: qph*scale BI Power: 2655.678 BSBM-BI Power: qph*scale (geom) Metrics for Query: 1 Count: 8 times executed in whole run Time share 2.120884% of total execution time AQET: 54.299656 seconds (arithmetic mean) AQET(geom.): 34.607302 seconds (geometric mean) QPS: 0.13 Queries per second minQET/maxQET: 11.71547600s / 148.65379700s Metrics for Query: 2 Count: 8 times executed in whole run Time share 0.207382% of total execution time AQET: 5.309462 seconds (arithmetic mean) AQET(geom.): 2.737696 seconds (geometric mean) QPS: 1.34 Queries per second minQET/maxQET: 0.78729800s / 25.80948200s Metrics for Query: 3 Count: 8 times executed in whole run Time share 17.650472% of total execution time AQET: 451.893890 seconds (arithmetic mean) AQET(geom.): 410.481088 seconds (geometric mean) QPS: 0.02 Queries per second minQET/maxQET: 171.07262500s / 721.72939200s Metrics for Query: 5 Count: 32 times executed in whole run Time share 6.196565% of total execution time AQET: 39.661685 seconds (arithmetic mean) AQET(geom.): 6.849882 seconds (geometric mean) QPS: 0.18 Queries per second minQET/maxQET: 0.15696500s / 189.00906200s Metrics for Query: 6 Count: 8 times executed in whole run Time share 0.119916% of total execution time AQET: 3.070136 seconds (arithmetic mean) AQET(geom.): 2.056059 seconds (geometric mean) QPS: 2.31 Queries per second minQET/maxQET: 0.41524400s / 7.55655300s Metrics for Query: 7 Count: 40 times executed in whole run Time share 1.577963% of total execution time AQET: 8.079921 seconds (arithmetic mean) AQET(geom.): 1.342079 seconds (geometric mean) QPS: 0.88 Queries per second minQET/maxQET: 0.02205800s / 40.27761500s Metrics for Query: 8 Count: 40 times executed in whole run Time share 72.126818% of total execution time AQET: 369.323481 seconds (arithmetic mean) AQET(geom.): 114.431863 seconds (geometric mean) QPS: 0.02 Queries per second minQET/maxQET: 5.94377300s / 1824.57867400s &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x2809d998&quot;&gt;CPU&lt;/a&gt; for the multiuser runs stays above 1500% for the whole run. The CPU for the single user 100 Mt run is 630%; for the 1000 Mt run, this is 574%. This can be improved since the queries usually have a lot of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x22cf75b8&quot;&gt;data&lt;/a&gt; to work on. But final &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x238b94c8&quot;&gt;optimization&lt;/a&gt; is not our goal yet; we are just surveying the race track. The difference between a warm single user run and a cold single user run is about 15% with data on SSD; with data on disk, this would be more. The numbers shown are with warm &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x23ad8c08&quot;&gt;cache&lt;/a&gt;. The single-user and multi-user Throughput difference, 1064 single-user vs. 1794 multi-user, is about what one would expect from the CPU utilization.&lt;/p&gt; &lt;p&gt;With these numbers, the CPU does not appear badly memory-bound, else the increase would be less; also core multi-threading seems to bring some benefit. If the single-user run was at 800%, the Throughput would be 1488. The speed in excess of this may be attributed to core multi-threading, although we must remember that not every query mix is exactly the same length, so the figure is not exact. Core multi-threading does not seem to hurt, at the very least. Comparison of the same numbers with the column store will be interesting since it misses the cache a lot less and accordingly has better SMP scaling. The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Intel_Corporation&quot; id=&quot;link-id0x23568308&quot;&gt;Intel&lt;/a&gt; Nehalem memory subsystem is really pretty good.&lt;/p&gt; &lt;p&gt; &lt;/p&gt; &lt;p&gt;For reference, we show a run with Virtuoso 6 at 100Mt. &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 424754.40ms, total: 424829ms Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 424.7544s / 424.7544s Elapsed runtime: 424.754 seconds QMpH: 8.475 query mixes per hour CQET: 424.754 seconds average runtime of query mix CQET (geom.): 424.754 seconds geometric mean runtime of query mix AQET (geom.): 1.097 seconds geometric mean runtime of query Throughput: 152.559 BSBM-BI throughput: qph*scale BI Power: 3281.150 BSBM-BI Power: qph*scale (geom) &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;and 8 user &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; Thread 5: query mix 3: 616997.86ms, total: 617042.83ms Thread 7: query mix 4: 625522.18ms, total: 625559.09ms Thread 3: query mix 7: 626247.62ms, total: 626304.96ms Thread 1: query mix 0: 629675.17ms, total: 629724.98ms Thread 4: query mix 6: 667633.36ms, total: 667670.07ms Thread 8: query mix 2: 674206.07ms, total: 674256.72ms Thread 6: query mix 5: 695020.21ms, total: 695052.29ms Thread 2: query mix 1: 701824.67ms, total: 701864.91ms Benchmark run completed in 701.909341s Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 616.9979s / 701.8247s Total runtime (sum): 5237.127 seconds Elapsed runtime: 5237.127 seconds QMpH: 41.031 query mixes per hour CQET: 654.641 seconds average runtime of query mix CQET (geom.): 653.873 seconds geometric mean runtime of query mix AQET (geom.): 2.557 seconds geometric mean runtime of query Throughput: 738.557 BSBM-BI throughput: qph*scale BI Power: 1408.133 BSBM-BI Power: qph*scale (geom) &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;Having the numbers, let us look at the metric and its scaling. We take the geometric mean of the single-user Power and the multiuser Throughput.&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 100 Mt: sqrt ( 7771 * 2178 ); = 4114 1000 Mt: sqrt ( 6967 * 1794 ); = 3535 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;Scaling seems to work; the results are in the same general ballpark. The real times for the 1000 Mt run are a bit over 10x the times for the 100Mt run, as expected. The relative percentages of the queries are about the same on both scales, with the drill-down in Q8 alone being 77% and 72% respectively. The Q8 drill-down starts at the root of the product hierarchy. If we made this start one level from the top, its share would drop. This seems reasonable.&lt;/p&gt; &lt;p&gt;Conversely, Q2 is out of place, with far too little share of the time. It takes a product as a starting point and shows a list of products with common features, sorted by descending count of common features. This would more appropriately be applied to a leaf product category instead, measuring how many of the products in the category have the top 20 features found in this category, to name an example.&lt;/p&gt; &lt;p&gt;Also there should be more queries.&lt;/p&gt; &lt;p&gt;At present it appears that BSBM-BI is definitely runnable, but a cursory look suffices to show that the workload needs more development and variety. We remember that I dreamt up the business questions last fall without much analysis, and that these questions were subsequently translated to SPARQL by FU Berlin. So, on one hand, BSBM-BI is of crucial importance because it is the first attempt at doing a benchmark with long running queries in SPARQL. On the other hand, BSBM-BI is not very good as a benchmark; &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x23872a10&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x28487d98&quot;&gt;H&lt;/a&gt; is a lot better. This stands to reason, as TPC-H has had years and years of development and participation by many people.&lt;/p&gt; &lt;p&gt;Benchmark queries are trick questions: For example, TPC-H Q18 cannot be done without changing an &lt;code&gt;IN&lt;/code&gt; into a &lt;code&gt;JOIN&lt;/code&gt; with the &lt;code&gt;IN&lt;/code&gt; subquery in the outer loop and doing streaming aggregation. Q13 cannot be done without a well-optimized &lt;code&gt;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hash_join&quot; id=&quot;link-id0x24974830&quot;&gt;HASH JOIN&lt;/a&gt;&lt;/code&gt; which besides must be partitioned at the larger scales.&lt;/p&gt; &lt;p&gt;Having such trick questions in an important benchmark eventually results in everybody doing the optimizations that the benchmark clearly calls for. Making benchmarks thus entails a responsibility ultimately to the end user, because an irrelevant benchmark might in the worst case send developers chasing things that are beside the point.&lt;/p&gt; &lt;p&gt;In the following, we will look at what BSBM-BI requires from the database and how these requirements can be further developed and extended.&lt;/p&gt; &lt;p&gt;BSBM-BI does not have any clear trick questions, at least not premeditatedly. BSBM-BI just requires a cost model that can guess the fanout of a &lt;code&gt;JOIN&lt;/code&gt; and the cardinality of a &lt;code&gt;GROUP BY&lt;/code&gt;; it is enough to distinguish smaller from greater; the guess does not otherwise have to be very good. Further, the queries are written in the benchmark text so that joining from left to right would work, so not even a cost-based optimizer is strictly needed. I did however have to add some cardinality statistics to get reasonable &lt;code&gt;JOIN&lt;/code&gt; order since we always reorder the query regardless of the source formulation.&lt;/p&gt; &lt;p&gt;BSBM-BI does have variable selectivity from the drill-downs; thus these may call for different &lt;code&gt;JOIN&lt;/code&gt; orders for different parameter values. I have not looked into whether this really makes a difference, though.&lt;/p&gt; &lt;p&gt;There are places in BSBM-BI where using a &lt;code&gt;HASH JOIN&lt;/code&gt; makes sense. We do not use &lt;code&gt;HASH JOINs&lt;/code&gt; with &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x23cbf908&quot;&gt;RDF&lt;/a&gt; because there is an index for everything and making a &lt;code&gt;HASH JOIN&lt;/code&gt; in the wrong place can have a large up-front cost, so one is more robust against cost model errors if one does not do &lt;code&gt;HASH JOINs&lt;/code&gt;. This said, a &lt;code&gt;HASH JOIN&lt;/code&gt; in the right place is a lot better than an index lookup. With TPC-H Q13, our best &lt;code&gt;HASH JOIN&lt;/code&gt; is over 2x better than the best &lt;code&gt;INDEX&lt;/code&gt;-based &lt;code&gt;JOIN&lt;/code&gt;, both being well tuned. For questions like &amp;quot;count the hairballs made in &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Germany&quot; id=&quot;link-id0x249d3e28&quot;&gt;Germany&lt;/a&gt; reviewed by Japanese Hello Kitty fans,&amp;quot; where two ends of a &lt;code&gt;JOIN&lt;/code&gt; path are fairly selective doing the other as a &lt;code&gt;HASH JOIN&lt;/code&gt; is good. This can, if the &lt;code&gt;JOIN&lt;/code&gt; is always cardinality-reducing, even be merged inside an &lt;code&gt;INDEX&lt;/code&gt; lookup. We have such capabilities since we have been for a while gearing up for the relational races, but are not using any of these with BSBM-BI, although they would be useful.&lt;/p&gt; &lt;p&gt;Let us see the profile for a single user 100 Mt run.&lt;/p&gt; &lt;p&gt;The database activity summary is --&lt;/p&gt; &lt;p&gt; &lt;code&gt;select db_activity (0, &amp;#39;http&amp;#39;);&lt;/code&gt; &lt;/p&gt; &lt;p&gt; &lt;code&gt; 161.3MÂ rndÂ  210.2MÂ seqÂ  Â  Â  0Â sameÂ segÂ  Â 104.5MÂ sameÂ pgÂ  45.08MÂ sameÂ parÂ  Â  Â  0Â diskÂ  Â  Â Â 0Â specÂ diskÂ  Â  Â  0BÂ /Â  Â  Â  0Â messagesÂ  2.393KÂ fork&lt;/code&gt; &lt;/p&gt; &lt;p&gt;See the post &amp;quot;&lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1b1f3068&quot;&gt;What Does BSBM Explore Measure&lt;/a&gt;&amp;quot; for an explanation of the numbers. We see that there is more sequential access than random and the random has fair locality with over half on the same page as the previous and a lot of the rest falling under the same parent. Funnily enough, the explore mix has more locality. Running with a longer vector size would probably increase performance by getting better locality. There is an optimization that adjusts vector size on the fly if locality is not sufficient but this is not being used here. So we manually set vector size to 100000 instead of the default 10000. We get --&lt;/p&gt; &lt;p&gt; &lt;code&gt; 172.4MÂ rndÂ  220.8MÂ seqÂ  Â  Â  0Â sameÂ segÂ  Â 149.6MÂ sameÂ pgÂ  10.99MÂ sameÂ parÂ  Â  Â 21Â diskÂ  Â  861Â specÂ diskÂ  Â  Â  0BÂ /Â  Â  Â  0Â messagesÂ  Â  Â 754Â fork&lt;/code&gt; &lt;/p&gt; &lt;p&gt;The throughput goes from 1494 to 1779. We see more hits on the same page, as expected. We do not make this setting a default since it raises the cost for small queries; therefore the vector size must be self-adjusting -- besides, expecting a DBA to tune this is not reasonable. We will just have to correctly tune the self-adjust logic, and we have again clear gains.&lt;/p&gt; &lt;p&gt;Let us now go back to the first run with vector size 10000.&lt;/p&gt; &lt;p&gt;The top of the CPU &lt;code&gt;oprofile&lt;/code&gt; is as follows:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 722309 15.4507 cmpf_iri64n_iri64n 434791 9.3005 cmpf_iri64n_iri64n_anyn_iri64n 294712 6.3041 itc_next_set 273488 5.8501 itc_vec_split_search 203970 4.3631 itc_dive_transit 199687 4.2714 itc_page_rcf_search 181614 3.8848 dc_itc_append_any 173043 3.7015 itc_bm_vec_row_check 146727 3.1386 cmpf_int64n 128224 2.7428 itc_vec_row_check 113515 2.4282 dk_alloc 97296 2.0812 page_wait_access 62523 1.3374 qst_vec_get_int64 59014 1.2623 itc_next_set_parent 53589 1.1463 sslr_qst_get 48003 1.0268 ds_add 46641 0.9977 dk_free_tree 44551 0.9530 kc_var_col 43650 0.9337 page_col_cmp_1 35297 0.7550 cmpf_iri64n_iri64n_anyn_gt_lt 34589 0.7399 dv_compare 25864 0.5532 cmpf_iri64n_anyn_iri64n_iri64n_lte 23088 0.4939 dk_free &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The top 10 are all index traversal, with the key compare for two leading IRI keys in the lead, corresponding to a lookup with &lt;code&gt;P&lt;/code&gt; and &lt;code&gt;S&lt;/code&gt; given. The one after that is with all parts given, corresponding to an existence test. The existence tests could probably be converted to &lt;code&gt;HASH JOIN&lt;/code&gt; lookups to good advantage. Aggregation and arithmetic are absent. We should probably add a query like TPC-H Q1 that does nothing but these two. Considering the overall profile, &lt;code&gt;GROUP BY&lt;/code&gt; seems to be around 3%. We should probably put in a query that makes a very large number of groups and could make use of streaming aggregation, i.e., take advantage of a situation where aggregation input comes already grouped by the grouping columns.&lt;/p&gt; &lt;p&gt;A BI use case should offer no problem with including arithmetic, but there are not that many numbers in the BSBM set. Some code sections in the queries with conditional execution and costly tests inside &lt;code&gt;ANDs&lt;/code&gt; and &lt;code&gt;ORs&lt;/code&gt; would be good. TPC-H has such in Q21 and Q19. An &lt;code&gt;OR&lt;/code&gt; with existences where there would be gain from good guesses of a subquery&amp;#39;s selectivity would be appropriate. Also, there should be conditional expressions somewhere with a lot of data, like the &lt;code&gt;CASE-WHEN&lt;/code&gt; in TPC-H Q12.&lt;/p&gt; &lt;p&gt;We can make BSBM-BI more interesting by putting in the above. Also we will have to see where we can profit from &lt;code&gt;HASH JOIN&lt;/code&gt;, both small and large. There should be such places in the workload already so this is a matter of just playing a bit more.&lt;/p&gt; &lt;p&gt;This post amounts to a cheat sheet for the BSBM-BI runs a bit farther down the road. By then we should be operational with the column store and Virtuoso 7 Cluster, though, so not everything is yet on the table.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1fd1d4e0&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1d5b07d8&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1dfe6c48&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x197fce30&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1fbf4210&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1beeb1e0&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1d7e1818&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1dfc1730&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1ea819a8&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1ec73da0&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1fbdce90&quot;&gt;Benchmarks, Redux (part 11): The Substance of Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x19928618&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f3d8710&quot;&gt;Benchmarks, Redux (part 13): BSBM-BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 14): BSBM-BI Mix &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e627400&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 13): BSBM BI Modifications</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-22#1686</atom:id>
  <atom:published>2011-03-22T22:30:44Z</atom:published>
  <atom:updated>2011-03-22T17:04:34.000003-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In this post we introduce changes to the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x234e0ca0&quot;&gt;BSBM&lt;/a&gt; BI queries and metric. These changes are motivated by prevailing benchmark practice and by our experiences in optimizing for the BSBM BI workload.&lt;/p&gt; &lt;p&gt;We will publish results according to the definitions given here and recommend that any interested parties do likewise. The rationales are given in the text.&lt;/p&gt; &lt;h3&gt;Query Mix&lt;/h3&gt; &lt;p&gt;We have removed Q4 from the mix because it is quadratic to the scale factor. The other queries are roughly &lt;code&gt;n * log (n)&lt;/code&gt;. &lt;/p&gt; &lt;h3&gt;Parameter Substitution &lt;/h3&gt; &lt;p&gt;All queries that take a product type as parameter are run in flights of several query invocations where the product type goes from broader to more specific. The initial product type specifies either the root product type or an immediate subtype of this, and the last in the drill-down is a leaf type.&lt;/p&gt; &lt;p&gt;The rationale for this is that the choice of product type may make several orders of magnitude difference in the run time of a query. In order to make consecutive query mixes roughly comparable in execution time, all mixes should have a predictable number of query invocations with product types of each level.&lt;/p&gt; &lt;h3&gt;Query Order &lt;/h3&gt; &lt;p&gt;In the BI mix, when running multiple concurrent clients, each query mix is submitted in a random order. Queries which do drill-downs always have the steps of the drill-down as consecutive in the session, but the query templates are permuted. This is done so as to make less likely that there were two concurrent queries accessing exactly the same &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x23be8d28&quot;&gt;data&lt;/a&gt;. In this way, scans cannot be trivially shared between queries -- but there are still opportunities for reuse of results and adapting execution to working set, e.g., starting with what is in memory.&lt;/p&gt; &lt;h3&gt;Metrics &lt;/h3&gt; &lt;p&gt;We use a &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x238c81a0&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x28c6bbd8&quot;&gt;H&lt;/a&gt;-like metric. This metric consists of a single-user part and a multi-user part, called respectively &lt;i&gt;Power&lt;/i&gt; and &lt;i&gt;Throughput.&lt;/i&gt; The &lt;i&gt;Power&lt;/i&gt; metric is a geometric mean of query run-time. The &lt;i&gt;Throughput&lt;/i&gt; is the total run-time divided by the number of queries completed. After taking the mean, the time is converted into queries-per-hour. This time is then multiplied by the scale factor divided by the scale factor for 100 Mt. In other words, we consider the 100 Mt data set as the unit scale.&lt;/p&gt; &lt;p&gt;The &lt;i&gt;Power&lt;/i&gt; is defined as&lt;/p&gt; &lt;blockquote&gt;( scale_factor / 284826 ) * 3600 / ( ( t1 * t1 * ... * tn ) ^ ( 1 / n ) ) &lt;/blockquote&gt; &lt;p&gt;The &lt;i&gt;Throughput&lt;/i&gt; is defined as&lt;/p&gt; &lt;blockquote&gt;( scale_factor / 284826 ) * 3600 / ( ( t1 + t2 + ... + tn ) / n ) &lt;/blockquote&gt; &lt;p&gt;The magic number &lt;b&gt;&lt;code&gt;284826&lt;/code&gt;&lt;/b&gt; is the scale that generates approximately 100 million triples (100 Mt). We consider this scale &amp;quot;one&amp;quot;. The reason for the multiplication is that scores at different scales should get similar numbers; otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries.&lt;/p&gt; &lt;p&gt;The &lt;i&gt;Composite&lt;/i&gt; metric is the geometric mean of the &lt;i&gt;Power&lt;/i&gt; and &lt;i&gt;Throughput&lt;/i&gt; metrics. A complete report shows both &lt;i&gt;Power&lt;/i&gt; and &lt;i&gt;Throughput&lt;/i&gt; metrics, as well as individual query times for all queries. The rationale for using a geometric mean is to give an equal importance to long and short queries. Halving the execution time of either a long query or a short query will have the same effect on the metric. This is good for encouraging research into all aspects of query processing. On the other hand, real-life users are more interested in halving the time of queries that take one hour than of queries that take one second; therefore, the throughput metric considers run times.&lt;/p&gt; &lt;p&gt;Taking the geometric mean of the two metrics gives more weight to the lower of the two than an arithmetic mean, hence we pay more attention to the worse of the two.&lt;/p&gt; &lt;p&gt;Single-user and multi-user metrics are separate because of the relative importance of intra-query parallelization in BI workloads: There may not be large numbers of concurrent users, yet queries are still complex, and it is important to have maximum parallelization. Therefore the metric rewards single-user performance.&lt;/p&gt; &lt;p&gt;In the next post we will look at the use of this metric and the actual content of BSBM BI.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1b02d528&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1d65f740&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1a797860&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1d3538e0&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1e566f60&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1dedffd8&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1eb11528&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1db46c38&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1c8174e8&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1dfa9338&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1e6dd7b0&quot;&gt;Benchmarks, Redux (part 11): The Substance of Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d154bb0&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 13): BSBM BI Modifications &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f242ae0&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ebf2f98&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-10#1680</atom:id>
  <atom:published>2011-03-10T23:30:11Z</atom:published>
  <atom:updated>2011-03-14T19:37:28-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Let us talk about what ought to be benchmarked in the context of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x2a84d3c0&quot;&gt;RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;A point that often gets brought up by RDF-ers when talking about benchmarks is that there already exist systems which perform very well at &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x2a9758e8&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x2a8fa2a0&quot;&gt;H&lt;/a&gt; and similar workloads, and therefore there is no need for RDF to go there. It is, as it were, somebody else&amp;#39;s problem; besides, it is a solved one.&lt;/p&gt; &lt;p&gt;On the other hand, being able to express what is generally expected of a query language might not be a core competence or a competitive edge, but it certainly is a checklist item.&lt;/p&gt; &lt;p&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x29c75a30&quot;&gt;BSBM&lt;/a&gt; seems to be adopted as a de facto RDF benchmark, as there indeed is almost nothing else. But we should not lose sight of the fact that this is in fact a relational &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x2a0565b8&quot;&gt;schema&lt;/a&gt; and workload that has just been straightforwardly transformed to RDF. BSBM was made, after all, in part for measuring RDB to RDF mapping. Thus BSBM is no more RDF-ish than a trivially RDF-ized TPC-H would be. TPC-H is however a bit more difficult if also a better thought out benchmark than the BSBM BI Mix proposal. But I do not expect an RDF audience to have any enthusiasm for this as this is indeed a very tough race by now, and besides one in which RDB and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x29c44d50&quot;&gt;SQL&lt;/a&gt; will keep some advantage. However, using this as a validation test is meaningful, as there exists a validation dataset and queries that we already have RDF-ized. We could publish these and call this &amp;quot;RDF-H&amp;quot;. &lt;/p&gt; &lt;p&gt;In the following I will outline what would constitute an RDF-friendly, scientifically interesting benchmark. The points are in part based on discussions with &lt;a class=&quot;auto-href&quot; href=&quot;http://nl.linkedin.com/in/peterboncz&quot; id=&quot;link-id0x2ac282f0&quot;&gt;Peter Boncz&lt;/a&gt; of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x2a1c9e10&quot;&gt;CWI&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The &lt;a class=&quot;auto-href&quot; href=&quot;http://www.w3.org/wiki/Social_Network_Intelligence_BenchMark&quot; id=&quot;link-id0x29e7d3d8&quot;&gt;Social Network Intelligence Benchmark&lt;/a&gt; (&lt;a class=&quot;auto-href&quot; href=&quot;http://www.w3.org/wiki/Social_Network_Intelligence_BenchMark&quot; id=&quot;link-id0x2a70e3c0&quot;&gt;SNIB&lt;/a&gt;) takes the social web Facebook-style schema Ivan Mikhailov and I made last year under the name of Botnet BM. In &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x2a9a70f0&quot;&gt;LOD2&lt;/a&gt;, CWI is presently working on this.&lt;/p&gt; &lt;p&gt;The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x2ad04408&quot;&gt;data&lt;/a&gt; includes &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x29d5eeb0&quot;&gt;DBpedia&lt;/a&gt; as a base component used for providing conversation topics, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x2ac97c40&quot;&gt;information&lt;/a&gt; about geographical locales of simulated users, etc. DBpedia is not very large, around 200M-300M triples, but it is diverse enough.&lt;/p&gt; &lt;p&gt;The data will have correlations, e.g., people who talk about sports tend to know other people who talk about the same sport, and they are more likely to know people from their geographical area than from elsewhere. &lt;/p&gt; &lt;p&gt;The bulk of the data consists of a rich history of interactions including messages to individuals and groups, linking to people, dropping links, joining and leaving groups, and so forth. The messages are tagged using real-world concepts from DBpedia, and there is correlation between tagging and textual content since both are generated from Dbpedia articles. Since there is such correlation, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x2ac359c0&quot;&gt;NLP&lt;/a&gt; techniques like &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x2a1c8ed0&quot;&gt;entity&lt;/a&gt; and relationship extraction can be used with the data even though this is not the primary thrust of SNIB.&lt;/p&gt; &lt;p&gt;There is variation in frequency of online interaction, and this interaction consist of sessions. For example, one could analyze user behavior per time of day for online ad placement.&lt;/p&gt; &lt;p&gt;The data probably should include propagating memes, fashions, and trends that travel on the social network. With this, one could query about their origin and speed of propagation.&lt;/p&gt; &lt;p&gt;There should probably be cases of duplicate identities in the data, i.e., one real person using many online accounts to push an agenda. Resolving duplicate identities makes for nice queries.&lt;/p&gt; &lt;p&gt;Ragged data with half-filled profiles and misspelled identifiers like person and place names are a natural part of the social web use case. The data generator should take this into account.&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Distribution of popularity and activity should follow a power-law-like pattern; actual measures of popularity can be sampled from existing social networks even though large quantities of data cannot easily be extracted.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;The dataset should be predictably scalable. For the workload considered, the relative importance of the queries or other measured tasks should not change dramatically with the scale.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;For example some queries are logarithmic to data size (e.g., find connections to a person), some are linear (e.g., find average online time of sports fans on Sundays), and some are quadratic or worse (e.g., find two extremists of the same ideology that are otherwise unrelated). Making a single metric from such parts may not be meaningful. Therefore, SNIB might be structured into different workloads.&lt;/p&gt; &lt;p&gt;The first would be an online mix with typically short lookups and updates, around &lt;code&gt;O ( log ( n ) )&lt;/code&gt;. &lt;/p&gt; &lt;p&gt;The Business Intelligence Mix would be composed of queries around &lt;code&gt;OO ( n log ( n ) )&lt;/code&gt;. Even so, with real data, choice of parameters will provide dramatic changes in query run-time. Therefore a run should be specified to have a predictable distribution of &amp;quot;hard&amp;quot; and &amp;quot;easy&amp;quot; parameter choices. In the BSBM BI mix modification, I did this by defining some to be drill downs from a more general to a more specific level of a hierarchy. This could be done here too in some cases; other cases would have to be defined with buckets of values. &lt;/p&gt; &lt;p&gt;Both the real world and LOD2 are largely concerned with data integration. The SNIB workload can have aspects of this, for example, in resolving duplicate identities. These operations are more complex than typical database queries, as the attributes used for joining might not even match in the initial data.&lt;/p&gt; &lt;p&gt;One characteristic of these is the production of sometimes large intermediate results that need to be materialized. Doing these operations in practice requires procedural control. Further, running algorithms like network analytics (e.g., Page rank, centrality, etc.) involves aggregation of intermediate results that is not very well expressible in a query language. Some basic graph operations like shortest path are expressible but then are not in unextended &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x29d26588&quot;&gt;SPARQL&lt;/a&gt; 1.1; as these would for example involve returning paths, which are explicitly excluded from the spec.&lt;/p&gt; &lt;p&gt;These are however the areas where we need to go for a benchmark that is more than a repackaging of a relational BI workload.&lt;/p&gt; &lt;p&gt;We find that such a workload will have procedural sections either in application code or stored procedures. Map-reduce is sometimes used for scaling these. As one would expect, many cluster databases have their own version of these control structures. Therefore some of the SNIB workload could even be implemented as map-reduce jobs alongside parallel database implementations. We might here touch base with the &lt;a class=&quot;auto-href&quot; href=&quot;http://www.larkc.eu/&quot; id=&quot;link-id0x29b69640&quot;&gt;LarKC&lt;/a&gt; map-reduce work to see if it could be applied to SNIB workloads. &lt;/p&gt; &lt;p&gt;We see a three-level structure emerging. There is an &lt;i&gt;Online&lt;/i&gt; mix which is a bit like the BSBM &lt;i&gt;Explore&lt;/i&gt; mix, and an &lt;i&gt;Analytics&lt;/i&gt; mix which is on the same order of complexity as TPC-H. These may have a more-or-less fixed query formulation and test driver. Beyond these, yet working on the same data, we have a set of &lt;i&gt;Predefined Tasks&lt;/i&gt; which the test sponsor may implement in a manner of their choice.&lt;/p&gt; &lt;p&gt;We would finally get to the &amp;quot;raging conflict&amp;quot; between the &amp;quot;declarativists&amp;quot; and the &amp;quot;map reductionists.&amp;quot; Last year&amp;#39;s VLDB had a lot of map-reduce papers. I know of comparisons between &lt;a class=&quot;auto-href&quot; href=&quot;http://www.vertica.com/&quot; id=&quot;link-id0x2a8c4510&quot;&gt;Vertica&lt;/a&gt; and map reduce for doing a fairly simple SQL query on a lot of data, but here we would be talking about much more complex jobs on more interesting (i.e., less uniform) data.&lt;/p&gt; &lt;p&gt;We might even interest some of the cluster &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x2995aaa8&quot;&gt;RDBMS&lt;/a&gt; players (&lt;a class=&quot;auto-href&quot; href=&quot;http://www.teradata.com/&quot; id=&quot;link-id0x29c9af10&quot;&gt;Teradata&lt;/a&gt;, Vertica, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x29c9af38&quot;&gt;Greenplum&lt;/a&gt;, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/page/Oracle_Exadata&quot; id=&quot;link-id0x29d48b78&quot;&gt;Oracle Exadata&lt;/a&gt;, &lt;a class=&quot;auto-href&quot; href=&quot;http://www.paraccel.com/&quot; id=&quot;link-id0x29d48ba0&quot;&gt;ParAccel&lt;/a&gt;, and/or &lt;a class=&quot;auto-href&quot; href=&quot;http://www.asterdata.com/&quot; id=&quot;link-id0x29bf8fb0&quot;&gt;Aster Data&lt;/a&gt;, to name a few) in running this workload using their map-reduce analogs.&lt;/p&gt; &lt;p&gt;We see that as we get to topics beyond relational BI, we do not find ourselves in an RDF-only world but very much at a crossroads of many technologies, e.g., map-reduce and its database analogs, various custom built databases, graph libraries, data integration and cleaning tools, and so forth.&lt;/p&gt; &lt;p&gt;There is not, nor ought there to be, a sheltered, RDF-only enclave. RDF will have to justify itself in a world of alternatives.&lt;/p&gt; &lt;p&gt;This must be reflected in our benchmark development, so relational BI is not irrelevant; in fact, it is what everybody does. RDF cannot be a total failure at this, even if this were not RDF&amp;#39;s claim to fame. The claim to fame comes after we pass this stage, which is what we intend to explore in SNIB.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1c9f7ab8&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1dd17b28&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1eb20620&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1f8a5ae8&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1ac14a08&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1d1f8d58&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1ea83308&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1b548028&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1c3d9c58&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1f5e6978&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1c082a28&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ec73578&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1eb25d48&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1b261958&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-10#1679</atom:id>
  <atom:published>2011-03-10T23:29:41Z</atom:published>
  <atom:updated>2011-03-14T19:37:14.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I have in the previous posts generally argued for and demonstrated the usefulness of benchmarks.&lt;/p&gt; &lt;p&gt;Here I will talk about how this could be organized in a way that is tractable, and takes vendor and end user interests into account. These are my views on the subject and do not represent a &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x2acb0760&quot;&gt;LOD2&lt;/a&gt; members consensus, but have been discussed in the consortium. &lt;/p&gt; &lt;p&gt;My colleague Ivan Mikhailov once proposed that the only way to get benchmarks run right is to package them as a single script that does everything, like instant noodles -- just add water! But even instant noodles can be abused: Cook too long, add too much water, maybe forget to light the stove, and complain that the result is unsatisfyingly hard and brittle, lacking the suppleness one has grown to expect from this delicacy. No, the answer lies at the other end of the culinary spectrum, in gourmet cooking. Let the best cooks show what they can do, and let them work at it; let those who in fact have capacity and motivation for creating &lt;i&gt;le chef d&amp;#39;oeuvre culinaire&lt;/i&gt; (&amp;quot;the culinary masterpiece&amp;quot;) create it. Even so, there are many value points along the dimensions of preparation time, cost, and esthetic layout, not to forget taste and nutritional values. Indeed, an intimate &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x2aca6a30&quot;&gt;knowledge&lt;/a&gt; &lt;i&gt;de la vie secrete du canard&lt;/i&gt; (&amp;quot;the secret life of duck&amp;quot;) is required in order to liberate the aroma that it might take flight and soar. In the previous, I have shed some light on how we prepare &lt;i&gt;le canard&lt;/i&gt;, and if &lt;i&gt;le canard&lt;/i&gt; be such then &lt;i&gt;la dinde&lt;/i&gt; (turkey) might in some ways be analogous; who is to say?&lt;/p&gt; &lt;p&gt;In other words, as a vendor, we want to have complete control over the benchmarking process, and have it take place in our environment at a time of our choice. In exchange for this, we are ready to document and observe possibly complicated rules, document how the runs are made, and let others monitor and repeat them on the equipment on which the results are obtained. This is the &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x2b847818&quot;&gt;TPC&lt;/a&gt; (Transaction Processing Performance Council) model.&lt;/p&gt; &lt;p&gt;Another culture of doing benchmarks is the periodic challenge model used in TREC, the &lt;a class=&quot;auto-href&quot; href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x2ac3a6f8&quot;&gt;Billion Triples Challenge&lt;/a&gt;, the Semantic Search Challenge and others. In this model, vendors prepare the benchmark submission and agree to joint publication.&lt;/p&gt; &lt;p&gt;A third party performing benchmarks by itself is uncommon in databases. Licenses even often explicitly prohibit this, for understandable reasons.&lt;/p&gt; &lt;p&gt;The LOD2 project has an outreach activity called Publink where we offer to help owners of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x2aea5930&quot;&gt;data&lt;/a&gt; to publish it as &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x2a790128&quot;&gt;Linked Data&lt;/a&gt;. Similarly, since FP 7s are supposed to offer a visible service to their communities, I proposed that LOD2 offer to serve a role in disseminating and auditing &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x29babb00&quot;&gt;RDF&lt;/a&gt; store benchmarks.&lt;/p&gt; &lt;p&gt;One representative of an RDF store vendor I talked to, in relation to setting up a benchmark configuration of their product, told me that we could do this and that they would give some advice but that such an exercise was by its nature fundamentally flawed and could not possibly produce worthwhile results. The reason for this was that OpenLink engineers could not possibly learn enough about the other products nor unlearn enough of their own to make this a meaningful comparison.&lt;/p&gt; &lt;p&gt;Isn&amp;#39;t this the very truth? Let the chefs mix their own spices.&lt;/p&gt; &lt;p&gt;This does not mean that there would not be comparability of results. If the benchmarks and processes are well defined, documented, and checked by a third party, these can be considered legitimate and not just one-off best-case results without further import.&lt;/p&gt; &lt;p&gt;In order to stretch the envelope, which is very much a LOD2 goal, this benchmarking should be done on a variety of equipment -- whatever works best at the scale in question. Increasing the scale remains a stated objective. LOD2 even promised to run things with a trillion triples in another 3 years. &lt;/p&gt; &lt;p&gt;Imagine that the unimpeachably impartial Berliners made house calls. Would this debase Justice to be a servant of mere show-off? Or would this on the contrary combine strict Justice with edifying Charity? Who indeed is in greater need of the light of objective evaluation than the vendor whose very nature makes a being of bias and prejudice?&lt;/p&gt; &lt;p&gt;Even better, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x2a21d108&quot;&gt;CWI&lt;/a&gt;, with its &lt;a href=&quot;http://monetdb.cwi.nl/Development/Research/Articles/&quot; id=&quot;link-id0x1d6479d0&quot;&gt;stellar database pedigree&lt;/a&gt;, agreed in principle to audit RDF benchmarks in LOD2. &lt;/p&gt; &lt;p&gt;In this way one could get a stamp of approval for one&amp;#39;s results regardless of when they were produced, and be free of the arbitrary schedule of third party benchmarking runs. On the relational side this is a process of some cost and complexity, but since the RDF side is still young and more on mutually friendly terms, the process can be somewhat lighter here. I did promise to draft some extra descriptions of process and result disclosure so that we could see how this goes.&lt;/p&gt; &lt;p&gt;We could even do this unilaterally -- just publish &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x2a0d73d8&quot;&gt;Virtuoso&lt;/a&gt; results according to a predefined reporting and verification format. If others wished to publish by the same rules, LOD2 could use some of the benchmarking funds for auditing the proceedings. This could all take place over the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x2a6b44a0&quot;&gt;net&lt;/a&gt;, so we are not talking about any huge cost or prohibitive amount of trouble. It would be in the FP7 spirit that LOD2 provide this service for free, naturally within reason.&lt;/p&gt; &lt;p&gt;Then there is the matter of the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x2a1722a8&quot;&gt;BSBM&lt;/a&gt; Business Intelligence (BI) mix. At present, it seems everybody has chosen to defer the matter to another round of BSBM runs in the summer. This seems to fit the pattern of a public challenge with a few months given for contenders to prepare their submissions. Here we certainly should look at bigger scales and more diverse hardware than in the Berlin runs published this time around. The BI workload is in fact fairly cluster friendly, with big joins and aggregations that parallelize well. There it would definitely make sense to reserve an actual cluster, and have all contenders set up their gear on it. If all have access to the run environment and to monitoring tools, we can be reasonably sure that things will be done in a transparent manner. &lt;/p&gt; &lt;p&gt;(I will talk about the BI mix in more detail in &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1dfcc038&quot;&gt;part 13&lt;/a&gt; and &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1edaa388&quot;&gt;part 14&lt;/a&gt; of this series.)&lt;/p&gt; &lt;p&gt;Once the BI mix has settled and there are a few interoperable implementations, likely in the summer, we could pass from the challenge model to a situation where vendors may publish results as they become available, with LOD2 offering its services for audit. &lt;/p&gt; &lt;p&gt;Of course, this could be done even before then, but the content of the mix might not be settled. We likely need to check it on a few implementations first.&lt;/p&gt; &lt;p&gt;For equipment, people can use their own, or LOD2 partners might on a case-by-case basis make some equipment available for running on the same hardware on which say the Virtuoso results were obtained. For example, FU Berlin could give people a login to get their recently published results fixed. Now this might or might not happen, so I will not hold my breath waiting for this but instead close with a proposal.&lt;/p&gt; &lt;p&gt;As a unilateral diplomatic overture I put forth the following: If other vendors are interested in 1:1 comparison of their results with our publications, we can offer them a login to the same equipment. They can set up and tune their systems, and perform the runs. We will just watch. As an extra quid pro quo, they can try Virtuoso as configured for the results we have published, with the same data. Like this, both parties get to see the others&amp;#39; technology with proper tuning and installation. What, if anything, is reported about this activity is up to the owner of the technology being tested. We will publish a set of benchmark rules that can serve as a guideline for mutually comparable reporting, but we cannot force anybody to use these. This all will function as a catalyst for technological advance, all to the ultimate benefit of the end user. If you wish to take advantage of this offer, you may contact &lt;a href=&quot;mailto:hwilliams@openlinksw.com?subject=Collaborative RDF Benchmark&quot; id=&quot;link-id0x1c071100&quot;&gt;Hugh Williams at OpenLink Software, and we will see how this can be arranged in practice.&lt;/a&gt; &lt;/p&gt; &lt;p&gt;The next post will talk about the &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x19933fd8&quot;&gt;actual content of benchmarks&lt;/a&gt;. The milestone after this will be when we publish the measurement and reporting protocols.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1c554800&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1ec159e8&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1dd5eb10&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x18f05940&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1ed5ef10&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1e9cb130&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1dfa79d8&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1eb6f478&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1de5a918&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 10): LOD2 and the Benchmark Process &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1dae9060&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f45fa10&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f49d2b8&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e68e4c8&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e353858&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 9): BSBM With Cluster</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-09#1676</atom:id>
  <atom:published>2011-03-09T22:54:50Z</atom:published>
  <atom:updated>2011-03-14T19:36:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;This post is dedicated to our brothers in horizontal partitioning (or sharding), &lt;a class=&quot;auto-href&quot; href=&quot;http://freebase.com/guid/9202a8c04000641f8000000005c908d6&quot; id=&quot;link-id0x2a1e9010&quot;&gt;Garlik&lt;/a&gt; and &lt;a class=&quot;auto-href&quot; href=&quot;http://www.systap.com/bigdata.htm&quot; id=&quot;link-id0x2acd5218&quot;&gt;Bigdata&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;At first sight, the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x2bb33648&quot;&gt;BSBM&lt;/a&gt; &lt;i&gt;Explore&lt;/i&gt; mix appears very cluster-unfriendly, as it contains short queries that access &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x2b8fffb8&quot;&gt;data&lt;/a&gt; at random. There is every opportunity for latency and few opportunities for parallelism.&lt;/p&gt; &lt;p&gt;For this reason we had not even run the BSBM mix with &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x2a84b780&quot;&gt;Virtuoso&lt;/a&gt; Cluster. We were not surprised to learn that &lt;a href=&quot;http://steveharris.tumblr.com/post/3453040647/bsbm-v3-post-mortem&quot; id=&quot;link-id0x1c4ef8d8&quot;&gt;Garlik hadn&amp;#39;t run BSBM either&lt;/a&gt;. We have understood from &lt;a class=&quot;auto-href&quot; href=&quot;http://www.systap.com/&quot; id=&quot;link-id0x2ad3d050&quot;&gt;Systap&lt;/a&gt; that their Bigdata BSBM experiments were on a single-process configuration.&lt;/p&gt; &lt;p&gt;But the 4Store results in the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html&quot; id=&quot;link-id0x1f8090f8&quot;&gt;recent Berlin report&lt;/a&gt; were with a distributed setup, as 4Store always runs a multiprocess configuration, even on a single server, so it seemed interesting to us to compare how Virtuoso Cluster compares with Virtuoso Single with this workload. These tests were run on a different box than the recent BSBM tests, so those 4Store figures are not directly comparable.&lt;/p&gt; &lt;p&gt;The setup here consists of 8 partitions, each managed by its own process, all running on the same box. Any of these processes can have its &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x2ac28380&quot;&gt;HTTP&lt;/a&gt; and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x2bba8720&quot;&gt;SQL&lt;/a&gt; listener and can provide the same service. Most access to data goes over the interconnect, except when the data is co-resident in the process which is coordinating the query. The interconnect is Unix domain sockets since all 8 processes are on the same box.&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;4&quot; align=&quot;center&quot;&gt;6 Cluster - Load Rates and Times&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Rate &lt;br /&gt; (quads per second)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Load time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Checkpoint time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 119,204 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 749 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 89 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 121,607 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 1486 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 157 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 102,694 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 8737 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 979 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;br /&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;4&quot; align=&quot;center&quot;&gt;6 Single - Load Rates and Times&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Rate &lt;br /&gt; (quads per second)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Load time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Checkpoint time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 74,713 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 1192 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 145 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;The load times are systematically better than for 6 Single. This is also not bad compared to the 7 Single vectored load rates of 220 Kt/s or so. We note that loading is a cluster friendly operation, going at a steady 1400+% &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x296b03b8&quot;&gt;CPU&lt;/a&gt; utilization with an aggregate message throughput of 40MB/s. 7 Single is faster because of vectoring at the index level, not because the clusters were hitting communication overheads. 6 Cluster is faster than 6 Single because scale-out in this case diminishes contention, even on a single box.&lt;/p&gt; &lt;p&gt;Throughput is as follows:&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; 6 Cluster - Throughput &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 7318 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 43120 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 6222 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 29981 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 2526 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 11156 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;br /&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; 6 Single - Throughput &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 7641 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 29433 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 6017 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 13335 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 1770 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 2487 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;Below is a snapshot of status during the 6 Cluster 100 Mt run.&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; Cluster 8 nodes, 15 s. 25784 m/s 25682 KB/s 1160% cpu 0% read 740% clw threads 18r 0w 10i buffers 1133459 12 d 4 w 0 pfs cl 1: 10851 m/s 3911 KB/s 597% cpu 0% read 668% clw threads 17r 0w 10i buffers 143992 4 d 0 w 0 pfs cl 2: 2194 m/s 7959 KB/s 107% cpu 0% read 9% clw threads 1r 0w 0i buffers 143616 3 d 2 w 0 pfs cl 3: 2186 m/s 7818 KB/s 107% cpu 0% read 9% clw threads 0r 0w 0i buffers 140787 0 d 0 w 0 pfs cl 4: 2174 m/s 2804 KB/s 77% cpu 0% read 10% clw threads 0r 0w 0i buffers 140654 0 d 2 w 0 pfs cl 5: 2127 m/s 1612 KB/s 71% cpu 0% read 9% clw threads 0r 0w 0i buffers 140949 1 d 0 w 0 pfs cl 6: 2060 m/s 544 KB/s 66% cpu 0% read 10% clw threads 0r 0w 0i buffers 141295 2 d 0 w 0 pfs cl 7: 2072 m/s 517 KB/s 65% cpu 0% read 11% clw threads 0r 0w 0i buffers 141111 1 d 0 w 0 pfs cl 8: 2105 m/s 522 KB/s 66% cpu 0% read 10% clw threads 0r 0w 0i buffers 141055 1 d 0 w 0 pfs &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The main meters for cluster execution are the messages-per-second (m/s), the message volume (KB/s), and the total CPU% of the processes. &lt;/p&gt; &lt;p&gt;We note that CPU utilization is highly uneven and messages are short, about 1K on the average, compared to about 100K during the load. CPU would be evenly divided between the nodes if each got a share of the HTTP requests. We changed the test driver to round-robin requests between multiple end points. The work does then get evenly divided, but the speed is not affected. Also, this does not improve the message sizes since the workload consists mostly of short lookups. However, with the processes spread over multiple servers, the round-robin would be essential for CPU and especially for interconnect throughput. &lt;/p&gt; &lt;p&gt;Then we try 6 Cluster at 1000 Mt. For Single User, we get 1180 m/s, 6955 KB/s, and 173% cpu. For 16 User, this is 6573 m/s, 44366 KB/s, 1470% cpu.&lt;/p&gt; &lt;p&gt;This is a lot better than the figures with 6 Single, due to lower contention on the index tree, as discussed in &lt;i&gt;&lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1e9a0b58&quot;&gt;A Benchmarking Story&lt;/a&gt;&lt;/i&gt;. Also Single User throughput on 6 Cluster outperforms 6 Single, due to the natural parallelism of doing the Q5 joins in parallel in each partition. The larger the scale, the more weight this has in the metric. We see this also in the average message size, i.e., the KB/s throughput is almost double while the messages/s is a bit under a third.&lt;/p&gt; &lt;p&gt;The small-scale 6 Cluster run is about even with the 6 Single figure. Looking at the details, we see that the qps for Q1 in 6 Cluster is half of that on 6 Single, whereas the qps for Q5 on 6 Cluster is about double that of the 6 Single. This is as one might expect; longer queries are favored, and single row lookups are penalized.&lt;/p&gt; &lt;p&gt;Looking further at the 6 Cluster status we see the cluster wait (&lt;code&gt;clw&lt;/code&gt;) to be 740%. For 16 Users, this means that about half of the execution real time is spent waiting for responses from other partitions. A high figure means uneven distribution between partitions; a low figure means even. This is as expected, since many queries are concerned with just one S and its related objects.&lt;/p&gt; &lt;p&gt;We will update this section once 7 Cluster is ready. This will implement vectored execution and column store inside the cluster nodes.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1d7894d0&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1e434888&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1f6b5260&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1dd29460&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1f0d78b8&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1f9a9670&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1c055370&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1dc06cd0&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 9): BSBM With Cluster &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x18f04db0&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1ee729b8&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e2e76b8&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d75ef48&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ee518c0&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d9244b0&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 8): BSBM Explore and Update</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-09#1674</atom:id>
  <atom:published>2011-03-09T17:32:47Z</atom:published>
  <atom:updated>2011-03-15T17:18:32-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We will here look at the &lt;i&gt;Explore and Update&lt;/i&gt; scenario of &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1c064218&quot;&gt;BSBM&lt;/a&gt;. This presents us with a novel problem as the specification does not address any aspect of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x1c1852b0&quot;&gt;ACID&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;A transaction benchmark ought to have something to say about this. The &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/page/SPARUL&quot; id=&quot;link-id0x1dbca228&quot;&gt;SPARUL&lt;/a&gt; (also known as &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1eaa4fd0&quot;&gt;SPARQL&lt;/a&gt;/&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/page/SPARUL&quot; id=&quot;link-id0x1dd12bb0&quot;&gt;Update&lt;/a&gt;) language does not say anything about transactionality, but I suppose it is in the spirit of the SPARUL protocol to promise atomicity and durability.&lt;/p&gt; &lt;p&gt;We begin by running &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1c5f4830&quot;&gt;Virtuoso&lt;/a&gt; 7 Single, with Single User and 16 User, each at scales of 100 Mt, 200 Mt, and 1000 Mt. The transactionality is default, meaning &lt;code&gt;SERIALIZABLE&lt;/code&gt; isolation between &lt;code&gt;INSERTs&lt;/code&gt; and &lt;code&gt;DELETEs&lt;/code&gt;, and &lt;code&gt;READ COMMITTED&lt;/code&gt; isolation between &lt;code&gt;READ&lt;/code&gt; and any &lt;code&gt;UPDATE&lt;/code&gt; transaction. (Figures for Virtuoso 6 will also be presented here in the near future, as they are the currently shipping production versions.)&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; Virtuoso 7 Single, Full ACID &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 9,969 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 65,537 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 8,646 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 40,527 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 5,512 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 17,293 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;br /&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; Virtuoso 6 Cluster, Full ACID &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt; Scale &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt; 100 Mt &lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 5604.520 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 34079.019 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt; 1000 Mt &lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 2866.616 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 10028.325 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;br /&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; Virtuoso 6 Single, Full ACID &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 7,152 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 21,065 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 5,862 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 16,895 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 1,542 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 4,548 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;Each run is preceded by a warm-up of 500 or 300 mixes (the exact number is not material), resulting in a warm &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1d4f13d8&quot;&gt;cache&lt;/a&gt;; see &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1f8ac510&quot;&gt;previous post on read-ahead&lt;/a&gt; for details. All runs do 1000 &lt;i&gt;Explore and Update&lt;/i&gt; mixes. The initial database is in the state following the &lt;i&gt;Explore&lt;/i&gt; only runs.&lt;/p&gt; &lt;p&gt;The results are in line with the &lt;i&gt;Explore&lt;/i&gt; results. There is a fair amount of variability between consecutive runs; the 16 User run at 1000 Mt varies between 14K and 19K QMpH depending on the measurement. The smaller runs exhibit less variability.&lt;/p&gt; &lt;p&gt;In the following we will look at transactions and at how the definition of the workload and reporting could be made complete.&lt;/p&gt; &lt;p&gt;Full ACID means serializable semantic of concurrent insert and delete of the same quad. Non-transactional means that on concurrent insert and delete of overlapping sets of quads the result is undefined. Further if one logged such &amp;quot;transactions,&amp;quot; the replay would give serialization although the initial execution did not, hence further confusing the issue. Considering the hypothetical use case of an e-commerce information portal, there is little chance of deletes and inserts actually needing serialization. An insert-only workload does not need serializability because an insert cannot fail. If the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1ec05c10&quot;&gt;data&lt;/a&gt; already exists the insert does nothing, if the quad does not previously exist it is created. The same applies to deletes alone. If a delete and insert overlap, serialization would be needed but the semantics implicit in the use case make this improbable.&lt;/p&gt; &lt;p&gt;Read-only transactions (i.e., the &lt;i&gt;Explore&lt;/i&gt; mix in the &lt;i&gt;Explore and Update&lt;/i&gt; scenario) will be run as &lt;code&gt;READ COMMITTED&lt;/code&gt;. These do not see uncommitted data and never block for lock wait. The reads may not be repeatable.&lt;/p&gt; &lt;p&gt;Our first point of call is to determine the cost of ACID. We run 1000 mixes of &lt;i&gt;Explore and Update&lt;/i&gt; at 1000 Mt. The throughput is 19214 after a warm-up of 500 mixes. This is pretty good in comparison with the diverse read-only results at this scale.&lt;/p&gt; &lt;p&gt;We look at the pertinent statistics:&lt;/p&gt; &lt;p&gt; &lt;code&gt;&lt;/code&gt; &lt;/p&gt; &lt;pre&gt; SELECT TOP 5 * FROM sys_l_stat ORDER BY waits DESC; &lt;/pre&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; KEY_TABLE INDEX_NAME LOCKS WAITS WAIT_PCT DEADLOCKS LOCK_ESC WAIT_MSECS =============== ============= ====== ===== ======== ========= ======== ========== DB.DBA.&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x180837c8&quot;&gt;RDF&lt;/a&gt;_QUAD RDF_QUAD_POGS 179205 934 0 0 0 35164 DB.DBA.RDF_IRI RDF_IRI 20752 217 1 0 0 16445 DB.DBA.RDF_QUAD RDF_QUAD_SP 9244 3 0 0 0 235 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;We see 934 waits with a total duration of 35 seconds on the index with the most contention. The run was 187 seconds, real time. The lock wait time is not real time since this is the total elapsed wait time summed over all threads. The lock wait frequency is a little over one per query mix, meaning a little over one per five locking transactions. &lt;/p&gt; &lt;p&gt;We note that we do not get deadlocks since all inserts and deletes are in ascending key order due to vectoring. This guarantees the absence of deadlocks for single insert transactions, as long as the transaction stays within the vector size. This is always the case since the inserts are a few hundred triples at the maximum. The waits concentrate on POGS, because this is a bitmap index where the locking resolution is less than a row, and the values do not correlate with insert order. The locking behavior could be better with the column store, where we would have row level locking also for this index. This is to be seen. The column store would otherwise tend to have higher cost per random insert.&lt;/p&gt; &lt;p&gt;Considering these results it does not seem crucial to &amp;quot;drop ACID,&amp;quot; though doing so would save &lt;i&gt;some&lt;/i&gt; time. We will now run measurements for all scales with 16 Users and ACID. &lt;/p&gt; &lt;p&gt;Let us now see what the benchmark writes:&lt;/p&gt; &lt;p&gt; &lt;code&gt;&lt;/code&gt; &lt;/p&gt; &lt;pre&gt; SELECT TOP 10 * FROM sys_d_stat ORDER BY n_dirty DESC; &lt;/pre&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS =========================== ============================ ========= ======= ======== ======= ========= DB.DBA.RDF_QUAD RDF_QUAD_POGS 763846891 237436 0 58040 228606 DB.DBA.RDF_QUAD RDF_QUAD 213282706 1991836 0 30226 1940280 DB.DBA.RDF_OBJ RO_VAL 15474 17837 115 13438 17431 DB.DBA.RO_START RO_START 10573 11195 105 10228 11227 DB.DBA.RDF_IRI RDF_IRI 61902 125711 203 7705 121300 DB.DBA.RDF_OBJ RDF_OBJ 23809053 3205963 13 636 3072517 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 3237687 504486 15 340 488797 DB.DBA.RDF_QUAD RDF_QUAD_SP 89995 70446 78 99 68340 DB.DBA.RDF_QUAD RDF_QUAD_OP 19440 47541 244 66 45583 DB.DBA.VTLOG_DB_DBA_RDF_OBJ VTLOG_DB_DBA_RDF_OBJ 3014 1 0 11 11 DB.DBA.RDF_QUAD RDF_QUAD_GS 1261 801 63 10 751 DB.DBA.RDF_PREFIX RDF_PREFIX 14 168 1120 1 153 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 1807 200 11 1 200 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The most dirty pages are on the &lt;code&gt;POGS&lt;/code&gt; index, which is reasonable; values are spread out at random. After this we have the &lt;code&gt;PSOG&lt;/code&gt; index, likely because of random deletes. New IRIs tend to get consecutive numbers and do not make many dirty pages. Literals come next, with the index from leading string or hash of the literal to id leading, as one would expect, again because of values being distributed at random. After this come IRIs. The distribution of updates is generally as one would expect.&lt;/p&gt; &lt;p align=&quot;center&quot;&gt;* * *&lt;/p&gt; &lt;p&gt;Going back to BSBM, at least the following aspects of the benchmark have to be further specified:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Disclosure of ACID properties.&lt;/b&gt; If the benchmark required full ACID many would not run this at all. Besides full ACID is not necessarily an absolute requirement based on the hypothetical usage scenario of the benchmark. However, when publishing numbers the guarantees that go with the numbers must be made explicit. This includes logging, checkpoint frequency or equivalent etc.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Steady state.&lt;/b&gt; The working set of the &lt;i&gt;Update&lt;/i&gt; mix is different from that of the &lt;i&gt;Explore&lt;/i&gt; mixes. This touches more indices than &lt;i&gt;Explore&lt;/i&gt;. The &lt;i&gt;Explore&lt;/i&gt; warm-up is in part good but does not represent steady state.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Checkpoint and sustained throughput.&lt;/b&gt; Benchmarks involving update generally have rules for checkpointing the state and for sustained throughput. In specific, the throughput of an update benchmark cannot rely on never flushing to persistent storage. Even bulk load must be timed with a checkpoint guaranteeing durability at the end. A steady update stream should be timed with a test interval of sufficient length involving a few checkpoints; for example, a minimum duration of 30 minutes with no less than 3 completed checkpoints in the interval with at least 9 minutes between the end of one and the start of the next. Not all DBMSs work with logs and checkpoints, but if an alternate scheme is used then this needs to be described.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Memory and warm-up issues.&lt;/b&gt;We have seen the test data generator run out of memory when trying to generate update streams of meaningful length. Also the test driver should allow running updates in timed and non-timed mode (warm-up).&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;With an update benchmark, many more things need to be defined, and the set-up becomes more system specific, than with a read-only workload. We will address these shortcomings in the measurement rules proposal to come. Especially with update workloads, the vendors need to provide tuning expertise; however, this will not happen if the benchmark does not properly set the expectations. If benchmarks serve as a catalyst for clearly defining how things are to be set up, then they will have served the end user.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1de61db8&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1f9f96f8&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1f89eeb0&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1ad83f30&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1de62178&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1b2ec018&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1ae6f028&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 8): BSBM Explore and Update &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x132605c0&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1a9871b0&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1baa20f8&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e25a840&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1b53db20&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e7ce520&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1b18f400&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-07#1672</atom:id>
  <atom:published>2011-03-07T23:39:22Z</atom:published>
  <atom:updated>2011-03-14T17:57:20-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We will here analyze what the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1db49f28&quot;&gt;BSBM&lt;/a&gt; Explore workload does. This is necessary in order to compare benchmark results at different scales. Historically, BSBM had a Query 6 whose share of the metric approached 100% as scale increased. The present mix does not have this query, but different queries still have different relative importance at different scales.&lt;/p&gt; &lt;p&gt;We will here look at database-running statistics for BSBM at different scales. Finally, we look at &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1f150460&quot;&gt;CPU&lt;/a&gt; profiles.&lt;/p&gt; &lt;p&gt;But first, let us see what BSBM reads in general. The system is in steady state after around 1500 query mixes; after this the working set does not shift much. After several thousand query mixes, we have:&lt;/p&gt; &lt;p&gt; &lt;code&gt;SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC;&lt;/code&gt; &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS ================= ============================ ========== ======= ======== ======= ========= DB.DBA.&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1ddb0b50&quot;&gt;RDF&lt;/a&gt;_OBJ RDF_OBJ 114105938 3302150 2 0 3171275 DB.DBA.RDF_QUAD RDF_QUAD 977426773 2041156 0 0 1970712 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 8250414 509239 6 15 491631 DB.DBA.RDF_QUAD RDF_QUAD_POGS 3677233812 183860 0 0 175386 DB.DBA.RDF_IRI RDF_IRI 32 99710 302151 5 95353 DB.DBA.RDF_QUAD RDF_QUAD_OP 30597 51593 168 0 48941 DB.DBA.RDF_QUAD RDF_QUAD_SP 265474 47210 17 0 46078 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 6020 212 3 0 212 DB.DBA.RDF_PREFIX RDF_PREFIX 0 167 16700 0 157 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The first column is the table, then the index, then the number of times a row was found. The fourth number is the count of disk pages read. The last number is the count of 8K buffer pool pages in use for caching pages of the index in question. Note that the index is clustered, i.e., there is no table &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1d4f9808&quot;&gt;data&lt;/a&gt; structure separate from the index. Most of the reads are for strings or RDF literals. After this comes the &lt;code&gt;PSOG&lt;/code&gt; index for getting a property value given the subject. After this, but much lower, we have lookups of IRI strings given the ID. The index from object value to subject is used the most but the number of pages is small; only a few properties seem to be concerned. The rest is minimal in comparison.&lt;/p&gt; &lt;p&gt;Now let us reset the counts and see what the steady state I/O profile is.&lt;/p&gt; &lt;p&gt; &lt;code&gt;SELECT key_stat (key_table, name_part (key_name, 2), &amp;#39;reset&amp;#39;) FROM sys_keys WHERE key_migrate_to IS NULL;&lt;/code&gt; &lt;/p&gt; &lt;p&gt; &lt;code&gt;SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC;&lt;/code&gt; &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS ================= ============================ ========== ======= ======== ======= ========= DB.DBA.RDF_OBJ RDF_OBJ 30155789 79659 0 0 3191391 DB.DBA.RDF_QUAD RDF_QUAD 259008064 8904 0 0 1948707 DB.DBA.RDF_QUAD RDF_QUAD_SP 68002 7730 11 0 53360 DB.DBA.RDF_IRI RDF_IRI 12 5415 41653 6 98804 DB.DBA.RDF_QUAD RDF_QUAD_POGS 975147136 1597 0 0 173459 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 2213525 1286 0 17 485093 DB.DBA.RDF_QUAD RDF_QUAD_OP 7999 904 11 0 48568 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 1494 1 0 0 213 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;Literal strings dominate. The &lt;code&gt;SP&lt;/code&gt; index is used only for situations where the &lt;code&gt;P&lt;/code&gt; is not specified, i.e., the &lt;code&gt;DESCRIBE&lt;/code&gt; query. Based on this, I/O seems to be attributable mostly to this. The first &lt;code&gt;RDF_IRI&lt;/code&gt; represents translations from string to IRI id; the second represents translations from IRI id to string. The touch count for the first &lt;code&gt;RDF_IRI&lt;/code&gt; is not properly recorded, hence the miss % is out of line. We see &lt;code&gt;SP&lt;/code&gt; missing the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x17d2e670&quot;&gt;cache&lt;/a&gt; the most since its use is infrequent in the mix.&lt;/p&gt; &lt;p&gt;We will next look at query processing statistics. For this we introduce a new meter.&lt;/p&gt; &lt;p&gt;The &lt;code&gt;db_activity&lt;/code&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1d4915b8&quot;&gt;SQL&lt;/a&gt; function provides a session-by-session cumulative statistic of activity. The fields are: &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;rnd&lt;/code&gt; &lt;/b&gt; - Count of &lt;i&gt;random index lookups&lt;/i&gt;. Each first row of a select or insert counts as one, regardless of whether something was found.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;seq&lt;/code&gt; &lt;/b&gt; - Count of &lt;i&gt;sequential rows&lt;/i&gt;. Every move to next row on a cursor counts as 1, regardless of whether conditions match.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;same seg&lt;/code&gt; &lt;/b&gt; - For column store only; counts how many times the next row in a vectored join using an index falls in the &lt;i&gt;same segment&lt;/i&gt; as the previous random access. A segment is the stretch of rows between entries in the sparse top level index on the column projection.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;same pg&lt;/code&gt; &lt;/b&gt; - Counts how many times a vectored index join finds the next match on the &lt;i&gt;same page&lt;/i&gt; as the previous one.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;same par&lt;/code&gt; &lt;/b&gt; - Counts how many times the next lookup in a vectored index join falls on a different page than the previous but still under the &lt;i&gt;same parent&lt;/i&gt;.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;disk&lt;/code&gt; &lt;/b&gt; - Counts how many &lt;i&gt;disk reads&lt;/i&gt; were made, including any speculative reads initiated.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;spec disk&lt;/code&gt; &lt;/b&gt; - Counts &lt;i&gt;speculative disk reads&lt;/i&gt;.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;messages&lt;/code&gt; &lt;/b&gt; - Counts &lt;i&gt;cluster interconnect messages&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;B (KB, MB, GB)&lt;/code&gt; &lt;/b&gt; - is the &lt;i&gt;total length&lt;/i&gt; of the cluster interconnect messages.&lt;/li&gt; &lt;li&gt; &lt;b&gt;&lt;code&gt;fork&lt;/code&gt; &lt;/b&gt; - Counts how many times a &lt;i&gt;thread was forked (started)&lt;/i&gt; for query parallelization.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The numbers are given with 4 significant digits and a scale suffix. G is 10^9 (1,000,000,000); M is 10^6 (1,000,000), K is 10^3 (1,000).&lt;/p&gt; &lt;p&gt;We run 2000 query mixes with 16 Users. The special &lt;code&gt;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1bf7f318&quot;&gt;http&lt;/a&gt;&lt;/code&gt; account keeps a cumulative account of all activity on web server threads.&lt;/p&gt; &lt;blockquote&gt; &lt;p&gt; &lt;code&gt;SELECT db_activity (2, &amp;#39;http&amp;#39;);&lt;/code&gt; &lt;/p&gt; &lt;p&gt; &lt;code&gt;1.674GÂ rndÂ  3.223GÂ seqÂ  Â  Â  0Â sameÂ segÂ  1.286GÂ sameÂ pgÂ  314.8MÂ sameÂ parÂ  6.186MÂ diskÂ  6.461MÂ specÂ diskÂ  Â  Â  0BÂ / Â  Â  0Â messagesÂ  298.6KÂ fork&lt;/code&gt; &lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;We see that random access dominates. The &lt;code&gt;seq&lt;/code&gt; number is about twice the &lt;code&gt;rnd&lt;/code&gt; number, meaning that the average random lookup gets two rows. Getting a row at random obviously takes more time than getting the next row. Since the index used is row-wise, the &lt;code&gt;same seg&lt;/code&gt; is 0; the &lt;code&gt;same pg&lt;/code&gt; indicates that 77% of the random accesses fall on the same page as the previous random access; most of the remaining random accesses fall under the same parent as the previous one.&lt;/p&gt; &lt;p&gt;There are more speculative reads than disk reads which is an artifact of counting some concurrently speculated reads twice. This does indicate that speculative reads dominate. This is because a large part of the run was in the warm-up state with aggressive speculative reading. We reset the counts and run another 2000 mixes.&lt;/p&gt; &lt;p&gt;Now let us look at the same reading after 2000 mixes, 16 user at 100Mt.&lt;/p&gt; &lt;blockquote&gt; &lt;p&gt; &lt;code&gt;234.3MÂ rndÂ  420.5MÂ seqÂ  Â  Â  0Â sameÂ segÂ  Â 188.8MÂ sameÂ pgÂ  29.09MÂ sameÂ parÂ  808.9KÂ diskÂ  919.9KÂ specÂ diskÂ  Â  Â  0BÂ /Â  Â  Â  0Â messagesÂ  76KÂ fork&lt;/code&gt; &lt;/p&gt; &lt;/blockquote&gt; &lt;p&gt;We note that the ratios between the random and sequential and same page/parent counts are about the same. The sequential number looks to be even a bit smaller in proportion. The count of random accesses for the 100Mt run is 14% of the count for the 1000Mt run. The count of query parallelization threads is also much lower since it is worthwhile to schedule a new thread only if there are at least a few thousand operations to perform on it. The precise criterion for making a thread is that according to the cost model guess, the thread must have at least 5ms worth of work.&lt;/p&gt; &lt;p&gt;We note that the 100 Mt throughput is a little over three-times that of the 1000 Mt throughput, as reported before. We might justifiably ask why the 100 Mt run is not seven-times faster instead, for this much less work. &lt;/p&gt; &lt;p&gt;We note that for one-off random access, it makes no real difference whether the tree has 100 M or 1000 M rows; this translates to roughly 27 vs 30 comparisons, so the depth of the tree is not a factor &lt;i&gt;per se&lt;/i&gt;. Besides, vectoring makes the tree often look only one or two levels deep, so the total row count matters even less there.&lt;/p&gt; &lt;p&gt;To elucidate this last question, we look at the CPU profiles. We take an &lt;a href=&quot;http://oprofile.sourceforge.net/about/&quot; id=&quot;link-id0x1efb3360&quot;&gt;oprofile&lt;/a&gt; of 100 Single User mixes at both scales.&lt;/p&gt; For 100 Mt: &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 61161 10.1723 cmpf_iri64n_iri64n_anyn_gt_lt 31321 5.2093 box_equal 19027 3.1646 sqlo_parse_tree_has_node 15905 2.6453 dk_alloc 15647 2.6024 itc_next_set_neq 12702 2.1126 itc_vec_split_search 12487 2.0768 itc_dive_transit 11450 1.9044 itc_bm_vec_row_check 10646 1.7706 itc_page_rcf_search 9223 1.5340 id_hash_get 9215 1.5326 gen_qsort 8867 1.4748 sqlo_key_part_best 8807 1.4648 itc_param_cmp 8062 1.3409 cmpf_iri64n_iri64n 6820 1.1343 sqlo_in_list 6005 0.9987 dc_iri_id_cmp 5905 0.9821 dk_free_tree 5801 0.9648 box_hash 5509 0.9163 dks_esc_write 5444 0.9054 sql_tree_hash_1 &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; For 1000 Mt &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 754331 31.4149 cmpf_iri64n_iri64n_anyn_gt_lt 146165 6.0872 itc_vec_split_search 144795 6.0301 itc_next_set_neq 131671 5.4836 itc_dive_transit 110870 4.6173 itc_page_rcf_search 66780 2.7811 gen_qsort 66434 2.7667 itc_param_cmp 58450 2.4342 itc_bm_vec_row_check 55213 2.2994 dk_alloc 47793 1.9904 cmpf_iri64n_iri64n 44277 1.8440 dc_iri_id_cmp 39489 1.6446 cmpf_int64n 36880 1.5359 dc_append_bytes 36601 1.5243 dv_compare 31286 1.3029 dc_any_value_prefetch 25457 1.0602 itc_next_set 20852 0.8684 box_equal 19895 0.8285 dk_free_tree 19698 0.8203 itc_page_insert_search 19367 0.8066 dc_copy &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The top function in both is the compare for an equality of two leading IRIs and a range for the trailing any. This corresponds to the range check in Q5. At the larger scale this is three times more important. At the smaller scale, the share of query &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x1bf8ca38&quot;&gt;optimization&lt;/a&gt; is about 6.5 times greater. The top function in this category is &lt;code&gt;box_equal&lt;/code&gt; with 5.2% vs 0.87%. The remaining SQL compiler functions are all in proportion to this, totaling 14.3% of the 100 Mt top-20 profile.&lt;/p&gt; &lt;p&gt;From this sample it appears ten times more scale is seven times more database operations. This is not taken into account in the metric. Query compilation is significant at the small end, and no longer significant at 1000 Mt. From these numbers, we could say that &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1be12350&quot;&gt;Virtuoso&lt;/a&gt; is about two times more efficient in terms of database operation throughput at 1000 Mt than at 100 Mt.&lt;/p&gt; &lt;p&gt;We may conclude that different BSBM scales measure different things. The &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x17eb98a0&quot;&gt;TPC&lt;/a&gt; workloads are relatively better in that they have a balance between metric components that stay relatively constant across a large range of scales.&lt;/p&gt; &lt;p&gt;This is not necessarily something that should be fixed in the BSBM Explore mix. We must however take these factors better into account in developing the BI mix.&lt;/p&gt; &lt;p&gt;Let us also remember that BSBM Explore is a relational workload. Future posts in this series will outline how we propose to make RDF-friendlier benchmarks. &lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1a9bcff8&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1d3e5470&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1de94770&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1ea66470&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1f1118d8&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1d1c0cd8&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 7): What Does BSBM Explore Measure? &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1aaf4180&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1a957610&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x127e75c8&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1c9400f0&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d2c1d68&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ea1fb40&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1c073a10&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1c5541e8&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 6): BSBM and I/O, continued</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-07#1670</atom:id>
  <atom:published>2011-03-07T22:36:24Z</atom:published>
  <atom:updated>2011-03-14T17:57:06.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In the words of Jim Gray, disks have become tapes. By this he means that a disk is really only good for sequential access. For this reason, the SSD extent read ahead was incomparably better. We note that in the experiment, every page in the general area of the database the experiment touched would in time be touched, and that the whole working set would end up in memory. Therefore no speculative read would be wasted. Therefore it stands to reason to read whole extents.&lt;/p&gt; &lt;p&gt;So I changed the default behavior to use a very long window for triggering read-ahead as long as the buffer pool was not full. After the initial filling of the buffer pool, the read ahead would require more temporal locality before kicking in. &lt;/p&gt; &lt;p&gt;Still, the scheme was not really good since the rest of the extent would go for background-read and the triggering read would be done right then, leading to extra seeks. Well, this is good for latency but bad for throughput. So I changed this too, going to an &amp;quot;elevator only&amp;quot; scheme where reads that triggered read-ahead would go with the read-ahead batch. Reads that did not trigger read-ahead would still be done right in place, thus favoring latency but breaking any sequentiality with its attendant 10+ ms penalty.&lt;/p&gt; &lt;p&gt;We keep in mind that the test we target is &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x17c88010&quot;&gt;BSBM&lt;/a&gt; warm-up time, which is purely a throughput business. One could have timeouts and could penalize queries that sacrificed too much latency to throughput.&lt;/p&gt; &lt;p&gt;We note that even for this very simple metric, just reading the allocated database pages from start to end is not good since a large number of pages in fact never get read during a run.&lt;/p&gt; &lt;p&gt;We further note that the vectored read-ahead without any speculation will be useful as-is for cases with few threads and striping, since at least one thread&amp;#39;s random I/Os get to go to multiple threads. The benefit is less in multiuser situations where disks are randomly busy anyhow. &lt;/p&gt; &lt;p&gt;In the previous I/O experiments, we saw that with vectored read ahead and no speculation, there were around 50 pages waiting for I/O at all times. With an easily-triggered extent read-ahead, there were around 4000 pages waiting. The more pages are waiting for I/O, the greater the benefit from the elevator algorithm of servicing I/O in order of file offset. &lt;/p&gt; &lt;p&gt;In &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1c51fae0&quot;&gt;Virtuoso&lt;/a&gt; 5 we had a trick that would, if the buffer pool was not full, speculatively read every uncached sibling of every index tree node it visited. This filled the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1d6a0cf0&quot;&gt;cache&lt;/a&gt; quite fast, but was useless after the cache was full. The extent read ahead first implemented in 6 was less aggressive, but would continue working with full cache and did in fact help with shifts in the working set.&lt;/p&gt; &lt;p&gt;The next logical step is to combine the vector and extent read-ahead modes. We see what pages we will be getting, then take the distinct extents; if we have been to this extent within the time window, we just add all the uncached allocated pages of the extent to the batch.&lt;/p&gt; &lt;p&gt;With this setting, especially at the start of the run, we get large read-ahead batches and maintain I/O queues of 5000 to 20000 pages. The SSD starting time drops to about 120 seconds from cold start to reach 1200% &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1d295448&quot;&gt;CPU&lt;/a&gt;. We see transfer rates of up to 150 MB/s per SSD. With HDDs, we see transfer rates around 14 MB/s per drive, mostly reading chunks of an average of seventy-one (71) 8K pages.&lt;/p&gt; &lt;p&gt;The BSBM workload does not offer better possibilities for &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x1aca8b40&quot;&gt;optimization&lt;/a&gt;, short of pre-reading the whole database, which is not practical at large scales. &lt;/p&gt; &lt;h2&gt;Some Details&lt;/h2&gt; &lt;p&gt;First we start from cold disk, with and without mandatory read of the whole extent on the touch.&lt;/p&gt; &lt;p&gt;Without any speculation but with vectored read-ahead, here are the times for the first 11 query mixes:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 151560.82 ms, total: 151718 ms 1: 179589.08 ms, total: 179648 ms 2: 71974.49 ms, total: 72017 ms 3: 102701.73 ms, total: 102729 ms 4: 58834.41 ms, total: 58856 ms 5: 65926.34 ms, total: 65944 ms 6: 68244.69 ms, total: 68274 ms 7: 39197.15 ms, total: 39215 ms 8: 45654.93 ms, total: 45674 ms 9: 34850.30 ms, total: 34878 ms 10: 100061.30 ms, total: 100079 ms &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The average CPU during this time was 5%. The best read throughput was 2.5 MB/s; the average was 1.35 MB/s. The average disk read was 16 ms. &lt;/p&gt; &lt;p&gt;With vectored read-ahead and full extents only, i.e., max speculation:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 178854.23 ms, total: 179034 ms 1: 110826.68 ms, total: 110887 ms 2: 19896.11 ms, total: 19941 ms 3: 36724.43 ms, total: 36753 ms 4: 21253.70 ms, total: 21285 ms 5: 18417.73 ms, total: 18439 ms 6: 21668.92 ms, total: 21690 ms 7: 12236.49 ms, total: 12267 ms 8: 14922.74 ms, total: 14945 ms 9: 11502.96 ms, total: 11523 ms 10: 15762.34 ms, total: 15792 ms ... 90: 1747.62 ms, total: 1761 ms 91: 1701.01 ms, total: 1714 ms 92: 1300.62 ms, total: 1318 ms 93: 1873.15 ms, total: 1886 ms 94: 1508.24 ms, total: 1524 ms 95: 1748.15 ms, total: 1761 ms 96: 2076.92 ms, total: 2090 ms 97: 2199.38 ms, total: 2212 ms 98: 2305.75 ms, total: 2319 ms 99: 1771.91 ms, total: 1784 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 1.3006s / 178.8542s Elapsed runtime: 872.993 seconds QMpH: 412.374 query mixes per hour &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The peak throughput is 91 MB/s, with average around 50 MB/s; CPU average around 50%.&lt;/p&gt; &lt;p&gt;We note that the latency of the first query mix is hardly greater than in the non-speculative run, but starting from mix 3 the speed is clearly better. &lt;/p&gt; &lt;p&gt;Then the same with cold SSDs. First with no speculation:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 5177.68 ms, total: 5302 ms 1: 2570.16 ms, total: 2614 ms 2: 1353.06 ms, total: 1391 ms 3: 1957.63 ms, total: 1978 ms 4: 1371.13 ms, total: 1386 ms 5: 1765.55 ms, total: 1781 ms 6: 1658.23 ms, total: 1673 ms 7: 1273.87 ms, total: 1289 ms 8: 1355.19 ms, total: 1380 ms 9: 1152.78 ms, total: 1167 ms 10: 1787.91 ms, total: 1802 ms ... 90: 1116.25 ms, total: 1128 ms 91: 989.50 ms, total: 1001 ms 92: 833.24 ms, total: 844 ms 93: 1137.83 ms, total: 1150 ms 94: 969.47 ms, total: 982 ms 95: 1138.04 ms, total: 1149 ms 96: 1155.98 ms, total: 1168 ms 97: 1178.15 ms, total: 1193 ms 98: 1120.18 ms, total: 1132 ms 99: 1013.16 ms, total: 1025 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 0.8201s / 5.1777s Elapsed runtime: 127.555 seconds QMpH: 2822.321 query mixes per hour &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The peak I/O is 45 MB/s, with average 28.3 MB/s; CPU average is 168%.&lt;/p&gt; &lt;p&gt;Now, SSDs with max speculation.&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; 0: 44670.34 ms, total: 44809 ms 1: 18490.44 ms, total: 18548 ms 2: 7306.12 ms, total: 7353 ms 3: 9452.66 ms, total: 9485 ms 4: 5648.56 ms, total: 5668 ms 5: 5493.21 ms, total: 5511 ms 6: 5951.48 ms, total: 5970 ms 7: 3815.59 ms, total: 3834 ms 8: 4560.71 ms, total: 4579 ms 9: 3523.74 ms, total: 3543 ms 10: 4724.04 ms, total: 4741 ms ... 90: 673.53 ms, total: 685 ms 91: 534.62 ms, total: 545 ms 92: 730.81 ms, total: 742 ms 93: 1358.14 ms, total: 1370 ms 94: 1098.64 ms, total: 1110 ms 95: 1232.20 ms, total: 1243 ms 96: 1259.57 ms, total: 1273 ms 97: 1298.95 ms, total: 1310 ms 98: 1156.01 ms, total: 1166 ms 99: 1025.45 ms, total: 1034 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 0.4725s / 44.6703s Elapsed runtime: 269.323 seconds QMpH: 1336.683 query mixes per hour &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The peak I/O is 339 MB/s, with average 192 MB/s; average CPU is 121%.&lt;/p&gt; &lt;p&gt;The above was measured with the read-ahead thread doing single-page reads. We repeated the test with merging reads with small differences. The max IO was 353 MB/s, and average 173 MB/s; average CPU 113%.&lt;/p&gt; &lt;p&gt;We see that the start latency is quite a bit longer than without speculation and the CPU % is lower due to higher latency of individual I/O. The I/O rate is fair. We would expect more throughput however. &lt;/p&gt; &lt;p&gt;We find that a supposedly better use of the API, doing single requests of up to 100 pages instead of consecutive requests of 1 page, does not make a lot of difference. The peak I/O is a bit higher; overall throughput is a bit lower.&lt;/p&gt; &lt;p&gt;We will have to retry these experiments with a better controller. We have at no point seen anything like the 50K 4KB random I/Os promised for the SSDs by the manufacturer. We know for a fact that the controller gives about 700 MB/s sequential read with &lt;code&gt;cat file /dev/null&lt;/code&gt; and two drives busy. With 4 drives busy, this does not get better. The best 30 second stretch we saw in a multiuser BSBM warm-up was 590 MB/s, which is consistent with the &lt;code&gt;cat&lt;/code&gt; to &lt;code&gt;/dev/null&lt;/code&gt; figure. We will later test with 8 SSDs with better controllers. &lt;/p&gt; &lt;p&gt;Note that the average I/O and CPU are averages over 30 second measurement windows; thus for short running tests, there is some error from the window during which the activity ended. &lt;/p&gt; &lt;p&gt;Let us now see if we can make a BSBM instance warm up from disk in a reasonable time. We run 16 users with max speculation. We note that after reading 7,500,000 buffers we are not entirely free of disk. The max speculation read-ahead filled the cache in 17 minutes, with an average of 58 MB/s. After the cache is filled, the system shifts to a more conservative policy on extent read-ahead; one which in fact never gets triggered with the BSBM &lt;i&gt;Explore&lt;/i&gt; in steady state. The vectored read-ahead is kept on since this by itself does not read pages that are not needed. However, the vectored read-ahead does not run either, because the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1c9bca60&quot;&gt;data&lt;/a&gt; that is accessed in larger batches is already in memory. Thus there remains a trickle of an average 0.49 MB/s from disk. This keeps CPU around 350%. With SSDs, the trickle is about 1.5 MB/s and CPU is around 1300% in steady state. Thus SSDs give approximately triple the throughput in a situation where there is a tiny amount of continuous random disk access. The disk access in question is 80% for retrieving &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1c05e280&quot;&gt;RDF&lt;/a&gt; literal strings, presumably on behalf of the &lt;code&gt;DESCRIBE&lt;/code&gt; query in the mix. This query touches things no other query touches and does so one subject at a time, in a way that can neither be anticipated nor optimized.&lt;/p&gt; &lt;p&gt;The Virtuoso 7 column store will deal with this better because it is more space efficient overall. If we apply stream-compression to literals, these will go in under half the space, while quads will go in maybe one-quarter the space. Thus 3000 Mt all from memory should be possible with 72 GB RAM. 1000 Mt row-wise did fit in in 72 GB RAM except for the random literals accessed by the the &lt;code&gt;DESCRIBE&lt;/code&gt;. This alone drops throughput to under a third of the memory-only throughput if using HDDs. SSDs, on the other hand, can largely neutralize this effect.&lt;/p&gt; &lt;h2&gt;Conclusions&lt;/h2&gt; &lt;p&gt;We have looked at basics of I/O. SSDs have been found to be a readily available solution to I/O bottlenecks without need for reconfiguration or complex I/O policies. We have been able to get a decent read rate under conditions of server warm-up or shift of working set even with HDDs.&lt;/p&gt; &lt;p&gt;More advanced I/O matters will be covered with the column store. We note that the techniques discussed here apply identically to rows and columns.&lt;/p&gt; &lt;p&gt;As concerns BSBM, it seems appropriate to include a warm-up time. In practice, this means that the store just must eagerly pre-read. This is not hard to do and can be quite useful.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1b4342b0&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1d3e7388&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x153c7ba8&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1da11d98&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1d25d630&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 6): BSBM and I/O, continued &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1f1f5ee8&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1cd44938&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1d51f848&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x13d333c0&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1e77a5e8&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ea1fb40&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e7786c8&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f8a37f8&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1c69e018&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-07#1668</atom:id>
  <atom:published>2011-03-07T19:17:36Z</atom:published>
  <atom:updated>2011-03-14T17:56:52-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In the context of database benchmarks we cannot ignore I/O, as pretty much has been done so far by &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1ea17348&quot;&gt;BSBM&lt;/a&gt;. &lt;/p&gt; &lt;p&gt;There are two approaches:&lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;p&gt;run twice or otherwise make sure one runs from memory and forget about I/O, or&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;make rules and metrics for warm-up.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;We will see if the second is possible with BSBM.&lt;/p&gt; &lt;p&gt;From this starting point, we look at various ways of scheduling I/O in &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x125c4f90&quot;&gt;Virtuoso&lt;/a&gt; using a 1000 Mt BSBM database on sets of each of HDDs (hard disk devices) and SSDs (solid-state storage devices). We will see that SSDs in this specific application can make a significant difference. &lt;/p&gt; &lt;p&gt;In this test we have the same 4 stripes of a 1000 Mt BSBM database on each of two storage arrays.&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;9&quot; align=&quot;center&quot;&gt;Storage Arrays&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt; Type &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Quantity &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Maker &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Size &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Speed &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Interface speed &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Controller &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Drive &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1cab1358&quot;&gt;Cache&lt;/a&gt; &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; RAID &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt; SSD &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 4 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; Crucial &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 128 GB &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; N/A &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 6Gbit SATA &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; RocketRaid 640 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 128 MB &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; None &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt; HDD &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 4 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; Samsung &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 1000 GB &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 7200 RPM &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 3Gbit SATA &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Intel_Corporation&quot; id=&quot;link-id0x1ab6edd8&quot;&gt;Intel&lt;/a&gt; ICH on Supermicro motherboard &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 16 MB &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; None &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;We make sure that the files are not in OS cache by filling it with other big files, reading a total of 120 GB off SSDs with &lt;code&gt;`cat file &amp;gt; /dev/null`&lt;/code&gt;. &lt;/p&gt; &lt;p&gt;The configuration files are as in the report on the 1000 Mt run. We note as significant that we have a few file descriptors for each stripe, and that read-ahead for each is handled by its own thread.&lt;/p&gt; &lt;p&gt;Two different read-ahead schemes are used: &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;With 6 Single, if a 2MB extent gets a second read within a given time after the first, the whole extent is scheduled for background read.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;With 7 Single, as an index search is vectored, we know a large number of values to fetch at one time and these values are sorted into an ascending sequence. Therefore, by looking at a node in an index tree, we can determine which sub-trees will be accessed and schedule these for read-ahead, skipping any that will not be accessed.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;In either model, a sequential scan touching more than a couple of consecutive index leaf pages triggers a read-ahead, to the end of the scanned range or to the next 3000 index leaves, whichever comes first. However, there are no sequential scans of significant size in BSBM.&lt;/p&gt; &lt;p&gt;There are a few different possibilities for the physical I/O: &lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;p&gt;Using a separate read system call for each page. There may be several open file descriptors on a file so that many such calls can proceed concurrently on different threads; the OS will order the operations.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;A thread finds it needs a page and reads it.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Using Unix asynchronous I/O, &lt;code&gt;aio.h&lt;/code&gt;, with the &lt;code&gt;aio_*&lt;/code&gt; and &lt;code&gt;lio_listio&lt;/code&gt; functions.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Using single-read system calls for adjacent pages. In this way, the drive sees longer requests and should give better throughput. If there are short gaps in the sequence, the gaps are also read, wasting bandwidth but saving on latency.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;The two latter apply only to bulk I/O that are scheduled on background threads, one per independently-addressable device (HDD, SSD, or RAID-set). These bulk-reads operate on an elevator model, keeping a sorted queue of things to read or write and moving through this queue from start to end. At any time, the queue may get more work from other threads.&lt;/p&gt; &lt;p&gt;There is a further choice when seeing single-page random requests. They can either go to the elevator or they can be done in place. Taking the elevator is presumably good for throughput but bad for latency. In general, the elevator should have a notion of fairness; these matters are discussed in the &lt;a href=&quot;http://www.cwi.nl/&quot; id=&quot;link-id0x1f62abb8&quot;&gt;CWI collaborative scan paper&lt;/a&gt;. Here we do not have long queries, so we do not have to talk about elevator policies or scan sharing; there are no scans. We may touch on these questions later with the column store, the BSBM BI mix, and &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x1bfb17c0&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x1e76bfc8&quot;&gt;H&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;While we may know principles, I/O has always given us surprises; the only way to optimize this is to measure.&lt;/p&gt; &lt;p&gt;The metric we try to optimize here is the time it takes for a multiuser BSBM run starting from cold cache to get to 1200% &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1d7b1d10&quot;&gt;CPU&lt;/a&gt;. When running from memory, the CPU is around 1350% for the system in question. &lt;/p&gt; &lt;p&gt;This depends on getting I/O throughput, which in turn depends on having a lot of speculative reading since the workload itself does not give any long stretches to read. &lt;/p&gt; &lt;p&gt;The test driver is set at 16 clients, and the run continues for 2000 query mixes or until target throughput is reached. Target throughput is deemed reached after the first 20 second stretch with CPU at 1200% or higher.&lt;/p&gt; &lt;p&gt;The meter is a stored procedure that records the CPU time, count of reads, cumulative elapsed time spent waiting for I/O, and other metrics. The code for this procedure (for 7 Single; this file will not work on Virtuoso 6 or earlier) is &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/BenchmarksReduxSupportingFiles/ldmeter.sql&quot; id=&quot;link-id0x1b5adb08&quot;&gt;available here&lt;/a&gt;. &lt;/p&gt; &lt;p&gt;The database space allocation gives each index a number of 2MB segments, each with 256 8K pages. When a page splits, the new page is allocated from the same extent if possible, or from a specific second extent which is designated as the overflow extent of this extent. This scheme provides for a sort of pseudo-locality within extents over random insert order. Thus there is a chance that pre-reading an extent will get key values in the same range a the ones on the page being requested in the first place. At least the pre-read pages will be from the same index tree. There are insertion orders that do not create good locality with this allocation scheme, though. In order to generally improve locality, one could shuffle pages of an all-dirty subtree before writing this out so as to have physical order match key order. We will look at some tricks in this vein with the column store.&lt;/p&gt; &lt;p&gt;For the sake of simplicity we only run 7 Single with the 1000 Mt scale.&lt;/p&gt; &lt;p&gt;The first experiment was with SSDs and the vectored read-ahead. The target throughput was reached after 280 seconds. &lt;/p&gt; &lt;p&gt;The next test was with HDDs and extent read-ahead. One hour into the experiment, the CPU was about 70% after processing around 1000 query mixes. It might have been hours before HDD reads became rare enough for hitting 1200% CPU. The test was not worth continuing.&lt;/p&gt; &lt;p&gt;The result with HDDs and vectored read-ahead would be worse since vectored read-ahead leads to smaller read-ahead batches and to less contiguous read patterns. The individual read times here, are over twice the individual read times with per-extent read-ahead. The fact that vectored read-ahead does not read potentially unneeded pages makes no difference. Hence this test is also not worth running to completion.&lt;/p&gt; &lt;p&gt;There are other possibilities for improving HDD I/O. If only 2MB read requests are made, a transfer will be about 20 ms at a sequential transfer speed of 50 MB/s. Then seeking to the next 2MB extent will be a few ms, most often less than 20, so the HDD should give at least half the nominal throughput.&lt;/p&gt; &lt;p&gt;We note that, when reading sequential 8K pages inside a single 2MB (256 page) extent, the seek latency is not 0 as one would expect but an extreme 5 ms. One would think that the drive would buffer a whole track, and a track would hold a large number of 2MB sections, but apparently this is not so. &lt;/p&gt; &lt;p&gt;Therefore, now if we have a sequential read pattern that is more dense than 1 page out of 10, we read all the pages and just keep the ones we want.&lt;/p&gt; &lt;p&gt;So now we set the read-ahead to merge reads that fall within 10 pages. This wastes bandwidth, but supposedly saves on latency. We will see. &lt;/p&gt; &lt;p&gt;So we try, and we find that read-ahead does not account for most pages since it does not get triggered. Thus, we change the triggering condition to be the 2nd read to fall in the extent within 20 seconds of the first.&lt;/p&gt; &lt;p&gt;The HDDs were in all cases 700% busy for 4 HDDs. But with the new setting we get longer requests, most often full extents, which gets a per-HDD transfer rate of about 5 MB/s. With the looser condition for starting read-ahead, 89% of all pages were read in a read-ahead batch. We see the I/O throughput decrease during the run because there are more single-page reads that do not trigger extent read-ahead. So HDDs have 1.7 concurrent operations pending, but the batch size drops, dropping the throughput.&lt;/p&gt; &lt;p&gt; &lt;/p&gt; &lt;p&gt;Thus with the best settings, the test with 2000 query mixes finishes in 46 minutes, and the CPU utilization is steadily increasing, hitting 392% for the last minute. In comparison, with SSDs and our worst read-ahead setting we got 1200% CPU in under 5 minutes from cold start. The I/O system can be further tuned; for example, by only reading full extents as long as the buffer pool is not full. In the next post we will measure some more. &lt;/p&gt; &lt;p&gt; &lt;/p&gt; &lt;h3&gt;BSBM Note &lt;/h3&gt; &lt;p&gt;We look at query times with semi-warm cache, with CPU around 400%. We note that Q8-Q12 are especially bad. Q5 runs at about half speed. Q12 runs at under 1/10th speed. The relatively slowest queries appear to be single-instance lookups. Nothing short of the most aggressive speculative reading can help there. Neither query nor workload has any exploitable pattern. Therefore if an I/O component is to be included in a BSBM metric, the only way to score in this is to use speculative read to the maximum.&lt;/p&gt; &lt;p&gt;Some of the queries take consecutive property values of a single instance. One could parallelize this pipeline, but this would be a one-off and would make sense only when reading from storage (whether HDD, SSD, or otherwise). Multithreading for single rows is not worth the overhead.&lt;/p&gt; &lt;p&gt;A metric for BSBM warm-up is not interesting for database science, but may still be of practical interest in the specific case of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1d6371d8&quot;&gt;RDF&lt;/a&gt; stores. Specially reading large chunks at startup time is good, so putting a section in BSBM that would force one to implement this would be a service to most end users. Measuring and reporting such I/O performance would favor space efficiency in general. Space efficiency is generally a good thing, especially at larger scales, so we can put an optional section in the report for warm-up. This is also good for comparing HDDs and SSDs, and for testing read-ahead, which is still something a database is expected to do. Implementors have it easy; just speculatively read everything.&lt;/p&gt; &lt;p&gt;Looking at the BSBM fictional use case, anybody running such a portal would do this from RAM only, so it makes sense to define the primary metric as running from warm cache, in practice 100% from memory.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1ecb2af0&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x19d05678&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1d542328&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x13947e08&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1a7f6b30&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1d67dd40&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1ebcee68&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1a855ba0&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1b081e70&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1d7a7940&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d7e2cd0&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e375338&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d199728&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e808818&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-04#1666</atom:id>
  <atom:published>2011-03-04T20:28:28Z</atom:published>
  <atom:updated>2011-03-14T17:56:38.000002-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Below is a questionnaire I sent to the &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x17f62428&quot;&gt;BSBM&lt;/a&gt; participants in order to get tuning instructions for the runs we were planning. I have filled in the answers for &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1d48ed28&quot;&gt;Virtuoso&lt;/a&gt;, here. This can be a checklist for pretty much any &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1e11b228&quot;&gt;RDF&lt;/a&gt; database tuning.&lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Threading - What settings should be used (e.g., for query parallelization, I/O parallelization [e.g., prefetch, flush of dirty], thread pools [e,.g. web server], any other thread related)? We will run with 8 and 32 cores, so if there are settings controlling number of read/write (R/W) locks or mutexes or such for serializing diverse things, these should be set accordingly to minimize contention.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The following three settings are all &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Parameters&quot; id=&quot;link-id0x1ed4fe10&quot;&gt;in the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;. &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;&lt;code&gt;AsyncQueueMaxThreads&lt;/code&gt; &lt;/b&gt; controls the size of a pool of extra threads that can be used for query parallelization. This should be set to either &lt;b&gt;1.5 * the number of cores&lt;/b&gt; or &lt;b&gt;1.5 * the number of core threads&lt;/b&gt;; see which works better.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;&lt;code&gt;ThreadsPerQuery&lt;/code&gt; &lt;/b&gt; is the maximum number of threads a single query will take. This should be set to either &lt;b&gt;the number of cores&lt;/b&gt; or &lt;b&gt;the number of core threads&lt;/b&gt;; see which works better. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;&lt;code&gt;IndexTreeMaps&lt;/code&gt; &lt;/b&gt; is the number of mutexes over which control for buffering an index tree is split. This can generally be left at default (&lt;b&gt;256&lt;/b&gt; in normal operation; valid settings are powers of 2 from 2 to 1024), but setting to &lt;b&gt;64, 128, or 512&lt;/b&gt; may be beneficial.&lt;/p&gt; &lt;p&gt;A low number will lead to frequent contention; upwards of 64 will have little contention. We have sometimes seen a multiuser workload go 10% faster when setting this to 64 (down from 256), which seems counter-intuitive. This may be a &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1e12b618&quot;&gt;cache&lt;/a&gt; artifact.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_HTTPServer&quot; id=&quot;link-id0x1f8960a0&quot;&gt;In the &lt;code&gt;[HTTPServer]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;, the &lt;b&gt;&lt;code&gt;ServerThreads&lt;/code&gt;&lt;/b&gt; setting is the number of web server threads, i.e., the maximum number of concurrent &lt;a class=&quot;auto-href&quot; href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id0x17e4d690&quot;&gt;SPARQL protocol&lt;/a&gt; requests. Having a value larger than the number of concurrent clients is OK; for large numbers of concurrent clients a lower value may be better, which will result in requests waiting for a thread to be available.&lt;/p&gt; &lt;p&gt;Note â The &lt;code&gt;[HTTPServer] ServerThreads&lt;/code&gt; are taken from the total pool made available by the &lt;code&gt;[Parameters] ServerThreads&lt;/code&gt;. Thus, the &lt;code&gt;[Parameters] ServerThreads&lt;/code&gt; should always be at least as large as (and is best set greater than) the &lt;code&gt;[HTTPServer] ServerThreads&lt;/code&gt;, and if using the closed-source Commercial Version, &lt;code&gt;[Parameters] ServerThreads&lt;/code&gt; cannot exceed the licensed thread count. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;File layout - Are there settings for striping over multiple devices? Settings for other file access parallelism? Settings for SSDs (e.g., SSD based cache of hot set of larger db files on disk)? The target config is for 4 independent disks and 4 independent SSDs. If you depend on RAID, are there settings for this? If you need RAID to be set up, please provide the settings/script for doing this with 4 SSDs on Linux (RH and Debian). This will be software RAID, as we find the hardware RAID to be much worse than an independent disk setup on the system in question.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;It is best to stripe database files over all available disks, and to not use RAID. If RAID is desired, then stripe database files across many RAID sets. Use the &lt;code&gt;segment&lt;/code&gt; declaration in the &lt;code&gt;virtuoso.ini&lt;/code&gt; file. It is very important to give each independently seekable device its own I/O queue thread. See the documentation on the &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x1e9a6bc0&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x1ebdf210&quot;&gt;C&lt;/a&gt; sample for examples. &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Parameters&quot; id=&quot;link-id0x1f893f48&quot;&gt;in the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;, set &lt;code&gt;FDsPerFile&lt;/code&gt; to be &lt;code&gt; (the number of concurrent threads * 1.5) Ã· the number of distinct database files&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;There are no SSD specific settings.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Loading - How many parallel streams work best? We are looking for non-transactional bulk load, with no inference materialization. For partitioned cluster settings, do we divide the load streams over server processes? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;Use one stream per core (not per core thread). In the case of a cluster, divide load streams evenly across all processes. The total number of streams on a cluster can equal the total number of cores; adjust up or down depending on what is observed.&lt;/p&gt; &lt;p&gt;Use the built-in bulk load facility, i.e., &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;ld_dir (&amp;#39;&amp;lt;source-filename-or-directory&amp;gt;&amp;#39;, &amp;#39;&amp;lt;file name pattern&amp;gt;&amp;#39;, &amp;#39;&amp;lt;destination graph iri&amp;gt;&amp;#39;);&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;For example,&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1dc52c58&quot;&gt;SQL&lt;/a&gt;&amp;gt; ld_dir (&amp;#39;/path/to/files&amp;#39;, &amp;#39;*.n3&amp;#39;, &amp;#39;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1e76bfc8&quot;&gt;http&lt;/a&gt;://&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x1e9a6ad8&quot;&gt;dbpedia&lt;/a&gt;.org&amp;#39;);&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;Then do a &lt;code&gt;rdf_loader_run ()&lt;/code&gt; on enough connections. For example, you can use the shell command &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;isql rdf_loader_run () &amp;amp;&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;to start one in a background isql process. When starting background load commands from the shell, you can use the shell &lt;code&gt;wait&lt;/code&gt; command to wait for completion. If starting from isql, use the &lt;code&gt;wait_for_children;&lt;/code&gt; command (see &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/isql.html&quot; id=&quot;link-id0x1ae0f230&quot;&gt;isql documentation&lt;/a&gt; for details). &lt;/p&gt; &lt;p&gt;See the &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d635820&quot;&gt;BSBM disclosure report&lt;/a&gt; for an example load script.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What command should be used after non-transactional bulk load, to ensure a consistent persistent state on disk, like a log checkpoint or similar? Load and checkpoint will be timed separately, load being &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1e6f1000&quot;&gt;CPU&lt;/a&gt;-bound and checkpoint being I/O-bound. No roll-forward log or similar is required; the load does not have to recover if it fails before the checkpoint.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Execute &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; CHECKPOINT;&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;through a SQL client, e.g., &lt;code&gt;isql&lt;/code&gt;. This is not a &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1c2401d8&quot;&gt;SPARQL&lt;/a&gt; statement and cannot be executed over the SPARQL protocol.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What settings should be used for trickle load of small triple sets into a pre-existing graph? This should be as transactional as supported; at least there should be a roll forward log, unlike the case for the bulk load.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;No special settings are needed for load testing; defaults will produce transactional behavior with a roll forward log. Default transaction isolation is &lt;b&gt;&lt;code&gt;REPEATABLE READ&lt;/code&gt;&lt;/b&gt;, but this may be altered via SQL session settings or at Virtuoso server start-up through &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Parameters&quot; id=&quot;link-id0x1a791b80&quot;&gt;the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;, with&lt;/p&gt; &lt;blockquote&gt; &lt;b&gt;&lt;code&gt;&lt;a href=&quot;http://wikis.openlinksw.com/dataspace/owiki/wiki/VirtuosoWikiWeb/ChangeVirtuosoSDefaultTransactionIsolationLevel&quot; id=&quot;link-id0x1e5536b8&quot;&gt;DefaultIsolation&lt;/a&gt; = 4&lt;/code&gt; &lt;/b&gt; &lt;/blockquote&gt; &lt;p&gt; Transaction isolation cannot be set over the SPARQL protocol.&lt;/p&gt; &lt;p&gt; NOTE: When testing full CRUD operations, other isolation settings may be preferable, due to &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x1ce6a310&quot;&gt;ACID&lt;/a&gt; considerations. See answer #12, below, and detailed discussion in part 8 of this series, &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1b7eb5f0&quot;&gt;BSBM &lt;i&gt;Explore and Update&lt;/i&gt;&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What settings control allocation of memory for database caching? We will be running mostly from memory, so we need to make sure that there is enough memory configured. &lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Parameters&quot; id=&quot;link-id0x1acd8fe8&quot;&gt;In the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;, &lt;b&gt;&lt;code&gt;NumberOfBuffers&lt;/code&gt;&lt;/b&gt; controls the amount of RAM used by Virtuoso to cache database files. One buffer caches an 8KB database page. In practice, count 10KB of memory per page. If &amp;quot;swappiness&amp;quot; on Linux is low (e.g., 2), two-thirds or more of physical memory can be used for database buffers. If swapping occurs, decrease the setting.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What command gives status on memory allocation (e.g., number of buffers, number of dirty buffers, etc.) so that we can verify that things are indeed in server memory and not, for example, being served from OS disk cache. If the cached format is different from the disk layout (e.g., decompression after disk read), is there a command for space statistics for database cache? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;In an &lt;code&gt;isql&lt;/code&gt; session, execute &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;STATUS ( ? ? );&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The second result paragraph gives counts of total, used, and dirty buffers. If used buffers is steady and less than total, and if the disk read count on the line below does not increase, the system is running from memory. The cached format is the same as the disk based format.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What command gives &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1c185f28&quot;&gt;information&lt;/a&gt; on disk allocation for different things? We are looking for the total size of allocated database pages for quads (including table, indices, anything else associated with quads) and dictionaries for literals, IRI names, etc. If there is a text index on literals, what command gives space stats for this? We count used pages, excluding any preallocated unused pages or other gaps. There is one number for quads and another for the dictionaries or other such structures, optionally a third for text index.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Execute on an &lt;code&gt;isql&lt;/code&gt; session: &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; CHECKPOINT; SELECT TOP 20 * FROM sys_index_space_stats ORDER BY iss_pages DESC; &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The &lt;code&gt;iss_pages&lt;/code&gt; column is the total pages for each index, including blob pages. Pages are 8KB. Only used pages are reported, gaps and unused pages are not counted. The rows pertaining to &lt;code&gt;RDF_QUAD&lt;/code&gt; are for quads; &lt;code&gt;RDF_IRI&lt;/code&gt;, &lt;code&gt;RDF_PREFIX&lt;/code&gt;, &lt;code&gt;RO_START&lt;/code&gt;, &lt;code&gt;RDF_OBJ&lt;/code&gt; are for dictionaries; &lt;code&gt;RDF_OBJ_RO_FLAGS_WORDS&lt;/code&gt; and &lt;code&gt;VTLOG_DB_DBA_RDF_OBJ&lt;/code&gt; are for text index. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;If there is a choice between triples and quads, we will run with quads. How do we ascertain that the run is with quads? How do we find out the index scheme? Should be use an alternate index scheme? Most of the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1c573780&quot;&gt;data&lt;/a&gt; will be in a single big graph.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The default scheme uses quads. The default index layout is &lt;code&gt;PSOG&lt;/code&gt;, &lt;code&gt;POGS&lt;/code&gt;, &lt;code&gt;GS&lt;/code&gt;, &lt;code&gt;SP&lt;/code&gt;, &lt;code&gt;OP&lt;/code&gt;. To see the current index scheme, use an &lt;code&gt;isql&lt;/code&gt; session to execute&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;STATISTICS DB.DBA.RDF_QUAD;&lt;/code&gt; &lt;/blockquote&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;For partitioned cluster settings, are there partitioning-related settings to control even distribution of data between partitions? For example, is there a way to set partitioning by &lt;code&gt;S&lt;/code&gt; or &lt;code&gt;O&lt;/code&gt; depending on which is first in key order for each index? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;The default partitioning settings are good, i.e., partitioning is on &lt;code&gt;O&lt;/code&gt; or &lt;code&gt;S&lt;/code&gt;, whichever is first in key order.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;For partitioned clusters, are there settings to control message batching or similar? What are the statistics available for checking interconnect operation, e.g. message counts, latencies, total aggregate throughput of interconnect?&lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/clusteroperation.html#clusteroperationgeneralclusterinifields&quot; id=&quot;link-id0x1ec6dff0&quot;&gt;In the &lt;code&gt;[Cluster]&lt;/code&gt; section of the &lt;code&gt;cluster.ini&lt;/code&gt; file&lt;/a&gt;, &lt;b&gt;&lt;code&gt;ReqBatchSize&lt;/code&gt;&lt;/b&gt; is the number of query states dispatched between cluster nodes per message round trip. This may be incremented from the default of &lt;code&gt;10000&lt;/code&gt; to &lt;code&gt;50000&lt;/code&gt; or so if this is seen to be useful. &lt;/p&gt; &lt;p&gt;To change this on the fly, the following can be issued through an &lt;code&gt;isql&lt;/code&gt; session:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;cl_exec ( &amp;#39; __dbf_set (&amp;#39;&amp;#39;cl_request_batch_size&amp;#39;&amp;#39;, 50000) &amp;#39; ); &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The commands below may be executed through an &lt;code&gt;isql&lt;/code&gt; session to get a summary of CPU and message traffic for the whole cluster or process-by-process, respectively. The documentation &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/clusteroperation.html#clusteroperationadminstdispl&quot; id=&quot;link-id0x1dfccec0&quot;&gt;details the fields&lt;/a&gt;. &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt; &lt;code&gt;STATUS (&amp;#39;cluster&amp;#39;) ;; whole cluster&lt;/code&gt; &lt;br /&gt; &lt;code&gt;STATUS (&amp;#39;cluster_d&amp;#39;) ;; process-by-process&lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Other settings - Are there settings for limiting query planning, when appropriate? For example, the BSBM &lt;i&gt;Explore&lt;/i&gt; mix has a large component of unnecessary query optimizer time, since the queries themselves access almost no data. Any other relevant settings?&lt;/b&gt; &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;For BSBM, needless query &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x1f0ffab8&quot;&gt;optimization&lt;/a&gt; should be capped at Virtuoso server start-up through the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt;, with&lt;/p&gt; &lt;blockquote&gt; &lt;b&gt;&lt;code&gt;StopCompilerWhenXOverRun = 1&lt;/code&gt; &lt;/b&gt; &lt;/blockquote&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;When testing full CRUD operations (not simply CREATE, i.e., load, as discussed in #5, above), it is essential to make queries run with transaction isolation of &lt;code&gt;READ COMMITTED&lt;/code&gt;, to remove most lock contention. Transaction isolation cannot be adjusted via SPARQL. This can be changed through SQL session settings, or at Virtuoso server start-up &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Parameters&quot; id=&quot;link-id0x1f3a43c8&quot;&gt;through the &lt;code&gt;[Parameters]&lt;/code&gt; section of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file&lt;/a&gt;, with&lt;/p&gt; &lt;blockquote&gt; &lt;b&gt;&lt;code&gt;&lt;a href=&quot;http://wikis.openlinksw.com/dataspace/owiki/wiki/VirtuosoWikiWeb/ChangeVirtuosoSDefaultTransactionIsolationLevel&quot; id=&quot;link-id0x1a5a51e0&quot;&gt;DefaultIsolation&lt;/a&gt; = 2&lt;/code&gt; &lt;/b&gt; &lt;/blockquote&gt; &lt;/li&gt; &lt;/ul&gt; &lt;/li&gt; &lt;/ol&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1d6e5428&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1c3ea770&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1efeca30&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1bda5158&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1ec74808&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1ea253a0&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1b02d528&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1ae81fc0&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x197515c0&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1a78db90&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d32ae10&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1e8fcc18&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ae95050&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1dbf3158&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-03-02#1664</atom:id>
  <atom:published>2011-03-02T23:23:16Z</atom:published>
  <atom:updated>2011-03-14T17:16:56-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In this post I will summarize the figures for &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1dcf58f8&quot;&gt;BSBM&lt;/a&gt; Load and &lt;i&gt;Explore&lt;/i&gt; mixes at 100 Mt, 200 Mt, and 1000 Mt. (1 Mt = 1 Megatriple, or one million triples.) The measurements were made on a 72GB 2xXeon 5520 with 4 SSDs. The exact specifications and configurations are in the raw reports to follow.&lt;/p&gt; &lt;p&gt;The load time in &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html&quot; id=&quot;link-id0x1f3716d8&quot;&gt;the recent Berlin report&lt;/a&gt; was measured with &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html#resultsExplore&quot; id=&quot;link-id0x1dd37f80&quot;&gt;the wrong function&lt;/a&gt;, and so far as we can tell, without multiple threads. The intermediate cut of &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1ddb0c90&quot;&gt;Virtuoso&lt;/a&gt; they tested also &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html#resultsExploreAndUpdate&quot; id=&quot;link-id0x1e5fcf40&quot;&gt; had broken&lt;/a&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1e1d2b70&quot;&gt;SPARQL&lt;/a&gt;/&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/page/SPARUL&quot; id=&quot;link-id0x1bfb00c0&quot;&gt;Update&lt;/a&gt; (also known as &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/page/SPARUL&quot; id=&quot;link-id0x1e0d5fd8&quot;&gt;SPARUL&lt;/a&gt;) features. We have fixed this since, and give &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/BenchmarksReduxSupportingFiles/results.zip&quot; id=&quot;link-id0x1edf36b0&quot;&gt;here the right numbers&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;In the course of the discussion to follow, we talk about 3 different kinds of Virtuoso:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Single&lt;/i&gt; is the generally available single server configuration of Virtuoso. Whether this is open source or not does not make a difference.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Cluster&lt;/i&gt; is the generally available commercial only cluster-capable Virtuoso.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;7 Single&lt;/i&gt; is the next generation single server Virtuoso, about to be released as a preview.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;To understand the numbers, we must explain how these differ from each other in execution:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Single&lt;/i&gt; has one thread-per-query, and operates on one state of the query at a time.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Cluster&lt;/i&gt; has one thread-per-query-per-process, and between processes it operates on batches of some tens-of-thousands of simultaneous query states. Within each node, these batches run through the execution pipeline one state at a time. Aggregation is distributed, and the query optimizer is generally smart about shipping colocated functions together.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;7 Single&lt;/i&gt; has multiple threads-per-query and in all situations operates on batches of 10,000 or more simultaneous query states. This means, for example, that index lookups get large numbers of parameters which then are sorted to get an ascending search pattern which benefits from locality, so the &lt;code&gt;n * log(n)&lt;/code&gt; index access for the batch becomes more like linear if the &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1ceca188&quot;&gt;data&lt;/a&gt; accessed has any locality. Furthermore, if there are many operands to an operator, these can be split on multiple threads. Also, scans of consecutive rows can be split before the scan on multiple threads, each doing a range of the scan. These features are called &lt;i&gt;vectored execution&lt;/i&gt; and &lt;i&gt;query parallelization&lt;/i&gt;. These techniques will also be applied to the cluster variant in due time.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The version 6 and 7 variants discussed here use the same physical storage layout with row-wise &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data_compression&quot; id=&quot;link-id0x1e521fa0&quot;&gt;key compression&lt;/a&gt;. Additionally, there exists a column-wise storage option in 7 that can fit 4x the number of quads in the same space. This column store option is not used here because it still has some problems with random order inserts.&lt;/p&gt; &lt;p&gt; We will first consider loading. Below are the load times and rates for 7 at each scale.&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;4&quot; align=&quot;center&quot;&gt;7 Single&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Rate &lt;br /&gt; (quads per second)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Load time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Checkpoint time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 261,366 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 301 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 82 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 216,000 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 802 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 123 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 130,378 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 6641 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 1012 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;In each case the load was made on 8 concurrent streams, each reading a file from a pool of 80 files for the two smaller scales and 360 files for the larger scale.&lt;/p&gt; &lt;p&gt;We also loaded the smallest data set with 6 Single using the same load script. &lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;4&quot; align=&quot;center&quot;&gt;6 Single&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Rate &lt;br /&gt; (quads per second)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Load time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;th align=&quot;center&quot;&gt;Checkpoint time &lt;br /&gt; (seconds)&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 74,713 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 1192 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 145 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1132ad18&quot;&gt;CPU&lt;/a&gt; time with 6 Single was 8047 seconds. We compare this to 4453 seconds of CPU for the same load on 7 Single. The CPU% during the run was on either side of 700% for 6 Single and 1300% for 7 Single. Note that high percentages involve core threads, not real cores. &lt;/p&gt; &lt;p&gt;The difference is mostly attributable to vectoring and the introduction of a non-transactional insert. The 6 Single inserts transactionally but makes very frequent commits and writes no log, resulting in &lt;i&gt;de facto&lt;/i&gt; non-transactional behavior but still there is a lock and commit cycle. Inserts in &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1c750368&quot;&gt;RDF&lt;/a&gt; load usually exhibit locality on all SPOG. Sorting by value gives ascending insert order and eliminates much of the lookup time for deciding where the next row will go. Contention on page read-write locks is less because the engine stays longer on a page, inserting multiple values in one go, instead of re-acquiring the read-write lock and possible transaction locks for each row.&lt;/p&gt; &lt;p&gt;Furthermore, for single stream loading the non-transactional mode can serve one thread doing the parsing with many threads doing the inserting; hence, in practice the speed is bounded by the parsing speed. In multi-stream load this parallelization also happens but is less significant, as adding threads past the count of core threads is not useful. Writes are all in-place, and no delta-merge mechanism is involved. For transactional inserts, the uncommitted rows are not visible to read-committed readers, which do not block. Repeatable and serializable readers would block before an uncommitted insert.&lt;/p&gt; &lt;p&gt;Now for the run (larger numbers indicate more queries executed, and are therefore better):&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; 6 Single Throughput &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 7641 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 29433 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 6017 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 13335 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 1770 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 2487 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;br /&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; align=&quot;center&quot; width=&quot;90%&quot;&gt; &lt;tr&gt; &lt;th colspan=&quot;3&quot; align=&quot;center&quot;&gt; 7 Single Throughput &lt;br /&gt; (QMpH, query mixes per hour) &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;Scale&lt;/th&gt; &lt;th align=&quot;center&quot;&gt; Single User &lt;/th&gt; &lt;th align=&quot;center&quot;&gt; 16 User &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;100 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 11742 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 72278 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;200 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 10225 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 60951 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th align=&quot;center&quot;&gt;1000 Mt&lt;/th&gt; &lt;td align=&quot;center&quot;&gt; 6262 &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 24672 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;The 100 Mt and 200 Mt runs are entirely in memory; the 1000 Mt run is mostly in memory, with about a 1.6 MB/s trickle from SSD in steady state. Accordingly, the 1000 Mt run is longer, with 2000 query mixes in the timed period, preceded by a warm-up of 2000 mixes with a different seed. For the memory-only scales, we run 500 mixes twice, and take the timing of the second run.&lt;/p&gt; &lt;p&gt;Looking at single user speeds, 6 Single and 7 Single are closest at the small end and drift farther apart at the larger scales. This comes from the increased opportunity to parallelize Q5, since this works on more data and is relatively more important as the scale gets larger. The 100 Mt run of 7 Single has about 130% CPU, and the 1000 Mt run has about 270%. This also explains why adding clients gives a larger boost at the smaller scale. &lt;/p&gt; &lt;p&gt;Now let us look at the relative effects of parallelizing and vectoring in 7 Single. We run 50 mixes of Single User &lt;i&gt;Explore&lt;/i&gt;: 6132 QMpH with both parallelizing and vectoring on; 2805 QMpH with execution limited to a single thread. Then we set the vector size to 1, meaning that the query pipeline runs one row at a time. This gets us 1319 QMpH which is a bit worse than 6 Single. This is to be expected since there is some overhead to running vectored with single-element vectors. Q5 on 7 Single with vectoring and a single thread runs at 1.9 qps; with single-element vectors, at 0.8 qps. The 6 Single engine runs Q5 at 1.13 qps.&lt;/p&gt; &lt;p&gt;The 100 Mt scale 7 Single gains the most from adding clients; the 1000 Mt 6 Single gains the least. The reason for the latter is covered in detail in &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1b9ed390&quot;&gt;A Benchmarking Story&lt;/a&gt;. We note that while vectoring is primarily geared to better single-thread speed and better &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1ddc2f48&quot;&gt;cache&lt;/a&gt; hit rates, it delivers a huge multithreaded benefit by eliminating the mutex contention at the index tree top which stops 6 Single dead at 1000 Mt.&lt;/p&gt; &lt;p&gt;In conclusion, we see that even with a workload of short queries and little opportunity for parallelism, we get substantial benefits from query parallelization and vectoring. When moving to more complex workloads, the benefits become more pronounced. For a single user complex query load, we can get 7x speed-up from parallelism (8 core), plus up to 3x from vectoring. These numbers do not take into account the benefits of the column store; those will be analyzed separately a bit later.&lt;/p&gt; &lt;p&gt;The full run details will be supplied at the end of this &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1e9f6960&quot;&gt;blog&lt;/a&gt; series.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1d0bb988&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x155fc700&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1d96e218&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1d7a5170&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1def9ca0&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1a7a7800&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1e9c6c68&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1e80c208&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1dafd290&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1f34f7f8&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1df24f50&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f4b19c8&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1de90cf8&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ebefbe8&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 2): A Benchmarking Story</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-02-28#1661</atom:id>
  <atom:published>2011-02-28T21:12:28Z</atom:published>
  <atom:updated>2011-03-14T17:16:40-04:00</atom:updated>
  <atom:content type="html">&lt;blockquote&gt; &lt;i&gt;Caeterum censeo, benchmarks are for vendors...&lt;/i&gt; &lt;/blockquote&gt; &lt;p&gt;This is an edifying story about benchmarks and how databases work. I will show how one detail makes a 5+x difference, and how one really must understand how things work in order to make sense of benchmarks.&lt;/p&gt; &lt;p&gt;We begin right after the publication of the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html&quot; id=&quot;link-id0x1df843f8&quot;&gt;recent Berlin report&lt;/a&gt;. This report gives us OK performance for queries and very bad performance for loading. Trickle updates were not measurable. This comes as a consequence of testing intermediate software cuts and having incomplete instructions for operating them. I will cover the whole &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1d0b6ea0&quot;&gt;BSBM&lt;/a&gt; matter and the general benchmarking question in forthcoming posts; for now, let&amp;#39;s talk about specifics.&lt;/p&gt; &lt;p&gt;In the course of the discussion to follow, we talk about 3 different kinds of &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e09ee88&quot;&gt;Virtuoso&lt;/a&gt;:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Single&lt;/i&gt; is the generally available single-instance-server configuration of Virtuoso. Whether this is open source or not does not make a difference.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;6 Cluster&lt;/i&gt; is the generally available, commercial-only, cluster-capable Virtuoso.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;i&gt;7 Single&lt;/i&gt; is the next-generation single-instance-server Virtuoso, about to be released as a preview.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;We began by running the various parts of BSBM at different scales with different Virtuoso variants. In so doing, we noticed that the BSBM &lt;i&gt;Explore&lt;/i&gt; mix at one scale got better throughput as we added more clients, approximately as one would expect based on &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1c1b4860&quot;&gt;CPU&lt;/a&gt; usage and number of cores, while at another scale this was not so.&lt;/p&gt; &lt;p&gt;At the 1-billion-triple scale (1000 Mt; 1 Mt = 1 Megatriple, or one million triples) we saw CPU going from 200% with 1 client to 1400% with 16 clients but throughput increased by less than 20%. &lt;/p&gt; &lt;p&gt;When we ran the same scale with our shared-nothing 6 Cluster, running 8 processes on the same box, throughput increased normally with the client count. We have not previously tried BSBM with 6 Cluster simply because there is little to gain and a lot to lose by distributing this workload. But here we got a multiuser throughput with 6 Cluster that is easily 3 times that of the single server, even with a cluster-unfriendly workload. &lt;/p&gt; &lt;p&gt; See, sometimes scaling out even within a shared memory multiprocessor pays! Still, what we saw was rather anomalous.&lt;/p&gt; &lt;p&gt;Over the years we have looked at performance any number of times and have a lot of built-in meters. For cases of high CPU with no throughput, the prime suspect is contention on critical sections. Quite right, when building with the mutex meter enabled, counting how many times each mutex is acquired and how many times this results in a wait, we found a mutex which gets acquired 600M times in the run, of which an insane 450M result in a wait. One can count a microsecond of real time each time a mutex wait results in the kernel switching tasks. The run took 500 s or so, of which 450 s of real time were attributable to the overhead of waiting for this one mutex.&lt;/p&gt; &lt;p&gt;Waiting for a mutex is a real train wreck. We have tried spinning a few times before it, which the OS does anyhow, but this does not help. Using spin locks is good only if waits are extremely rare; with any frequency of waiting, even for very short waits, a mutex is still a lot better.&lt;/p&gt; &lt;p&gt;Now, the mutex in question happens to serialize the buffer &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1e542088&quot;&gt;cache&lt;/a&gt; for one specific page of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x17c853d8&quot;&gt;data&lt;/a&gt;, one level down from the root of the index for &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1d64a658&quot;&gt;RDF&lt;/a&gt; PSOG. By the luck of the draw, the Ps falling on that page are commonly accessed Ps pertaining to product features. In order to get any product feature value, one must pass via this page. At the smaller scale, the different properties web their different ways based on the index root.&lt;/p&gt; &lt;p&gt;One might here ask why the problem is one level down from the root and not in the root. The index root is already handled specially, so the read-write locks for buffers usually apply only for the first level down. One might also ask why have a mutex in the first place. Well, unless one is read-only and all in memory, there simply must be a way to say that a buffer must not get written to by one thread while another is reading it. Same for cache replacement. Some in-memory people fork a whole copy of the database process to do a large query and so can forget about serialization. But one must have long queries for this and have all in memory. One can make writes less frequent by keeping deltas, but this does not remove the need to merge the deltas at some point, which cannot happen without serializing this with the readers.&lt;/p&gt; &lt;p&gt;Most of the time the offending mutex is acquired for getting a property of a product in Q5, the one that looks for products with similar values of a numeric property. We retrieve this property for a number of products in one go, due to vectoring. Vectoring is supposed to save us from constantly hitting the index tree top when getting the next match. So how come there is contention in the index tree top? As it happens, the vectored index lookup checks for locality only when all search conditions on key parts are equalities. Here however there is equality on P and S and a range on O; hence, the lookup starts from the index root every time.&lt;/p&gt; &lt;p&gt;So I changed this. The effect was Q5 getting over twice as fast, with the single user throughput at 1000 Mt going from 2000 to 5200 QMpH (Query Mixes per Hour) and the 16-user throughput going from 3800 to over 21000 QMpH. The previously &amp;quot;good&amp;quot; throughput of 40K QMpH at 100 Mt went to 66K QMpH. &lt;/p&gt; &lt;p&gt;Vectoring can make a real difference. The throughputs for the same workload on 6 Single, without vectoring, thus unavoidably hitting the page with the crazy contention, are 1770 QMpH single user and 2487 QMpH with 16 users. The 6 Cluster throughput, avoiding the contention but without the increased locality from vectoring and with the increased latency of going out-of-process for most of the data, was about 11.5K QMpH with 16 users. Each partition had a page getting the hits but since the partitioning was on S and S was about-evenly distributed, each partition got 1/8 of the load; thus waiting on the mutex did not become a killer issue. &lt;/p&gt; &lt;p&gt;We see how detailed analysis of benchmarks can lead to almost an order of magnitude improvements in a short time. This analysis is however both difficult and tedious. It is not readily delegable; one needs real &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1e7249d0&quot;&gt;knowledge&lt;/a&gt; of how things work and of how they ought to work in order to get anywhere with this. Experience tends to show that a competitive situation is needed in order to motivate one to go to the trouble. Unless something really sticks out in an obvious manner, one is most likely not going to look deep enough. Of course, this is seen in applications too but application &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x1d429e80&quot;&gt;optimization&lt;/a&gt; tends to stop at a point where the application is usable. Also stored procedures and specially-tweaked queries will usually help. In most application scenarios, we are not simultaneously looking at multiple different implementations, except maybe at the start of development but then this falls under benchmarking and evaluation.&lt;/p&gt; &lt;p&gt;So, the usefulness of benchmarks is again confirmed. There is likely great unexplored space for improvement as we move to more interesting and diverse scenarios.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1658&quot; id=&quot;link-id0x1f619550&quot;&gt;Benchmarks, Redux (part 1): On RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt;Benchmarks, Redux (part 2): A Benchmarking Story &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1caa7cd8&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1d8b7648&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1f2a6ba8&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x17b425f0&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x1a7f6b30&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1ee5ec98&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1b7c5af8&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1dad7588&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1c5520a0&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1eb19bf8&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1eb2c398&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1fb6a118&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1f160580&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Benchmarks, Redux (part 1): On RDF Benchmarks</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-02-28#1659</atom:id>
  <atom:published>2011-02-28T20:20:22Z</atom:published>
  <atom:updated>2011-03-14T17:16:34.000002-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;This post introduces a series on &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1e724ae0&quot;&gt;RDF&lt;/a&gt; benchmarking. In these posts I will cover the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Correct misleading &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1e325480&quot;&gt;information&lt;/a&gt; about us in the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html&quot; id=&quot;link-id0x1ded41d0&quot;&gt;recent Berlin report&lt;/a&gt;: The load rate is off-the wall and the update mix is missing. We supply the right numbers and explain how to load things so that one gets decent performance.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Discuss configuration options for &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e0a2548&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Tell a story about multithreading and its perils and how vectoring and scale-out can save us.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Analyze the run time behavior of Virtuoso 6 Single, 6 Cluster, and 7 Single.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Look at the benefits of SSDs (solid-state storage devices) over HDDs (hard disk devices; spinning platters), and I/O matters in general.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Talk in general about modalities of benchmark running, and how to reconcile vendors doing what they know best with the air of legitimacy of a third party. Whether to do things a la &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x1e0ef4f0&quot;&gt;TPC&lt;/a&gt; or a la TREC? We will hopefully try a bit of both, at least so I have proposed to our partners in &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x1e54d3d8&quot;&gt;LOD2&lt;/a&gt;, the EU FP7 that also funded the recent Berlin report.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Outline the desiderata for an RDF benchmark that is not just an RDF-ized relational workload, the Social Intelligence Benchmark.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Talk about &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1e730bc8&quot;&gt;BSBM&lt;/a&gt; in specific. What does it measure?&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Discuss some experiments with the BI use case of BSBM.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Document how the results mentioned here were obtained and suggest practices for benchmark running and disclosure.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The background is that the LOD2 FP7 project is supposed to deliver a report about the state of the art and benchmark laboratory by March 1. The Berlin report is a part thereof. In the project proposal we talk about an ongoing benchmarking activity and about having up-to-date installations of the relevant RDF stores and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1c1551e0&quot;&gt;RDBMS&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Since this is taxpayer money for supposedly the common good, I see no reason why such a useful thing should be restricted to the project participants. On the other hand, running a display window of stuff for benchmarking, when in at least in some cases licenses prohibit unauthorized publishing of benchmark results might be seen to conflict with the spirit of the license if not its letter. We will see.&lt;/p&gt; &lt;p&gt;For now, my take is that we want to run benchmarks of all interesting software, inviting the vendors to tell us how to do that if they will, and maybe even letting them perform those runs themselves. Then we promise not to disclose results without the vendor&amp;#39;s permission. Access to the installations is limited to whoever operates the equipment. Configuration files and detailed hardware specs and such on the other hand will be made public. If a run is published, it will be with permission and in a format that includes full information for replicating the experiment.&lt;/p&gt; &lt;p&gt;In the LOD2 proposal we also in so many words say that we will stretch the limits of the state of the art. This stretching is surely not limited to the project&amp;#39;s own products but should also include the general benchmarking aspect. I will say with confidence that running single server benchmarks at a max 200 Mtriples of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x11327f10&quot;&gt;data&lt;/a&gt; is not stretching anything.&lt;/p&gt; &lt;p&gt;So to ameliorate this situation, I thought to run the same at 10x the scale on a couple of large boxes we have access to. 1 and 2 billion triples are still comfortably single server scales. Then we could go for example to Giovanni&amp;#39;s cluster at &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Digital_Enterprise_Research_Institute&quot; id=&quot;link-id0x1bfaffa0&quot;&gt;DERI&lt;/a&gt; and do 10 and 20 billion triples, this should fly reasonably on 8 or 16 nodes of the DERI gear. Or we might talk to SEALS who by now should have their own lab. Even Amazon &lt;a class=&quot;auto-href&quot; href=&quot;http://aws.amazon.com/ec2/&quot; id=&quot;link-id0x1bfafef8&quot;&gt;EC2&lt;/a&gt; might be an option, although not the preferred one.&lt;/p&gt; &lt;p&gt;So I asked everybody about config instructions, which produced a certain amount of dismay as I might be said to be biased and to be skirting the edges of conflict of interest. The inquiry was not altogether negative though since &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Ontotext&quot; id=&quot;link-id0x1eccc1e0&quot;&gt;Ontotext&lt;/a&gt; and &lt;a class=&quot;auto-href&quot; href=&quot;http://freebase.com/guid/9202a8c04000641f8000000005c908d6&quot; id=&quot;link-id0x1eccc208&quot;&gt;Garlik&lt;/a&gt; provided some information. We will look into these this and next week. We will not publish any information without asking first.&lt;/p&gt; &lt;p&gt;In this series of posts I will only talk about &lt;a class=&quot;auto-href&quot; href=&quot;http://www.openlinksw.com/dataspace/organization/openlink#this&quot; id=&quot;link-id0x1bfa4030&quot;&gt;OpenLink Software&lt;/a&gt;.&lt;/p&gt; &lt;h3&gt; &lt;i&gt;Benchmarks, Redux&lt;/i&gt; Series&lt;/h3&gt; &lt;ul&gt; &lt;li&gt;Benchmarks, Redux (part 1): On RDF Benchmarks &lt;i&gt;(this post)&lt;/i&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1660&quot; id=&quot;link-id0x1b668d10&quot;&gt;Benchmarks, Redux (part 2): A Benchmarking Story&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1663&quot; id=&quot;link-id0x1b3a0c08&quot;&gt;Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1665&quot; id=&quot;link-id0x1f9f1740&quot;&gt;Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1667&quot; id=&quot;link-id0x1ad929f8&quot;&gt;Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1669&quot; id=&quot;link-id0x1db437c0&quot;&gt;Benchmarks, Redux (part 6): BSBM and I/O, continued&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1671&quot; id=&quot;link-id0x17138c38&quot;&gt;Benchmarks, Redux (part 7): What Does BSBM Explore Measure?&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1673&quot; id=&quot;link-id0x1c0e74f8&quot;&gt;Benchmarks, Redux (part 8): BSBM Explore and Update &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1675&quot; id=&quot;link-id0x1f297d10&quot;&gt;Benchmarks, Redux (part 9): BSBM With Cluster&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1677&quot; id=&quot;link-id0x1e4994b8&quot;&gt;Benchmarks, Redux (part 10): LOD2 and the Benchmark Process&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1678&quot; id=&quot;link-id0x1ebea6d0&quot;&gt;Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1d5c86c0&quot;&gt;Benchmarks, Redux (part 12): Our Own BSBM Results Report&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1efec0e0&quot;&gt;Benchmarks, Redux (part 13): BSBM BI Modifications &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1a9941f8&quot;&gt;Benchmarks, Redux (part 14): BSBM BI Mix &lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=&quot; id=&quot;link-id0x1ea26de8&quot;&gt;Benchmarks, Redux (part 15): BSBM Test Driver Enhancements &lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Directions for 2011</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2011-01-19#1650</atom:id>
  <atom:published>2011-01-19T16:29:37Z</atom:published>
  <atom:updated>2011-01-20T12:54:42.000002-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1603&quot; id=&quot;link-id0x1d584720&quot;&gt;At the start of 2010, I wrote&lt;/a&gt; that 2010 would be the year when &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x2007b778&quot;&gt;RDF&lt;/a&gt; became performance- and cost-competitive with relational technology for &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x7f5bf68&quot;&gt;data&lt;/a&gt; warehousing and analytics. More specifically, RDF would shine where data was heterogenous and/or where there was a high frequency of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x1ffa18b0&quot;&gt;schema&lt;/a&gt; change.&lt;/p&gt; &lt;p&gt;I will now discuss what we have done towards this end in 2010 and how you will gain by this in 2011.&lt;/p&gt; &lt;p&gt;At the start of 2010, we had internally demonstrated 4x space efficiency gains from column-wise compression and 3x loop join speed gains from vectored execution. To recap, &lt;i&gt;column-wise compression&lt;/i&gt; means a column-wise storage layout where values of consecutive rows of a single column are consecutive in memory/disk and are compressed in a manner that benefits from the homogenous data type and possible sort order of the column. &lt;i&gt;Vectored execution&lt;/i&gt; means passing large numbers of query variable bindings between query operators and possibly sorting inputs to joins for improving locality. Furthermore, always operating on large sets of values gives extra opportunities for parallelism, from instruction level to threads to scale out.&lt;/p&gt; &lt;p&gt;So, during 2010, we integrated these technologies into &lt;a class=&quot;auto-href&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1fdf3f90&quot;&gt;Virtuoso&lt;/a&gt;, for relational- and graph-based applications alike. Further, even if we say that RDF will be close to relational speed in Virtuoso, the point is moot if Virtuoso&amp;#39;s relational speed is not up there with the best of analytics-oriented &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x7bf0d40&quot;&gt;RDBMS&lt;/a&gt;. RDF performance does rest on the basis of general-purpose database performance; what is sauce for the goose is sauce for the gander. So we reimplemented &lt;code&gt;&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Hash_join&quot; id=&quot;link-id0x7815c60&quot;&gt;HASH JOIN&lt;/a&gt;&lt;/code&gt; and &lt;code&gt;GROUP BY&lt;/code&gt;, and fine-tuned many of the tricks required by &lt;a class=&quot;auto-href&quot; href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x213d6de8&quot;&gt;TPC&lt;/a&gt;-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x1fd92690&quot;&gt;H. TPC-H&lt;/a&gt; is not the sole final destination, but it is a step on the way and a valuable checklist for what a database ought to do.&lt;/p&gt; &lt;p&gt;At the Semdata workshop of &lt;a class=&quot;auto-href&quot; href=&quot;http://www.vldb2010.org/&quot; id=&quot;link-id0x21178a50&quot;&gt;VLDB 2010&lt;/a&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1632&quot; id=&quot;link-id0x1de8fee8&quot;&gt;we presented some results&lt;/a&gt; of our column store applied to RDF and relational tasks. As noted in the paper, the implementation did demonstrate significant gains over the previous row-wise architecture but was not yet well optimized, so not ready to be compared with the best of the relational analytics world. A good part of the fall of 2010 went into optimizing the column store and completing functionality such as transaction support with columns.&lt;/p&gt; &lt;p&gt;A lot of this work is not specifically RDF oriented, but all of this work is constantly informed by the specific requirements of RDF. For example, the general idea of vectored execution is to eliminate overheads and optimize &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x7ae0d58&quot;&gt;CPU&lt;/a&gt; &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x7bb7150&quot;&gt;cache&lt;/a&gt; and other locality by doing single query operations on arrays of operands so that the whole batch runs more or less in CPU cache. Are the gains not lost if data is typed at run time, as in RDF? In fact, the cost of run-time-typing turns out to be small, since data in practice tends to be of homogenous type and with locality of reference in values. Virtuoso&amp;#39;s column store implementation resembles in broad outline other column stores like &lt;a class=&quot;auto-href&quot; href=&quot;http://www.vertica.com/&quot; id=&quot;link-id0x7f61080&quot;&gt;Vertica&lt;/a&gt; or &lt;a class=&quot;auto-href&quot; href=&quot;http://www.ingres.com/vectorwise/&quot; id=&quot;link-id0x2154ce38&quot;&gt;VectorWise&lt;/a&gt;, the main difference being the built-in support for run-time heterogenous types.&lt;/p&gt; &lt;p&gt;The &lt;a class=&quot;auto-href&quot; href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x755e668&quot;&gt;LOD2&lt;/a&gt; EU FP 7 project &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1630&quot; id=&quot;link-id0x1d8eaf28&quot;&gt;started in September 2010&lt;/a&gt;. In this project OpenLink and the celebrated heroes of the column store, &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x1feba470&quot;&gt;CWI&lt;/a&gt; of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x223bbe70&quot;&gt;MonetDB&lt;/a&gt; and VectorWise fame, represent the database side.&lt;/p&gt; &lt;p&gt;The first database task of LOD2 is making a survey of the state of the art and a round of benchmarking of RDF stores. The &lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x20f50c20&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt; (&lt;a class=&quot;auto-href&quot; href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x780c430&quot;&gt;BSBM&lt;/a&gt;) has accordingly evolved to include a business intelligence section and an update stream. Initial results from running these will become available in February/March, 2011. The specifics of this process merit another post; let it for now be said that benchmarking is making progress. In the end, it is our conviction that we need a situation where vendors may publish results as and when they are available and where there exists a well defined process for documenting and checking results.&lt;/p&gt; &lt;p&gt;LOD2 will continue by &lt;i&gt;linking the universe,&lt;/i&gt; as I half-facetiously put it on a presentation slide. This means alignment of anything from schema to instance identifiers, with and without supervision, and always with provenance, summarization, visualization, and so forth. In fact, putting it this way, this gets to sound like the old chimera of generating applications from data or allowing users to derive actionable intelligence from data of which they do not even know the structure. No, we are not that unrealistic. But we are moving toward more ad-hoc discovery and faster time to answer. And since we provide an infrastructure element under all this, we want to do away with the &amp;quot;RDF tax,&amp;quot; by which we mean any significant extra cost of RDF compared to an alternate technology. To put it another way, you ought to pay for unpredictable heterogeneity or complex inference only when you actually use them, not as a fixed up-front overhead.&lt;/p&gt; &lt;p&gt;So much for promises. When will you see something? It is safe to say that we cannot very well publish benchmarks of systems that are not generally available in some form. This places an initial technology preview cut of Virtuoso 7 with vectored execution somewhere in January or early February. The column store feature will be built in, but more than likely the row-wise compressed RDF format of Virtuoso 6 will still be the default. Version 6 and 7 databases will be interchangeable unless column-store structures are used.&lt;/p&gt; &lt;p&gt;For now, our priority is to release the substantial gains that have already been accomplished.&lt;/p&gt; &lt;p&gt;After an initial preview cut, we will return to the agenda of making sure Virtuoso is up there with the best in relational analytics, and that the equivalent workload with an RDF data model runs as close as possible to relational performance. As a first step this means taking TPC-H as is, and then converting the data and queries to the trivially equivalent RDF and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x25716618&quot;&gt;SPARQL&lt;/a&gt; and seeing how it goes. In &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1627&quot; id=&quot;link-id0x1af60d40&quot;&gt;the September paper&lt;/a&gt; we dabbled a little with the data at a small scale but now we must run the full set of queries at 100GB and 300GB scales, which come to about 14 billion and 42 billion triples, respectively. A well done analysis of the issues encountered, covering similarities and dissimilarities of the implementation of the workload as &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x223b0a88&quot;&gt;SQL&lt;/a&gt; and SPARQL, should make a good VLDB paper.&lt;/p&gt; &lt;p&gt;Database performance is an entirely open-ended quest and the bag of potentially applicable tricks is as good as infinite. Having said this, it seems that the scales comfortably reached in the TPC benchmarks are more than adequate for pretty much anything one is likely to encounter in real world applications involving comparable workloads. Businesses getting over 6 million new order transactions per minute (the high score of TPC-&lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x1f72a180&quot;&gt;C&lt;/a&gt;) or analyzing a warehouse of 60 billion orders shipped to 6 billion customers over 7 years (10000GB or 10TB TPC-H) are not very common if they exist at all.&lt;/p&gt; &lt;p&gt;The real world frontier has moved on. Scaling up the TPC workloads remains a generally useful exercise that continues to contribute to the state of the art but the applications requiring this advance are changing.&lt;/p&gt; &lt;p&gt;Someone once said that for a new technology to become mainstream, it needs to solve a new class of problem. Yes, while it is a preparatory step to run TPC-H translated to SPARQL without dying of overheads, there is little point in doing this in production since SQL is anyway likely better and already known, proven, and deployed.&lt;/p&gt; &lt;p&gt;The new class of problem, as LOD2 sees it, is the matter of web-wide cross-organizational data integration. Web-wide does not necessarily mean crawling the whole web, but does tend to mean running into significant heterogeneity of sources, both in terms of modeling and in terms of usage of more-or-less standard data models. Around this topic we hear two messages. The database people say that inference beyond what you can express in SQL views is theoretically nice but practically not needed; on the other side, we hear that the inference now being standardized in efforts like &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Rule_Interchange_Format&quot; id=&quot;link-id0x22b3ad68&quot;&gt;RIF&lt;/a&gt; and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x22b3ad90&quot;&gt;OWL&lt;/a&gt; is not expressive enough for the real world. As one expert put it, &lt;i&gt;if enterprise data integration in the 1980s was between a few databases, today it is more like between 1000 databases,&lt;/i&gt; which makes this matter similar to searching the web. How can one know in such a situation that the data being aggregated is in fact meaningfully aggregate-able?&lt;/p&gt; &lt;p&gt;Add to this the prevalence of unstructured data in the world and the need to mine it for actionable intelligence. Think of combining data from CRM, worldwide media coverage of own and competitive brands, and in-house emails for assessing organizational response to events on the market.&lt;/p&gt; &lt;p&gt;These are the actual use cases for which we need RDF at relational DW performance and scale. This is not limited to RDF and OWL profiles, since we fully believe that inference needs are more diverse. The reason why this is RDF and not SQL plus some extension of &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Datalog&quot; id=&quot;link-id0x7ee5130&quot;&gt;Datalog&lt;/a&gt;, is the widespread adoption of RDF and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x2111f968&quot;&gt;linked data&lt;/a&gt; as a data publishing format, with all the schema-last and &lt;a class=&quot;auto-href&quot; href=&quot;http://dbpedia.org/resource/Open_world_assumption&quot; id=&quot;link-id0x2111f990&quot;&gt;open world&lt;/a&gt; aspects that have been there from the start.&lt;/p&gt; &lt;p&gt;Stay tuned for more news later this month!&lt;/p&gt; &lt;h3&gt;Related&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1603&quot; id=&quot;link-id0x1de6b370&quot;&gt;Linked Data and Virtuoso in 2010&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1510&quot; id=&quot;link-id0x1b031180&quot;&gt;Linked Data &amp;amp; The Year 2009&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1286&quot; id=&quot;link-id0x1a582d10&quot;&gt;Retrospective and Outlook for 2008&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>VLDB Semdata Workshop</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-09-21#1635</atom:id>
  <atom:published>2010-09-21T21:14:14Z</atom:published>
  <atom:updated>2010-09-21T16:22:18-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I will begin by extending my thanks to the organizers, in specific &lt;a href=&quot;http://members.deri.at/~retok&quot; id=&quot;link-id0x236ebfd0&quot;&gt;Reto Krummenacher&lt;/a&gt; of &lt;a href=&quot;http://www.sti-innsbruck.at/&quot; id=&quot;link-id0x2371aca8&quot;&gt;STI&lt;/a&gt; and Atanas Kiryakov of &lt;a href=&quot;http://dbpedia.org/resource/Ontotext&quot; id=&quot;link-id0x22e24190&quot;&gt;Ontotext&lt;/a&gt; for inviting me to give a position paper at the workshop. Indeed, it is the builders of bridges, the pontifs (pontifex) amongst us who shall be remembered by history. The idea of organizing a semantic &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x23781ba8&quot;&gt;data&lt;/a&gt; management workshop at VLDB is a laudable attempt at rapprochement between two communities to the advantage of all concerned.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://semanticweb.org/id/Franz_Inc&quot; id=&quot;link-id0x22e09fa8&quot;&gt;Franz&lt;/a&gt;, Ontotext, and OpenLink were the vendors present at the workshop. To summarize very briefly, &lt;a href=&quot;http://data.semanticweb.org/person/jans-aasman&quot; id=&quot;link-id0x2380e7c8&quot;&gt;Jans Aasman&lt;/a&gt; of Franz talked about the telco call center automation solution by Amdocs, where the &lt;a href=&quot;http://semanticweb.org/id/AllegroGraph&quot; id=&quot;link-id0x237c9408&quot;&gt;AllegroGraph&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x236f96a8&quot;&gt;RDF&lt;/a&gt; store is integrated. On the technical side, AllegroGraph has Javascript as a stored procedure language, which is certainly a good idea. Naso of Ontotext talked about the BBC FIFA World Cup site. The technical proposition was that materialization is good and data partitioning is not needed; a set of replicated read-only copies is good enough.&lt;/p&gt; &lt;p&gt;I talked about making RDF cost competitive with relational for data integration and BI. The crux is space efficiency and column store techniques.&lt;/p&gt; &lt;p&gt;One question that came up was that maybe RDF could approach relational in some things, but what about string literals being stored in a separate table? Or &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x22ff2c78&quot;&gt;URI&lt;/a&gt; strings being stored in a separate table?&lt;/p&gt; &lt;p&gt;The answer is that if one accesses a lot of these literals the access will be local and fairly efficient. If one accesses just a few, it does not matter. For user-facing reports, there is no point in returning a million strings that the user will not read anyhow. But then it turned out that there in fact exist reports in bioinformatics where there are 100,000 strings. Now taking the worst abuse of &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x236e43f8&quot;&gt;SPARQL&lt;/a&gt;, a regexp over all literals in a property of a given class. With a column store this is a scan of the column; with RDF, a three table join. The join is about 10x slower than the column scan. Quite OK, considering that a full text index is the likely solution for such workloads anyway. Besides, a sensible relational &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x22e31050&quot;&gt;schema&lt;/a&gt; will also not use strings for foreign keys, and will therefore incur a similar burden from fetching the strings before returning the result.&lt;/p&gt; &lt;p&gt;Another question was about whether the attitude was one of confrontation between RDF and relational and whether it would not be better to join forces. Well, as said in my talk, sauce for the goose is sauce for the gander and generally speaking relational techniques apply equally to RDF. There are a few RDB tricks that have no RDF equivalent, like clustering a fact table on dimension values, e.g., sales ordered by country, manufacturer, month. But by and large, column-store techniques apply. The execution engine can be essentially identical, just needing a couple of extra data types and some run-time typing and in some cases producing nulls instead of errors. Query &lt;a href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0x237d76e0&quot;&gt;optimization&lt;/a&gt; is much the same, except that RDB stats are not applicable as such; one needs to sample the data in the cost model. All in all, these adaptations to a RDB are not so large, even though they do require changes to source code.&lt;/p&gt; &lt;p&gt;Another question was about combining data models, e.g., relational (rows and columns), RDF (graph), &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x23845418&quot;&gt;XML&lt;/a&gt; (tree), and full text. Here I would say that it is a fault of our messaging that we do not constantly repeat the necessity of this combining, as we take it for granted. Most RDF stores have a full text index on literal values. OWLIM and a &lt;a href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x22feefa0&quot;&gt;CWI&lt;/a&gt; prototype even have it for URIs. XML is a valid data type for an RDF literal, even though this does not get used very much. So doing SPARQL to select the values, and then doing &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot; id=&quot;link-id0x235b5890&quot;&gt;XPath&lt;/a&gt; and XSLT on the values, is entirely possible, at least in &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x237f6428&quot;&gt;Virtuoso&lt;/a&gt; which has an XPath/XSLT engine built in. Same for invoking SPARQL from an XSLT sheet. Colocating a native &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x238265a8&quot;&gt;RDBMS&lt;/a&gt; with local and federated &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x236f7bc8&quot;&gt;SQL&lt;/a&gt; is what Virtuoso has always done. One can, for example, map tables in heterogenous remote RDBs into tables in Virtuoso, then map these into RDF, and run SPARQL queries that get translated into SQL against the original tables, thereby getting SPARQL access without any materialization. Alongside this, one can ETL relational data into RDF via the same declarative mapping.&lt;/p&gt; &lt;p&gt;Further, there are RDF extensions for geospatial queries in Virtuoso and AllegroGraph, and soon also in others.&lt;/p&gt; &lt;p&gt;With all this cross-model operation, RDF is definitely not a closed island. We&amp;#39;ll have to repeat this more.&lt;/p&gt; &lt;p&gt;Of the academic papers, the SpiderStore (&lt;a href=&quot;http://dbis-informatik.uibk.ac.at/5-1-Publications.html&quot; id=&quot;link-id0x19ecd3f0&quot;&gt;paper&lt;/a&gt; is not yet available at time of writing, but should be soon) and &lt;a href=&quot;http://www.few.vu.nl/~jui200/webpie.html&quot; id=&quot;link-id0x1d60a498&quot;&gt;Webpie&lt;/a&gt; that should be specially noted.&lt;/p&gt; &lt;p&gt;Let us talk about SpiderStore first.&lt;/p&gt; &lt;h2&gt;SpiderStore&lt;/h2&gt; &lt;p&gt;The SpiderStore from the University of Innsbruck is a main-memory-only system that has a record for each distinct IRI. The IRI record has one array of pointers to all IRI records that are objects where the referencing record is the subject, and a similar array of pointers to all records where the referencing record is the object. Both sets of pointers are clustered based on the predicate labeling the edge.&lt;/p&gt; &lt;p&gt;According to the authors (Robert Binna, Wolfgang Gassler, Eva Zangerle, Dominic Pacher, and GÃ¼nther Specht), a distinct IRI is 5 pointers and each triple is 3 pointers. This would make about 4 pointers per triple, i.e., 32 bytes with 64-bit pointers.&lt;/p&gt; &lt;p&gt;This is not particularly memory efficient, since one must count unused space after growing the lists, fragmentation, etc., which will make the space consumption closer to 40 bytes per triple, plus should one add a graph to the mix one would need another pointer per distinct predicate, adding another 1-4 bytes per triple. Supporting non-IRI types in the object position is not a problem, as long as all distinct values have a chunk of memory to them with a type &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0x236fe4d0&quot;&gt;tag&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;We get a few times better memory efficiency with column compressed quads, plus we are not limited to main memory.&lt;/p&gt; &lt;p&gt;But SpiderStore has a point. Making the traversal of an edge in the graph into a pointer dereference is not such a bad deal, especially if the data set is not that big. Furthermore, compiling the queries into &lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x235a2228&quot;&gt;C&lt;/a&gt; procedures playing with the pointers alone would give performance to match or exceed any hard coded graph traversal library and would not be very difficult. Supporting multithreaded updates would spoil much of the gain but allowing single threaded updates and forking read-only copies for reading would be fine.&lt;/p&gt; &lt;p&gt;SpiderStore as such is not attractive for what we intend to do, this being aggregating RDF quads in volumes far exceeding main memory and scaling to clusters. We note that SpiderStore hits problems with distributed memory, since SpiderStore executes depth first, which is manifestly impossible if significant latencies are involved. In other words, if there can be latency, one must amortize by having a lot of other possible work available. Running with long vectors of values is one way, as in &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x236e14a0&quot;&gt;MonetDB&lt;/a&gt; or Virtuoso Cluster. The other way is to have a massively multithreaded platform which favors code with few instructions but little memory locality. SpiderStore could be a good fit for massive multithreading, specially if queries were compiled to C, dramatically cutting down on the count of instructions to execute.&lt;/p&gt; &lt;p&gt;We too could adopt some ideas from SpiderStore. Namely, if running vectored, one just in passing, without extra overhead, generates an array of links to the next IRI, a bit like the array that SpiderStore has for each predicate for the incoming and outgoing edges of a given IRI. Of course, here these would be persistent IDs and not pointers, but a hash from one to the other takes almost no time. So, while SpiderStore alone may not be what we are after for data warehousing, Spiderizing parts of the working set would not be so bad. This is especially so since the Spiderizable data structure almost gets made as a by-product of query evaluation.&lt;/p&gt; &lt;p&gt;If an algorithm made several passes over a relatively small subgraph of the whole database, Spiderizing it would accelerate things. The memory overhead could have a fixed cap so as not to ruin the working set if locality happened not to hold.&lt;/p&gt; &lt;p&gt;Running a SpiderStore-like execution model on vectors instead of single values would likely do no harm and might even result in better &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x237eb508&quot;&gt;cache&lt;/a&gt; behavior. The exception is in the event of completely unpredictable patterns of connections which may only be amortized by massive multithreading.&lt;/p&gt; &lt;h2&gt;Webpie&lt;/h2&gt; &lt;p&gt;Webpie from &lt;a href=&quot;http://www.vu.nl/&quot; id=&quot;link-id0x23811bf8&quot;&gt;VU Amsterdam&lt;/a&gt; and the &lt;a href=&quot;http://www.larkc.eu/&quot; id=&quot;link-id0x22ff8fe8&quot;&gt;LarKC&lt;/a&gt; EU FP 7 project is, as it were, the opposite of SpiderStore. This is a map-reduce-based RDFS and &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x238482a0&quot;&gt;OWL&lt;/a&gt; Horst inference engine which is all about breadth-first passes over the data in a map-reduce framework with intermediate disk-based storage.&lt;/p&gt; &lt;p&gt;Webpie is not however a database. After the inference result has been materialized, it must be loaded into a SPARQL engine in order to evaluate a query against the result.&lt;/p&gt; &lt;p&gt;The execution plan of Webpie is made from the ontology whose consequences must be materialized. The steps are sorted and run until a fixed point is reached for each. This is similar to running SPARQL &lt;code&gt;INSERT â¦ SELECT&lt;/code&gt; statements until no new inserts are produced. The only requirement is that the &lt;code&gt;INSERT&lt;/code&gt; statement should report whether new inserts were actually made. This is easy to do. In this way, a comparison between map-reduce plus memory-based joining and a parallel RDF database could be made.&lt;/p&gt; &lt;p&gt;We have suggested such an experiment to the LarKC people. We will see.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>LOD2 Kick Off</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-09-21#1633</atom:id>
  <atom:published>2010-09-21T21:13:03Z</atom:published>
  <atom:updated>2010-09-21T16:22:12-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The &lt;a href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x22e06810&quot;&gt;LOD2&lt;/a&gt; &lt;a href=&quot;http://lod2.eu/BlogPost/9-press-release-lod2-project-launch.html&quot; id=&quot;link-id0x18c0c770&quot;&gt;kick off meeting&lt;/a&gt; was held in Leipzig on Sept 6-8. I will here talk about OpenLink plans as concerns LOD2; hence this is not to be taken as representative of the whole project. I will first discuss the immediate and conclude with the long term.&lt;/p&gt; &lt;p&gt;As concerns OpenLink specifically, we have two short term activities, namely publishing the initial LOD2 repository in December and publishing a set of RDB and &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x22f9ba70&quot;&gt;RDF&lt;/a&gt; benchmarks in February.&lt;/p&gt; &lt;p&gt;The LOD2 repository is a fusion of the OpenLink &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x2378d288&quot;&gt;LOD&lt;/a&gt; &lt;a href=&quot;http://lod.openlinksw.com/&quot; id=&quot;link-id0x23908828&quot;&gt;Cloud&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x2378e6c8&quot;&gt;Cache&lt;/a&gt; (which includes &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x237d7d20&quot;&gt;data&lt;/a&gt; from &lt;a href=&quot;http://uriburner.com/&quot; id=&quot;link-id0x237c9408&quot;&gt;URIBurner&lt;/a&gt; and &lt;a href=&quot;http://www.pingthesemanticweb.com/&quot; id=&quot;link-id0x235b03b0&quot;&gt;PingTheSemanticWeb&lt;/a&gt;) and &lt;a href=&quot;http://sindice.com/&quot; id=&quot;link-id0x22e24190&quot;&gt;Sindice&lt;/a&gt;, both hosted at &lt;a href=&quot;http://dbpedia.org/resource/Digital_Enterprise_Research_Institute&quot; id=&quot;link-id0x237b80f8&quot;&gt;DERI&lt;/a&gt;. The value-add compared to Sindice or the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x237b63c0&quot;&gt;Virtuoso&lt;/a&gt;-based LOD Cloud Cache alone is the merger of the timeliness and ping-ping crawling of Sindice with the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x237f7568&quot;&gt;SPARQL&lt;/a&gt; of Virtuoso.&lt;/p&gt; &lt;p&gt;Further down the road, after we migrate the system to the Virtuoso column store, we will also see gains in performance, primarily due to much better working set, as data is many times more compact than with the present row-wise &lt;a href=&quot;http://dbpedia.org/resource/Data_compression&quot; id=&quot;link-id0x235b0c38&quot;&gt;key compression&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Still further, but before next September, we will have dynamic repartitioning; the time of availability is set as this is part of the LOD2 project roadmap. The operational need for this is pushed back somewhat by the compression gains from column-wise storage.&lt;/p&gt; &lt;p&gt;As for benchmarks, I just compiled &lt;a href=&quot;http://www.openlinksw.com/weblogs/oerling/&quot; id=&quot;link-id0x1c29e720&quot;&gt;a draft of suggested extensions to the BSBM&lt;/a&gt; (&lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x22e31050&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt;). I talked about this with &lt;a href=&quot;http://nl.linkedin.com/in/peterboncz&quot; id=&quot;link-id0x237c90b0&quot;&gt;Peter Boncz&lt;/a&gt; and &lt;a href=&quot;http://data.semanticweb.org/person/christian-bizer&quot; id=&quot;link-id0x23813340&quot;&gt;Chris Bizer&lt;/a&gt;, to the effect that some extensions of BSBM could be done but that the time was a bit short for making a RDF-specific benchmark. We do recall that BSBM is fully feasible with a relational &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x236f7ef8&quot;&gt;schema&lt;/a&gt; and that RDF offers no fundamental edge for the workload.&lt;/p&gt; &lt;p&gt;There was a graph benchmark talk at the &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x236f8170&quot;&gt;TPC&lt;/a&gt; workshop at &lt;a href=&quot;http://www.vldb2010.org/&quot; id=&quot;link-id0x235c6b90&quot;&gt;VLDB 2010&lt;/a&gt;. There too, the authors were suggesting a social network use case for benchmarking anything from RDF stores to graph libraries. The presentation did not include any specification of test data, so it may be that some cooperation is possible there. The need for such a benchmark is well acknowledged. The final form of this is not yet set but LOD2 will in time publish results from such.&lt;/p&gt; &lt;p&gt;We did informally talk about a process for publishing with our colleagues from &lt;a href=&quot;http://semanticweb.org/id/Franz_Inc&quot; id=&quot;link-id0x23781d28&quot;&gt;Franz&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Ontotext&quot; id=&quot;link-id0x23782740&quot;&gt;Ontotext&lt;/a&gt; at VLDB 2010. The idea is that vendors tune their own systems and do the runs and that the others check on this, preferably all using the same hardware.&lt;/p&gt; &lt;p&gt;Now, the LOD2 benchmarks will also include relational-to-RDF comparisons, for example TPC-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x235a3568&quot;&gt;H&lt;/a&gt; in &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x22e67370&quot;&gt;SQL&lt;/a&gt; and SPARQL. The SQL will be Virtuoso, &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x22e70db0&quot;&gt;MonetDB&lt;/a&gt;, and possibly &lt;a href=&quot;http://www.ingres.com/vectorwise/&quot; id=&quot;link-id0x2378f750&quot;&gt;VectorWise&lt;/a&gt; and others, depending on what legal restrictions apply at the time. This will give an RDF-to-SQL comparison of TPC-H at least on Virtuoso, later also on MonetDB, depending on the schedule for a MonetDB SPARQL front-end.&lt;/p&gt; &lt;p&gt;In the immediate term, this of course focuses our efforts on productizing the Virtuoso column store extension and the optimizations that go with it.&lt;/p&gt; &lt;p&gt;LOD2 is however about much more than database benchmarks. Over the longer term, we plan to apply suitable parts of the ground-breaking database research done at &lt;a href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x23911830&quot;&gt;CWI&lt;/a&gt; to RDF use cases.&lt;/p&gt; &lt;p&gt;This involves anything from adaptive indexing, to reuse and caching of intermediate results, to adaptive execution. This is however more than just mapping column store concepts to RDF. New challenges are posed by running on clusters and dealing with more expressive queries than just SQL, in specific queries with Datalog-like rules and recursion.&lt;/p&gt; &lt;p&gt;LOD2 is principally about integration and alignment, from the schema to the instance level. This involves complex batch processing, close to the data, on large volumes of data. Map-reduce is not the be-all-end-all of this. Of course, a parallel database like Virtuoso, &lt;a href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x22feb520&quot;&gt;Greenplum&lt;/a&gt;, or &lt;a href=&quot;http://www.vertica.com/&quot; id=&quot;link-id0x237f7428&quot;&gt;Vertica&lt;/a&gt; can do map-reduce style operations under control of the SQL engine. After all, the SQL engine needs to do map-reduce and a lot more to provide good throughput for parallel, distributed SQL. Something like the &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x235c2e28&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x2380e7c8&quot;&gt;BOOM&lt;/a&gt;) distributed Datalog implementation (Overlog, Deadalus, BLOOM) could be a parallel computation framework that would subsume any map-reduce-style functionality under a more elegant declarative framework while still leaving control of execution to the developer for the cases where this is needed.&lt;/p&gt; &lt;p&gt;From our viewpoint, the project&amp;#39;s gains include:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Significant narrowing of the RDB to RDF performance gap. RDF will be an option for large scale warehousing, cutting down on time to integration by providing greater schema flexibility.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Ready to use toolbox for data integration, including schema alignment and resolution of coreference.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Data discovery, summarization and visualization&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Integrating this into a relatively unified stack of tools is possible, since these all cluster around the task of linking the universe with RDF and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x236e14a0&quot;&gt;linked data&lt;/a&gt;. In this respect the integration of results may be stronger than often seen in European large scale integrating projects.&lt;/p&gt; &lt;p&gt;The use cases fit the development profile well: &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Wolters_Kluwer&quot; id=&quot;link-id0x23820568&quot;&gt;Wolters Kluwer&lt;/a&gt; will develop an application for integrating resources around law, from the actual laws to court cases to media coverage. The content is modeled in a fine grained legal ontology.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Exalead&quot; id=&quot;link-id0x22e50ba0&quot;&gt;Exalead&lt;/a&gt; will implement the linked data enterprise, addressing enterprise search and any typical enterprise data integration plus generating added value from open sources.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;The Open &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x236fb248&quot;&gt;Knowledge&lt;/a&gt; Foundation will create a portal of all government published data for easy access by citizens.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;In all these cases, the integration requirements of schema alignment, resolution of identity, &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x2381ebb0&quot;&gt;information&lt;/a&gt; extraction, and efficient storage and retrieval play a significant role. The end user interfaces will be task-specific but developer interfaces around integration tools and query formulation may be quite generic and suited for generic RDF application development.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Perseus, Andromeda, and RDF</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-09-13#1629</atom:id>
  <atom:published>2010-09-13T22:10:12Z</atom:published>
  <atom:updated>2010-09-13T17:35:04-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;It has been several months since &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1622&quot; id=&quot;link-id0x1c86c3a0&quot;&gt;my last&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x15fa418&quot;&gt;blog&lt;/a&gt; post. In this day and age of the attention economy, what gives me the insolence so to neglect my duty to mindshare?&lt;/p&gt; &lt;p&gt;Well, Perseus wasn&amp;#39;t blogging or checking his email either, when he went to fetch the Gorgon&amp;#39;s head. As Joseph Campbell puts it, the hero breaks into a world separate from the ordinary in order to bring back a blessing which will revitalize the community.&lt;/p&gt; &lt;p&gt;Thus, I deliberately withdrew from the public conversation, in faith that it would take care of itself and that I would still not be altogether forgotten. As it happens, I was confirmed in this when recently invited to submit a talk for the &lt;a href=&quot;http://semdata.org/events/2010/vldb&quot; id=&quot;link-id0x1c319338&quot;&gt;Semdata workshop&lt;/a&gt; at &lt;a href=&quot;http://www.vldb2010.org&quot; id=&quot;link-id0x1b334640&quot;&gt;VLDB 2010&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Great deeds are not only personal accomplishments but also play a role in a broader context. The quest may appear remote and difficult to execute but its outcome can be quite tangible: Andromeda needed no elaborate sales pitch to convince her of the advantages of not being eaten by the sea serpent.&lt;/p&gt; &lt;p&gt;Thus right after &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1614&quot; id=&quot;link-id0x15fd6968&quot;&gt;the meeting in Sofia last March&lt;/a&gt;, I followed the vertical treasure map into the realm of first principles. As Perseus received advice from Athena, so was I informed by the Platonic ideas of locality and concurrency.&lt;/p&gt; &lt;p&gt;The great quests have an outer and inner aspect. Likewise here, bringing the ideas to physical reality gave me a great deal of material on cognitive function itself. For human and computer alike, it appears that the main reason why anything at all works is &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x140f1dc0&quot;&gt;cache&lt;/a&gt;. Locality and parallelism again. Maybe I will say something more about memory, attention, interface, and paradigm some other time. On the other hand, such material is bound to be unpopular even if valid.&lt;/p&gt; &lt;p&gt;By now, you may ask yourself what I am talking about.&lt;/p&gt; &lt;p&gt;We remember that Andromeda&amp;#39;s fix was due to her mother, Cassiopeia, having claimed greater beauty than the daughters of the sea-god Poseidon. To transpose the archetype into the present, it is like Tim B-L saying that OWLs (by the way sacred to Athena) are more semantic than Codd&amp;#39;s brainchild. Yet the relational community sees &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1d241648&quot;&gt;RDF&lt;/a&gt; as something not quite serious. A matter of scale(s) â just think of the sea serpent.&lt;/p&gt; &lt;p&gt;So, I am talking about what I alluded to in the &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1603&quot; id=&quot;link-id0x19182b88&quot;&gt;2010 New Year&amp;#39;s statement on this blog&lt;/a&gt;: RDF as a viable alternative to relational for big &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1b896350&quot;&gt;data&lt;/a&gt;. This means that RDF is no longer a specialty niche where, due to the hopeless task of bringing everything into a relational model, the fact of everything taking several times both the time and space is tolerated because there is no real alternative.&lt;/p&gt; &lt;p&gt;The value proposition is that for any current RDF user, the present assets will go four times farther than before with the next release of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x6a223b0&quot;&gt;Virtuoso&lt;/a&gt;. For a prospective RDF user, the cost of keeping an ETLed RDF integration warehouse is now in the same ballpark as the relational cost, except that &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x15f8ed8&quot;&gt;schema&lt;/a&gt; is now flexible, and the time to integrate and answer is accordingly shorter. For users of analytics-oriented &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x8bc44d8&quot;&gt;RDBMS&lt;/a&gt;, the next Virtuoso is a full cluster-capable &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x127faf40&quot;&gt;SQL&lt;/a&gt; column store. Its merits compared to others in this space will be published later with benchmarks like &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x6af7ae0&quot;&gt;TPC&lt;/a&gt;-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x1d46f230&quot;&gt;H&lt;/a&gt;. As an extra bonus for such users, Virtuoso brings SQL federation and a growth path to RDF, should this become interesting.&lt;/p&gt; &lt;p&gt;This is accomplished by introducing a new column-wise compressed-storage engine with corresponding changes to query execution. The general principles are explained in &lt;a href=&quot;http://cs-www.cs.yale.edu/homes/dna/papers/abadiphd.pdf&quot; id=&quot;link-id0x1d259a88&quot;&gt;Daniel Abadi&amp;#39;s famous Ph.D. thesis&lt;/a&gt;. The compression is tuned by the data itself, without user intervention. Further, our implementation remains capable of run-time-typing, thus the column-store advantages to RDF are obtained without going to a task-specific schema. But since data types, even if determined at run-time, are still in practice repetitive, the advantages of running on homogenous vectors are not lost.&lt;/p&gt; &lt;p&gt;When storing an RDF extraction of TPC-H data, we get a storage usage of 6.3 bytes per quad. If you do not care about queries where the predicate is unspecified, the storage requirement drops to 4.7 bytes per quad. Whether storing the data as RDF quads or as Vertica-style multicolumn projections, the working set is about the same. Since having enough of the data in memory is the &lt;i&gt;sine qua non&lt;/i&gt; prerequisite of flexible querying, the point is made. QED.&lt;/p&gt; &lt;p&gt;In Virtuoso also, relational remains a bit faster but a penalty of 1.3x or so for RDF is quite tolerable, considering that &lt;i&gt;a priori&lt;/i&gt; schema is no longer needed.&lt;/p&gt; &lt;p&gt;This means that we are coming into an age where the warehouse becomes an &lt;i&gt;ad hoc&lt;/i&gt; asset, to be filled with RDF, without the need to develop an &lt;i&gt;a priori&lt;/i&gt; universal schema for all data one may ever wish to integrate, now or in the future. The data can be stored as RDF and projected from there into any form that may be needed at any time, whether the target format is more RDF or a task-specific relational schema.&lt;/p&gt; &lt;p&gt;Availability is planned for late 2010, first as a Virtuoso Open Source preview.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>VLDB Semdata Workshop - The New Frontier of Semdata</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-09-13#1628</atom:id>
  <atom:published>2010-09-13T22:09:24Z</atom:published>
  <atom:updated>2010-09-21T10:52:19.000004-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;This is a revised version of the talk I will be giving at the &lt;a href=&quot;http://semdata.org/events/2010/vldb&quot; id=&quot;link-id0x1d137fe0&quot;&gt;Semdata workshop&lt;/a&gt; at &lt;a href=&quot;http://www.vldb2010.org/&quot; id=&quot;link-id0x2533b280&quot;&gt;VLDB 2010&lt;/a&gt;.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDirectionsChallengesSemdata&quot; id=&quot;link-id0x1cff6678&quot;&gt;The paper&lt;/a&gt; shows how we store &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x244a65c0&quot;&gt;TPC&lt;/a&gt;-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x25136af8&quot;&gt;H&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x259a6460&quot;&gt;data&lt;/a&gt; as &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x268767b0&quot;&gt;RDF&lt;/a&gt; with relational-level efficiency and how we query both RDF and relational versions in comparable time. We also compare row-wise and column-wise storage formats as implemented in &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x2596dbc8&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;A question that has come up a few times during the Semdata initiative is how semantic data will avoid the fate of other would-be database revolutions like OODBMS and deductive databases.&lt;/p&gt; &lt;p&gt;The need and opportunity are driven by the explosion of data in quantity and diversity of structure. The competition consists of analytics &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x2681bb10&quot;&gt;RDBMS&lt;/a&gt;, point solutions done with map-reduce or the like, and lastly in some cases from key-value stores with relaxed &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x2493ca50&quot;&gt;schema&lt;/a&gt; but limited querying.&lt;/p&gt; &lt;p&gt;The benefits of RDF are the ever expanding volume of data published in it, reuse of vocabulary, and well-defined semantics. The downside is efficiency. This is not so much a matter of absolute scalability â you can run an RDF database on a cluster â but a question of relative cost as opposed to alternatives.&lt;/p&gt; &lt;p&gt;The baseline is that for relational-style queries, one should get relational performance or close enough. We outline in the paper how RDF reduces to a run-time-typed relational column-store, and gets all the compression and locality advantages traditionally associated with such. After memory is no longer the differentiator, the rest is engineering. So much for the scalability barrier to adoption.&lt;/p&gt; &lt;p&gt;I do not need to talk here about the benefits of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x245f72e8&quot;&gt;linked data&lt;/a&gt; and more or less &lt;i&gt;ad hoc&lt;/i&gt; integration &lt;i&gt;per se&lt;/i&gt;. But again, to make these practical, there are logistics to resolve: How to keep data up to date? How to distribute it incrementally? How to monetize freshness? We propose some solutions for these, looking at diverse-RDF replication and RDB-to-RDF replication in Virtuoso.&lt;/p&gt; &lt;p&gt;But to realize the ultimate promise of RDF/Linked Data/Semdata, however we call it, we must look farther into the landscape of what is being done with big data. Here we are no longer so much running against the RDBMS, but against map-reduce and key-value stores.&lt;/p&gt; &lt;p&gt;Given the psychology of geekdom, the charm of map-reduce is understandable: One controls what is going on, can work in the usual languages, can run on big iron without being picked to pieces by the endless concurrency and timing and order-of-events issues one gets when programming a cluster. Tough for the best, and unworkable for the rest.&lt;/p&gt; &lt;p&gt;The key-value store has some of the same appeal, as it is the DBMS laid bare, so to say, made understandable, without the again intractably-complex questions of fancy query planning and distributed &lt;a href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x25c9a008&quot;&gt;ACID&lt;/a&gt; transactions. The psychological rewards of the sense of control are there, never mind the complex query; one can always hard code a point solution for the business question, if really must â maybe even in map-reduce.&lt;/p&gt; &lt;p&gt;Besides, for some things that go beyond &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x25149078&quot;&gt;SQL&lt;/a&gt; (for example, with graph structures), there really isn&amp;#39;t a good solution.&lt;/p&gt; &lt;p&gt;Now, enter &lt;a href=&quot;http://www.vertica.com/&quot; id=&quot;link-id0x268ecb90&quot;&gt;Vertica&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x25954eb8&quot;&gt;Greenplum&lt;/a&gt;, &lt;a href=&quot;http://www.ingres.com/vectorwise/&quot; id=&quot;link-id0x28cac500&quot;&gt;VectorWise&lt;/a&gt; (a &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x28c239f8&quot;&gt;MonetDB&lt;/a&gt; project derivative from &lt;a href=&quot;http://dbpedia.org/resource/Ingres&quot; id=&quot;link-id0x24a2f498&quot;&gt;Ingres&lt;/a&gt;) and Virtuoso, maybe others, who all propose some combination of SQL- and explicit map-reduce-style control structures. This is nice but better is possible.&lt;/p&gt; &lt;p&gt;Here we find the next frontier of Semdata. Take &lt;a href=&quot;http://dbpedia.org/resource/Joseph_M._Hellerstein&quot; id=&quot;link-id0x257db7c0&quot;&gt;Joe Hellerstein&lt;/a&gt; et al&amp;#39;s work on &lt;a href=&quot;http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-90.html&quot; id=&quot;link-id0x1c64ba98&quot;&gt;declarative logic for the data centric data center&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;We have heard it many times â when the data is big, the logic must go to it. We can take declarative, location-conscious rules, &lt;i&gt;Ã  la&lt;/i&gt; &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x29affc18&quot;&gt;BOOM&lt;/a&gt; and BLOOM, and combine these with the declarative query, well-defined semantics, parallel-database capability of the leading RDF stores. Merge this with locality compression and throughput from the best analytics DBMS.&lt;/p&gt; &lt;p&gt;Here we have a data infrastructure that subsumes map-reduce as a special case of arbitrary distributed-parallel control flow, can send the processing to the data, and has flexible queries and schema-last capability.&lt;/p&gt; &lt;p&gt;Further, since RDF more or less reduces to relational columns, the techniques of caching and reuse and materialized joins and demand-driven indexing, &lt;i&gt;Ã  la&lt;/i&gt; MonetDB, are applicable with minimal if any adaptation.&lt;/p&gt; &lt;p&gt;Such a hybrid database-fusion frontier is relevant because it addresses heterogenous, large-scale data, with operations that are not easy to reduce to SQL, still without loss of the advantages of SQL. Apply this to anything from enhancing the business intelligence process by faster integration, including integration with &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x268168c8&quot;&gt;linked open data&lt;/a&gt; to the map-reduce bulk processing of today. Do it with strong semantics and inference close to the data.&lt;/p&gt; &lt;p&gt;In short, RDF stays relevant by tackling real issues, with scale second to none, and decisive advantages in time-to-integrate and expressive power.&lt;/p&gt; &lt;p&gt;Last week I was at the &lt;a href=&quot;http://lod2.eu/&quot; id=&quot;link-id0x29e23be8&quot;&gt;LOD2&lt;/a&gt; &lt;a href=&quot;http://lod2.eu/BlogPost/9-press-release-lod2-project-launch.html&quot; id=&quot;link-id0x1aec1c10&quot;&gt;kick off&lt;/a&gt; and a &lt;a href=&quot;http://www.larkc.eu/&quot; id=&quot;link-id0x245f1168&quot;&gt;LarKC&lt;/a&gt; meeting. The capabilities envisioned in this and the following post mirror our commitments to the EU co-funded LOD2 project. This week is VLDB and the Semdata workshop. I will talk more about how these trends are taking shape within the Virtuoso product development roadmap in future posts.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Transactional High Availability in Virtuoso Cluster Edition</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-04-14#1623</atom:id>
  <atom:published>2010-04-14T22:21:52Z</atom:published>
  <atom:updated>2010-04-14T19:13:00-04:00</atom:updated>
  <atom:content type="html">&lt;h2&gt;Introduction&lt;/h2&gt; &lt;p&gt;This post discusses the technical specifics of how we accomplish smooth transactional operation in a database server cluster under different failure conditions. (&lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1621&quot; id=&quot;link-id0x198e8e68&quot;&gt;A higher-level short version&lt;/a&gt; was posted last week.) The reader is expected to be familiar with the basics of &lt;a href=&quot;http://dbpedia.org/resource/Distributed_transaction&quot; id=&quot;link-id0x25088028&quot;&gt;distributed transactions&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Someone on a cloud computing discussion list called &lt;a href=&quot;http://dbpedia.org/resource/Two-phase_commit_protocol&quot; id=&quot;link-id0x21addd50&quot;&gt;two-phase commit&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Two-phase_commit_protocol&quot; id=&quot;link-id0x1eb6bc90&quot;&gt;2PC&lt;/a&gt;) the &amp;quot;anti-availability protocol.&amp;quot; There is indeed a certain anti-&lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x28e1dbd0&quot;&gt;SQL&lt;/a&gt; and anti-2PC sentiment out there, with key-value stores and &amp;quot;eventual consistency&amp;quot; being talked about a lot. Indeed, if we are talking about wide-area replication over high-latency connections, then 2PC with synchronously-sharp transaction boundaries over all copies is not really workable.&lt;/p&gt; &lt;p&gt;For multi-site operations, a level of &lt;i&gt;eventual&lt;/i&gt; consistency is indeed quite unavoidable. Exactly what the requirements are depends on the application, so I will focus here on operations inside one site.&lt;/p&gt; &lt;p&gt;The key-value store culture seems to focus on workloads where a record is relatively self-contained. The record can be quite long, with repeating fields, different selections of fields in consecutive records, and so forth. Such a record would typically be split over many tables of a relational &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x216479f8&quot;&gt;schema&lt;/a&gt;. In the &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x672d740&quot;&gt;RDF&lt;/a&gt; world, such a record would be split even wider, with the &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x72c8ec0&quot;&gt;information&lt;/a&gt; needed to reconstitute the full record almost invariably split over many servers. This comes from the mapping between the text of URIs and their internal IDs being partitioned in one way, and the many indices on the RDF quads each in yet another way.&lt;/p&gt; &lt;p&gt;So it comes to pass that in the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x216c6280&quot;&gt;data&lt;/a&gt; models we are most interested in, the application-level &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x224444c0&quot;&gt;entity&lt;/a&gt; (&lt;i&gt;e.g.,&lt;/i&gt; a user account in a social network) is not a contiguous unit with a single global identifier. The social network user account, that the key-value store would consider a unit of replication mastering and eventual consistency, will be in RDF or SQL a set of maybe hundreds of tuples, each with more than one index, nearly invariably spanning multiple nodes of the database cluster.&lt;/p&gt; &lt;p&gt;So, before we can talk about wide-area replication and eventual consistency with application-level semantics, we need a database that can run on a fair-sized cluster and have cast-iron consistency within its bounds. If such a cluster is to be large and is to operate continuously, it must have some form of redundancy to cover for hardware failures, software upgrades, reboots, etc., without interruption of service.&lt;/p&gt; &lt;p&gt;This is the point of the design space we are tackling here.&lt;/p&gt; &lt;h2&gt;Non Fault-Tolerant Operation&lt;/h2&gt; &lt;p&gt;There are two basic modes of operation we cover: bulk load, and online transactions.&lt;/p&gt; &lt;p&gt;In the case of bulk load, we start with a consistent image of the database; load data; and finish by making another consistent image. If there is a failure during load, we lose the whole load, and restart from the initial consistent image. This is quite simple and is not properly transactional. It is quicker for filling a warehouse but is not to be used for anything else. In the remainder, we will only talk about online transactions.&lt;/p&gt; &lt;p&gt;When all cluster nodes are online, operation is relatively simple. Each entry of each index belongs to a partition that is determined by the values of one or more partitioning columns of said index. There are no tables separate from indices; the relational row is on the index leaf of its primary key. Secondary indices reference the row by including the primary key. Blobs are in the same partition as the row which contains the blob. Each partition is then stored on a &amp;quot;cluster node.&amp;quot; In non fault-tolerant operations, each such cluster node is a single process with exclusive access to its own permanent storage, consisting of database files and logs; &lt;i&gt;i.e.,&lt;/i&gt; each node is a single server instance. It does not matter if the storage is local or on a SAN, the cluster node is still the only one accessing it.&lt;/p&gt; &lt;p&gt;When things are not fault tolerant, transactions work as follows:&lt;/p&gt; &lt;p&gt;When there are updates, two-phase commit is used to guarantee a consistent result. Each transaction is coordinated by one cluster node, which issues the updates in parallel to all cluster nodes concerned. Sending two update messages instead of one does not significantly impact latency. The coordinator of each transaction is the primary authority for the transaction&amp;#39;s outcome. If the coordinator of the transaction dies between the phases of the commit, the transaction branches stay in the prepared state until the coordinator is recovered and can be asked again about the outcome of the transaction. Likewise, if a non-coordinating cluster node with a transaction branch dies between the phases, it will do a roll-forward and ask the coordinator for the outcome of the transaction.&lt;/p&gt; &lt;p&gt;If cluster nodes occasionally crash and then recover relatively quickly, without ever losing transaction logs or database files, this is resilient enough. Everything is symmetrical; there are no cluster nodes with special functions, except for one master node that has the added task of resolving distributed deadlocks.&lt;/p&gt; &lt;p&gt;I suppose our anti-SQL person called 2PC &amp;quot;anti-availability&amp;quot; because in the above situation we have the following problems: if any one cluster node is offline, it is quite likely that no transaction can be committed. This is so unless the data is partitioned on a key with application semantics, and all data touched by a transaction usually stays within a single partition. Then operations could proceed on most of the data while one cluster node was recovering. But, especially with RDF, this is never the case, since keys are partitioned in ways that have nothing to do with application semantics. Further, if one uses XA or &lt;a href=&quot;http://dbpedia.org/resource/Microsoft&quot; id=&quot;link-id0x785bc50&quot;&gt;Microsoft&lt;/a&gt; DTC with the monitor on a single box, this box can become a bottleneck and/or a single point of failure. (Among other considerations, this is why &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x72a1ea8&quot;&gt;Virtuoso&lt;/a&gt; does not rely on any such monitor.) Further, if a cluster node dies never to be heard of again, leaving prepared but uncommitted transaction branches, the rest of the system has no way of telling what to do with them, again unless relying on a monitor that is itself liable to fail.&lt;/p&gt; &lt;p&gt;If transactions have a real world counterpart, it is possible, at least in theory, to check the outcome against the real world state: One can ask a customer if an order was actually placed or a shipment delivered. But when a transaction has to do with internal identifiers of things, for example whether &lt;b&gt;&lt;code&gt;mailto://plaidskirt@hotdate.com&lt;/code&gt;&lt;/b&gt; has internal ID &lt;b&gt;&lt;code&gt;0xacebabe&lt;/code&gt;&lt;/b&gt;, such a check against external reality is not possible.&lt;/p&gt; &lt;h2&gt;Fault-Tolerant Operation&lt;/h2&gt; &lt;p&gt;In a fault tolerant setting, we introduce the following extra elements: Cluster nodes are comprised of &amp;quot;quorums&amp;quot; of mutually-mirroring server instances. Each such quorum holds a partition of the data. Such a quorum typically consists of two server instances, but may have three for extra safety. If all server instances in the quorum are offline, then the cluster node is offline, and the cluster is not fully operational. If at least one server instance in a quorum is online, then the cluster node is online, and the cluster is operational and can process new transactions.&lt;/p&gt; &lt;p&gt;We designate one cluster node (&lt;i&gt;i.e.,&lt;/i&gt; one quorum of 2 or 3 server instances) to act as a master node, and we set an order of precedence among its member instances. In addition to arbitrating distributed deadlocks, the master instance on duty will handle reports of server instance failures, and answer questions about any transactions left hanging in prepared state by a dead transaction coordinator. If the master on duty fails, the next master in line will either notice this itself in the line of normal business or get a complaint from another server instance about not being able to contact the previous master.&lt;/p&gt; &lt;p&gt;There is no global heartbeat messaging &lt;i&gt;per se,&lt;/i&gt; but since connections between server instances are reused long-term, a dropped connection will be noticed and the master on duty will be notified. If all masters are unavailable, that entire quorum (&lt;i&gt;i.e.,&lt;/i&gt; the master node) is offline and thus (as with any entire node going offline) most operations will fail anyway, unless by chance they do not hit any data managed by that failed quorum.&lt;/p&gt; &lt;p&gt;When it receives a notice of unavailability, the master instance on duty tries to contact the unavailable server instance and if it fails, it will notify all remaining instances that that server instance is removed from the cluster. The effect is that the remaining server instances will stop attempting to access the failed instance. Updates to the partitions managed by the failed server instance are no longer sent to it, which results in updates to this data succeeding, as they are made against the other server instances in that quorum. Updates to the data of the failed server instance &lt;i&gt;will&lt;/i&gt; fail in the window of time between the actual failure and the removal, which is typically well under a second. The removal of a failed server instance is delegated to a central authority in order not to have everybody get in each other&amp;#39;s way when trying to effect the removal.&lt;/p&gt; &lt;p&gt;If the failed server instance left prepared uncommitted transactions behind, the server instances having such branches will in due order contact the transaction coordinator to ask what should be done. This is a normal procedure for dealing with possibly dropped commit or rollback messages. When they discover that the coordinator has been removed, the master on duty will be contacted instead. Each prepare message of a transaction lists all the server instances participating in the transaction; thus the master can check whether each has received the prepare. If all have the prepare and none has an abort, the transaction is committed. The dead coordinator may not know this or may indeed not have the transaction logged, since it sends the prepares before logging its own prepare. The recovery will handle this though. We note that of the remaining branches, there is at least one copy of the branch with the failed server instance, or else we would have a whole quorum failed. In cases where there are branches participating in an unresolved transaction where all the quorum members have failed, the system cannot decide the outcome, and will periodically retry until at least one member of the failed quorum becomes available.&lt;/p&gt; &lt;p&gt;The most complex part of the protocol is the recovery of a failed server instance. The recovery starts with a normal roll forward from the local transaction log. After this, the server instance will contact the master on duty to ask for its state. Typically, the master will reply that the recovering server instance had been removed and is out of date. When this is established, the recovering server instance will contact a live member of its quorum and ask for sync. The failed server instance has an approximate timestamp of its last received transaction. It knows this from the roll forward, where time markers are interspersed now and then between transaction records. The live partner then sends its transaction log(s) covering the time from a few seconds before the last transaction of the failed partner up to the present. A few transactions may get rolled forward twice but this does no harm, since these records have absolute values and no deltas and the second insert of a key is simply ignored. When the sender of the log reaches its last committed log entry, it asks the recovering server instance to confirm successful replay of the log so far. Having the confirmation, the sender will abort all unprepared transactions affecting it and will not accept any new ones until the sync is completed. If new transactions were committed between sending the last of the log and killing the uncommitted new transactions, these too are shipped to the recovering server instance in their committed or prepared state. When these are also confirmed replayed, the recovering server instance is in exact sync up to the transaction. The sender then notifies the rest of the cluster that the sync is complete and that the recovered server instance will be included in any updates of its slice of the data. The time between freeze and re-enable of transactions is the time to replay what came in between the first sync and finishing the freeze. Typically nothing came in, so the time is in milliseconds. If an application got its transaction killed in this maneuver, it will be seen as a deadlock.&lt;/p&gt; &lt;p&gt;If the recovering server instance received transactions in prepared state, it will ask about their outcome as a part of the periodic sweep through pending transactions. One of these transactions could have been one originally prepared by itself, where the prepares had gone out before it had time to log the transaction. Thus, this eventuality too is covered and has a consistent outcome. Failures can interrupt the recovery process. The recovering server instance will have logged as far as it got, and will pick up from this point onward. Real time clocks on the host nodes of the cluster will have to be in approximate sync, within a margin of a minute or so. This is not a problem in a closely connected network.&lt;/p&gt; &lt;p&gt;For simultaneous failure of a entire quorum of server instances (&lt;i&gt;i.e.,&lt;/i&gt; a set of mutually-mirroring partners; a cluster node), the rule is that the last one to fail must be the first to come back up. In order to have uninterrupted service across arbitrary double failures, one must store things in triplicate; statistically, however, most double failures will not hit cluster nodes of the same group.&lt;/p&gt; &lt;p&gt;The protocol for recovery of failed server instances of the master quorum (&lt;i&gt;i.e.,&lt;/i&gt; the master cluster node) is identical, except that a recovering master will have to ask the other master(s) which one is more up to date. If the recovering master has a log entry of having excluded all other masters in its quorum from the cluster, it can come back online without asking anybody. If there is no such entry, it must ask the other master(s). If all had failed at the exact same instant, none has an entry of the other(s) being excluded and all will know that they are in the same state since any update to one would also have been sent to the other(s).&lt;/p&gt; &lt;h2&gt;Failure of Storage Media&lt;/h2&gt; &lt;p&gt;When a server instance fails, its permanent storage may or may not survive. Especially with mirrored disks, storage most often survives a failure. However, the survival of the database does not depend on any single server instance retaining any permanent storage over failure. If storage is left in place, as in the case of an OS reboot or replacing a faulty memory chip, rejoining the cluster is done based on the existing copy of the database on the server instance. if there is no existing copy, a copy can be taken from any surviving member of the same quorum. This consists of the following steps: First, a log checkpoint is forced on the surviving instance. Normally log checkpoints are done at regular intervals, independently on each server instance. The log checkpoint writes a consistent state of the database to permanent storage. The disk pages forming this consistent image will not be written to until the next log checkpoint. Therefore copying the database file is safe and consistent as long as a log checkpoint does not take place between the start and end of copy. Thus checkpoints are disabled right after the initial checkpoint. The copy can take a relatively long time; consider 20s per gigabyte on a 1GbE network a good day. At the end of copy, checkpoints are re-enabled on the surviving cluster node. The recovering database starts without a log, sees the timestamp of the checkpoint in the database, and asks for transactions from just before this time up to present. The recovery then proceeds as outlined above.&lt;/p&gt; &lt;h2&gt;Network Failures&lt;/h2&gt; &lt;p&gt;The CAP theorem states that Consistency, Availability, and Partition-tolerance do not mix. &amp;quot;Partition&amp;quot; here means the split of a network.&lt;/p&gt; &lt;p&gt;It is trivially true that if the network splits so that on both sides there is a copy of each partition of the data, both sides will think themselves the live copy left online after the other died, and each will thus continue to accumulate updates. Such an event is not very probable within one site where all machines are redundantly connected to two independent switches. Most servers have dual 1GbE on the motherboard, and both ports should be used for cluster interconnect for best performance, with each attached to an independent switch. Both switches would have to fail in such a way as to split their respective network for a single-site network split to happen. Of course, the likelihood of a network split in multi-site situations is higher.&lt;/p&gt; &lt;p&gt;One way of guarding against network splits is to require that at least one partition of the data have all copies online. Additionally, the master on duty can request each cluster node or server instance it expects to be online to connect to every other node or instance, and to report which they could reach. If the reports differ, there is a network problem. This procedure can be performed using both interfaces or only the first or second interface of each server to determine if one of the switches selectively blocks some paths. These simple sanity checks protect against arbitrary network errors. Using TCP for inter-cluster-node communication in principle protects against random message loss, but the Virtuoso cluster protocols do not rely on this. Instead, there are protocols for retry of any transaction messages and for using keep-alive messages on any long-running functions sent across the cluster. Failure to get a keep-alive message within a certain period will abort a query even if the network connections look OK. &lt;/p&gt; &lt;h2&gt;Backups, and Recovery from Loss of Entire Site&lt;/h2&gt; &lt;p&gt;For a constantly-operating distributed system, it is hard to define what exactly constitutes a consistent snapshot. The checkpointed state on each cluster node is consistent as far as this cluster node is concerned (&lt;i&gt;i.e.,&lt;/i&gt; it contains no uncommitted data), but the checkpointed states on all the cluster nodes are not from exactly the same moment in time. The complete state of a cluster is the checkpoint state of each cluster node plus the current transaction log of each. If the logs were shipped in real time to off-site storage, a consistent image could be reconstructed from them. Since such shipping cannot be synchronous due to latency considerations, some transactions could be received only in part in the event of a failure of the off-site link. Such partial transactions can however be detected at reconstruction time because each record contains the list of all participants of the transaction. If some piece is found missing, the whole can be discarded. In this way integrity is guaranteed but it is possible that a few milliseconds worth of transactions get lost. In these cases, the online client will almost certainly fail to get the final success message and will recheck the status after recovery.&lt;/p&gt; &lt;p&gt;For business continuity purposes, a live feed of transactions can be constantly streamed off-site, for example to a cloud infrastructure provider. One low-cost virtual machine on the cloud will typically be enough for receiving the feed. In the event of long-term loss of the whole site, replacement servers can be procured on the cloud; thus, capital is not tied up in an aging inventory of spare servers. The cloud-based substitute can be maintained for the time it takes to rebuild an owned infrastructure, which is still at present more economical than a cloud-only solution.&lt;/p&gt; &lt;p&gt;Switching a cluster from an owned site to the cloud could be accomplished in a few hours. The prerequisite of this is that there are reasonably recent snapshots of the database files, so that replay of logs does not take too long. The bulk of the time taken by such a switch would be in transferring the database snapshots from S3 or similar to the newly provisioned machines, formatting the newly provisioned virtual disks, etc.&lt;/p&gt; &lt;p&gt;Rehearsing such a maneuver beforehand is quite necessary for predictable execution. We do not presently have a productized set of tools for such a switch, but can advise any interested parties on implementing and testing such a disaster recovery scheme.&lt;/p&gt; &lt;h2&gt;Conclusions&lt;/h2&gt; &lt;p&gt;In conclusion, we have shown how we can have strong transactional guarantees in a database cluster without single points of failure or performance penalties when compared with a non fault-tolerant cluster. Operator intervention is not required for anything short of hardware failure. Recovery procedures are simple, at most consisting of installing software and copying database files from a surviving cluster node. Unless permanent storage is lost in the failure, not even this is required. Real-time off-site log shipment can easily be added to these procedures to protect against site-wide failures.&lt;/p&gt; &lt;p&gt;Future work may be directed toward concurrent operation of geographically-distributed data centers with eventual consistency. Such a setting would allow for migration between sites in the event of whole-site failures, and for reconciliation between inconsistent histories of different halves of a temporarily split network. Such schemes are likely to require application-level logic for reconciliation and cannot consist of an out-of-the-box DBMS alone. All techniques discussed here are application-agnostic and will work equally well for Graph Model (&lt;i&gt;e.g.,&lt;/i&gt; RDF) and Relational Model (&lt;i&gt;e.g.,&lt;/i&gt; SQL) workloads.&lt;/p&gt; &lt;h3&gt; &lt;a href=&quot;http://dbpedia.org/resource/Glossary&quot; id=&quot;link-id0x24f4e378&quot;&gt;Glossary&lt;/a&gt; &lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;b&gt;Virtuoso Cluster (VC)&lt;/b&gt; -- a collection of Virtuoso Cluster Nodes on one or more machines, working in parallel as part of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Cluster Node (VCN)&lt;/b&gt; -- a Virtuoso Server Instance (Non Fault-Tolerant Operations), or a Quorum of Server Instances (Fault Tolerant Operations), which is a member of a collection of Virtuoso Cluster Nodes working in parallel as part of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Host Cluster (VHC)&lt;/b&gt; -- a collection of machines, each hosting one or more Virtuoso Server Instances, making up a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Host Cluster Node (VHCN)&lt;/b&gt; -- a machine hosting one or more Virtuoso Server Instances that are members of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Server Instance (VSI)&lt;/b&gt; -- a single Virtuoso process with exclusive access to its own permanent storage, consisting of database files and logs. May comprise an entire Virtuoso Cluster Node (Non Fault-Tolerant Operations), or be one member of a quorum which comprises a Virtuoso Cluster Node (Fault Tolerant Operations).&lt;/li&gt; &lt;/ul&gt; &lt;h3&gt;Also see&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.gbcacm.org/sites/www.gbcacm.org/files/slides/SpecialRelativity[1]_0.pdf&quot; id=&quot;link-id0x16cb22d8&quot;&gt;Special Relativity and the Problem of Database Scalability (PDF)&lt;/a&gt;, by James Starkey of &lt;a href=&quot;http://www.nimbusdb.com/&quot; id=&quot;link-id0x18f30d58&quot;&gt;NimbusDB, Inc.&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Fault Tolerance in Virtuoso Cluster Edition (Short Version)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-04-07#1621</atom:id>
  <atom:published>2010-04-07T16:40:02Z</atom:published>
  <atom:updated>2010-04-14T19:12:47.000003-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have for some time had the option of storing &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x28eb2178&quot;&gt;data&lt;/a&gt; in a cluster in multiple copies, in the Commercial Edition of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x25178ed0&quot;&gt;Virtuoso&lt;/a&gt;. (This feature is not in and is not planned to be added to the Open Source Edition.)&lt;/p&gt; &lt;p&gt;Based on some feedback from the field, we decided to make this feature more user friendly. The gist of the matter is that failure and recovery processes have been automated so that neither application developer nor operating personnel needs any &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x21fea428&quot;&gt;knowledge&lt;/a&gt; of how things actually work.&lt;/p&gt; &lt;p&gt;So I will here make a few high level statements about what we offer for fault tolerance. I will follow up with technical specifics in another post.&lt;/p&gt; &lt;p&gt;Three types of individuals need to know about fault tolerance:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Executives: What does it cost? Will it really eliminate downtime?&lt;/li&gt; &lt;li&gt;System Administrators: Is it hard to configure? What do I do when I get an alert?&lt;/li&gt; &lt;li&gt;Application Developers/Programmers: Will I need to write extra code? Can old applications get fault tolerance with no changes?&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;I will explain the matter to each of these three groups:&lt;/p&gt; &lt;h2&gt;Executives&lt;/h2&gt; &lt;p&gt;The value gained is elimination of downtime. The cost is in purchasing twice (or thrice) the hardware and software licenses. In reality, the cost is less since you get the whole money&amp;#39;s worth of read throughput and half the money&amp;#39;s worth of write throughput. Since most applications are about reading, this is a good deal. You do not end up paying for unused capacity.&lt;/p&gt; &lt;p&gt;Server instances are grouped in &amp;quot;quorums&amp;quot; of two or, for extra safety, three; as long as one member of each quorum is available, the system keeps running and nobody sees a difference, except maybe for slower response. This does not protect against widespread power outage or the building burning down; the scope is limited to hardware and software failures at one site.&lt;/p&gt; &lt;p&gt;The most basic site-wide disaster recovery plan consists of constantly streaming updates off-site. Using an off-site backup plus update stream, one can reconstitute the failed data center on a cloud provider in a few hours. Details will vary; please &lt;a href=&quot;http://www.openlinksw.com/contact/&quot; id=&quot;link-id0x2bdb0db8&quot;&gt;contact us&lt;/a&gt; for specifics.&lt;/p&gt; &lt;p&gt;Running multiple sites in parallel is also possible but specifics will depend on the application. Again, please contact us if you have a specific case in mind.&lt;/p&gt; &lt;h2&gt; System Administrators&lt;/h2&gt; &lt;p&gt;To configure, divide your server instances into quorums of 2 or 3, according to which will be mirrors of each other, with each quorum member on a different host from the others in its quorum. These things are declared in a configuration file. Table definitions do not have to be altered for fault tolerance. It is enough for tables and indices to specify partitioning. Use two switches, and two NICs per machine, and connect one of each server&amp;#39;s network cables to each switch, to cover switch failures.&lt;/p&gt; &lt;p&gt;When things break, as long as there is at least one server instance up from each quorum, things will continue to work. Reboots and the like are handled without operator intervention; if there is a broken host, then remove it and put a spare in its place. If the disks are OK, put the old disks in the replacement host and start. If the disks are gone, then copy the database files from the live copy. Finally start the replacement database, and the system will do the rest. The system is online in read-write mode during all this time, including during copying.&lt;/p&gt; &lt;p&gt;Having mirrored disks in individual hosts is optional since data will anyhow be in two copies. Mirrored disks will shorten the vulnerability window of running a partition on a single server instance since this will for the most part eliminate the need to copy many (hundreds) of GB of database files when recovering a failed instance.&lt;/p&gt; &lt;h2&gt; Application Developers/Programmers&lt;/h2&gt; &lt;p&gt;An application can connect to any server instance in the cluster and have access to the same data, with full &lt;a href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x6451870&quot;&gt;ACID&lt;/a&gt; properties.&lt;/p&gt; &lt;p&gt;There are two types of errors that can occur in any database application: The database server instance may be offline or otherwise unreachable; and a transaction may be aborted due to a deadlock.&lt;/p&gt; &lt;p&gt;For the missing server instance, the application should try to reconnect. An &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x28e859b8&quot;&gt;ODBC&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x28e11940&quot;&gt;JDBC&lt;/a&gt; connect string can specify a list of alternate server instances; thus as long as the application is written to try to reconnect as best practices dictate, there is no new code needed.&lt;/p&gt; &lt;p&gt;For the deadlock, the application is supposed to retry the transaction. Sometimes when a server instance drops out or rejoins a running cluster, some transactions will have to be retried. To the application, these conditions look like a deadlock. If the application handles deadlocks (&lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x2bda4e40&quot;&gt;SQL&lt;/a&gt; State 40001) as best practices dictate, there is no change needed.&lt;/p&gt; &lt;h2&gt;Conclusion&lt;/h2&gt; &lt;p&gt;In summary...&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Limited extra cost for fault tolerance; no equipment sitting idle.&lt;/li&gt; &lt;li&gt;Easy operation: Replace servers when they fail; the cluster does the rest.&lt;/li&gt; &lt;li&gt;No changes needed to most applications.&lt;/li&gt; &lt;li&gt;No proprietary SQL APIs or special fault tolerance logic needed in applications.&lt;/li&gt; &lt;li&gt;Fully transactional programming model.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;All the above applies to both the Graph Model (&lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x22606f10&quot;&gt;RDF&lt;/a&gt;) and Relational (SQL) sides of Virtuoso. These features will be in the commercial release of Virtuoso to be publicly available in the next 2-3 weeks. Please &lt;a href=&quot;http://www.openlinksw.com/contact/&quot; id=&quot;link-id0x24f35648&quot;&gt;contact OpenLink Software&lt;/a&gt; Sales for details of availability or for getting advance evaluation copies.&lt;/p&gt; &lt;h3&gt; &lt;a href=&quot;http://dbpedia.org/resource/Glossary&quot; id=&quot;link-id0x6648890&quot;&gt;Glossary&lt;/a&gt; &lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;b&gt;Virtuoso Cluster (VC)&lt;/b&gt; -- a collection of Virtuoso Cluster Nodes on one or more machines, working in parallel as part of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Cluster Node (VCN)&lt;/b&gt; -- a Virtuoso Server Instance (Non Fault-Tolerant Operations), or a Quorum of Server Instances (Fault Tolerant Operations), which is a member of a collection of Virtuoso Cluster Nodes working in parallel as part of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Host Cluster (VHC)&lt;/b&gt; -- a collection of machines, each hosting one or more Virtuoso Server Instances, making up a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Host Cluster Node (VHCN)&lt;/b&gt; -- a machine hosting one or more Virtuoso Server Instances that are members of a Virtuoso Cluster.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Virtuoso Server Instance (VSI)&lt;/b&gt; -- a single Virtuoso process with exclusive access to its own permanent storage, consisting of database files and logs. May comprise an entire Virtuoso Cluster Node (Non Fault-Tolerant Operations), or be one member of a quorum which comprises a Virtuoso Cluster Node (Fault Tolerant Operations).&lt;/li&gt; &lt;/ul&gt; &lt;h3&gt;Also see&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.gbcacm.org/sites/www.gbcacm.org/files/slides/SpecialRelativity[1]_0.pdf&quot; id=&quot;link-id0x1320f1e8&quot;&gt;Special Relativity and the Problem of Database Scalability (PDF)&lt;/a&gt;, by James Starkey of &lt;a href=&quot;http://www.nimbusdb.com/&quot; id=&quot;link-id0x1320f2b0&quot;&gt;NimbusDB, Inc.&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>&quot;The Acquired, The Innate, and the Semantic&quot; or &quot;Teaching Sem Tech&quot;</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-04-05#1619</atom:id>
  <atom:published>2010-04-05T15:21:19Z</atom:published>
  <atom:updated>2010-05-05T13:49:57-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I was recently asked to write a section for a policy document touching the intersection of database and semantics, as a follow up to the meeting in Sofia I &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1614&quot; id=&quot;link-id0x19c4f938&quot;&gt;blogged about earlier&lt;/a&gt;. I will write about technology, but this same document also touches the matter of education and computer science curricula. Since the matter came up, I will share a few thoughts on the latter topic.&lt;/p&gt; &lt;p&gt;I have over the years trained a few truly excellent engineers and managed a heterogeneous lot of people. These days, since what we are doing is in fact quite difficult and the world is not totally without competition, I find that I must stick to core competence, which is hardcore tech and leave management to those who have time for it.&lt;/p&gt; &lt;p&gt;When younger, I thought that I could, through sheer personal charisma, transfer either technical skills, sound judgment, or drive and ambition to people I was working with. Well, to the extent I believed this, my own judgment was not sound. Transferring anything at all is difficult and chancy. I must here think of a fantasy novel where a wizard said that, &amp;quot;working such magic that makes things do what they already want to do is easy.&amp;quot; There is a grain of truth in that.&lt;/p&gt; &lt;p&gt;In order to build or manage organizations, we must work, as the wizard put it, &lt;i&gt;with&lt;/i&gt; nature, not against it. There are also counter-examples, for example my wife&amp;#39;s grandmother had decided to transform a regular willow into a weeping one by tying down the branches. Such &amp;quot;magic,&amp;quot; needless to say, takes constant maintenance; else the spell breaks.&lt;/p&gt; &lt;p&gt;To operate efficiently, either in business or education, we need to steer away from such endeavors. This is a valuable lesson, but now consider teaching this to somebody. Those who would most benefit from this wisdom are the least receptive to it. So again, we are reminded to stay away from the fantasy of being able to transfer some understanding we think to have and to have this take root. It will if it will and if it does not, it will take constant follow up, like the would-be weeping willow.&lt;/p&gt; &lt;p&gt;Now, in more specific terms, what can we realistically expect to teach about computer science?&lt;/p&gt; &lt;p&gt;Complexity of algorithms would be the first thing. Understanding the relative throughputs and latencies of the memory hierarchy (i.e., &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x13fcc8b8&quot;&gt;cache&lt;/a&gt;, memory, local network, disk, wide area network) is the second. Understanding the difference of synchronous and asynchronous and the cost of synchronization (i.e., anything from waiting for a mutex to waiting for a network message) is the third.&lt;/p&gt; &lt;p&gt;Understanding how a database works would be immensely helpful for almost any application development task but this is probably asking too much.&lt;/p&gt; &lt;p&gt;Then there is the question of engineering. Where do we put interfaces and what should these interfaces expose? Well, they certainly should expose multiple instances of whatever it is they expose, since passing through an interface takes time.&lt;/p&gt; &lt;p&gt;I tried once to tell the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x72d7490&quot;&gt;SPARQL&lt;/a&gt; committee that parameterized queries and array parameters are a self-evident truism on the database side. This is an example of an interface that exposes multiple instances of what it exposes. But the committee decided not to standardize these. There is something in the &amp;quot;semanticist&amp;quot; mind that is irrationally antagonistic to what is self-evident for databasers. This is further an example of ignoring precept 2 above, the point about the throughputs and latencies in the memory hierarchy. Nature is a better and more patient teacher than I; the point will become clear of itself in due time, no worry.&lt;/p&gt; &lt;p&gt;Interfaces seem to be overvalued in education. This is tricky because we should not teach that interfaces are bad either. Nature has islands of tightly intertwined processes, separated by fairly narrow interfaces. People are taught to think in block diagrams, so they probably project this also where it does not apply, thereby missing some connections and porosity of interfaces.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.larkc.eu/&quot; id=&quot;link-id0x1c5591f0&quot;&gt;LarKC&lt;/a&gt; (EU FP7 Large &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x15fae798&quot;&gt;Knowledge&lt;/a&gt; Collider project) is an exercise in interfaces. The lessons so far are that coupling needs to be tight, and that the roles of the components are not always as neatly separable as the block diagram suggests.&lt;/p&gt; &lt;p&gt;Recognizing the points where interfaces are naturally narrow is very difficult. Teaching this in a curriculum is likely impossible. This is not to say that the matter should not be mentioned and examples of over-&amp;quot;paradigmatism&amp;quot; given. The geek mind likes to latch on to a paradigm (e.g., object orientation), and then they try to put it everywhere. It is safe to say that taking block diagrams too naively or too seriously makes for poor performance and needless code. In some cases, block diagrams can serve as tactical disinformation; i.e., you give lip service to the values of structure, &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x6f03e90&quot;&gt;information&lt;/a&gt; hiding, and reuse, which one is not allowed to challenge, ever, and at the same time you do not disclose the competitive edge, which is pretty much always a breach of these same principles.&lt;/p&gt; &lt;p&gt;I was once at a &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1d524ce0&quot;&gt;data&lt;/a&gt; integration workshop in the US where some very qualified people talked about the process of science. They had this delightfully American metaphor for it:&lt;/p&gt; &lt;blockquote&gt; &lt;i&gt;The edge is created in the &amp;quot;Wild West&amp;quot; â there are no standards or hard-and-fast rules, and paradigmatism for paradigmatism&amp;#39;s sake is a laughing matter with the cowboys in the fringe where new ground is broken. Then there is the OK Corral, where the cowboys shoot it out to see who prevails. Then there is Dodge City, where the lawman already reigns, and compliance, standards, and paradigms are not to be trifled with, lest one get the tar-and-feather treatment and be &amp;quot;driven out o&amp;#39;Dodge.&amp;quot;&lt;/i&gt; &lt;/blockquote&gt; &lt;p&gt;So, if reality is like this, what attitude should the curriculum have towards it? Do we make innovators or followers? Well, as said before, they are not made. Or if they are made, they are not at least made in the university but much before that. I never made any of either, in spite of trying, but did meet many of both kinds. The education system needs to recognize individual differences, even though this is against the trend of turning out a standardized product. Enforced mediocrity makes mediocrity. The world has an amazing tolerance for mediocrity, it is true. But the edge is not created with this, if edge is what we are after.&lt;/p&gt; &lt;p&gt;But let us move to specifics of semantic technology. What are the core precepts, the equivalent of the complexity/memory/synchronization triangle of general purpose CS basics? Let us not forget that, especially in semantic technology, when we have complex operations, lots of data, and almost always multiple distributed data sources, forgetting the laws of physics carries an especially high penalty.&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Know when to ontologize, when to folksonomize.&lt;/b&gt; The history of standards has examples of &amp;quot;stacks of Babel,&amp;quot; sky-high and all-encompassing, which just result in non-communication and non-adoption. Lighter weight, community driven, &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0x1dbd9018&quot;&gt;tag&lt;/a&gt; folksonomy, VoCamp-style approaches can be better. But this is a judgment call, entirely contextual, having to do with the maturity of the domain of discourse, etc.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Answer only questions that are actually asked.&lt;/b&gt; This precept is two-pronged. The literal interpretation is not to do inferential closure for its own sake, materializing all implied facts of the knowledge base.&lt;/p&gt; &lt;p&gt;The broader interpretation is to take real-world problems. Expanding RDFS semantics with map-reduce and proving how many iterations this will take is a thing one can do but real-world problems will be more complex and less neat.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Deal with ambiguity.&lt;/b&gt; Data on which semantic technologies will be applied will be dirty, with errors from machine processing of natural language to erroneous human annotations. The knowledge bases will not be contradiction free. Michael Witbrock of CYC said many good things about this in Sofia; he would have something to say about a curriculum, no doubt.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Here we see that semantic technology is a younger discipline than computer science. We can outline some desirable skills and directions to follow but the idea of core precepts is not as well formed.&lt;/p&gt; &lt;p&gt;So we can approach the question from the angle of needed skills more than of precepts of science. What should the certified semantician be able to do?&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Data integration.&lt;/b&gt; Given heterogenous relational schemas talking about the same entities, the semantician should find existing ontologies for the domain, possibly extend these, and then map the relational data to them. After the mapping is conceptually done, the semantician must know what combination of ETL and on-the-fly mapping fits the situation. This does mean that the semantician indeed must understand databases, which I above classified as an almost unreachable ideal. But there is no getting around this. Data is increasingly what makes the world go round. From this it follows that everybody must increasingly publish, consume, and refine, i.e., integrate. The anti-database attitude of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x2038d520&quot;&gt;semantic web&lt;/a&gt; community simply has to go.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Design and implement workflows for content extraction,&lt;/b&gt; e.g., &lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x713cdc0&quot;&gt;NLP&lt;/a&gt; or information extraction from images. This also means familiarity with NLP, desirably to the point of being able to tune the extraction rule sets of various NLP frameworks.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Design SOA workflows.&lt;/b&gt; The semantician should be able to extract and represent the semantics of business transactions and the data involved therein.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Lightweight knowledge engineering.&lt;/b&gt; The experience of building expert systems from the early days of AI is not the best possible, but with semantics attached to data, some sort of rules seem about inevitable. The rule systems will merge into the DBMS in time. Some ability to work with these, short of making expert systems, will be desirable.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Understand information quality&lt;/b&gt; in the sense of trust, provenance, errors in the information, etc. If the world is run based on data analytics, then one must know what the data in the warehouse means, what accidental and deliberate errors it contains, etc.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Of course, most of these tasks take place at some sort of organizational crossroads or interface. This means that the semantician must have some project management skills; must be capable of effectively communicating with different publics and simply getting the job done, always in the face of organizational inertia and often in the face of active resistance from people who view the semantician as some kind of intruder on their turf.&lt;/p&gt; &lt;p&gt;Now, this is a tall order. The semantician will have to be reasonably versatile technically, reasonably clever, and a self-starter on top. The self-starter aspect is the hardest.&lt;/p&gt; &lt;p&gt;The semanticists I have met are more of the scholar than the IT consultant profile. I say &lt;i&gt;semanticist&lt;/i&gt; for the semantic web research people and &lt;i&gt;semantician&lt;/i&gt; for the practitioner we are trying to define.&lt;/p&gt; &lt;p&gt;We could start by taking people who already do data integration projects and educating them in some semantic technology. We are here talking about a different breed than the one that by nature gravitates to description logics and AI. Projecting semanticist interests or attributes on this public is a source of bias and error.&lt;/p&gt; &lt;p&gt;If we talk about a university curriculum, the part that cannot be taught is the leadership and self-starter aspect, or whatever makes a good IT consultant. Thus the semantic technology studies must be profiled so as to attract people with this profile. As quoted before, the dream job for each era is a scarce skill that makes value from something that is plentiful in the environment. At this moment and for a few moments to come, this is the data geek, or maybe even semantician profile, if we take data geek past statistics and traditional business intelligence skills.&lt;/p&gt; &lt;p&gt;The semantic tech community, especially the academic branch of it, needs to reinvent itself in order to rise to this occasion. The flavor of the dream job curriculum will be away from the theoretical computer science towards the hands-on of database, large systems performance, and the practicalities of getting data intensive projects delivered.&lt;/p&gt; &lt;p&gt; &lt;b&gt;Related&lt;/b&gt; &lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/presentations/Linked_Data_Virtualization/Linked_Data_Virtualization.html&quot; id=&quot;link-id0x199aca78&quot;&gt;Linked Data Driven Data Virtualization for Web-scale Integration (presentation)&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1603&quot; id=&quot;link-id0x13297a70&quot;&gt;Linked Data and Virtuoso in 2010&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/?id=1595&quot; id=&quot;link-id0x1a3d0bd0&quot;&gt;Getting The Linked Data Value Pyramid Layers Right&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1572&quot; id=&quot;link-id0x1802b170&quot;&gt;Provenance and Reification in Virtuoso&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/blog/~kidehen/?id=1519&quot; id=&quot;link-id0x19af4220&quot;&gt;The Time for RDBMS Primacy Downgrade is Nigh!&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1375&quot; id=&quot;link-id0x1a07a378&quot;&gt;Aspects of RDF to RDF Mapping&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Upcoming RDF Loader in Unclustered Virtuoso loads Uniprot at 279 Ktriples/s!</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-04-02#1617</atom:id>
  <atom:published>2010-04-02T14:15:01Z</atom:published>
  <atom:updated>2010-04-02T12:59:15-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We recently heard that &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x21414c58&quot;&gt;Oracle&lt;/a&gt; 11G loaded &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x28281e50&quot;&gt;RDF&lt;/a&gt; faster than we did. Now, we never thought the speed of loading a database was as important as the speed of query results, but since this is the &lt;b&gt;&lt;i&gt;sole&lt;/i&gt;&lt;/b&gt; area where they have reportedly been tested as faster, we decided it was time loading was addressed. Indeed, without Oracle to challenge us on query performance, we would not be half as good as we are. So, spurred on by the Oracular influence, we did something about our RDF loading.&lt;/p&gt; &lt;p&gt;Performance, I have said before, is a matter of locality and parallelism. So we applied both to the otherwise quite boring exercise of loading RDF. The recipe is this: Take a large set of triples; resolve the IRIs and literals into their IDs; then insert each index of the triple table on its own thread. All the lookups and inserts are first sorted in key order to get the locality. Running the indices in parallel gets the parallelism. Then run the parser on its own thread, fetching chunks of consecutive triples and queueing them for a pool of loader threads. Then run several parsers concurrently on different files so as to make sure there is work enough at all times. Do not make many more process threads than available &lt;a href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x30f3b20&quot;&gt;CPU&lt;/a&gt; threads, since they would just get in each other&amp;#39;s way.&lt;/p&gt; &lt;p&gt;The whole process is non-transactional, starting from a checkpoint and ending with a checkpoint.&lt;/p&gt; &lt;p&gt;The test system was a dual-Xeon 5520 with 72G RAM. The &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x3256138&quot;&gt;Virtuoso&lt;/a&gt; was a single server; no cluster capability was used.&lt;/p&gt; &lt;p&gt;We loaded English &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x33b3e58&quot;&gt;Dbpedia&lt;/a&gt;, 179M triples, in 15 minutes, for a rate of 198 Kt/s. Uniprot with 1.33 G triples loaded in 79 minutes, for 279 Kt/s.&lt;/p&gt; &lt;p&gt;The source files were the Dbpedia 3.4 English files and the &lt;a href=&quot;http://www.bio2rdf.org/&quot; id=&quot;link-id0x28266c20&quot;&gt;Bio2RDF&lt;/a&gt; copy of Uniprot, both in Turtle syntax. The uniref, uniparc and uniprot files from the Bio2RDF set were sliced into smaller chunks so as to have more files to load in parallel; the taxonomy file was as such; and no other Bio2RDF files were loaded. Both experiments ran with 8 load streams, 1 per core. The CPU utilization was mostly between 1400% and 1500%, 14-15 of 16 CPU threads busy. Top load speed for a measurement window of 2 minutes was 383 Kt/s.&lt;/p&gt; &lt;p&gt;The index scheme for RDF quads was the default Virtuoso 6 configuration of 5 indices â GS, SP, OP, PSOG, and POGS. (We call this &amp;quot;3+2&amp;quot; indexing, because there are 3 partial and 2 full indices, delivering massive performance benefits over most other index schemes.) IRIs and literals reside in their own tables, each indexed from string to ID and vice versa. A full-text index on literals was not used.&lt;/p&gt; &lt;p&gt;Compared to previous performance, we have more than tripled our best single server multi-stream load speed, and multiplied our single stream load speed by a factor of 8. Some further gains may be reached by adjusting thread counts and matching vector sizes to CPU &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x20403130&quot;&gt;cache&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;This will be available in a forthcoming release; this is not for download yet. Now that you know this, you may guess what we are doing with queries. More on this another time.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>SemData@Sofia Roundtable write-up</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2010-03-15#1615</atom:id>
  <atom:published>2010-03-15T14:46:57Z</atom:published>
  <atom:updated>2010-03-22T12:34:40.000010-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;There was last week an &lt;a href=&quot;http://www.semdata.org/&quot; id=&quot;link-id11a83cf98&quot;&gt;invitation-based roundtable&lt;/a&gt; about semantic &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1d37f598&quot;&gt;data&lt;/a&gt; management in &lt;a href=&quot;http://www.dbpedia.org/resource/Sofia&quot; id=&quot;link-id0x1ba4a208&quot;&gt;Sofia, Bulgaria&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Lots of smart people together. The meeting was hosted by &lt;a href=&quot;http://dbpedia.org/resource/Ontotext&quot; id=&quot;link-id0x1cfc83f8&quot;&gt;Ontotext&lt;/a&gt; and chaired by &lt;a href=&quot;http://www.dbpedia.org/resource/Dieter_Fensel&quot; id=&quot;link-id0x1dc6e0d0&quot;&gt;Dieter Fensel&lt;/a&gt;. On the database side we had Ontotext, &lt;a href=&quot;http://www.systap.com/&quot; id=&quot;link-id0x1cda77f0&quot;&gt;SYSTAP&lt;/a&gt; (&lt;a href=&quot;http://www.systap.com/bigdata.htm&quot; id=&quot;link-id0x1dba6a30&quot;&gt;Bigdata&lt;/a&gt;), &lt;a href=&quot;http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science&quot; id=&quot;link-id0x1d8e1d88&quot;&gt;CWI&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x1d8cbcf0&quot;&gt;MonetDB&lt;/a&gt;), &lt;a href=&quot;http://www.dbpedia.org/resource/Karlsruhe_Institute_of_Technology&quot; id=&quot;link-id0x1e204cb0&quot;&gt;Karlsruhe Institute of Technology&lt;/a&gt; (YARS2/&lt;a href=&quot;http://swse.deri.ie/&quot; id=&quot;link-id0x1e653bf0&quot;&gt;SWSE&lt;/a&gt;). &lt;a href=&quot;http://www.larkc.eu/&quot; id=&quot;link-id0x1e6a4408&quot;&gt;LarKC&lt;/a&gt; was well represented, being our hosts, with STI, Ontotext, CYC, and &lt;a href=&quot;http://www.vu.nl/&quot; id=&quot;link-id0x1c8a6090&quot;&gt;VU Amsterdam&lt;/a&gt;. Notable absences were &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1e5ab690&quot;&gt;Oracle&lt;/a&gt;, &lt;a href=&quot;http://freebase.com/guid/9202a8c04000641f8000000005c908d6&quot; id=&quot;link-id0x1f5e5ff0&quot;&gt;Garlik&lt;/a&gt;, &lt;a href=&quot;http://semanticweb.org/id/Franz_Inc&quot; id=&quot;link-id0x1d9c08f0&quot;&gt;Franz&lt;/a&gt;, and &lt;a href=&quot;http://www.talis.com/&quot; id=&quot;link-id0x1d338b30&quot;&gt;Talis&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Now of semantic data management... What is the difference between a relational database and a semantic repository, a triple/quad store, a whatever-you-call-them?&lt;/p&gt; &lt;p&gt;I had last fall a meeting at CWI with Martin Kersten, Peter Boncz and Lefteris Sidirourgos from CWI, and Frank van Harmelen and Spiros Kotoulas of VU Amsterdam, to start a dialogue between semanticists and databasers. Here we were with many more people trying to discover what the case might be. What are the differences?&lt;/p&gt; &lt;p&gt;Michael &lt;a href=&quot;http://dbpedia.org/resource/Michael_Stonebraker&quot; id=&quot;link-id0x1da55730&quot;&gt;Stonebraker&lt;/a&gt; and Martin Kersten have basically said that what is sauce for the goose is sauce for the gander, and that there is no real difference between relational DB and &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1d828310&quot;&gt;RDF&lt;/a&gt; storage, except maybe for a little tuning in some data structures or parameters. Semantic repository implementors on the other hand say that when they tried putting triples inside an RDB it worked so poorly that they did everything from scratch. (It is a geekly penchant to do things from scratch, but then this is not always unjustified.)&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/organization/openlink#this&quot; id=&quot;link-id0x1cf1e620&quot;&gt;OpenLink Software&lt;/a&gt; and &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1cfbc1d8&quot;&gt;Virtuoso&lt;/a&gt; are in agreement with both sides, contradictory as this might sound. We took our &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1e1f6a20&quot;&gt;RDBMS&lt;/a&gt; and added data types and structures and cost model alterations to an existing platform. Oracle did the same. MonetDB considers doing this and time will tell the extent of their RDF-oriented alterations. Right now the estimate is that this will be small and not in the kernel.&lt;/p&gt; &lt;p&gt;I would say with confidence that without source code access to the RDB, RDF will not be particularly convenient or efficient to accommodate. With source access, we found that what serves RDB also serves RDF. For example, execution engine and data compression considerations are the same, with minimal tweaks for RDF&amp;#39;s run time typing needs.&lt;/p&gt; &lt;p&gt;So now we are founding a platform for continuing this discussion. There will be workshops and calls for papers and the beginnings of a research community.&lt;/p&gt; &lt;p&gt;After the initial meeting at CWI, I tried to figure what the difference was between the databaser and semanticist minds. Really, the things are close but there is still a disconnect. Database is about big sets and semantics is about individuals, maybe. The databaser discovers that the operation on each member of the set is not always the same, and the semanticist discovers that the operation on each member of the set is often the same.&lt;/p&gt; &lt;p&gt;So the semanticist says that big joins take time. The databaser tells the semanticist not to repeat what&amp;#39;s been obvious for 40 years and for which there is anything from partitioned hashes to merges to various vectored execution models. Not to mention columns.&lt;/p&gt; &lt;p&gt;Spiros of VU Amsterdam/LarKC says that map-reduce materializes inferential closure really fast. Lefteris of CWI says that while he is not a semantic person, he does not understand what the point of all this materializing is, nobody is asking the question, right? So why answer? I say that computing inferential closure is a semanticist tradition; this is just what they do. Atanas Kiryakov of Ontotext says that this is not just a tradition whose start and justification is in the forgotten mists of history, but actually a clear and present need; just look at all the joining you would need.&lt;/p&gt; &lt;p&gt;Michael Witbrock of CYC says that it is not about forward or backward inference on toy rule sets, but that both will be needed and on massively bigger rule sets at that. Further, there can be machine learning to direct the inference, doing the meta-reasoning merged with the reasoning itself.&lt;/p&gt; &lt;p&gt;I say that there is nothing wrong with materialization if it is guided by need, in the vein of memo-ization or cracking or recycling as is done in MonetDB. Do the work when it is needed, and do not do it again.&lt;/p&gt; &lt;p&gt;Brian Thompson of Systap/Bigdata asks whether it is not a contradiction in terms to both want pluggability and merging inference into the data, like LarKC would be doing. I say that this is difficult but not impossible and that when you run joins in a cluster database, as you decide based on the data where the next join step will be, so it will be with inference. Right there, between join steps, integrated with whatever data partitioning logic you have, for partitioning you &lt;i&gt;will&lt;/i&gt; have, data being bigger and bigger. And if you have reuse of intermediates and demand driven indexing &lt;i&gt;Ã  la&lt;/i&gt; MonetDB, this too integrates and applies to inference results.&lt;/p&gt; &lt;p&gt;So then, LarKC and CYC, can you picture a pluggable inference interface at this level of granularity? So far, I have received some more detail as to the needs of inference and database integration, essentially validating our previous intuitions and plans.&lt;/p&gt; &lt;p&gt;Aside talking of inference, we have the more immediate issue of creating an industry out of the semantic data management offerings of today.&lt;/p&gt; &lt;p&gt;What do we need for this? We need close-to-parity with relational â doing your warehouse in RDF with the attendant agility thereof can&amp;#39;t cost 10x more to deploy than the equivalent relational solution.&lt;/p&gt; &lt;p&gt;We also want to tell the key-value, anti-&lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x172e8c80&quot;&gt;SQL&lt;/a&gt; people, who throw away transactions and queries, that there is a better way. And for this, we need to improve our gig just a little bit. Then you have the union of some level of &lt;a href=&quot;http://dbpedia.org/resource/ACID&quot; id=&quot;link-id0x1e0de2e8&quot;&gt;ACID&lt;/a&gt;, at least consistent read, availability, complex query, large scale.&lt;/p&gt; &lt;p&gt;And to do this, we need a benchmark. It needs a differentiation of online queries and browsing and analytics, graph algorithms and such. We are getting there. We will soon propose a social web benchmark for RDF which has both online and analytical aspects, a data generator, a test driver, and so on, with a &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x1e3cb130&quot;&gt;TPC&lt;/a&gt;-style set of rules. If there is agreement on this, we will all get a few times faster. At this point, RDF will be a lot more competitive with mainstream and we will cross another qualitative threshold. &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RDF Geography With Virtuoso</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-11-11#1588</atom:id>
  <atom:published>2009-11-11T17:17:27Z</atom:published>
  <atom:updated>2010-02-01T09:14:29.000012-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have just added a geometry &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1c0e02b0&quot;&gt;data&lt;/a&gt; type and corresponding &lt;a href=&quot;http://dbpedia.org/resource/R-tree&quot; id=&quot;link-id0x1e093220&quot;&gt;R&lt;/a&gt;-tree index to &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1ddccfe8&quot;&gt;Virtuoso&lt;/a&gt;. This follows the general scheme of &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1b88a580&quot;&gt;SQL&lt;/a&gt;/MM, as is implemented by &lt;a href=&quot;http://dbpedia.org/resource/PostGIS&quot; id=&quot;link-id0x1d271a90&quot;&gt;PostGIS&lt;/a&gt; and many others. We have all the engine-side stuff, including optimizer support for geometry cardinality sampling and good execution plans for combinations of spatial and other joins. We have however not yet implemented all the different geometry types and library function support for them, like shortest distance between two arbitrary shapes.&lt;/p&gt; &lt;p&gt;The geometry support is for both SQL and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1b8d4ca8&quot;&gt;SPARQL&lt;/a&gt;. On the SQL side, it works with the ISO/IEC 13249 SQL/MM API; with &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1ed69318&quot;&gt;RDF&lt;/a&gt;, a geometry can occur as the object of a quad. If the object is a typed-literal of the &lt;code&gt;virtrdf:Geometry&lt;/code&gt; type, it gets indexed in a geometry index over all geometries in quads; no special declarations are needed. After this, SQL MM predicates and functions can be used with SPARQL, like this:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt; PREFIX geo: &amp;lt;&lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1d2d0ae0&quot;&gt;http&lt;/a&gt;://www.w3.org/2003/01/geo/wgs84_pos#&amp;gt; SELECT ?class COUNT (*) WHERE { ?m geo:geometry ?geo . ?m a ?class . FILTER ( &amp;lt;bif:st_intersects&amp;gt; ( ?geo, &amp;lt;bif:st_point&amp;gt; (0, 52), 100 ) ) } GROUP BY ?class ORDER BY DESC 2 &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This returns the counts of objects of each class occurring within 100 km of (0, 52), a point near London.&lt;/p&gt; &lt;p&gt;For any data set with &lt;a href=&quot;http://dbpedia.org/resource/World_Geodetic_System&quot; id=&quot;link-id0x1ec00578&quot;&gt;WGS 84&lt;/a&gt; &lt;code&gt;geo:long&lt;/code&gt; and &lt;code&gt;geo:lat&lt;/code&gt; values, a simple SQL function makes a point geometry for each such coordinate pair and adds it as the &lt;code&gt;geo:geometry&lt;/code&gt; property of the subject with the long/lat. This then enables fast spatial access to arbitrary location data in RDF.&lt;/p&gt; &lt;p&gt;Right now, we hardly see any geometries other than points in RDF data, even though there are some efforts for vocabularies for more complex entities. As these get adopted we will support them.&lt;/p&gt; &lt;p&gt;For scalability, we tried the implementation with &lt;a href=&quot;http://www.openstreetmap.org/&quot; id=&quot;link-id0x1c781e68&quot;&gt;OpenStreetMap&lt;/a&gt;&amp;#39;s 350 million or so points. The geometry implementation partitions well over a cluster, similarly to a full text index, i.e., every server has its slice of the geometries, partitioned by the geometry object&amp;#39;s key, thus not by range of coordinates or such. Like this, the items are evenly spread even though the coordinate distribution is highly uneven.&lt;/p&gt; &lt;p&gt;We can do spatial joins like â&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt; SELECT ?s ( &amp;lt;sql:num_or_null&amp;gt; (?p) ) COUNT (*) WHERE { ?s &amp;lt;http://&lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x1f885868&quot;&gt;dbpedia&lt;/a&gt;.org/ontology/populationTotal&amp;gt; ?p . FILTER ( &amp;lt;sql:num_or_null&amp;gt; (?p) &amp;gt; 1000000 ) . ?s geo:geometry ?geo . FILTER ( &amp;lt;bif:st_intersects&amp;gt; ( ?pt, ?geo, 5 ) ) . ?xx geo:geometry ?pt } GROUP BY ?s ( &amp;lt;sql:num_or_null&amp;gt; (?p) ) ORDER BY DESC 3 LIMIT 20 &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This takes the DBpedia subjects that have a population over 1 million and a geometry. We then count all the geometries within 5 km of the point location of the first geometry. With DBpedia (about 5 million points), &lt;a href=&quot;http://www.geonames.org/&quot; id=&quot;link-id0x1d4279b0&quot;&gt;GeoNames&lt;/a&gt; (7 million points), and OpenStreetMap (350 million points), we get the result:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;http://dbpedia.org/resource/Munich 1356594 117280 http://dbpedia.org/resource/London 7355400 81486 http://dbpedia.org/resource/Davao_City 1363337 58640 http://dbpedia.org/resource/Belo_Horizonte 2412937 58640 http://dbpedia.org/resource/Chengde 3610000 58640 http://dbpedia.org/resource/Hamburg 1769117 51664 http://dbpedia.org/resource/San_Diego%2C_California 1266731 47685 http://dbpedia.org/resource/Bursa 1562828 47685 http://dbpedia.org/resource/Port-au-Prince 1082800 47685 http://dbpedia.org/resource/Oakland_County%2C_Michigan 1194156 45636 http://dbpedia.org/resource/Sana%27a 1747627 40923 http://dbpedia.org/resource/Milan 1303437 40923 http://dbpedia.org/resource/Campinas 1059420 40923 http://dbpedia.org/resource/Hohhot 2580000 40923 http://dbpedia.org/resource/Brussels 1031215 40923 http://dbpedia.org/resource/Bogra_District 2988567 40923 http://dbpedia.org/resource/Cort%C3%A9s_Department 1202510 40923 http://dbpedia.org/resource/Berlin 3416300 35668 http://dbpedia.org/resource/New_York_City 8274527 30810 http://dbpedia.org/resource/Los_Angeles%2C_California 3849378 25614&lt;br /&gt; 20 Rows. -- 1733 msec.&lt;br /&gt; Cluster 8 nodes, 1 s. 358 m/s 1596 KB/s 664% &lt;a href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1e6403b0&quot;&gt;cpu&lt;/a&gt; 2% read 16% clw threads 1r 0w 0i buffers 1124351 0 d 0 w 0 pfs &lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This takes 1.7 seconds on a Virtuoso Cluster configured with 8 processes on a single dual-Xeon 5520 box, running at about 664% CPU with warm &lt;a href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1e81f610&quot;&gt;cache&lt;/a&gt;. Fair enough for a first crack, this can obviously be optimized further. Still, the geo part of the processing is already as good as instantaneous.&lt;/p&gt; &lt;p&gt;We will shortly have the geography features installed on DBpedia and the other data sets we host. As these come online we will show more demo queries.&lt;/p&gt; &lt;p&gt;For more about SQL/MM, you can look to a couple of PDFs:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.fer.hr/_download/repository/SQLMM_Spatial-_The_Standard_to_Manage_Spatial_Data_in_Relational_Database_Systems.pdf&quot; id=&quot;link-id133775f0&quot;&gt;SQL/MM Spatial: The Standard to Manage Spatial Data in Relational Database Systems&lt;/a&gt; by Knut Stolze&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.sigmod.org/record/issues/0112/standards.pdf&quot; id=&quot;link-id1433c5e0&quot;&gt;SQL Multimedia and Application Packages (SQL/MM)&lt;/a&gt; by Jim Melton and Andrew Eisenberg&lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>European Commission and the Data Overflow</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-10-27#1586</atom:id>
  <atom:published>2009-10-27T18:29:51Z</atom:published>
  <atom:updated>2009-10-27T14:57:31-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The European Commission recently circulated a questionnaire to selected experts on what could be done for the future of big &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x43bae00&quot;&gt;data&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Since the &lt;a href=&quot;http://cordis.europa.eu/fp7/ict/content-knowledge/consultation_en.html&quot; id=&quot;link-id1191c0f8&quot;&gt;questionnaire is public&lt;/a&gt;, I am publishing my answers below.&lt;/p&gt; &lt;ol type=&quot;1&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Data and data types&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What volumes of data are we dealing with today? What is the growth rate? Where can we expect to be in 2015? &lt;/b&gt; &lt;/p&gt; &lt;p&gt;Private data warehouses of corporations have more than doubled yearly for the past years; hundreds of TB is not exceptional. This will continue. The real shift is in structured data being published in increasing quantities with a minimum level of integrate-ability through use of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x5c7add0&quot;&gt;RDF&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x5c7adb8&quot;&gt;linked data&lt;/a&gt; principles. There are rewards for use of standard vocabularies and identifiers through search engines recognizing such data. There is convergence around &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x5c7ada0&quot;&gt;DBpedia&lt;/a&gt; identifiers for real-world entities, e.g., most things that would be in the news.&lt;/p&gt; &lt;p&gt;This also means that internal data processes and silos may be enriched with this content. There is consequent pressure for accommodating more diversity of data, with more flexible &lt;a href=&quot;http://dbpedia.org/resource/Database_schema&quot; id=&quot;link-id0x7d87a88&quot;&gt;schema&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Ultimately, all content presently stored in RDBs and presented in public accessible dynamic web pages will end up on the web of linked data. Examples are product catalogs, price lists, event schedules and the like.&lt;/p&gt; &lt;p&gt;The volume of the well known linked data sets is around 10 billion statements. With the above mentioned trends, growth by two or three orders of magnitude by 2015 seems reasonable, This is so especially if explicit semantics are extracted from the document web and if there is some further progress in the precision/recall of such extraction.&lt;/p&gt; &lt;p&gt;Relevant sections of this mass of data are a potential addition to any present or future analytics application.&lt;/p&gt; &lt;p&gt;Since arbitrary analytics over the database which is the web cannot be economically provided by a centralized search engine, a cloud model may be used for on-demand selection of relevant data and mixing it with private data. This will drive database innovation for the next years even more than the continued classical warehouse growth.&lt;/p&gt; &lt;p&gt;Science data is another driver of the data overflow. For example, faster gene sequencing, more accurate measurements in high energy physics, better imaging, and remote sensing will produce large volumes of data. This data has highly regular structure but labeling this data with source and lineage calls for a flexible, schema-last, self-describing model, such as RDF and linked data. Data and &lt;a href=&quot;http://dbpedia.org/resource/Metadata&quot; id=&quot;link-id0x7a3fb40&quot;&gt;metadata&lt;/a&gt; should travel together but may have different data models.&lt;/p&gt; &lt;p&gt;By and large, the metadata of science data will be another stream to the web of linked data, at least to the degree it is publicly accessible. Restricted circles can and likely will implement similar ideas.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What types of data can we deal with intelligently due to their inherent structure (geospatial, temporal, social or &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x5a48058&quot;&gt;knowledge&lt;/a&gt; graphs, 3D, sensor streams...)?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;All the above types should be supported inside one DBMS so as to allow efficient querying combining conditions on all these types of data, e.g., &lt;i&gt;photos of sunsets taken last summer in Ibiza, with over 20 megapixels, by people I know.&lt;/i&gt; &lt;/p&gt; &lt;p&gt;Note that the test for being a sunset is an operation on the image blob that should be taken to the data; the images cannot be economically transferred.&lt;/p&gt; &lt;p&gt;Interleaving of all database functions and types becomes increasingly important.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Industries, communities&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is producing these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Right now, projects such as &lt;a href=&quot;http://www.bio2rdf.org/&quot; id=&quot;link-id0x2a29de8&quot;&gt;Bio2RDF&lt;/a&gt;, &lt;a href=&quot;http://neurocommons.org/page/Main_Page&quot; id=&quot;link-id0x7ddaed0&quot;&gt;Neurocommons&lt;/a&gt;, and DBPedia produce this data. The processes are in place and are reasonable. Incremental improvement is to be expected. These processes, along with the &lt;a href=&quot;http://www.w3.org/DesignIssues/LinkedData.html&quot; id=&quot;link-id0xbab4dfd0&quot;&gt;linked data meme&lt;/a&gt; generally taking off, drive demand for better &lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x51f4e0&quot;&gt;NLP&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/Natural_language_processing&quot; id=&quot;link-id0x51a1b48&quot;&gt;Natural Language Processing&lt;/a&gt;), e.g., &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x956680&quot;&gt;entity&lt;/a&gt; and relationship extraction, especially extraction that can produce instance data in given ontologies (e.g., events) using common identifiers (e.g., DBPedia URIs).&lt;/p&gt; &lt;p&gt;Mapping of RDBs to RDF is possible, and a W3C working group is developing standards for this. The required baseline level has been reached; the rest is a matter of automating deployment. Within the enterprise, there are advantages to be gained for &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x7da9e80&quot;&gt;information&lt;/a&gt; integration; e.g., all entities in the CRM space can be integrated with all email and support tickets through giving everything a &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x71673f8&quot;&gt;URI&lt;/a&gt;. Some of this information may even be published on an &lt;a href=&quot;http://dbpedia.org/resource/Extranet&quot; id=&quot;link-id0x9aa6e0&quot;&gt;extranet&lt;/a&gt; for self-service and web-service interfaces. This has been done at small scales and the rest is a matter of spreading adoption and lowering the entry barrier. Incremental progress will take place, eventually resulting in qualitatively better integration along the value chain when adoption is sufficiently widespread.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Who is consuming these data and why? Could they do it better? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Consumers are various. The greatest need is for tools that summarize complex data and allow getting a bird&amp;#39;s eye view of what data is in the first instance available. Consuming the data is hindered by the user not even necessarily knowing what data there is. This is somewhat new, as traditionally the business analyst did know the schema of the warehouse and was proficient with &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x7f7b148&quot;&gt;SQL&lt;/a&gt; report generators and statistics packages.&lt;/p&gt; &lt;p&gt;Where Web 2.0 made the &lt;i&gt;citizen journalist&lt;/i&gt;, the web of linked data will make the &lt;i&gt;citizen analyst&lt;/i&gt;. For this to happen, with benefits for individuals, enterprises, and governments alike, more work in user interfaces, knowledge discovery, and query composition will be useful. We may envision a &amp;quot;meshup economy&amp;quot; where data is plentiful, but the unit of value and exchange is the smart report that crystallizes actionable value from this ocean.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What industrial sectors in Europe could become more competitive if they became much better at managing data?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Any sector could benefit. Early adopters are seen in the biomedical field and to an extent in media. &lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Is the regulation landscape imposing constraints (privacy, compliance ...) that don&amp;#39;t have today good tool support?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The regulation landscape drives database demand through data retention requirements and the like.&lt;/p&gt; &lt;p&gt;With data integration, especially with privacy-sensitive data (as in medicine), there are issues of whether one dares put otherwise-shareable information online. Regulation is needed to protect individuals, but integration should still be possible for science.&lt;/p&gt; &lt;p&gt;For this, we see a need for progress in applying policy-based approaches (e.g., row level security) to relatively schema-last data such as RDF. This is possible but needs some more work. Also, creating on-the-fly-anonymizing views on data might help.&lt;/p&gt; &lt;p&gt;More research is needed for reconciling the need for security with the advantages of broad-based &lt;i&gt;ad hoc&lt;/i&gt; integration. Ideally, data should be intelligent, aware of its origins and classification and cautious of whom it interacts with, all of this supported under the covers so that the user could ask anything but the data might refuse to answer or might restrict answers according to the user&amp;#39;s profile. This is a tall order and implementing something of the sort is an open question.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What are the main practical problem identified for individuals and organizations? Please give examples and tell us about the main obstacles and barriers.&lt;/b&gt; &lt;/p&gt; &lt;p&gt;We have come across the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Knowing that the data exists in the first place.&lt;/li&gt; &lt;li&gt;If the data is found, figuring out the provenance, units and precision of measurement, identifiers, and the like.&lt;/li&gt; &lt;li&gt;Compatible subject matter but incompatible representation: For example, one has numbers on a map with different maps for different points in time; another has time series of instrument data with geo-location for the instrument. It is only to be expected that the time interval between measurements is not the same. So there is need for a lot of one-off programming to align data.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Other problems have to do with sheer volume, i.e., transfer of data even in a local area network is too slow, let alone over a wide area network. Computation needs to go to the data, and databases need to support this.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Services, software stacks, protocols, standards, benchmarks&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What combinations of components are needed to deal with these problems?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Recent times have seen a proliferation of special purpose databases. Since the data needs of the future are about combining data with maximum agility and minimum performance hit, there is need to gather the currently-separate functionality into an integrated system with sufficient flexibility. We see some of this in integration of map-reduce and scale-out databases. The former antagonists have become partners. Vertica, &lt;a href=&quot;http://dbpedia.org/resource/Greenplum&quot; id=&quot;link-id0x7a94e70&quot;&gt;Greenplum&lt;/a&gt;, and OpenLink &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x2ab2868&quot;&gt;Virtuoso&lt;/a&gt; are example of DBMS featuring work in this direction.&lt;/p&gt; &lt;p&gt;Interoperability and at least &lt;i&gt;de facto&lt;/i&gt; standards in ways of doing this will emerge.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data exchange and processing mechanisms will be needed to work across platforms and programming languages?&lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x78a0458&quot;&gt;HTTP&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x7ff2360&quot;&gt;XML&lt;/a&gt;, and RDF are in fact very verbose, yet these are the formats and models that have uptake. Thus, these will continue to be used even though one might think binary formats to be more efficient.&lt;/p&gt; &lt;p&gt;There are of course science data set standards that are more compressed and these will continue, hopefully adding a practice of rich metadata in RDF.&lt;/p&gt; &lt;p&gt;For internals of systems, MPI and TCP/IP with proprietary optimized wire formats will continue. Inter-system communication will likely continue to be HTTP, XML, and RDF as appropriate.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What data environments are today so wastefully messy that they would benefit from the development of standards?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;RDF and &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x5643d70&quot;&gt;OWL&lt;/a&gt; are not messy but they could use some more performance; we are working on this. &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x152ab18&quot;&gt;SPARQL&lt;/a&gt; is finally acquiring the capabilities of a serious query language, so things are slowly coming together.&lt;/p&gt; &lt;p&gt;Community process for developing application domain specific vocabularies works quite well, even though one could argue it is &lt;i&gt;ad hoc&lt;/i&gt; and not up to what a modeling purist might wish.&lt;/p&gt; &lt;p&gt;Top-down imposition of standards has a mixed history, with long and expensive development and sometimes no or little uptake, consider some WS* standards for example.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What kind of performance is expected or required of these systems? Who will measure it reliably? How?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Relational databases have a history of substantial investment in &lt;a href=&quot;http://dbpedia.org/resource/Program_optimization&quot; id=&quot;link-id0xecc100&quot;&gt;optimization&lt;/a&gt; and some of them are very good for what they do, e.g., the newer generation of analytics databases.&lt;/p&gt; &lt;p&gt;The very large schema-last, no-SQL, sometimes eventually consistent key-value stores have a somewhat shorter history but do fill a real need.&lt;/p&gt; &lt;p&gt;These trends will merge: Extreme scale, schema-last, complex queries, even more complex inference, custom code for in-database machine learning and other bulk processing.&lt;/p&gt; &lt;p&gt;We find RDF augmented with some binary types at this crossroads. This point of the design space will have to provide performance roughly on the level of today&amp;#39;s best relational solution for workloads that fit the relational model. The added cost of schema-last and inference must come down. We are working on this. Research work such as carried out with &lt;a href=&quot;http://dbpedia.org/resource/MonetDB&quot; id=&quot;link-id0x7ae2890&quot;&gt;MonetDB&lt;/a&gt; gives clues as to how these aims can be reached.&lt;/p&gt; &lt;p&gt;The separation of query language and inference is artificial. After the concepts are mature, these functions will merge and execute close to the data; there are clear evolutionary pressures in this direction.&lt;/p&gt; &lt;p&gt;Benchmarks are key. Some gain can be had even from repurposing standard relational benchmarks like &lt;a href=&quot;http://www.tpc.org/&quot; id=&quot;link-id0x71eb528&quot;&gt;TPC&lt;/a&gt;-&lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x5e16a40&quot;&gt;H&lt;/a&gt;. But the TPC-H rules do not allow official reporting of such.&lt;/p&gt; &lt;p&gt;Development of benchmarks for RDF, complex queries, and inference is needed. A bold challenge to the community, it should be rooted in real-life integration needs and involve high heterogeneity. A key-value store benchmark might also be conceived. A transaction benchmark like TPC-&lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x78562d0&quot;&gt;C&lt;/a&gt; might be the basis, maybe augmented with massive user-generated content like reviews and blogs.&lt;/p&gt; &lt;p&gt;If benchmarks exist and are not too easy nor inaccessibly difficult nor too expensive to run â think of the high end TPC-C results â then TPC-style rules and processes would be quite adequate. The threshold to publish should be lowered: Everybody runs the TPC workloads internally but few publish.&lt;/p&gt; &lt;p&gt;Some EC initiative for benchmarking could make sense, similar to the TREC initiative of the US government. Industry should be consulted for the specific content; possibly the answers to the present questionnaire can provide an approximate direction.&lt;/p&gt; &lt;p&gt;Benchmarks should be run by software vendors on their own systems, tuned by themselves. But there should be a process of disclosure and auditing; the TPC rules give an example. Compliance should not be too expensive or time consuming. Some community development for automating these things would be a worthwhile target for EC funding.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Usability and training&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How difficult will it be for a developer of average competence to deploy components whose core is based on rather deep computer science? Do we all need to understand Monads and Continuations? What can be done to make it ever easier?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;In the database world, huge advances in technology have taken place behind a relatively simple and stable interface: SQL. For the linked data &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x7761e50&quot;&gt;web&lt;/a&gt;, the same will take place behind SPARQL.&lt;/p&gt; &lt;p&gt;Beyond these, for example, programming with MPI with good utilization of a cluster platform for an arbitrary algorithm, is quite difficult. The casual amateur is hereby warned.&lt;/p&gt; &lt;p&gt;There is no single solution. For automatic parallelization, since explicit, programmatic parallelization of things with MPI for example is very unscalable in terms of required skill, we should favor declarative and/or functional approaches.&lt;/p&gt; &lt;p&gt;Developing a debugger and explanation engine for rule-based and description-logics-based inference would be an idea.&lt;/p&gt; &lt;p&gt;For procedural workloads, things like Erlang may be good in cases and are not overly difficult in principle, especially if there are good debugging facilities.&lt;/p&gt; &lt;p&gt;For shipping functions in a cluster or cloud, the &lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x5494b0&quot;&gt;BOOM&lt;/a&gt; (&lt;a href=&quot;http://www.eecs.berkeley.edu/Research/Projects/Data/105733.html&quot; id=&quot;link-id0x7f1f148&quot;&gt;Berkeley Orders Of Magnitude&lt;/a&gt;) approach or logic programming with explicit specification of compute location seem promising, surely more flexible than map-reduce. The question is whether a &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x5c758c8&quot;&gt;PHP&lt;/a&gt; developer can be made to do logic programming.&lt;/p&gt; &lt;p&gt;This bridge will be crossed only with actual need and even then reluctantly. We may look at the Web 2.0 practice of sharding &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0x432f868&quot;&gt;MySQL&lt;/a&gt;, inconvenient as this may be, for an example. There is inertia and thus re-architecting is a constant process that is generally in reaction to facts, &lt;i&gt;post hoc&lt;/i&gt;, often a point solution. One could argue that planning ahead would be smarter but by and large the world does not work so.&lt;/p&gt; &lt;p&gt;One part of the answer is an infinitely-scalable SQL database that expands and shrinks in the clouds, with the usual semantics, maybe optional eventual consistency and built-in map reduce. If such a thing is inexpensive enough and syntax-level-compatible with present installed base, many developers do not have to learn very much more.&lt;/p&gt; &lt;p&gt;This is maybe good for the bread-and-butter IT, but European competitiveness should not rest on this. Therefore we wish to go for bold new application types for which the client-server database application is not the model. Data-centric languages like BOOM, if they can be made very efficient and have good debugging support, are attractive there. These do require more intellectual investment but that is not a problem since the less-inquisitive part of the developer community is served by the first part of the answer.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;How is a developer of average skills going to learn about these new advanced tools? How can we plan for excellent documentation and training, community mentoring, exchange of good practices, etc... across all EU countries?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;For the most part, developers do not learn things for the sake of learning. When they have learned something and it is adequate, they stay with it for the most part and are even reluctant to engage in cross-camps interaction. The research world is often similarly insular. A new inflection in the application landscape is needed to drive learning. This inflection is provided by the &lt;a href=&quot;https://wiki.mozilla.org/Labs/Ubiquity&quot; id=&quot;link-id0x7f051c8&quot;&gt;ubiquity&lt;/a&gt; of mobile devices, sensor data, explicit semantics, NLP concept extraction, web of linked data, and such factors.&lt;/p&gt; &lt;p&gt;RDFa is a good example of a new technique piggybacking on something everybody uses, namely HTML. These new things should, within possibility, be deployed in the usual technology stack, &lt;a href=&quot;http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29&quot; id=&quot;link-id0x77151e0&quot;&gt;LAMP&lt;/a&gt; or Java. Of course these do not have to be LAMP or Java or HTML or HTTP themselves but they must manifest through these.&lt;/p&gt; &lt;p&gt;A lot of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x7940cd0&quot;&gt;semantic web&lt;/a&gt; potential can be realized within the client-server database application model, thus no fundamental re-architecting, just some new data types and queries.&lt;/p&gt; &lt;p&gt;For data- or processing-intensive tasks, an on-demand hookup to cloud-based servers with Erlang and/or BOOM for programming model would be easy enough to learn and utilize.&lt;/p&gt; &lt;p&gt;The question is one of providing challenges. Addressing actual challenges with these techniques will lead to maturity, documentation, examples, and training. With virtual, Europe-wide distributed teams a reality in many places, Europe-wide dissemination is no longer insurmountable.&lt;/p&gt; &lt;p&gt;As the data overflow proceeds, its victims will multiply and create demand for solutions. The EC could here encourage research project use cases gaining an extended life past the end of research projects, possibly being maintained and multiplied and spun off.&lt;/p&gt; &lt;p&gt;If such things could be mutated into self-sustaining service businesses with pay-per-use revenue, say through a cloud SaaS business model, still primarily leveraging an open source technology stack, we could have self-propagating and self-supporting models for exploiting advanced IT. This would create interest, and interest would drive training and dissemination.&lt;/p&gt; &lt;p&gt;The problem is creating the pull.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Challenges&lt;/b&gt; &lt;/p&gt; &lt;ol type=&quot;a&quot; start=&quot;1&quot;&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should be, in this domain, the equivalent of the Netflix challenge, Ansari X Prize, &lt;a href=&quot;http://dbpedia.org/resource/Google&quot; id=&quot;link-id0x7e72f40&quot;&gt;Google&lt;/a&gt; Lunar X Prize, etc. ... ?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC itself no doubt suffers from data overflow in one function or another. Unless security/secrecy prohibits, simply publishing a large data set and a description of what operations should be done on it would be a start. The more real the data, the better â reality is consistently more complex and surprising than imagination. Since many interesting problems touch on fraud detection and law enforcement, there may be some security obstacles for using these application domains as subject matters of open challenges.&lt;/p&gt; &lt;p&gt;Once there is a good benchmark, as discussed above, there can be some prize money allocated for the winners, specially if the race is tight.&lt;/p&gt; &lt;p&gt;The Semantic Web Challenge and the Billion Triples Challenge exist and are useful as such, but do not seem to have any huge impact.&lt;/p&gt; &lt;p&gt;The incentives should be sufficient and part of the expenses arising from running for such challenges could be funded. Otherwise investing in existing business development will be more interesting to industry. Some industry participation seems necessary; we would wish academia and industry to work closer. Also, having industry supply the baseline guarantees that academia actually does further the state of the art. This is not always certain.&lt;/p&gt; &lt;p&gt;If challenges are based on actual problems, whether of the EC, its member governments, or private entities, and winning the challenge may lead to a contract for supplying an actual solution, these will naturally become more interesting for consortia involving integrators, specialist software vendors, and academia. Such a model would build actual capacity to deploy leading edge technologies in production, which is sorely needed.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;What should one do to set up such a challenge, administer, and monitor it?&lt;/b&gt; &lt;/p&gt; &lt;p&gt;The EC should probably circulate a call for actual problem scenarios involving big data. If the matter of the overflow is as dire as represented, cases should be easy to find. A few should be selected and then anonymized if needed.&lt;/p&gt; &lt;p&gt;The party with the use case would benefit by having hopefully the best work on it. The contestants would benefit from having real world needs guide R&amp;amp;D. The EC would not have to do very much, except possibly use some money for funding the best proposals. The winner would possibly get a large account and related sales and service income. The contestants would have to be teams possibly involving many organizations; for example, development and first-line services and support could come from different companies along a systems integrator model such as is widely used in the US.&lt;/p&gt; &lt;p&gt;There may be a good benchmark at the time, possibly resulting from FP7 itself. In such a case, the EC could offer a prize for winners. Details would have to be worked out case by case. Such a challenge could be repeated a few times, as benchmark-driven progress in databases or TREC for example have taken some years to reach a point of slowdown in progress.&lt;/p&gt; &lt;p&gt;Administrating such an activity should not be prohibitive, as most of the expertise can be found with the stakeholders.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;/li&gt; &lt;/ol&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Provenance and Reification in Virtuoso</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-09-01#1573</atom:id>
  <atom:published>2009-09-01T14:44:08Z</atom:published>
  <atom:updated>2009-09-01T11:20:46.000006-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;These days, &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x37019c8&quot;&gt;data&lt;/a&gt; provenance is a big topic across the board, ranging from the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x53c3620&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x4aa3848&quot;&gt;web&lt;/a&gt;, to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x385aff0&quot;&gt;RDF&lt;/a&gt; in general, to any kind of data integration, with or without RDF. Especially with scientific data we encounter the need for metadata and provenance, repeatability of experiments, etc. Data without context is worthless, yet the producers of said data do not always have a model or budget for metadata. And if they do, the approach is often a proprietary relational schema with web services in front.&lt;/p&gt; &lt;p&gt;RDF and linked data principles could evidently be a great help. This is a large topic that goes into the culture of doing science and will deserve a more extensive treatment down the road.&lt;/p&gt; &lt;p&gt;For now, I will talk about possible ways of dealing with provenance annotations in &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x51c4da0&quot;&gt;Virtuoso&lt;/a&gt; at a fairly technical level.&lt;/p&gt; &lt;p&gt;If data comes many-triples-at-a-time from some source (e.g., library catalogue, user of a social network), then it is often easiest to put the data from each source/user into its own graph. Annotations can then be made on the graph. The graph IRI will simply occur as the subject of a triple in the same or some other graph. For example, all such annotations could go into a special annotations graph.&lt;/p&gt; &lt;p&gt;On the query side, having lots of distinct graphs does not have to be a problem if the index scheme is the right one, i.e., the 4 index scheme &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfindexes&quot; id=&quot;link-id142a0798&quot;&gt;discussed in the Virtuoso documentation&lt;/a&gt;. If the query does not specify a graph, then triples in any graph will be considered when evaluating the query.&lt;/p&gt; &lt;p&gt;One could write queries like â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt;SELECT ?pub WHERE { GRAPH ?g { ?person foaf:knows ?contact } ?contact foaf:name &amp;quot;Alice&amp;quot; . ?g xx:has_publisher ?pub }&lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;This would return the publishers of graphs that assert that somebody knows Alice.&lt;/p&gt; &lt;p&gt;Of course, the &lt;a href=&quot;http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#reification&quot; id=&quot;link-id14fa9488&quot;&gt;RDF reification vocabulary&lt;/a&gt; can be used as-is to say things about single triples. It is however very inefficient and is not supported by any specific optimization. Further, reification does not seem to get used very much; thus there is no great pressure to specially optimize it.&lt;/p&gt; &lt;p&gt;If we have to say things about specific triples and this occurs frequently (i.e., for more than 10% or so of the triples), then modifying the quad table becomes an option. For all its inefficiency, the RDF reification vocabulary is applicable if reification is a rarity.&lt;/p&gt; &lt;p&gt;Virtuoso&amp;#39;s &lt;code&gt;RDF_QUAD&lt;/code&gt; table can be altered to have more columns. The problem with this is that space usage is increased and the RDF loading and query functions will not know about the columns. A &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x4784bf0&quot;&gt;SQL&lt;/a&gt; update statement can be used to set values for these additional columns if one knows the &lt;code&gt;G,S,P,O&lt;/code&gt;. &lt;/p&gt; &lt;p&gt;Suppose we annotated each quad with the user who inserted it and a timestamp. These would be columns in the &lt;code&gt;RDF_QUAD&lt;/code&gt; table. The next choice would be whether these were primary key parts or dependent parts. If primary key parts, these would be non-&lt;code&gt;NULL&lt;/code&gt; and would occur on every index. The same quad would exist for each distinct user and time this quad had been inserted. For loading functions to work, these columns would need a default. In practice, we think that having such metadata as a dependent part is more likely, so that &lt;code&gt;G,S,P,O&lt;/code&gt; are the unique identifier of the quad. Whether one would then include these columns on indices other than the primary key would depend on how frequently they were accessed.&lt;/p&gt; &lt;p&gt;In &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x4a8a7c0&quot;&gt;SPARQL&lt;/a&gt;, one could use an extension syntax like â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt;SELECT * WHERE { ?person foaf:knows ?connection OPTION ( time ?ts ) . ?connection foaf:name &amp;quot;Alice&amp;quot; . FILTER ( ?ts &amp;gt; &amp;quot;2009-08-08&amp;quot;^^xsd:datetime ) }&lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;This would return everybody who knows Alice since a date more recent than 2009-08-08. This presupposes that the quad table has been extended with a datetime column.&lt;/p&gt; &lt;p&gt;The &lt;code&gt;OPTION (time ?ts)&lt;/code&gt; syntax is not presently supported but we can easily add something of the sort if there is user demand for it. In practice, this would be an extension mechanism enabling one to access extension columns of &lt;code&gt;RDF_QUAD&lt;/code&gt; via a column &lt;code&gt;?variable&lt;/code&gt; syntax in the &lt;code&gt;OPTION&lt;/code&gt; clause.&lt;/p&gt; &lt;p&gt;If quad metadata were not for every quad but still relatively frequent, another possibility would be making a separate table with a key of &lt;code&gt;GSPO&lt;/code&gt; and a dependent part of &lt;code&gt;R&lt;/code&gt;, where &lt;code&gt;R&lt;/code&gt; would be the reification &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x49e6108&quot;&gt;URI&lt;/a&gt; of the quad. Reification statements would then be made with &lt;code&gt;R&lt;/code&gt; as a subject. This would be more compact than the reification vocabulary and would not modify the &lt;code&gt;RDF_QUAD&lt;/code&gt; table. The syntax for referring to this could be something like â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt;SELECT * WHERE { ?person foaf:knows ?contact OPTION ( reify ?r ) . ?r xx:assertion_time ?ts . ?contact foaf:name &amp;quot;Alice&amp;quot; . FILTER ( ?ts &amp;gt; &amp;quot;2008-8-8&amp;quot;^^xsd:datetime ) }&lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;We could even recognize the reification vocabulary and convert it into the reify option if this were really necessary. But since it is so unwieldy I don&amp;#39;t think there would be huge demand. Who knows? You tell us.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>More On Parallel RDF/Text Query Evaluation</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-08-19#1571</atom:id>
  <atom:published>2009-08-19T17:28:50Z</atom:published>
  <atom:updated>2009-08-19T14:00:32.000006-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have received some more questions about &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x15ca9a30&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s parallel query evaluation model.&lt;/p&gt; &lt;p&gt;In answer, we will here explain how we do search engine style processing by writing &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1574c560&quot;&gt;SPARQL&lt;/a&gt;. There is no need for custom procedural code because the query optimizer does all the partitioning and the equivalent of map reduce.&lt;/p&gt; &lt;p&gt;The point is that what used to require programming can often be done in a generic query language. The technical detail is that the implementation must be smart enough with respect to parallelizing queries for this to be of practical benefit. But by combining these two things, we are a step closer to the web being the database.&lt;/p&gt; &lt;p&gt;I will here show how we do some joins combining full text, &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x15949970&quot;&gt;RDF&lt;/a&gt; conditions, and aggregates and &lt;code&gt;ORDER BY&lt;/code&gt;. The sample task is finding the top 20 entities with New York in some attribute value. Then we specify the search further by only taking actors associated with New York. The results are returned in the order of a composite of &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x213bf310&quot;&gt;entity&lt;/a&gt; rank and text match score.&lt;/p&gt; &lt;p&gt;The basic query is:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; SELECT ( &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x23632230&quot;&gt;sql&lt;/a&gt;:s_sum_page ( &amp;lt;sql:vector_agg&amp;gt; ( &amp;lt;bif:vector&amp;gt; ( ?c1 , ?sm ) ), bif:vector ( &amp;#39;new&amp;#39;, &amp;#39;york&amp;#39; ) ) ) AS ?res WHERE { { SELECT ( &amp;lt;SHORT_OR_LONG::&amp;gt;(?s1) ) AS ?c1 ( &amp;lt;sql:S_SUM&amp;gt; ( &amp;lt;SHORT_OR_LONG::IRI_RANK&amp;gt; ( ?s1 ) , &amp;lt;SHORT_OR_LONG::&amp;gt; ( ?s1textp ) , &amp;lt;SHORT_OR_LONG::&amp;gt; ( ?o1 ) , ?sc ) ) AS ?sm WHERE { ?s1 ?s1textp ?o1 . ?o1 bif:contains &amp;quot;new AND york&amp;quot; OPTION ( SCORE ?sc ) } ORDER BY DESC ( &amp;lt;sql:sum_rank&amp;gt; (( &amp;lt;sql:S_SUM&amp;gt; ( &amp;lt;SHORT_OR_LONG::IRI_RANK&amp;gt; ( ?s1 ) , &amp;lt;SHORT_OR_LONG::&amp;gt; ( ?s1textp ) , &amp;lt;SHORT_OR_LONG::&amp;gt; ( ?o1 ) , ?sc ) )) ) LIMIT 20 } } &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;This takes some explaining. The basic part is&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt;{ ?s1 ?s1textp ?o1 . ?o1 bif:contains &amp;quot;new AND york&amp;quot; OPTION ( SCORE ?sc ) }&lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;This just makes tuples where &lt;code&gt;?s1&lt;/code&gt; is the object, &lt;code&gt;?s1textp&lt;/code&gt; the property, and &lt;code&gt;?o1&lt;/code&gt; the literal which contains &amp;quot;New York&amp;quot;. For a single &lt;code&gt;?s1&lt;/code&gt;, there can of course be many properties which all contain &amp;quot;New York&amp;quot;.&lt;/p&gt; &lt;p&gt;The rest of the query gathers all the &amp;quot;New York&amp;quot; containing properties of an entity into a single aggregate, and then gets the entity ranks of all such entities.&lt;/p&gt; &lt;p&gt;After this, the aggregates are sorted by a sum of the entity rank and a combined text score calculated based on the individual text match scores between &amp;quot;New York&amp;quot; and the strings containing &amp;quot;New York&amp;quot;. The text hit score is higher if the words repeat often and in close proximity.&lt;/p&gt; &lt;p&gt;The &lt;code&gt;s_sum&lt;/code&gt; function is a user-defined aggregate which takes 4 arguments: The rank of the subject of the triple; the predicate of the triple containing the text; the object of the triple containing the text; and the text match score.&lt;/p&gt; &lt;p&gt;These are grouped by the subject of the triple. After this, these are sorted by &lt;code&gt;sum_score&lt;/code&gt; of the aggregate constructed with &lt;code&gt;s_sum&lt;/code&gt;. The &lt;code&gt;sum_score&lt;/code&gt; is a SQL function combining the entity rank with the text scores of the different literals.&lt;/p&gt; &lt;p&gt;This executes as one would expect: All partitions make a text index lookup, retrieving the object of the triple. The text index entries of an object are stored in the same partition as the object. But the entity rank is a property of the subject and is partitioned by the subject. Also the &lt;code&gt;GROUP BY&lt;/code&gt; is by the subject. Thus the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x15da01b8&quot;&gt;data&lt;/a&gt; is produced from all partitions, then streamed into the receiving partitions, determined by the subject. This partition can then get the score and group the matches by the subject. Since all these partial aggregates are partitioned by the subject, there is no need to merge them; thus, the top &lt;code&gt;k&lt;/code&gt; sort can be done for each partition separately. Finally, the top 20 of each partition are merged into the global top 20. This is then passed to a final function &lt;code&gt;s_sum_page&lt;/code&gt; that turns this all into an &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x15d59fc8&quot;&gt;XML&lt;/a&gt; fragment that can be processed with XSLT for inclusion on a web page.&lt;/p&gt; &lt;p&gt;This differs from the text search engine in that the query pipeline can contain arbitrary cross-partition joins. Also, the string &amp;quot;New York&amp;quot; is a common label that occurs in many distinct entities. Thus one text match, to one document, in the case the containing only the string &amp;quot;New York&amp;quot; will get many entities, likely all from different partitions.&lt;/p&gt; &lt;p&gt;So, if we only want actors with a mention of &amp;quot;New York&amp;quot;, we need to get the inner part of the query as:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt;{ ?s1 ?s1textp ?o1 . ?o1 bif:contains &amp;quot;new AND york&amp;quot; OPTION ( SCORE ?sc ) . ?s1 a &amp;lt;&lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x15befb10&quot;&gt;http&lt;/a&gt;://&lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0x15c92330&quot;&gt;umbel&lt;/a&gt;.org/umbel/sc/Actor&amp;gt; }&lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;Whether an entity is an actor can be checked in the same partition as the rank of the entity. Thus the query plan gets this check right before getting the rank. This is natural since there is no point in getting the rank of something that is not an actor.&lt;/p&gt; &lt;p&gt;The &lt;code&gt;&amp;lt;short_or_long::sql:func&amp;gt;&lt;/code&gt; notation means that we call &lt;code&gt;func&lt;/code&gt;, which is a SQL stored procedure with the arguments in their internal form. Thus, if a variable bound to an IRI is passed, the &lt;code&gt;short_or_long&lt;/code&gt; specifies that it is passed as its internal ID and is not converted into its text form. This is essential, since there is no point getting the text of half a million IRIs when only 20 at most will be shown in the end.&lt;/p&gt; &lt;p&gt;Now, when we run this on a collection of 4.5 billion triples of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x153772e8&quot;&gt;linked data&lt;/a&gt;, once we have the working set, we can get the top 20 &amp;quot;New York&amp;quot; occurrences, with text summaries and all, in just 1.1s, with 12 of 16 cores busy. (The hardware is two boxes with two quad-core Xeon 5345 each.)&lt;/p&gt; &lt;p&gt;If we run this query in two parallel sessions, we get both results in 1.9s, with 14 of 16 cores busy. This gets about 200K &amp;quot;New York&amp;quot; strings, which becomes about 400K entities with New York somewhere, for which a rank then has to be retrieved. After this, all the possibly-many occurrences of New York in the title, text, and other properties of the entity are aggregated together, resolving into some 220K groups. These are then sorted. This is internally over 1.5 million random lookups and some 40MB of traffic between processes. Restricting the type of the entity to actor drops the execution time of one query to 0.8s because there are then fewer ranks to retrieve and less data to aggregate and sort.&lt;/p&gt; &lt;p&gt;By adding partitions and cores, we scale horizontally, as evaluating the query involves almost no central control, even though data are swapped between partitions. There is some flow control to avoid constructing overly-large intermediate results but generally partitions run independently and asynchronously. In the above case, there is just one fence at the point where all aggregates are complete, so that they can be sorted; otherwise, all is asynchronous.&lt;/p&gt; &lt;p&gt;Doing &lt;code&gt;JOINs&lt;/code&gt; between partitions and partitioned &lt;code&gt;GROUP BY&lt;/code&gt;/&lt;code&gt;ORDER BY&lt;/code&gt; is pretty regular database stuff. Applying this to RDF is a most natural thing.&lt;/p&gt; &lt;p&gt;If we do not parallelize the user-defined aggregate for grouping all the &amp;quot;New York&amp;quot; occurrences, the query takes 8s instead of 1.1s. If we could not put SQL procedures as user-defined aggregates to be parallelized with the query, we&amp;#39;d have to either bring all the data to a central point before the top k, which would destroy performance, or we would have to do procedures with explicit parallel procedure calls which is hard to write, surely too hard for &lt;i&gt;ad hoc&lt;/i&gt; queries.&lt;/p&gt; &lt;a href=&quot;http://bit.ly/4jAVHC&quot; id=&quot;link-id114d58f0&quot;&gt;Results of live execution&lt;/a&gt; may not be complete on initial load, as this link includes a &amp;quot;Virtuoso Anytime&amp;quot; timeout of 10 seconds. Running against a cold cache, these results may take much longer to return; a warm cache will deliver response times along the lines of those discussed above. &lt;p&gt;Engineering matters. If we wish to commoditize queries on a lot of data, such intelligence in the DBMS is necessary; it is very unscalable to require people to do procedural code or give query parallelization hints. If you need to optimize a workload of 10 different transactions, this is of course possible and even desirable, but for the infinity of all search or analysis, this will not happen.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Updated hardware improves LUBM 8000 load rate in Virtuoso 6</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-08-14#1569</atom:id>
  <atom:published>2009-08-14T19:01:30Z</atom:published>
  <atom:updated>2009-08-15T15:27:29-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We repeated the &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1562&quot; id=&quot;link-id173d3068&quot;&gt;earlier LUBM 8000 experiment&lt;/a&gt; on a newer machine, with 2 x Xeon 5520 and 72G 1333MHz memory, and once again with the 2 machines as a networked cluster. Otherwise the settings were the same.&lt;/p&gt; &lt;p&gt;The load rate is now 160,739 triples-per-second.&lt;/p&gt; &lt;table&gt; &lt;tr&gt; &lt;th&gt;&lt;/th&gt; &lt;td&gt;   &lt;/td&gt; &lt;th align=&quot;center&quot;&gt;&lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x240daf38&quot;&gt;Virtuoso&lt;/a&gt; 6 &lt;br /&gt; (previous run)&lt;/th&gt; &lt;td&gt;   &lt;/td&gt; &lt;th align=&quot;center&quot;&gt;Virtuoso 6 &lt;br /&gt; (new run)&lt;/th&gt; &lt;td&gt;   &lt;/td&gt; &lt;th align=&quot;center&quot;&gt;Virtuoso 6 &lt;br /&gt; (newest run)&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;left&quot;&gt;blades&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1 &lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1 &lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;left&quot;&gt;processors&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2 x Xeon 5410&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2 x Xeon 5520&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 2 x Xeon 5520 &lt;br /&gt;+ &lt;br /&gt;2 x Xeon 5410 &lt;br /&gt;with 1x1GigE &lt;br /&gt;interconnect &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;left&quot;&gt;memory&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 16G 667 MHz&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;72G 1333 MHz&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;72G 1333 MHz &lt;br /&gt;+ &lt;br /&gt; 16G 667 MHz &lt;br /&gt; respectively&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;left&quot;&gt;reported load rate&lt;br /&gt;triples-per-second&lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 110,532 &lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 160,739 &lt;/td&gt; &lt;td&gt;   &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; 214,188 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;Again, if others talk about loading LUBM, so must we. Otherwise, this metric is rather uninteresting.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Single Virtuoso host loads 110,500 triples-per-second on LUBM 8000</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-06-29#1563</atom:id>
  <atom:published>2009-06-29T16:12:34Z</atom:published>
  <atom:updated>2009-08-15T16:06:45.000001-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;LUBM load speed still seems to be a metric that is quoted in comparisons of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id142df6e8&quot;&gt;RDF&lt;/a&gt; stores. Consequently, we too measured the load time of LUBM 8000, 1,068-million triples, on the newest &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id1389dfa0&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The real time for the load was 161m 3s. The rate was 110,532 triples-per-second. The hardware was one machine with 2 x Xeon 5410 (quad core, 2.33 GHz) and 16G 667 MHz RAM. The software was Virtuoso 6 Cluster, configured into 8 partitions (processes) — one partition per CPU core. Each partition had its database striped over 6 disks total; the 6 disks on the system were shared between the 8 database processes.&lt;/p&gt; &lt;p&gt;The load was done on 8 streams, one per server process. At the beginning of the load, the CPU usage was 740% with no disk; at the end, it was around 700% with 25% disk wait. 100% counts here for one CPU core or one disk being constantly busy.&lt;/p&gt; &lt;p&gt;The RDF store was configured with the default two indices over quads, these being GSPO and OGPS. Text indexing of literals was not enabled. No materialization of entailed triples was made.&lt;/p&gt; &lt;p&gt;We think that LUBM loading is not a realistic benchmark for the world but since other people publish such numbers, so do we.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Comparing Virtuoso Performance on Different Processors</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-05-28#1558</atom:id>
  <atom:published>2009-05-28T14:54:59Z</atom:published>
  <atom:updated>2009-05-28T11:15:41.000006-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Over the years we have run &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x16735e20&quot;&gt;Virtuoso&lt;/a&gt; on different hardware. We will here give a few figures that help identify the best price point for machines running Virtuoso.&lt;/p&gt; &lt;p&gt;Our test is very simple: &lt;i&gt;Load 20 warehouses of &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x16e0dba8&quot;&gt;TPC-C&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x14ff4f80&quot;&gt;data&lt;/a&gt;, and then run one client per warehouse for 10,000 new orders&lt;/i&gt;. The way this is set up, disk I/O does not play a role and lock contention between the clients is minimal.&lt;/p&gt; &lt;p&gt;The test essentially has 20 server and 20 client threads running the same workload in parallel. The load time gives the single thread number; the 20 clients run gives the multi-threaded number. The test uses about 2-3 GB of data, so all is in RAM but is large enough not to be all in processor cache.&lt;/p&gt; &lt;p&gt;All times reported are real times, starting from the start of the first client and ending with the completion of the last client.&lt;/p&gt; &lt;p&gt;Do not confuse these results with official TPC-C. The measurement protocols are entirely incomparable.&lt;/p&gt; &lt;style type=&quot;text/css&quot;&gt; TABLE { background: none; border: none } TH { text-align: center; font-weight: bold } TR.top { background: } TD { text-align: center; border: none } &lt;/style&gt; &lt;table align=&quot;center&quot; cellspacing=&quot;10&quot;&gt; &lt;tr&gt; &lt;th&gt;Test&lt;/th&gt; &lt;th&gt;Platform&lt;/th&gt; &lt;th&gt;Load&lt;br /&gt;(seconds)&lt;/th&gt; &lt;th&gt;Run&lt;br /&gt;(seconds)&lt;/th&gt; &lt;th&gt;GHz / cores / threads&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;Amazon &lt;a href=&quot;http://aws.amazon.com/ec2/&quot; id=&quot;link-id0x15d68e20&quot;&gt;EC2&lt;/a&gt; Extra Large&lt;br /&gt;(4 virtual cores)&lt;/td&gt; &lt;td&gt;340&lt;/td&gt; &lt;td&gt;42&lt;/td&gt; &lt;td&gt;1.2 GHz? / 4 / 1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;Amazon EC2 Extra Large&lt;br /&gt;(4 virtual cores)&lt;/td&gt; &lt;td&gt;305&lt;/td&gt; &lt;td&gt;43.3&lt;/td&gt; &lt;td&gt;1.2 GHz? / 4 / 1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;2&lt;/td&gt; &lt;td&gt;1 x dual-core AMD 5900&lt;/td&gt; &lt;td&gt;263&lt;/td&gt; &lt;td&gt;58.2&lt;/td&gt; &lt;td&gt;2.9 GHz / 2 / 1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;2 x dual-core Xeon 5130 (&amp;quot;Woodcrest&amp;quot;)&lt;/td&gt; &lt;td&gt;245&lt;/td&gt; &lt;td&gt;35.7&lt;/td&gt; &lt;td&gt;2.0 GHz / 4 / 1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;4&lt;/td&gt; &lt;td&gt;2 x quad-core Xeon 5410 (&amp;quot;Harpertown&amp;quot;)&lt;/td&gt; &lt;td&gt;237&lt;/td&gt; &lt;td&gt;18.0&lt;/td&gt; &lt;td&gt;2.33 GHz / 8 / 1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;5&lt;/td&gt; &lt;td&gt;2 x quad-core Xeon 5520 (&amp;quot;Nehalem&amp;quot;)&lt;/td&gt; &lt;td&gt;162&lt;/td&gt; &lt;td&gt;18.3&lt;/td&gt; &lt;td&gt;2.26 GHz / 8 / 2&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;We tried two different EC2 instances to see if there would be variation. The variation was quite small. The tested EC2 instances costs 20 US cents per hour. The AMD dual-core costs 550 US dollars with 8G. The 3 Xeon configurations are Supermicro boards with 667MHz memory for the Xeon 5130 (&amp;quot;Woodcrest&amp;quot;) and Xeon 5410 (&amp;quot;Harpertown&amp;quot;), and 800MHz memory for the Nehalem. The Xeon systems cost between 4000 and 7000 US dollars, with 5000 for a configuration with 2 x Xeon 5520 (&amp;quot;Nehalem&amp;quot;), 72 GB RAM, and 8 x 500 GB SATA disks.&lt;/p&gt; &lt;p&gt; &lt;i&gt;Caveat: Due to slow memory (we could not get faster within available time), the results for the Nehalem do not take full advantage of its principal edge over the previous generation, i.e., memory subsystem. We&amp;#39;ll see another time with faster memories.&lt;/i&gt; &lt;/p&gt; &lt;p&gt;The operating systems were various 64 bit Linux distributions.&lt;/p&gt; &lt;p&gt;We did some further measurements comparing Harpertown and Nehalem processors. The Nehalem chip was a bit faster for a slightly lower clock but we did not see any of the twofold and greater differences advertised by Intel.&lt;/p&gt; &lt;p&gt;We tried some &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1460b688&quot;&gt;RDF&lt;/a&gt; operations on the two last systems:&lt;/p&gt; &lt;table align=&quot;center&quot; cellspacing=&quot;10&quot;&gt; &lt;tr&gt; &lt;th&gt;operation&lt;/th&gt; &lt;th&gt; Harpertown&lt;/th&gt; &lt;th&gt;Nehalem&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;Build text index for &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x16a94590&quot;&gt;DBpedia&lt;/a&gt;&lt;/th&gt; &lt;td&gt;1080s&lt;/td&gt; &lt;td&gt;770s&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;&lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0xc37f380&quot;&gt;Entity&lt;/a&gt; Rank iteration&lt;/th&gt; &lt;td&gt;263s&lt;/td&gt; &lt;td&gt;251s&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;Then we tried to see if the core multithreading of Nehalem could be seen anywhere. To this effect, we ran the Fibonacci function in &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x15842a20&quot;&gt;SQL&lt;/a&gt; to serve as an example of an all in-cache integer operation. 16 concurrent operations took exactly twice as long as 8 concurrent ones, as expected.&lt;/p&gt; &lt;p&gt;For something that used memory, we took a count of RDF quads on two different indices, getting the same count. The database was a cluster setup with one process per core, so a count involved one thread per core. The counts in series took 5.02s and in parallel they took 4.27s.&lt;/p&gt; &lt;p&gt;Then we took a more memory intensive piece that read the RDF quads table in the order of one index and for each row checked that there is the equal row on another, differently-partitioned index. This is a cross-partition join. One of the indices is read sequentially and the other at random. The throughput can be reported as random-lookups-per-second. The data was English DBpedia, about 140M triples. One such query takes a couple of minutes with a 650% CPU utilization. Running multiple such queries should show effects of core multithreading since we expect frequent cache misses.&lt;/p&gt; &lt;ol&gt; &lt;li&gt;On the host OS of the Nehalem system â &lt;table align=&quot;center&quot; cellspacing=&quot;10&quot;&gt; &lt;tr&gt; &lt;th&gt;n&lt;/th&gt; &lt;th&gt;cpu%&lt;/th&gt; &lt;th&gt;rows per second&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;1 query&lt;/th&gt; &lt;td&gt;503&lt;/td&gt; &lt;td&gt;906,413&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;2 queries&lt;/th&gt; &lt;td&gt;1263&lt;/td&gt; &lt;td&gt;1,578,585&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;3 queries&lt;/th&gt; &lt;td&gt;1204&lt;/td&gt; &lt;td&gt;1,566,849&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/li&gt; &lt;li&gt;In a VM under Xen, on the Nehalem system â &lt;table align=&quot;center&quot; cellspacing=&quot;10&quot;&gt; &lt;tr&gt; &lt;th&gt;n&lt;/th&gt; &lt;th&gt;cpu%&lt;/th&gt; &lt;th&gt;rows per second&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;1 query&lt;/th&gt; &lt;td&gt;652&lt;/td&gt; &lt;td&gt;799,293&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;2 queries&lt;/th&gt; &lt;td&gt;1266&lt;/td&gt; &lt;td&gt;1,486,710&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;3 queries&lt;/th&gt; &lt;td&gt;1222&lt;/td&gt; &lt;td&gt;1,484,093&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/li&gt; &lt;li&gt; On the host OS of the Harpertown system â &lt;table align=&quot;center&quot; cellspacing=&quot;10&quot;&gt; &lt;tr&gt; &lt;th&gt;n&lt;/th&gt; &lt;th&gt;cpu%&lt;/th&gt; &lt;th&gt;rows per second&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;1 query&lt;/th&gt; &lt;td&gt; 648 &lt;/td&gt; &lt;td&gt; 1,041,448 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt;2 queries&lt;/th&gt; &lt;td&gt; 708 &lt;/td&gt; &lt;td&gt; 1,124,866 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;The CPU percentages are as reported by the OS: user + system CPU divided by real time.&lt;/p&gt; &lt;p&gt;So, Nehalem is in general somewhat faster, around 20-30%, than Harpertown. The effect of core multithreading can be noticed but is not huge, another 20% or so for situations with more threads than cores. The join where Harpertown did better could be attributed to its larger cache â 12 MB vs 8 MB.&lt;/p&gt; &lt;p&gt;We see that Xen has a measurable but not prohibitive overhead; count a little under 10% for everything, also tasks with no I/O. The VM was set up to have all CPU for the test and the queries did not do disk I/O.&lt;/p&gt; &lt;p&gt;The executables were compiled with &lt;code&gt;gcc&lt;/code&gt; with default settings. Specifying &lt;code&gt;-march=nocona&lt;/code&gt; (Core 2 target) dropped the cross-partition join time mentioned above from 128s to 122s on Harpertown. We did not try this on Nehalem but presume the effect would be the same, since the out-of-order unit is not much different. We did not do anything about process-to-memory affinity on Nehalem, which is a non-uniform architecture. We would expect this to increase performance since we have many equal size processes with even load.&lt;/p&gt; &lt;p&gt;The mainstay of the Nehalem value proposition is a better memory subsystem. Since the unit we got was at 800 MHz memory, we did not see any great improvement. So if you buy Nehalem, you should make sure it is with 1333 MHz memory, else the best case will not be over 50% over a 667 MHz Core 2-based Xeon.&lt;/p&gt; &lt;p&gt;Nehalem remains a better deal for us because of more memory per board. One Nehalem box with 72 GB costs less than two Harpertown boxes with 32 GB and offers almost the same performance. Having a lot of memory in a small space is key. With faster memory, it might even outperform two Harpertown boxes, but this remains to be seen.&lt;/p&gt; &lt;p&gt;If space were not a constraint, we could make a cluster of 12 small workstations for the price of our largest system and get still more memory and more processor power per unit of memory. The Nehalem box was almost 4x faster than the AMD box but then it has 9x the memory, so the CPU to memory ratio might be better with the smaller boxes.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Short Recap of Virtuoso Basics (#3 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-04-30#1552</atom:id>
  <atom:published>2009-04-30T15:49:53Z</atom:published>
  <atom:updated>2009-04-30T12:11:45-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;(Third of five posts related to the &lt;a href=&quot;http://www2009.org/&quot; id=&quot;link-id0x14b582b8&quot;&gt;WWW 2009&lt;/a&gt; conference, held the week of April 20, 2009.) &lt;/p&gt; &lt;p&gt;There are some points that came up in conversation at WWW 2009 that I will reiterate here. We find there is still some lack of clarity in the product image, so I will here condense it.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x14bf48b8&quot;&gt;Virtuoso&lt;/a&gt; is a DBMS. We pitch it primarily to the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x16bc4490&quot;&gt;data&lt;/a&gt; web space because this is where we see the emerging frontier. Virtuoso does both &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1223dc30&quot;&gt;SQL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x170eec88&quot;&gt;SPARQL&lt;/a&gt; and can do both at large scale and high performance. The popular perception of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x15a05fc0&quot;&gt;RDF&lt;/a&gt; and Relational models as mutually exclusive and antagonistic poles is based on the poor scalability of early RDF implementations. What we do is to have all the RDF specifics, like IRIs and typed literals as native SQL types, and to have a cost based optimizer that knows about this all.&lt;/p&gt; &lt;p&gt;If you want application-specific data structures as opposed to a schema-agnostic quad-store model (triple + graph-name), then Virtuoso can give you this too. &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/rdfsparqlintegrationmiddleware.html#rdfviews&quot; id=&quot;link-id14ddc7c8&quot;&gt;Rendering application specific data structures as RDF&lt;/a&gt; applies equally to relational data in non-Virtuoso databases because Virtuoso SQL can &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/qsvdbsrv.html&quot; id=&quot;link-id14aaea70&quot;&gt;federate tables from heterogenous DBMS&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;On top of this, there is a &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/qswebserver.html&quot; id=&quot;link-id16fcde60&quot;&gt;web server built in&lt;/a&gt;, so that no extra server is needed for web services, web pages, and the like.&lt;/p&gt; &lt;p&gt;Installation is simple, just one exe and one config file. There is a huge amount of code in &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/installation.html&quot; id=&quot;link-id16767b40&quot;&gt;installers&lt;/a&gt; â application code and test suites and such â but none of this is needed when you deploy. Scale goes from a 25MB memory footprint on the desktop to hundreds of gigabytes of RAM and endless terabytes of disk on shared-nothing clusters.&lt;/p&gt; &lt;p&gt;Clusters (coming in Release 6) and SQL federation are &lt;a href=&quot;http://download.openlinksw.com/download/product_matrix.vsp?p=l_os&amp;amp;c=39&amp;amp;df=16&quot; id=&quot;link-id16722550&quot;&gt;commercial only&lt;/a&gt;; the rest can be had &lt;a href=&quot;http://sourceforge.net/project/showfiles.php?group_id=161622&quot; id=&quot;link-id131080a8&quot;&gt;under GPL&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;To condense further:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Scalable Delivery of &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1060ad98&quot;&gt;Linked Data&lt;/a&gt; &lt;/li&gt; &lt;li&gt;SPARQL and SQL &lt;ul&gt; &lt;li&gt;Arbitrary RDF Data + Relational&lt;/li&gt; &lt;li&gt;Also From 3rd Party &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x16bbce60&quot;&gt;RDBMS&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt; &lt;/li&gt; &lt;li&gt;Easy Deployment &lt;/li&gt; &lt;li&gt;Standard Interfaces &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x12e284d8&quot;&gt;ODBC&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0xb5e1400&quot;&gt;JDBC&lt;/a&gt;, OLE DB, &lt;a href=&quot;http://dbpedia.org/resource/ADO.NET&quot; id=&quot;link-id0x15a55db8&quot;&gt;ADO&lt;/a&gt;.&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x16beb070&quot;&gt;NET&lt;/a&gt;, XMLA&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x122b5008&quot;&gt;Jena&lt;/a&gt;, &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x148d4078&quot;&gt;Sesame&lt;/a&gt;, etc.&lt;/li&gt; &lt;li&gt;All Web Protocols &lt;/li&gt; &lt;/ul&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data at WWW 2009 (#1 of 5)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-04-27#1545</atom:id>
  <atom:published>2009-04-27T21:28:11Z</atom:published>
  <atom:updated>2009-04-28T11:27:57-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;(First of five posts related to the &lt;a href=&quot;http://www2009.org/&quot; id=&quot;link-id0x12d8ed90&quot;&gt;WWW 2009&lt;/a&gt; conference, held the week of April 20, 2009.)&lt;/p&gt; &lt;p&gt;We gave a talk at the &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x152bf430&quot;&gt;Linked Open Data&lt;/a&gt; workshop, &lt;a href=&quot;http://events.linkeddata.org/ldow2009/&quot; id=&quot;link-id0x191721c8&quot;&gt;LDOW 2009&lt;/a&gt;, at WWW 2009. I did not go very far into the technical points in the talk, as there was almost no time and the points are rather complex. Instead, I emphasized what new things had become possible with recent developments.&lt;/p&gt; &lt;p&gt;The problem we do not cease hearing about is scale. We have solved most of it. There is scale in the schema: Put together, ontologies go over a million classes/properties. Which ones are relevant depends, and the user should have the choice. The instance &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x17c8f998&quot;&gt;data&lt;/a&gt; is in the tens of billions of triples, much derived from Web 2.0 sources but also much published as &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xd562090&quot;&gt;RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;To make sense of this all, we need quick summaries and search. Without navigation via joins, the value will be limited. Fast joining, counting, grouping, and ranking are key.&lt;/p&gt; &lt;p&gt;People will use different terms for the same thing. The issue of identity is philosophical. In order to do reasoning one needs strong identity; a statement like &lt;i&gt;x is a bit like y&lt;/i&gt; is not very useful in a database context. Whether any x and y can be considered the same depends on the context. So leave this for query time. The conditions under which two people are considered the same will depend on whether you are doing marketing analysis or law enforcement. A general purpose data store cannot anticipate all the possibilities, so smush on demand, as you go, as has been said many times.&lt;/p&gt; &lt;p&gt;Against this backdrop, we offer a solution with which anybody who so chooses can play with big data, whether a search or analytics player.&lt;/p&gt; &lt;p&gt;We are going in the direction of more and more ad hoc processing at larger and larger scale. With good query parallelization, we can do big joins without complex programming. No explicit Map Reduce jobs or the like. What was done with special code with special parallel programming models, can now be done in &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x60bd0c48&quot;&gt;SQL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x13db1ff0&quot;&gt;SPARQL&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;To showcase this, we do &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x10a5dde8&quot;&gt;linked data&lt;/a&gt; search, browsing, and so on, but are essentially a platform provider.&lt;/p&gt; &lt;p&gt;Entry costs into relatively high end databases have dropped significantly. A cluster with 1 TB of RAM sells for $75K or so at today&amp;#39;s retail prices and fits under a desk. For intermittent use, the rent for 1TB RAM is $1228 per day on &lt;a href=&quot;http://aws.amazon.com/ec2/&quot; id=&quot;link-id0xa59039d8&quot;&gt;EC2&lt;/a&gt;. With this on one side and &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x19f86c10&quot;&gt;Virtuoso&lt;/a&gt; on the other, a lot that was impractical in the past is now within reach. Like &lt;a href=&quot;http://g1o.net/foaf.rdf#me&quot; id=&quot;link-id0xa1853af8&quot;&gt;Giovanni Tummarello&lt;/a&gt; put it for airplanes, the physics are as they were for &lt;a href=&quot;http://dbpedia.org/resource/Leonardo_da_Vinci&quot; id=&quot;link-id0x12df02e0&quot;&gt;da Vinci&lt;/a&gt; but materials and engines had to develop a bit before there was commercial potential. So it is also with analytics for everyone.&lt;/p&gt; &lt;p&gt;A remark from the audience was that all the stuff being shown, not limited to Virtuoso, was non-standard, having to do with text search, with ranking, with extensions, and was in fact not SPARQL and pure linked data principles. Further, by throwing this all together, one got something overcomplicated, too heavy.&lt;/p&gt; &lt;p&gt;I answered as follows, which apparently cannot be repeated too much:&lt;/p&gt; &lt;p&gt;First, everybody expects a text search box, and is conditioned to having one. No text search and no ranking is a non-starter. &lt;i&gt;Ceterum censeo&lt;/i&gt;, for database, the next generation cannot be less expressive than the previous. All of SQL and then some is where SPARQL must be. The barest minimum is being able to say anything one can say in SQL, and then justify SPARQL by saying that it is better for heterogenous data, schema last, and so on. On top of this, transitivity and rules will not hurt. For now, the current SPARQL working group will at least reach basic SQL parity; the edge will still remain implementation dependent.&lt;/p&gt; &lt;p&gt;Another remark was that joining is slow. Depends. Anything involving more complex disk access than linear reading of a blob is generally not good for interactive use. But with adequate memory, and with all hot spots in memory, we do some 3.2 million random-accesses-per-second on 12 cores, with easily 80% platform utilization for a single large query. The high utilization means that times drop as processing gets divided over more partitions.&lt;/p&gt; &lt;p&gt;There was a talk about &lt;a href=&quot;http://semanticweb.org/wiki/MashQL&quot; id=&quot;link-id0x1642a780&quot;&gt;MashQL&lt;/a&gt; by &lt;a href=&quot;http://data.semanticweb.org/person/mustafa-jarrar&quot; id=&quot;link-id0x116e5af8&quot;&gt;Mustafa Jarrar&lt;/a&gt;, concerning an abstraction on top of SPARQL for easy composition of tree-structured queries. The idea was that such queries can be evaluated &amp;quot;on the fly&amp;quot; as they are being composed. As it happens, we already have an &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x11442520&quot;&gt;XML&lt;/a&gt;-based query abstraction layer incorporated into Virtuoso 6.0&amp;#39;s built-in &lt;a href=&quot;http://lod.openlinksw.com/fct/facet.vsp&quot; id=&quot;link-id0x6a9ebfe0&quot;&gt;Faceted Data Browser Service&lt;/a&gt;, and the effects are probably quite similar. The most important point here is that by using XML, both of these approaches are interoperable against a Virtuoso back-end. Along similar lines, we did not get to talk to the G Facets people but our message to them is the same: &lt;i&gt;Use the &lt;a href=&quot;http://lod.openlinksw.com/fct/facet.vsp&quot; id=&quot;link-id0x1676e158&quot;&gt;faceted browser service&lt;/a&gt; to get vastly higher performance when querying against Linked Data, be it &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x12653418&quot;&gt;DBpedia&lt;/a&gt; or the &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x10a61e78&quot;&gt;entity&lt;/a&gt; &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x164150d8&quot;&gt;LOD&lt;/a&gt; &lt;a href=&quot;http://lod.openlinksw.com/&quot; id=&quot;link-id0xc5ec918&quot;&gt;Cloud&lt;/a&gt;. Virtuoso 6.0 (Open Source Edition) &amp;quot;&lt;a href=&quot;http://sourceforge.net/project/showfiles.php?group_id=161622&amp;amp;package_id=319652&amp;amp;release_id=677866&quot; id=&quot;link-id12159728&quot;&gt;TP1&lt;/a&gt;&amp;quot; is now publicly available as a Technology Preview (beta).&lt;/i&gt; &lt;/p&gt; &lt;p&gt;We heard that there is an effort for porting Freebase&amp;#39;s Parallax to SPARQL. The same thing applies to this. With a number of different data viewers on top of SPARQL, we come closer to broad-audience linked-data applications. These viewers are still too generic for the end user, though. We fully believe that for both search and transactions, application-domain-specific workflows will stay relevant. But these can be made to a fair degree by specializing generic linked-data-bound controls and gluing them together with some scripting.&lt;/p&gt; &lt;p&gt;As said before, the application will interface the user to the vocabulary. The vocabulary development takes the modeling burden from the application and makes for interchangeable experience on the same data. The data in turn is &amp;quot;virtualized&amp;quot; into the database cloud or the local secure server, as the use case may require. &lt;/p&gt; &lt;p&gt;For ease of adoption, open competition, and safety from lock-in, the community needs a SPARQL whose usability is not totally dependent on vendor extensions. But we might &lt;i&gt;de facto&lt;/i&gt; have that in just a bit, whenever there is a working draft from the SPARQL WG.&lt;/p&gt; &lt;p&gt;Another topic that we encounter often is the question of integration (or lack thereof) between communities. For example, database conferences reject &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x12563ea0&quot;&gt;semantic web&lt;/a&gt; papers and vice versa. Such politics would seem to emerge naturally but are nonetheless detrimental. We really should partner with people who write papers as their principal occupation. We ourselves do software products and use very little time for papers, so some of the bad reviews we have received do make a legitimate point. By rights, we should go for database venues but we cannot have this take too much time. So we are open to partnering for splitting the opportunity cost of multiple submissions.&lt;/p&gt; &lt;p&gt;For future work, there is nothing radically new. We continue testing and productization of cluster databases. Just deliver what is in the pipeline. The essential nature of this is adding more and more cases of better and better parallelization in different query situations. The present usage patterns work well for finding bugs and performance bottlenecks. For presentation, our goal is to have third party viewers operate with our platform. We cannot completely leave data browsing and UI to third parties since we must from time to time introduce various unique functionality. Most interaction should however go via third party applications.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Web Scale and Fault Tolerance</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-04-01#1541</atom:id>
  <atom:published>2009-04-01T15:18:06Z</atom:published>
  <atom:updated>2009-04-01T11:18:54.000012-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;One concern about &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x719d2f8&quot;&gt;Virtuoso&lt;/a&gt; Cluster is fault tolerance. This post talks about the basics of fault tolerance and what we can do with this, from improving resilience and optimizing performance to accommodating bulk loads without impacting interactive response. We will see that this is yet another step towards a 24/7 web-scale &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0xa9a1d8d8&quot;&gt;Linked Data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x25201030&quot;&gt;Web&lt;/a&gt;. We will see how large scale, continuous operation, and redundancy are related.&lt;/p&gt; &lt;p&gt;It has been said many times â when things are large enough, failures become frequent. In view of this, basic storage of partitions in multiple copies is built into the Virtuoso cluster from the start. Until now, this feature has not been tested or used very extensively, aside from the trivial case of keeping all schema &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x4548898&quot;&gt;information&lt;/a&gt; in synchronous replicas on all servers.&lt;/p&gt; &lt;h2&gt;Approaches to Fault Tolerance&lt;/h2&gt; &lt;p&gt;Fault tolerance has many aspects but it starts with keeping &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x18757400&quot;&gt;data&lt;/a&gt; in at least two copies. There are shared-disk cluster databases like &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x711c900&quot;&gt;Oracle&lt;/a&gt; RAC that do not depend on partitioning. With these, as long as the disk image is intact, servers can come and go. The fault tolerance of the disk in turn comes from mirroring done by the disk controller. Raids other than mirrored disk are not really good for databases because of write speed.&lt;/p&gt; &lt;p&gt;With shared-nothing setups like Virtuoso, fault tolerance is based on multiple servers keeping the same logical data. The copies are synchronized transaction-by-transaction but are not bit-for-bit identical nor write-by-write synchronous as is the case with mirrored disks.&lt;/p&gt; &lt;p&gt;There are asynchronous replication schemes generally based on log shipping, where the replica replays the transaction log of the master copy. The master copy gets the updates, the replica replays them. Both can take queries. These do not guarantee an entirely ACID fail-over but for many applications they come close enough.&lt;/p&gt; &lt;p&gt;In a tightly coupled cluster, it is possible to do synchronous, transactional updates on multiple copies without great added cost. Sending the message to two places instead of one does not make much difference since it is the latency that counts. But once we go to wide area networks, this becomes as good as unworkable for any sort of update volume. Thus, wide area replication must in practice be asynchronous.&lt;/p&gt; &lt;p&gt;This is a subject for another discussion. For now, the short answer is that wide area log shipping must be adapted to the application&amp;#39;s requirements for synchronicity and consistency. Also, exactly what content is shipped and to where depends on the application. Some application-specific logic will likely be involved; more than this one cannot say without a specific context.&lt;/p&gt; &lt;h2&gt;Basics of Partition Fail-Over&lt;/h2&gt; &lt;p&gt;For now, we will be concerned with redundancy protecting against broken hardware, software slowdown, or crashes inside a single site.&lt;/p&gt; &lt;p&gt;The basic idea is simple: Writes go to all copies; reads that must be repeatable or serializable (i.e., locking) go to the first copy; reads that refer to committed state without guarantee of repeatability can be balanced among all copies. When a copy goes offline, nobody needs to know, as long as there is at least one copy online for each partition. The exception in practice is when there are open cursors or such stateful things as aggregations pending on a copy that goes down. Then the query or transaction will abort and the application can retry. This looks like a deadlock to the application.&lt;/p&gt; &lt;p&gt;Coming back online is more complicated. This requires establishing that the recovering copy is actually in sync. In practice this requires a short window during which no transactions have uncommitted updates. Sometimes, forcing this can require aborting some transactions, which again looks like a deadlock to the application.&lt;/p&gt; &lt;p&gt;When an error is seen, such as a process no longer accepting connections and dropping existing cluster connections, we in practice go via two stages. First, the operations that directly depended on this process are aborted, as well as any computation being done on behalf of the disconnected server. At this stage, attempting to read data from the partition of the failed server will go to another copy but writes will still try to update all copies and will fail if the failed copy continues to be offline. After it is established that the failed copy will stay off for some time, writes may be re-enabled â but now having the failed copy rejoin the cluster will be more complicated, requiring an atomic window to ensure sync, as mentioned earlier.&lt;/p&gt; &lt;p&gt;For the DBA, there can be intermittent software crashes where a failed server automatically restarts itself, and there can be prolonged failures where this does not happen. Both are alerts but the first kind can wait. Since a system must essentially run itself, it will wait for some time for the failed server to restart itself. During this window, all reads of the failed partition go to the spare copy and writes give an error. If the spare does not come back up in time, the system will automatically re-enable writes on the spare but now the failed server may no longer rejoin the cluster without a complex sync cycle. This all can happen in well under a minute, faster than a human operator can react. The diagnostics can be done later.&lt;/p&gt; &lt;p&gt;If the situation was a hardware failure, recovery consists of taking a spare server and copying the database from the surviving online copy. This done, the spare server can come on line. Copying the database can be done while online and accepting updates but this may take some time, maybe an hour for every 200G of data copied over a network. In principle this could be automated by scripting, but we would normally expect a human DBA to be involved.&lt;/p&gt; &lt;p&gt;As a general rule, reacting to the failure goes automatically without disruption of service but bringing the failed copy online will usually require some operator action.&lt;/p&gt; &lt;h2&gt;Levels of Tolerance and Performance&lt;/h2&gt; &lt;p&gt;The only way to make failures totally invisible is to have all in duplicate and provisioned so that the system never runs at more than half the total capacity. This is often not economical or necessary. This is why we can do better, using the spare capacity for more than standby.&lt;/p&gt; &lt;p&gt;Imagine keeping a repository of linked data. Most of the content will come in through periodic bulk replacement of data sets. Some data will come in through pings from applications publishing FOAF and similar. Some data will come through on-demand RDFization of resources.&lt;/p&gt; &lt;p&gt;The performance of such a repository essentially depends on having enough memory. Having this memory in duplicate is just added cost. What we can do instead is have all copies store the whole partition but when routing queries, apply range partitioning on top of the basic hash partitioning. If one partition stores IDs 64K - 128K, the next partition 128K - 192K, and so forth, and all partitions are stored in two full copies, we can route reads to the first 32K IDs to the first copy and reads to the second 32K IDs to the second copy. In this way, the copies will keep different working sets. The RAM is used to full advantage.&lt;/p&gt; &lt;p&gt;Of course, if there is a failure, then the working set will degrade, but if this is not often and not for long, this can be quite tolerable. The alternate expense is buying twice as much RAM, likely meaning twice as many servers. This workload is memory intensive, thus servers should have the maximum memory they can have without going to parts that are so expensive one gets a new server for the price of doubling memory.&lt;/p&gt; &lt;h2&gt;Background Bulk Processing&lt;/h2&gt; &lt;p&gt;When loading data, the system is online in principle, but query response can be quite bad. A large &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x19fd9c18&quot;&gt;RDF&lt;/a&gt; load will involve most memory and queries will miss the cache. The load will further keep most disks busy, so response is not good. This is the case as soon as a server&amp;#39;s partition of the database is four times the size of RAM or greater. Whether the work is bulk-load or bulk-delete makes little difference.&lt;/p&gt; &lt;p&gt;But if partitions are replicated, we can temporarily split the database so that the first copies serve queries and the second copies do the load. If the copies serving on line activities do some updates also, these updates will be committed on both copies. But the load will be committed on the second copy only. This is fully appropriate as long as the data are different. When the bulk load is done, the second copy of each partition will have the full up to date state, including changes that came in during the bulk load. The online activity can be now redirected to the second copies and the first copies can be overwritten in the background by the second copies, so as to again have all data in duplicate.&lt;/p&gt; &lt;p&gt;Failures during such operations are not dangerous. If the copies doing the bulk load fail, the bulk load will have to be restarted. If the front end copies fail, the front end load goes to the copies doing the bulk load. Response times will be bad until the bulk load is stopped, but no data is lost.&lt;/p&gt; &lt;p&gt;This technique applies to all data intensive background tasks â calculation of &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x20b7a568&quot;&gt;entity&lt;/a&gt; search ranks, data cleansing, consistency checking, and so on. If two copies are needed to keep up with the online load, then data can be kept just as well in three copies instead of two. This method applies to any data-warehouse-style workload which must coexist with online access and occasional low volume updating.&lt;/p&gt; &lt;h2&gt;Configurations of Redundancy&lt;/h2&gt; &lt;p&gt;Right now, we can declare that two or more server processes in a cluster form a group. All data managed by one member of the group is stored by all others. The members of the group are interchangeable. Thus, if there is four-servers-worth of data, then there will be a minimum of eight servers. Each of these servers will have one server process per core. The first hardware failure will not affect operations. For the second failure, there is a 1/7 chance that it stops the whole system, if it falls on the server whose pair is down. If groups consist of three servers, for a total of 12, the two first failures are guaranteed not to interrupt operations; for the third, there is a 1/10 chance that it will.&lt;/p&gt; &lt;p&gt;We note that for big databases, as said before, the RAM cache capacity is the sum of all the servers&amp;#39; RAM when in normal operation.&lt;/p&gt; &lt;p&gt;There are other, more dynamic ways of splitting data among servers, so that partitions migrate between servers and spawn extra copies of themselves if not enough copies are online. The Google File System (GFS) does something of this sort at the file system level; Amazon&amp;#39;s Dynamo does something similar at the database level. The analogies are not exact, though.&lt;/p&gt; &lt;p&gt;If data is partitioned in this manner, for example into 1K slices, each in duplicate, with the rule that the two duplicates will not be on the same physical server, the first failure will not break operations but the second probably will. Without extra logic, there is a probability that the partitions formerly hosted by the failed server have their second copies randomly spread over the remaining servers. This scheme equalizes load better but is less resilient.&lt;/p&gt; &lt;h2&gt;Maintenance and Continuity&lt;/h2&gt; &lt;p&gt;Databases may benefit from defragmentation, rebalancing of indices, and so on. While these are possible online, by definition they affect the working set and make response times quite bad as soon as the database is significantly larger than RAM. With duplicate copies, the problem is largely solved. Also, software version changes need not involve downtime.&lt;/p&gt; &lt;h2&gt;Present Status&lt;/h2&gt; &lt;p&gt;The basics of replicated partitions are operational. The items to finalize are about system administration procedures and automatic synchronization of recovering copies. This must be automatic because if it is not, the operator will find a way to forget something or do some steps in the wrong order. This also requires a management view that shows what the different processes are doing and whether something is hung or failing repeatedly. All this is for the recovery part; taking failed partitions offline is easy.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>An Update on Virtuoso Development</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-03-05#1529</atom:id>
  <atom:published>2009-03-05T10:23:49Z</atom:published>
  <atom:updated>2009-03-05T09:58:16-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;It is time for an update on &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x151e89e8&quot;&gt;Virtuoso&lt;/a&gt; developments.&lt;/p&gt; &lt;p&gt;We continue enhancing our hosting of the &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1464e168&quot;&gt;Linked Open Data&lt;/a&gt; (&lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x151e7f38&quot;&gt;LOD&lt;/a&gt;) cloud at &lt;a href=&quot;http://lod.openlinksw.com&quot; id=&quot;link-id11ac2448&quot;&gt;http://lod.openlinksw.com&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;We have now added result ranking for both text and URIs. Text hit scores are based on word frequency and proximity; &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x14f9fae0&quot;&gt;URI&lt;/a&gt; scores are based on link density.&lt;/p&gt; &lt;p&gt;We calculate each URI&amp;#39;s rank by adding up references and weighing these by the score of the referrer. This is like in web search. Each iteration of the ranking will join every referred to each of its referrers. We do about 1.2 million such joins per second, across partitions, over 2.2 billion triples and 400M distinct subjects without any great optimization, just using &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xaa36a458&quot;&gt;SQL&lt;/a&gt; stored procedures and partitioned function calls. This is a sort of SQL map-reduce. We would do over twice as fast if it were all in &lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0x1571e270&quot;&gt;C&lt;/a&gt; but this is adequate for now. The more interesting bit will be tuning the scoring based on what type of link we have. This is what the web search engines cannot do as well, since document links are untyped.&lt;/p&gt; &lt;p&gt;We are moving toward a decent user interface for the LOD hosting, including offering ready-made domain-specific queries, e.g., biomedical.&lt;/p&gt; &lt;p&gt;Things like &amp;quot;URI finding with autocomplete&amp;quot; are done and just have to be put online.&lt;/p&gt; &lt;p&gt;With &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x14325e08&quot;&gt;linked data&lt;/a&gt;, there is the whole question of identifier choice. We will have a special page just for this. There we show reference statistics, synonyms declared by &lt;code&gt;&lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x638b3900&quot;&gt;owl&lt;/a&gt;:sameAs&lt;/code&gt;, synonyms determined by shared property values, etc. In this way we become a terminology lookup service.&lt;/p&gt; &lt;p&gt;Copies of the LOD cluster system are available for evaluators, on a case by case basis. We will make this publicly available on EC2 also in not too long.&lt;/p&gt; &lt;p&gt;Otherwise, we continue working on productization, primarily things like reliability and recovery. One exercise is running &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x144d00f0&quot;&gt;TPC-C&lt;/a&gt; with intentionally stupid partitioning, so that almost all joins and deadlocks are distributed. Then we simulate a cluster interconnect that drops messages now and then, sometimes kill server processes, and still keep full ACID properties. Cloud capable, also in bad weather.&lt;/p&gt; &lt;p&gt;The open source release of Virtuoso 6 (no cluster) is basically ready to go, mostly this is a question of logistics.&lt;/p&gt; &lt;p&gt;I will talk about these things in greater individual detail next week.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Facets and Large Ontologies of the LOD Cloud</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-02-16#1527</atom:id>
  <atom:published>2009-02-16T11:21:05Z</atom:published>
  <atom:updated>2009-02-17T16:24:34.000004-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have just submitted &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/lodw.pdf&quot; id=&quot;link-id13d9bc68&quot;&gt;this paper&lt;/a&gt; to the WWW09 &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x51c2fd00&quot;&gt;Linked Open Data&lt;/a&gt; Workshop.&lt;/p&gt; &lt;p&gt;The thing is intermittently live with both &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xa16f87a0&quot;&gt;Dbpedia&lt;/a&gt; on one instance and a &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1a8212d8&quot;&gt;LOD&lt;/a&gt; Cloud &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xc1097a8&quot;&gt;data&lt;/a&gt; collection of about 2 billion triples on another. We will give out the links once we have tested a bit more.&lt;/p&gt; &lt;p&gt;The present activity is all about testing &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x17d64f90&quot;&gt;Virtuoso&lt;/a&gt; 6 for release, cluster and otherwise.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Faceted Search: Unlimited Data in Interactive Time</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-01-09#1516</atom:id>
  <atom:published>2009-01-09T22:03:11Z</atom:published>
  <atom:updated>2009-01-09T17:15:39-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Why not see the whole world of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xc3f6b38&quot;&gt;data&lt;/a&gt; as facets? Well, we&amp;#39;d like to, but there is the feeling that this is not practical.&lt;/p&gt; &lt;p&gt;The old problem has been that it is not really practical to pre-compute counts of everything for all possible combinations of search conditions and counting/grouping/sorting. The actual matches take time.&lt;/p&gt; &lt;p&gt;Well, neither is in fact necessary. When there are large numbers of items matching the conditions, counting them can take time but then this is the beginning of the search, and the user is not even likely to look very closely at the counts. It is enough to see that there are many of one and few of another. If the user already knows the precise predicate or class to look for, then the top-level faceted view is not even needed. The faceted view for guiding search and precise analytics are two different problems.&lt;/p&gt; &lt;p&gt;There are client-side faceted views like Exhibit or our own &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1bc1cfe0&quot;&gt;ODE&lt;/a&gt;. The problem with these is that there are a few orders of magnitude difference between the actual database size and what fits on the user agent. This is compounded by the fact that one does not know what to cache on the user agent because of the open nature of the data web. If this were about a fixed workflow, then a good guess would be possible â but we are talking about the data web, the very soul of serendipity and unexpected discovery.&lt;/p&gt; &lt;p&gt;So we made a web service that will do faceted search on arbitrary &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xbb62170&quot;&gt;RDF&lt;/a&gt;. If it does not get complete results within a timeout, it will return what it has counted so far, using &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb122b00&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1494&quot; id=&quot;link-id117b0df0&quot;&gt;&lt;b&gt;Anytime&lt;/b&gt;&lt;/a&gt; feature. Looking for subjects with some specific combination of properties is however a bit limited, so this will also do &lt;code&gt;JOINs&lt;/code&gt;. Many features are one or two &lt;code&gt;JOINs&lt;/code&gt; away; take geographical locations or social networks, for example.&lt;/p&gt; &lt;p&gt;Yet a faceted search should be point-and-click, and should not involve a full query construction. We put the compromise at starting with full text or property or class, then navigating down properties or classes, to arbitrary depth, tree-wise. At each step, one can see the matching instances or their classes or properties, all with counts, faceted-style.&lt;/p&gt; &lt;p&gt;This is good enough for queries like &amp;#39;what do Harry Potter fans also like&amp;#39; or &amp;#39;who are the authors of articles tagged &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0xbee32d8&quot;&gt;semantic web&lt;/a&gt; and machine learning and published in 2008&amp;#39;. For complex grouping, sub-queries, arithmetic or such, one must write the actual query.&lt;/p&gt; &lt;p&gt;But one can begin with facets, and then continue refining the query by hand since the service also returns &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xbcc9f38&quot;&gt;SPARQL&lt;/a&gt; text. We made a small web interface on top of the service with all logic server side. This proves that the web service is usable and that an interface with no AJAX, and no problems with browser interoperability or such, is possible and easy. Also, the problem of syncing between a user-agent-based store and a database is entirely gone.&lt;/p&gt; &lt;p&gt;If we are working with a known data structure, the user interface should choose the display by the data type and offer links to related reports. This is all easy to build as web pages or AJAX. We show how the generic interface is done in Virtuoso PL, and you can adapt that or rewrite it in &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0xcdbe268&quot;&gt;PHP&lt;/a&gt;, Java, JavaScript, or anything else, to accommodate use-case specific navigation needs such as data format.&lt;/p&gt; &lt;p&gt;The web service takes an &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0xc019c08&quot;&gt;XML&lt;/a&gt; representation of the search, which is more restricted and easier to process by machine than the SPARQL syntax. The web service returns the results, the SPARQL query it generated, whether the results are complete or not, and some resource use statistics.&lt;/p&gt; &lt;p&gt;The source of the PL functions, Web Service and Virtuoso Server Page (HTML UI) will be available as part of Virtuoso 6.0 and higher. A Programmer&amp;#39;s Guide will be available as part of the standard Virtuoso Documentation collection, including the Virtuoso Open Source Edition Website.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data &amp; The Year 2009 (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2009-01-02#1511</atom:id>
  <atom:published>2009-01-02T16:17:06Z</atom:published>
  <atom:updated>2009-01-02T13:26:42.000003-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;As is fitting for the season, I will editorialize a bit about what has gone before and what is to come.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.w3.org/People/Berners-Lee/card#i&quot; id=&quot;link-id1119f250&quot;&gt;Sir Tim&lt;/a&gt; said it at WWW08 in &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0x1dcb93a0&quot;&gt;Beijing&lt;/a&gt; â &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x13a3efb8&quot;&gt;linked data&lt;/a&gt; and the linked data &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x13a44cd0&quot;&gt;web&lt;/a&gt; is the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x10d25788&quot;&gt;semantic web&lt;/a&gt; and the Web done right.&lt;/p&gt; &lt;p&gt;The grail of &lt;i&gt;ad hoc&lt;/i&gt; analytics on infinite &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa201d518&quot;&gt;data&lt;/a&gt; has lost none of its appeal. We have seen fresh evidence of this in the realm of data warehousing products, as well as storage in general.&lt;/p&gt; &lt;p&gt;The benefits of a data model more abstract than the relational are being increasingly appreciated also outside the data web circles. Microsoft&amp;#39;s &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x12fa4e40&quot;&gt;Entity&lt;/a&gt; Frameworks technology is an example. Agility has been a buzzword for a long time. Everything should be offered in a service based business model and should interoperate and integrate with everything else â business needs first; schema last.&lt;/p&gt; &lt;p&gt;Not to forget that when money is tight, reuse of existing assets and paying on a usage basis are naturally emphasized. &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x175b32e8&quot;&gt;Information&lt;/a&gt;, as the asset it is, is none the less important, on the contrary. But even with information, value should be realized economically, which, among other things, entails not reinventing the wheel.&lt;/p&gt; &lt;p&gt;It is against this backdrop that this year will play out.&lt;/p&gt; &lt;p&gt;As concerns research, I will &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1374&quot; id=&quot;link-id1151b128&quot;&gt;again quote&lt;/a&gt; &lt;a href=&quot;http://www.ibiblio.org/hhalpin/#&quot; id=&quot;link-id141cb740&quot;&gt;Harry Halpin&lt;/a&gt; at &lt;a href=&quot;http://www.eswc2008.org/&quot; id=&quot;link-id0x18a8a858&quot;&gt;ESWC 2008&lt;/a&gt;: &amp;quot;Men will fight in a war, and even lose a war, for what they believe just. And it may come to pass that later, even though the war were lost, the things then fought for will emerge under another name and establish themselves as the prevailing reality&amp;quot; [or words to this effect].&lt;/p&gt; &lt;p&gt;Something like the data web, and even the semantic web, will happen. Harry&amp;#39;s question was whether this would be the descendant of what is today called semantic web research.&lt;/p&gt; &lt;p&gt;I heard in conversation about a project for making a very large metadata store. I also heard that the makers did not particularly insist on this being &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x3c39ed80&quot;&gt;RDF&lt;/a&gt;-based, though.&lt;/p&gt; &lt;p&gt;Why should such a thing be RDF-based? If it is already accepted that there will be &lt;i&gt;ad hoc&lt;/i&gt; schema and that queries ought to be able to view the data from all angles, not be limited by having indices one way and not another way, then why not RDF?&lt;/p&gt; &lt;p&gt;The justification of RDF is in reusing and linking-to data and terminology out there. Another justification is that by using an RDF store, one is spared a lot of work and tons of compromises which attend making an &lt;a href=&quot;http://dbpedia.org/resource/Entity-attribute-value_model&quot; id=&quot;link-id0x14a77880&quot;&gt;entity&lt;/a&gt;-attribute-value (&lt;a href=&quot;http://dbpedia.org/resource/Entity-attribute-value_model&quot; id=&quot;link-id0x5f978e88&quot;&gt;EAV&lt;/a&gt;, i.e., triple) store on a generic &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x391bdcd8&quot;&gt;RDBMS&lt;/a&gt;. The sem-web world has been there, trust me. We came out well because we put all inside the RDBMS, lowest level, which you can&amp;#39;t do unless you own the RDBMS. Source access is not enough; you also need the &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x138a3a00&quot;&gt;knowledge&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Technicalities aside, the question is one of proprietary vs. standards-based. This is not only so with software components, where standards have consistently demonstrated benefits, but now also with the data. &lt;a href=&quot;http://www.zemanta.com/&quot; id=&quot;link-id0x5f92cb38&quot;&gt;Zemanta&lt;/a&gt; and &lt;a href=&quot;http://www.opencalais.com/&quot; id=&quot;link-id0x139c3200&quot;&gt;OpenCalais&lt;/a&gt; serving &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x1731dc78&quot;&gt;DBpedia&lt;/a&gt; URIs are examples. Even in entirely closed applications, there is benefit in reusing open vocabularies and identifiers: One does not need to create a secret language for writing a secret memo.&lt;/p&gt; &lt;p&gt;Where data is a carrier of value, its value is enhanced by it being easy to repurpose (i.e., standard vocabularies) and to discover (i.e., data set metadata). As on the web, so on the enterprise &lt;a href=&quot;http://dbpedia.org/resource/Intranet&quot; id=&quot;link-id0x1324ada8&quot;&gt;intranet&lt;/a&gt;. In this lies the strength of RDF as opposed to proprietary flexible database schemes. This is a qualitative distinction.&lt;/p&gt; &lt;p align=&quot;center&quot;&gt; &lt;a href=&quot;http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData&quot; id=&quot;link-id117178a8&quot;&gt;&lt;img src=&quot;http://www.openlinksw.com/images/logos/LoDLogo.gif&quot; alt=&quot;Linking Open Data project logo&quot; /&gt; &lt;/a&gt; &lt;br /&gt; &lt;a href=&quot;http://dbpedia.org/resource/In_hoc_signo_vinces&quot; id=&quot;link-id115f47e8&quot;&gt;&lt;i&gt;In hoc signo vinces.&lt;/i&gt; &lt;/a&gt; &lt;/p&gt; &lt;p&gt;In this light, we welcome the &lt;a href=&quot;http://semanticweb.org/wiki/VoiD&quot; id=&quot;link-id0x67cf560&quot;&gt;voiD&lt;/a&gt; (&lt;a href=&quot;http://semanticweb.org/wiki/VoiD&quot; id=&quot;link-id0x1898c908&quot;&gt;VOcabulary of Interlinked Data&lt;/a&gt;), which is the first promise of making federatable data discoverable. Now that there is a point of focus for these efforts, the needed expressivity will no doubt accrete around the voiD core.&lt;/p&gt; &lt;p&gt;For data as a service, we clearly see the value of open terminologies as prerequisites for service interchangeability, i.e., creating a marketplace. &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x1588d6a8&quot;&gt;XML&lt;/a&gt; is for the transaction; RDF is for the discovery, query, and analytics. As with databases in general, first there was the transaction; then there was the query. Same here. For monetizing the query, there are models ranging from renting data sets and server capacity in the clouds to hosted services where one pays for processing past a certain quota. For the hosted case, we just removed a major barrier to offering unlimited query against unlimited data when we completed the &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1374&quot; id=&quot;link-id110b8668&quot;&gt;Virtuoso Anytime&lt;/a&gt; feature. With this, the user gets what is found within a set time, which is already something, and in case of needing more, one can pay for the usage. Of course, we do not forget advertising. When data has explicit semantics, contextuality is better than with keywords.&lt;/p&gt; &lt;p&gt;For these visions to materialize on top of the linked data platform, linked data must join the world of data. This means messaging that is geared towards the database public. They know the problem, but the RDF proposition is still not well enough understood for it to connect.&lt;/p&gt; &lt;p&gt;For the relational IT world, we offer passage to the data web and its promise of integration through RDF mapping. We are also bringing out new Microsoft Entity &lt;a href=&quot;http://dbpedia.org/resource/ADO.NET_Entity_Framework&quot; id=&quot;link-id0x13a50fd8&quot;&gt;Framework&lt;/a&gt; components. This goes in the direction of defining a unified database frontier with RDF and non-RDF entity models side by side.&lt;/p&gt; &lt;p&gt;For &lt;a href=&quot;http://www.openlinksw.com/dataspace/organization/openlink#this&quot; id=&quot;link-id0x1d2ea7f0&quot;&gt;OpenLink Software&lt;/a&gt;, 2008 was about developing technology for scale, RDF as well as generic relational. We did show a tiny preview with the &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x658fbc8&quot;&gt;Billion Triples Challenge&lt;/a&gt; demo. Now we are set to come out with the real thing, featuring, among other things, faceted search at the billion triple scale. We &lt;a href=&quot;http://www.openlinksw.com/blog/kidehen@openlinksw.com/blog/?id=1489&quot; id=&quot;link-id150c6090&quot;&gt;started offering ready-to-go Virtuoso-hosted linked open data sets&lt;/a&gt; on Amazon EC2 in December. Now we continue doing this based on our next-generation server, as well as make Virtuoso 6 Cluster commercially available. Technical specifics are amply discussed on this &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1424ec20&quot;&gt;blog&lt;/a&gt;. There are still some new technology things to be developed this year; first among these are strong &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x14b8ca88&quot;&gt;SPARQL&lt;/a&gt; federation, and on-the-fly resizing of server clusters. On the research partnerships side, we have an EU grant for working with the OntoWiki project from the University of Leipzig, and we are partners in DERI&amp;#39;s &lt;a href=&quot;https://lion.deri.ie/&quot; id=&quot;link-id115c02f8&quot;&gt;LÃ­on project&lt;/a&gt;. These will provide platforms for further demonstrating the &amp;quot;web&amp;quot; in data web, as in web-scale smart databasing.&lt;/p&gt; &lt;p&gt;2009 will see change through scale. The things that exist will start interconnecting and there will be emergent value. Deployments will be larger and scale will be readily available through a services model or by installation at one&amp;#39;s own facilities. We may see the start of Search becoming Find, like &lt;a href=&quot;http://myopenlink.net/dataspace/person/kidehen#this&quot; id=&quot;link-id14e43050&quot;&gt;Kingsley&lt;/a&gt; says, meaning semantics of data guiding search. Entity extraction will multiply data volumes and bring parts of the data web to real time.&lt;/p&gt; &lt;p&gt;Exciting 2009 to all.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso 6 FAQ directory</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-12-18#1507</atom:id>
  <atom:published>2008-12-18T15:46:18Z</atom:published>
  <atom:updated>2008-12-22T14:30:07.000004-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have received various inquiries on high-end metadata stores. I will here go through some salient questions. The requested features include:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Scaling to trillions of triples&lt;/li&gt; &lt;li&gt;Running on clusters of commodity servers&lt;/li&gt; &lt;li&gt;Running in federated environments, possibly over wide area networks&lt;/li&gt; &lt;li&gt;Built-in inference&lt;/li&gt; &lt;li&gt;Transactions&lt;/li&gt; &lt;li&gt;Security&lt;/li&gt; &lt;li&gt;Support for extra triple level metadata, such as security attributes&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Q: What is the storage cost per triple? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#StorageCostPerTriple&quot; id=&quot;link-id147f61e8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the cost to insert a triple? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#TripleInsertionCost&quot; id=&quot;link-id112e2488&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the cost to delete a triple? (For the insertion itself, as well as for updating any indices) &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#TripleDeletionCost&quot; id=&quot;link-id11728528&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the cost to search on a given property? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#PropertySearchCost&quot; id=&quot;link-id1586e360&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id14688e38&quot;&gt;data&lt;/a&gt; types are supported? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#SupportedDataTypes&quot; id=&quot;link-id1593dbf0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What inferencing is supported? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#SupportedInferencing&quot; id=&quot;link-id112f3248&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Is the inferencing dynamic or is an extra step required before inferencing can be used? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#InferencingDynamism&quot; id=&quot;link-id1477e2e0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Do you support &lt;a href=&quot;http://dbpedia.org/resource/Full_text_search&quot; id=&quot;link-id1177b198&quot;&gt;full text search&lt;/a&gt;? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#FullTextSearchSupport&quot; id=&quot;link-id1543b170&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What programming interfaces are supported? Do you support standard &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id14bb69c0&quot;&gt;SPARQL protocol&lt;/a&gt;? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#SupportedProgrammingInterfaces&quot; id=&quot;link-id14d4eb18&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How can data be partitioned across multiple servers? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#MultipleServerDataPartitioning&quot; id=&quot;link-id13722e00&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How many triples can a single server handle? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#SingleServerTripleLimits&quot; id=&quot;link-id14046e58&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the performance impact of going from the billion to the trillion triples? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#PerformanceImpactBillionToTrillion&quot; id=&quot;link-id113cfc10&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Do you support additional metadata for triples, such as timestamps, security tags etc? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#TripleMetadataSupport&quot; id=&quot;link-id14c75fa8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Should we use &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id11342010&quot;&gt;RDF&lt;/a&gt; for our large metadata store? What are the alternatives? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#LargeMetadataStoreFormat&quot; id=&quot;link-id1478db38&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How multithreaded is &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id1651d028&quot;&gt;Virtuoso&lt;/a&gt;? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#VirtuosoMultiThreading&quot; id=&quot;link-id152ad310&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Can multiple servers run off a single shared disk database? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#MultipleServersOneDiskDatabase&quot; id=&quot;link-id14d9d528&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Can Virtuoso run on a SAN? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#VirtuosoOnSAN&quot; id=&quot;link-id111b55d0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How does Virtuoso join across partitions? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#CrossPartitionJoins&quot; id=&quot;link-id11094db8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Does Virtuoso support federated triple stores? If there are multiple &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id19156b48&quot;&gt;SPARQL&lt;/a&gt; end points, can Virtuoso be used to do queries joining between these? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#FederatedTripleStoresAndQueries&quot; id=&quot;link-id15447ef8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How many servers can a cluster contain? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#ClusterServerLimit&quot; id=&quot;link-id125fe0d0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How do I reconfigure a cluster, adding and removing machines, etc? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#ClusterReconfiguration&quot; id=&quot;link-id1150c448&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How will Virtuoso handle regional clusters? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#RegionalClustering&quot; id=&quot;link-id1596ca48&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Is there a mechanism for terminating long running queries? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#TerminatingLongRunningQueries&quot; id=&quot;link-id116bbd60&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Can the user be asynchronously notified when a long running query terminates? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#AsynchNotificationOfQueryTermination&quot; id=&quot;link-id15a59a50&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: How many concurrent queries can Virtuoso handle? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#ConcurrentQueryLimits&quot; id=&quot;link-id110a8c00&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the relative performance of SPARQL queries vs. native relational queries &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#RelativePerformanceSparqlVsSql&quot; id=&quot;link-id110914f8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Does Virtuoso support property tables? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#PropertyTableSupport&quot; id=&quot;link-id1581f8c8&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What performance metrics does Virtuoso offer? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#PerformanceMetricSupport&quot; id=&quot;link-id14e92300&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What support do you provide for concurrency/multithreading operation? Is your interface thread-safe? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#ConcurrencyAndThreadSafety&quot; id=&quot;link-id15964b80&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What level of ACID properties are supported? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#AcidComplianceLevel&quot; id=&quot;link-id11035ac0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Do you provide the ability to atomically add a set of triples, where either all are added or none are added? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#AtomicTripleInsertion&quot; id=&quot;link-id15290e68&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: Do you provide the ability to add a set of triples, respecting the isolation property (so concurrent accessors either see none of the triple values, or all of them)? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#IsolationDuringInsertion&quot; id=&quot;link-id15855df0&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What is the time to start a database, create/open a graph? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#StartupTimes&quot; id=&quot;link-id14227f40&quot;&gt;answer&lt;/a&gt; &lt;/p&gt; &lt;p&gt;Q: What sort of security features are built into Virtuoso? &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/Virt6FAQ.html#BuiltInSecurity&quot; id=&quot;link-id11927810&quot;&gt;answer&lt;/a&gt; &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso RDF: A Getting Started Guide for the Developer</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-12-17#1505</atom:id>
  <atom:published>2008-12-17T12:31:34Z</atom:published>
  <atom:updated>2008-12-17T12:41:27.000006-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;It is a long standing promise of mine to dispel the false impression that using &lt;a href=&quot;http://virtuoso.openlinksw.com/&quot; id=&quot;link-id113506d0&quot;&gt;Virtuoso&lt;/a&gt; to work with &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id115d9528&quot;&gt;RDF&lt;/a&gt; is complicated.&lt;/p&gt; &lt;p&gt;The purpose of this presentation is to show a programmer how to put RDF into Virtuoso and how to query it. This is done programmatically, with no confusing user interfaces.&lt;/p&gt; &lt;p&gt;You should have a Virtuoso Open Source tree built and installed. We will look at the LUBM benchmark demo that comes with the package. All you need is a Unix shell. Running the shell under emacs (&lt;code&gt;m-x shell&lt;/code&gt;) is the best. But the open source &lt;code&gt;isql&lt;/code&gt; utility should have command line editing also. The emacs shell is however convenient for cutting and pasting things between shell and files.&lt;/p&gt; &lt;p&gt;To get started, cd into &lt;code&gt;binsrc/tests/lubm&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;To verify that this works, you can do &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;./test_server.sh virtuoso-t&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This will test the server with the LUBM queries. This should report 45 tests passed. After this we will do the tests step-by-step.&lt;/p&gt; &lt;h2&gt;Loading the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id10f7bd90&quot;&gt;Data&lt;/a&gt; &lt;/h2&gt; &lt;p&gt;The file &lt;code&gt;lubm-load.sql&lt;/code&gt; contains the commands for loading the LUBM single university qualification database.&lt;/p&gt; &lt;p&gt;The data files themselves are in &lt;code&gt;lubm_8000&lt;/code&gt;, 15 files in RDFXML.&lt;/p&gt; &lt;p&gt;There is also a little ontology called &lt;code&gt;inf.nt&lt;/code&gt;. This declares the subclass and subproperty relations used in the benchmark.&lt;/p&gt; &lt;p&gt;So now let&amp;#39;s go through this procedure.&lt;/p&gt; &lt;p&gt;Start the server:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;$ virtuoso-t -f &amp;amp; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This starts the server in foreground mode, and puts it in the background of the shell.&lt;/p&gt; &lt;p&gt;Now we connect to it with the isql utility.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;$ isql 1111 dba dba &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This gives a &lt;code&gt;SQL&amp;gt;&lt;/code&gt; prompt. The default username and password are both &lt;code&gt;dba&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;When a command is &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id1176ce70&quot;&gt;SQL&lt;/a&gt;, it is entered directly. If it is &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id156df468&quot;&gt;SPARQL&lt;/a&gt;, it is prefixed with the keyword &lt;code&gt;sparql&lt;/code&gt;. This is how all the SQL clients work. Any SQL client, such as any &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id152d0a00&quot;&gt;ODBC&lt;/a&gt; or &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id157ad6a0&quot;&gt;JDBC&lt;/a&gt; application, can use SPARQL if the SQL string starts with this keyword.&lt;/p&gt; &lt;p&gt;The &lt;code&gt;lubm-load.sql&lt;/code&gt; file is quite self-explanatory. It begins with defining an SQL procedure that calls the RDF/XML load function, &lt;code&gt;DB..RDF_LOAD_RDFXML&lt;/code&gt;, for each file in a directory.&lt;/p&gt; &lt;p&gt;Next it calls this function for the &lt;code&gt;lubm_8000&lt;/code&gt; directory under the server&amp;#39;s working directory.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql CLEAR GRAPH &amp;lt;lubm&amp;gt;; sparql CLEAR GRAPH &amp;lt;inf&amp;gt;; load_lubm ( server_root() || &amp;#39;/lubm_8000/&amp;#39; ); &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Then it verifies that the right number of triples is found in the &amp;lt;lubm&amp;gt; graph.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql SELECT COUNT(*) FROM &amp;lt;lubm&amp;gt; WHERE { ?x ?y ?z } ; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;The echo commands below this are interpreted by the isql utility, and produce output to show whether the test was passed. They can be ignored for now.&lt;/p&gt; &lt;p&gt;Then it adds some implied &lt;code&gt;subOrganizationOf&lt;/code&gt; triples. This is part of setting up the LUBM test database.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql PREFIX ub: &amp;lt;http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#&amp;gt; INSERT INTO GRAPH &amp;lt;lubm&amp;gt; { ?x ub:subOrganizationOf ?z } FROM &amp;lt;lubm&amp;gt; WHERE { ?x ub:subOrganizationOf ?y . ?y ub:subOrganizationOf ?z . }; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Then it loads the ontology file, &lt;code&gt;inf.nt&lt;/code&gt;, using the Turtle load function, &lt;code&gt;DB.DBA.TTLP&lt;/code&gt;. The arguments of the function are the text to load, the default namespace prefix, and the &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id15835550&quot;&gt;URI&lt;/a&gt; of the target graph.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;DB.DBA.TTLP ( file_to_string ( &amp;#39;inf.nt&amp;#39; ), &amp;#39;http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl&amp;#39;, &amp;#39;inf&amp;#39; ) ; sparql SELECT COUNT(*) FROM &amp;lt;inf&amp;gt; WHERE { ?x ?y ?z } ; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Then we declare that the triples in the &lt;code&gt;&amp;lt;inf&amp;gt;&lt;/code&gt; graph can be used for inference at run time. To enable this, a SPARQL query will declare that it uses the &lt;code&gt;&amp;#39;inft&amp;#39;&lt;/code&gt; rule set. Otherwise this has no effect.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;rdfs_rule_set (&amp;#39;inft&amp;#39;, &amp;#39;inf&amp;#39;); &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This is just a log checkpoint to finalize the work and truncate the transaction log. The server would also eventually do this in its own time.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;checkpoint; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Now we are ready for querying.&lt;/p&gt; &lt;h2&gt;Querying the Data&lt;/h2&gt; &lt;p&gt;The queries are given in 3 different versions: The first file, &lt;code&gt;lubm.sql&lt;/code&gt;, has the queries with most inference open coded as &lt;code&gt;UNIONs&lt;/code&gt;. The second file, &lt;code&gt;lubm-inf.sql&lt;/code&gt;, has the inference performed at run time using the ontology &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id1109faf0&quot;&gt;information&lt;/a&gt; in the &lt;code&gt;&amp;lt;inf&amp;gt;&lt;/code&gt; graph we just loaded. The last, &lt;code&gt;lubm-phys.sql&lt;/code&gt;, relies on having the entailed triples physically present in the &lt;code&gt;&amp;lt;lubm&amp;gt;&lt;/code&gt; graph. These entailed triples are inserted by the SPARUL commands in the &lt;code&gt;lubm-cp.sql&lt;/code&gt; file.&lt;/p&gt; &lt;p&gt;If you wish to run all the commands in a SQL file, you can type &lt;code&gt;load &amp;lt;filename&amp;gt;;&lt;/code&gt; (e.g., &lt;code&gt;load lubm-cp.sql;&lt;/code&gt;) at the &lt;code&gt;SQL&amp;gt;&lt;/code&gt; prompt. If you wish to try individual statements, you can paste them to the command line.&lt;/p&gt; &lt;p&gt;For example: &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;SQL&amp;gt; sparql PREFIX ub: &amp;lt;http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#&amp;gt; SELECT * FROM &amp;lt;lubm&amp;gt; WHERE { ?x a ub:Publication . ?x ub:publicationAuthor &amp;lt;http://www.Department0.University0.edu/AssistantProfessor0&amp;gt; }; VARCHAR _______________________________________________________________________ http://www.Department0.University0.edu/AssistantProfessor0/Publication0 http://www.Department0.University0.edu/AssistantProfessor0/Publication1 http://www.Department0.University0.edu/AssistantProfessor0/Publication2 http://www.Department0.University0.edu/AssistantProfessor0/Publication3 http://www.Department0.University0.edu/AssistantProfessor0/Publication4 http://www.Department0.University0.edu/AssistantProfessor0/Publication5 6 Rows. -- 4 msec. &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;To stop the server, simply type &lt;code&gt;shutdown;&lt;/code&gt; at the &lt;code&gt;SQL&amp;gt;&lt;/code&gt; prompt.&lt;/p&gt; &lt;p&gt;If you wish to use a &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id11384668&quot;&gt;SPARQL protocol&lt;/a&gt; end point, just enable the HTTP listener. This is done by adding a stanza like â&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;[HTTPServer] ServerPort = 8421 ServerRoot = . ServerThreads = 2 &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;â to the end of the &lt;code&gt;virtuoso.ini&lt;/code&gt; file in the &lt;code&gt;lubm&lt;/code&gt; directory. Then shutdown and restart (type &lt;code&gt;shutdown;&lt;/code&gt; at the &lt;code&gt;SQL&amp;gt;&lt;/code&gt; prompt and then &lt;code&gt;virtuoso-t -f &amp;amp;&lt;/code&gt; at the shell prompt).&lt;/p&gt; &lt;p&gt;Now you can connect to the end point with a web browser. The &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id113d02d8&quot;&gt;URL&lt;/a&gt; is &lt;code&gt;http://localhost:8421/sparql&lt;/code&gt;. Without parameters, this will show a human readable form. With parameters, this will execute SPARQL.&lt;/p&gt; &lt;p&gt;We have shown how to load and query RDF with Virtuoso using the most basic SQL tools. Next you can access RDF from, for example, &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id142d0ba0&quot;&gt;PHP&lt;/a&gt;, using the PHP ODBC interface.&lt;/p&gt; &lt;p&gt;To see how to use &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id117074f0&quot;&gt;Jena&lt;/a&gt; or &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id1103c9b0&quot;&gt;Sesame&lt;/a&gt; with Virtuoso, look at &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/rdfnativestorageproviders.html&quot; id=&quot;link-id15488ce8&quot;&gt;Native RDF Storage Providers&lt;/a&gt;. To see how RDF data types are supported, see &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/VirtuosoDriverJDBC.html#jdbcrdf&quot; id=&quot;link-id15784a40&quot;&gt;Extension datatype for RDF&lt;/a&gt; &lt;/p&gt; &lt;p&gt;To work with large volumes of data, you must add memory to the configuration file and use the row-autocommit mode, i.e., do &lt;code&gt;log_enableÂ (2);&lt;/code&gt; before the load command. Otherwise Virtuoso will do the entire load as a single transaction, and will run out of rollback space. See &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/&quot; id=&quot;link-id111410f0&quot;&gt;documentation&lt;/a&gt; for more.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>See the Lite: Embeddable/Background Virtuoso starts at 25MB</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-12-17#1503</atom:id>
  <atom:published>2008-12-17T09:34:12Z</atom:published>
  <atom:updated>2008-12-17T12:03:49-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We have received many requests for an embeddable-scale &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1cd69650&quot;&gt;Virtuoso&lt;/a&gt;. In response to this, we have added a Lite mode, where the initial size of a server process is a tiny fraction of what the initial size would be with default settings. With 2MB of disk cache buffers (ini file setting, &lt;code&gt;NumberOfBuffers = 256&lt;/code&gt;), the process size stays under 30MB on 32-bit Linux.&lt;/p&gt; &lt;p&gt;The value of this is that one can now have &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1ce89340&quot;&gt;RDF&lt;/a&gt; and full text indexing on the desktop without running a Java VM or any other memory-intensive software. And of course, all of &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1cfc9288&quot;&gt;SQL&lt;/a&gt; (transactions, stored procedures, etc.) is in the same embeddably-sized container.&lt;/p&gt; &lt;p&gt;The Lite executable is a full Virtuoso executable; the Lite mode is controlled by a switch in the configuration file. The executable size is about 10MB for 32-bit Linux. A database created in the Lite mode will be converted into a fully-featured database (tables and indexes are added, among other things) if the server is started with the Lite setting &amp;quot;off&amp;quot;; functionality can be reverted to Lite mode, though it will now consume somewhat more memory, etc.&lt;/p&gt; &lt;p&gt;Lite mode offers full SQL and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1c511da8&quot;&gt;SPARQL&lt;/a&gt;/SPARUL (via SPASQL), but disables all &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1dac1950&quot;&gt;HTTP&lt;/a&gt;-based services (WebDAV, application hosting, etc.). Clients can still use all typical database access mechanisms (i.e., &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0xb19a488&quot;&gt;ODBC&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1d93ee40&quot;&gt;JDBC&lt;/a&gt;, OLE-DB, &lt;a href=&quot;http://dbpedia.org/resource/ADO.NET&quot; id=&quot;link-id0x1ce391c0&quot;&gt;ADO&lt;/a&gt;.&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0xacf1168&quot;&gt;NET&lt;/a&gt;, and XMLA) to connect, including the &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0xaaf5b58&quot;&gt;Jena&lt;/a&gt; and &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x1b1e4328&quot;&gt;Sesame&lt;/a&gt; frameworks for RDF. ODBC now offers full support of RDF &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1cfc9f78&quot;&gt;data&lt;/a&gt; types for &lt;a href=&quot;http://dbpedia.org/resource/C%2B%2B&quot; id=&quot;link-id0xa6059d8&quot;&gt;C&lt;/a&gt;-based clients. A Redland-compatible API also exists, for use with Redland v1.0.8 and later. &lt;/p&gt; &lt;p&gt;Especially for embedded use, we now allow restricting the listener to be a Unix socket, which allows client connections only from the localhost.&lt;/p&gt; &lt;p&gt;Shipping an embedded Virtuoso is easy. It just takes one executable and one configuration file. Performance is generally comparable to &amp;quot;normal&amp;quot; mode, except that Lite will be somewhat less scalable on multicore systems.&lt;/p&gt; &lt;p&gt;The Lite mode will be included in the next Virtuoso 5 Open Source release.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>&quot;E Pluribus Unum&quot;, or &quot;Inversely Functional Identity&quot;, or &quot;Smooshing Without the Stickiness&quot; (re-updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-12-16#1499</atom:id>
  <atom:published>2008-12-16T14:14:43Z</atom:published>
  <atom:updated>2008-12-16T15:01:36.000003-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;What a terrible word, smooshing... I have understood it to mean that when you have two names for one thing, you give each all the attributes of the other. This smooshes them together, makes them interchangeable.&lt;/p&gt; &lt;p&gt;This is complex, so I will begin with the point and the interested may read on for the details and implications. Starting with soon to be released version 6, &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id15718cb8&quot;&gt;Virtuoso&lt;/a&gt; allows you to say that two things, if they share a uniquely identifying property, are the same. Examples of uniquely identifying properties would be a book&amp;#39;s ISBN number, or a person&amp;#39;s social security plus full name. In relational language this is a &lt;i&gt;unique key&lt;/i&gt;, and in &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id145ed998&quot;&gt;RDF&lt;/a&gt; parlance, an &lt;i&gt;inverse functional property&lt;/i&gt;.&lt;/p&gt; &lt;p&gt;In most systems, such problems are dealt with as a preprocessing step before querying. For example, all the items that are considered the same will get the same properties or at load time all identifiers will be normalized according to some application rules. This is good if the rules are clear and understood. This is so in closed situations, where things tend to have standard identifiers to begin with. But on the open web this is not so clear cut.&lt;/p&gt; &lt;p&gt;In this post, we show how to do these things &lt;i&gt;ad hoc&lt;/i&gt;, without materializing anything. At the end, we also show how to materialize identity and what the consequences of this are with open web &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id11726358&quot;&gt;data&lt;/a&gt;. We use real live web crawls from the &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id14f40448&quot;&gt;Billion Triples Challenge&lt;/a&gt; data set.&lt;/p&gt; &lt;p&gt;On the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id156e2b10&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id1106ce08&quot;&gt;web&lt;/a&gt;, there are independently arising descriptions of the same thing and thus arises the need to smoosh, if these are to be somehow integrated. But this is only the beginning of the problems.&lt;/p&gt; &lt;p&gt;To address these, we have added the option of specifying that some property will be considered inversely functional in a query. This is done at run time and the property does not really have to be inversely functional in the pure sense. &lt;code&gt;foaf:name&lt;/code&gt; will do for an example. This simply means that for purposes of the query concerned, two subjects which have at least one &lt;code&gt;foaf:name&lt;/code&gt; in common are considered the same. In this way, we can join between FOAF files. With the same database, a query about music preferences might consider having the same name as &amp;quot;same enough,&amp;quot; but a query about criminal prosecution would obviously need to be more precise about sameness.&lt;/p&gt; &lt;p&gt;Our ontology is defined like this:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;-- Populate a named graph with the triples you want to use in query time inferencing&lt;br /&gt; ttlp ( &amp;#39; @prefix foaf: &amp;lt;xmlns=&amp;quot;http&amp;quot; xmlns.com=&amp;quot;xmlns.com&amp;quot; foaf=&amp;quot;foaf&amp;quot;&amp;gt; &amp;lt;/&amp;gt; @prefix owl: &amp;lt;xmlns=&amp;quot;http&amp;quot; www.w3.org=&amp;quot;www.w3.org&amp;quot; owl=&amp;quot;owl&amp;quot;&amp;gt; &amp;lt;/&amp;gt; foaf:mbox_sha1sum a owl:InverseFunctionalProperty . foaf:name a owl:InverseFunctionalProperty . &amp;#39;, &amp;#39;xx&amp;#39;, &amp;#39;b3sifp&amp;#39; );&lt;br /&gt; -- Declare that the graph contains an ontology for use in query time inferencing &lt;br /&gt; rdfs_rule_set ( &amp;#39;http://example.com/rules/b3sifp#&amp;#39;, &amp;#39;b3sifp&amp;#39; ); &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Then use it:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql DEFINE input:inference &amp;quot;http://example.com/rules/b3sifp#&amp;quot; SELECT DISTINCT ?k ?f1 ?f2 WHERE { ?k foaf:name ?n . ?n bif:contains &amp;quot;&amp;#39;Kjetil Kjernsmo&amp;#39;&amp;quot; . ?k foaf:knows ?f1 . ?f1 foaf:knows ?f2 };&lt;br /&gt; VARCHAR VARCHAR VARCHAR ______________________________________ _______________________________________________ ______________________________&lt;br /&gt; http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/dajobe http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/net_twitter http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/amyvdh http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/pom http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/mattb http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/davorg http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/distobj http://www.kjetil.kjernsmo.net/foaf#me http://norman.walsh.name/knows/who/robin-berjon http://twitter.com/perigrin .... &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Without the inference, we get no matches. This is because the data in question has one graph per FOAF file, and blank nodes for persons. No graph references any person outside the ones in the graph. So if somebody is mentioned as known, then without the inference there is no way to get to what that person&amp;#39;s FOAF file says, since the same individual will be a different blank node there. The declaration in the context named &lt;code&gt;b3sifp&lt;/code&gt; just means that all things with a matching &lt;code&gt;foaf:name&lt;/code&gt; or &lt;code&gt;foaf:mbox_sha1sum&lt;/code&gt; are the same.&lt;/p&gt; &lt;p&gt;Sameness means that two are the same for purposes of &lt;code&gt;DISTINCT&lt;/code&gt; or &lt;code&gt;GROUP BY&lt;/code&gt;, and if two are the same, then both have the &lt;code&gt;UNION&lt;/code&gt; of all of the properties of both.&lt;/p&gt; &lt;p&gt;If this were a naive smoosh, then the individuals would have all the same properties but would not be the same for &lt;code&gt;DISTINCT&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;If we have complex application rules for determining whether individuals are the same, then one can materialize &lt;code&gt;owl:sameAs&lt;/code&gt; triples and keep them in a separate graph. In this way, the original data is not contaminated and the materialized volume stays reasonable â nothing like the blow-up of duplicating properties across instances.&lt;/p&gt; &lt;p&gt;The pro-smoosh argument is that if every duplicate makes exactly the same statements, then there is no great blow-up. Best and worst cases will always depend on the data. In rough terms, the more &lt;i&gt;ad hoc&lt;/i&gt; the use, the less desirable the materialization. If the usage pattern is really set, then a relational-style application-specific representation with identity resolved at load time will perform best. We can do that too, but so can others.&lt;/p&gt; &lt;p&gt;The principal point is about agility as concerns the inference. Run time is more agile than materialization, and if the rules change or if different users have different needs, then materialization runs into trouble. When talking web scale, having multiple users is a given; it is very uneconomical to give everybody their own copy, and the likelihood of a user accessing any significant part of the corpus is minimal. Even if the queries were not limited, the user would typically not wait for the answer of a query doing a scan or aggregation over 1 billion &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id1156a550&quot;&gt;blog&lt;/a&gt; posts or something of the sort. So queries will typically be selective. Selective means that they do not access all of the data, hence do not benefit from ready-made materialization for things they do not even look at. &lt;/p&gt; &lt;p&gt;The exception is corpus-wide statistics queries. But these will not be done in interactive time anyway, and will not be done very often. Plus, since these do not typically run all in memory, these are disk bound. And when things are disk bound, size matters. Reading extra entailment on the way is just a performance penalty.&lt;/p&gt; &lt;p&gt;Enough talk. Time for an experiment. We take the Yahoo and Falcon web crawls from the Billion Triples Challenge set, and do two things with the FOAF data in them:&lt;/p&gt; &lt;ol&gt; &lt;li&gt;Resolve identity at insert time. We remove duplicate person URIs, and give the single &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id11317008&quot;&gt;URI&lt;/a&gt; all the properties of all the duplicate URIs. We expect these to be most often repeats. If a person references another person, we normalize this reference to go to the single URI of the referenced person.&lt;/li&gt; &lt;li&gt;Give every duplicate URI of a person all the properties of all the duplicates. If these are the same value, the data should not get much bigger, or so we think.&lt;/li&gt; &lt;/ol&gt; &lt;p&gt;For the experiment, we will consider two people the same if they have the same &lt;code&gt;foaf:name&lt;/code&gt; and are both instances of &lt;code&gt;foaf:Person&lt;/code&gt;. This gets some extra hits but should not be statistically significant.&lt;/p&gt; &lt;p&gt;The following is a commented &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id110945b0&quot;&gt;SQL&lt;/a&gt; script performing the smoosh. We play with internal IDs of things, thus some of these operations cannot be done in SPARQL alone. We use SPARQL where possible for readability. As the documentation states, &lt;code&gt;iri_to_id&lt;/code&gt; converts from the qualified name of an IRI to its ID and &lt;code&gt;id_to_iri&lt;/code&gt; does the reverse.&lt;/p&gt; &lt;p&gt;We count the triples that enter into the smoosh:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;-- the name is an existence because else we&amp;#39;d get several times more due to -- the names occurring in many graphs &lt;br /&gt; sparql SELECT COUNT(*) WHERE { { SELECT DISTINCT ?person WHERE { ?person a foaf:Person } } . FILTER ( bif:exists ( SELECT (1) WHERE { ?person foaf:name ?nn } ) ) . ?person ?p ?o };&lt;br /&gt; -- We get 3284674 &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We make a few tables for intermediate results.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;-- For each distinct name, gather the properties and objects from -- all subjects with this name &lt;br /&gt; CREATE TABLE name_prop ( np_name ANY, np_p IRI_ID_8, np_o ANY, PRIMARY KEY ( np_name, np_p, np_o ) ); ALTER INDEX name_prop ON name_prop PARTITION ( np_name VARCHAR (-1, 0hexffff) );&lt;br /&gt; -- Map from name to canonical IRI used for the name &lt;br /&gt; CREATE TABLE name_iri ( ni_name ANY PRIMARY KEY, ni_s IRI_ID_8 ); ALTER INDEX name_iri ON name_iri PARTITION ( ni_name VARCHAR (-1, 0hexffff) );&lt;br /&gt; -- Map from person IRI to canonical person IRI&lt;br /&gt; CREATE TABLE pref_iri ( i IRI_ID_8, pref IRI_ID_8, PRIMARY KEY ( i ) ); ALTER INDEX pref_iri ON pref_iri PARTITION ( i INT (0hexffff00) );&lt;br /&gt; -- a table for the materialization where all aliases get all properties of every other &lt;br /&gt; CREATE TABLE smoosh_ct ( s IRI_ID_8, p IRI_ID_8, o ANY, PRIMARY KEY ( s, p, o ) ); ALTER INDEX smoosh_ct ON smoosh_ct PARTITION ( s INT (0hexffff00) );&lt;br /&gt; -- disable transaction log and enable row auto-commit. This is necessary, otherwise -- bulk operations are done transactionally and they will run out of rollback space.&lt;br /&gt; LOG_ENABLE (2);&lt;br /&gt; -- Gather all the properties of all persons with a name under that name. -- INSERT SOFT means that duplicates are ignored &lt;br /&gt; INSERT SOFT name_prop SELECT &amp;quot;n&amp;quot;, &amp;quot;p&amp;quot;, &amp;quot;o&amp;quot; FROM ( sparql DEFINE output:valmode &amp;quot;LONG&amp;quot; SELECT ?n ?p ?o WHERE { ?x a foaf:Person . ?x foaf:name ?n . ?x ?p ?o } ) xx ;&lt;br /&gt; -- Now choose for each name the canonical IRI &lt;br /&gt; INSERT INTO name_iri SELECT np_name, ( SELECT MIN (s) FROM rdf_quad WHERE o = np_name AND p = IRI_TO_ID (&amp;#39;http://xmlns.com/foaf/0.1/name&amp;#39;) ) AS mini FROM name_prop WHERE np_p = IRI_TO_ID (&amp;#39;http://xmlns.com/foaf/0.1/name&amp;#39;) ;&lt;br /&gt; -- For each person IRI, map to the canonical IRI of that person &lt;br /&gt; INSERT SOFT pref_iri (i, pref) SELECT s, ni_s FROM name_iri, rdf_quad WHERE o = ni_name AND p = IRI_TO_ID (&amp;#39;http://xmlns.com/foaf/0.1/name&amp;#39;) ;&lt;br /&gt; -- Make a graph where all persons have one iri with all the properties of all aliases -- and where person-to-person refs are canonicalized&lt;br /&gt; INSERT SOFT rdf_quad (g,s,p,o) SELECT IRI_TO_ID (&amp;#39;psmoosh&amp;#39;), ni_s, np_p, COALESCE ( ( SELECT pref FROM pref_iri WHERE i = np_o ), np_o ) FROM name_prop, name_iri WHERE ni_name = np_name OPTION ( loop, quietcast ) ;&lt;br /&gt; -- A little explanation: The properties of names are copied into rdf_quad with the name -- replaced with its canonical IRI. If the object has a canonical IRI, this is used as -- the object, else the object is unmodified. This is the COALESCE with the sub-query.&lt;br /&gt; -- This takes a little time. To check on the progress, take another connection to the -- server and do &lt;br /&gt; STATUS (&amp;#39;cluster&amp;#39;);&lt;br /&gt; -- It will return something like -- Cluster 4 nodes, 35 s. 108 m/s 1001 KB/s 75% cpu 186% read 12% clw threads 5r 0w 0i -- buffers 549481 253929 d 8 w 0 pfs&lt;br /&gt; -- Now finalize the state; this makes it permanent. Else the work will be lost on server -- failure, since there was no transaction log &lt;br /&gt; CL_EXEC (&amp;#39;checkpoint&amp;#39;);&lt;br /&gt; -- See what we got&lt;br /&gt; sparql SELECT COUNT (*) FROM &amp;lt;psmoosh&amp;gt; WHERE {?s ?p ?o};&lt;br /&gt; -- This is 2253102&lt;br /&gt; -- Now make the copy where all have the properties of all synonyms. This takes so much -- space we do not insert it as RDF quads, but make a special table for it so that we can -- run some statistics. This saves time.&lt;br /&gt; INSERT SOFT smoosh_ct (s, p, o) SELECT s, np_p, np_o FROM name_prop, rdf_quad WHERE o = np_name AND p = IRI_TO_ID (&amp;#39;http://xmlns.com/foaf/0.1/name&amp;#39;) ;&lt;br /&gt; -- as above, INSERT SOFT so as to ignore duplicates &lt;br /&gt; SELECT COUNT (*) FROM smoosh_ct;&lt;br /&gt; -- This is 167360324&lt;br /&gt; -- Find out where the bloat comes from &lt;br /&gt; SELECT TOP 20 COUNT (*), ID_TO_IRI (p) FROM smoosh_ct GROUP BY p ORDER BY 1 DESC; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;The results are:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;54728777 http://www.w3.org/2002/07/owl#sameAs 48543153 http://xmlns.com/foaf/0.1/knows 13930234 http://www.w3.org/2000/01/rdf-schema#seeAlso 12268512 http://xmlns.com/foaf/0.1/interest 11415867 http://xmlns.com/foaf/0.1/nick 6683963 http://xmlns.com/foaf/0.1/weblog 6650093 http://xmlns.com/foaf/0.1/depiction 4231946 http://xmlns.com/foaf/0.1/mbox_sha1sum 4129629 http://xmlns.com/foaf/0.1/homepage 1776555 http://xmlns.com/foaf/0.1/holdsAccount 1219525 http://xmlns.com/foaf/0.1/based_near 305522 http://www.w3.org/1999/02/22-rdf-syntax-ns#type 274965 http://xmlns.com/foaf/0.1/name 155131 http://xmlns.com/foaf/0.1/dateOfBirth 153001 http://xmlns.com/foaf/0.1/img 111130 http://www.w3.org/2001/vcard-rdf/3.0#ADR 52930 http://xmlns.com/foaf/0.1/gender 48517 http://www.w3.org/2004/02/skos/core#subject 45697 http://www.w3.org/2000/01/rdf-schema#label 44860 http://purl.org/vocab/bio/0.1/olb &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Now compare with the predicate distribution of the smoosh with identities canonicalized &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql SELECT COUNT (*) ?p FROM &amp;lt;psmoosh&amp;gt; WHERE { ?s ?p ?o } GROUP BY ?p ORDER BY 1 DESC LIMIT 20;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Results are:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;748311 http://xmlns.com/foaf/0.1/knows 548391 http://xmlns.com/foaf/0.1/interest 140531 http://www.w3.org/2000/01/rdf-schema#seeAlso 105273 http://www.w3.org/1999/02/22-rdf-syntax-ns#type 78497 http://xmlns.com/foaf/0.1/name 48099 http://www.w3.org/2004/02/skos/core#subject 45179 http://xmlns.com/foaf/0.1/depiction 40229 http://www.w3.org/2000/01/rdf-schema#comment 38272 http://www.w3.org/2000/01/rdf-schema#label 37378 http://xmlns.com/foaf/0.1/nick 37186 http://dbpedia.org/property/abstract 34003 http://xmlns.com/foaf/0.1/img 26182 http://xmlns.com/foaf/0.1/homepage 23795 http://www.w3.org/2002/07/owl#sameAs 17651 http://xmlns.com/foaf/0.1/mbox_sha1sum 17430 http://xmlns.com/foaf/0.1/dateOfBirth 15586 http://xmlns.com/foaf/0.1/page 12869 http://dbpedia.org/property/reference 12497 http://xmlns.com/foaf/0.1/weblog 12329 http://blogs.yandex.ru/schema/foaf/school &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We can drop the &lt;code&gt;owl:sameAs&lt;/code&gt; triples from the count, so the bloat is a bit less by that but it still is tens of times larger than the canonicalized copy or the initial state.&lt;/p&gt; &lt;p&gt;Now, when we try using the psmoosh graph, we still get different results from the results with the original data. This is because &lt;code&gt;foaf:knows&lt;/code&gt; relations to things with no &lt;code&gt;foaf:name&lt;/code&gt; are not represented in the smoosh. The exist:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;sparql SELECT COUNT (*) WHERE { ?s foaf:knows ?thing . FILTER ( !bif:exists ( SELECT (1) WHERE { ?thing foaf:name ?nn } ) ) };&lt;br /&gt; -- 1393940 &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;So the smoosh graph is not an accurate rendition of the social network. It would have to be smooshed further to be that, since the data in the sample is quite irregular. But we do not go that far here.&lt;/p&gt; &lt;p&gt;Finally, we calculate the smoosh blow up factors. We do not include &lt;code&gt;owl:sameAs&lt;/code&gt; triples in the counts.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;select (167360324 - 54728777) / 3284674.0; 34.290022997716059&lt;br /&gt; select 2229307 / 3284674.0; = 0.678699621332284 &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;So, to get a smoosh that is not really the equivalent of the original, either multiply the original triple count by 34 or 0.68, depending on whether synonyms are collapsed or not.&lt;/p&gt; &lt;p&gt;Making the smooshes does not take very long, some minutes for the small one. Inserting the big one would be longer, a couple of hours maybe. It was 33 minutes for filling the &lt;code&gt;smoosh_ct&lt;/code&gt; table. The metrics were not with optimal tuning so the performance numbers just serve to show that smooshing takes time. Probably more time than allowable in an interactive situation, no matter how the process is optimized.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Anytime: No Query Is Too Complex (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-12-11#1495</atom:id>
  <atom:published>2008-12-11T16:13:10Z</atom:published>
  <atom:updated>2008-12-12T10:29:23-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;A persistent argument against the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id1199d5f8&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id116f2730&quot;&gt;web&lt;/a&gt; has been the cost, scalability, and vulnerability of &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id14e423c0&quot;&gt;SPARQL&lt;/a&gt; end points, should the linked data web gain serious mass and traffic.&lt;/p&gt; &lt;p&gt;As we are on the brink of hosting the whole &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id1376a8b0&quot;&gt;DBpedia&lt;/a&gt; &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id113c8d20&quot;&gt;Linked Open Data&lt;/a&gt; cloud in &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id11425a78&quot;&gt;Virtuoso&lt;/a&gt; Cluster, we have had to think of what we&amp;#39;ll do if, for example, somebody decides to count all the triples in the set.&lt;/p&gt; &lt;p&gt;How can we encourage clever use of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id116f1210&quot;&gt;data&lt;/a&gt;, yet not die if somebody, whether through malice, lack of understanding, or simple bad luck, submits impossible queries?&lt;/p&gt; &lt;p&gt;Restricting the language is not the way; any language beyond text search can express queries that will take forever to execute. Also, just returning a timeout after the first second (or whatever arbitrary time period) leaves people in the dark and does not produce an impression of responsiveness. So we decided to allow arbitrary queries, and if a quota of time or resources is exceeded, we return partial results and indicate how much processing was done.&lt;/p&gt; &lt;p&gt;Here we are looking for the top 10 people whom people claim to know without being known in return, like this:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;SQL&amp;gt; sparql SELECT ?celeb, COUNT (*) WHERE { ?claimant foaf:knows ?celeb . FILTER (!bif:exists ( SELECT (1) WHERE { ?celeb foaf:knows ?claimant } ) ) } GROUP BY ?celeb ORDER BY DESC 2 LIMIT 10;&lt;br /&gt; celeb callret-1 VARCHAR VARCHAR ________________________________________ _________&lt;br /&gt; http://twitter.com/BarackObama 252 http://twitter.com/brianshaler 183 http://twitter.com/newmediajim 101 http://twitter.com/HenryRollins 95 http://twitter.com/wilw 81 http://twitter.com/stevegarfield 78 http://twitter.com/cote 66 mailto:adam.westerski@deri.org 66 mailto:michal.zaremba@deri.org 66 http://twitter.com/dsifry 65&lt;br /&gt; *** Error S1TAT: [Virtuoso Driver][Virtuoso Server]RC...: Returning incomplete results, query interrupted by result timeout. Activity: 1R rnd 0R seq 0P disk 1.346KB / 3 messages&lt;br /&gt; SQL&amp;gt; sparql SELECT ?celeb, COUNT (*) WHERE { ?claimant foaf:knows ?celeb . FILTER (!bif:exists ( SELECT (1) WHERE { ?celeb foaf:knows ?claimant } ) ) } GROUP BY ?celeb ORDER BY DESC 2 LIMIT 10;&lt;br /&gt; celeb callret-1 VARCHAR VARCHAR ________________________________________ _________&lt;br /&gt; http://twitter.com/JasonCalacanis 496 http://twitter.com/Twitterrific 466 http://twitter.com/ev 442 http://twitter.com/BarackObama 356 http://twitter.com/laughingsquid 317 http://twitter.com/gruber 294 http://twitter.com/chrispirillo 259 http://twitter.com/ambermacarthur 224 http://twitter.com/t 219 http://twitter.com/johnedwards 188&lt;br /&gt; *** Error S1TAT: [Virtuoso Driver][Virtuoso Server]RC...: Returning incomplete results, query interrupted by result timeout. Activity: 329R rnd 44.6KR seq 342P disk 638.4KB / 46 messages&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;The first query read all data from disk; the second run had the working set from the first and could read some more before time ran out, hence the results were better. But the response time was the same.&lt;/p&gt; &lt;p&gt;If one has a query that just loops over consecutive joins, like in basic SPARQL, interrupting the processing after a set time period is simple. But such queries are not very interesting. To give meaningful partial answers with nested aggregation and sub-queries requires some more tricks. The basic idea is to terminate the innermost active sub-query/aggregation at the first timeout, and extend the timeout a bit so that accumulated results get fed to the next aggregation, like from the &lt;code&gt;GROUP BY&lt;/code&gt; to the &lt;code&gt;ORDER BY&lt;/code&gt;. If this again times out, we continue with the next outer layer. This guarantees that results are delivered if there were any results found for which the query pattern is true. False results are not produced, except in cases where there is comparison with a count and the count is smaller than it would be with the full evaluation.&lt;/p&gt; &lt;p&gt;One can also use this as a basis for paid services. The cutoff does not have to be time; it can also be in other units, making it insensitive to concurrent usage and variations of working set.&lt;/p&gt; &lt;p&gt;This system will be deployed on our &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id11500a58&quot;&gt;Billion Triples Challenge&lt;/a&gt; &lt;a href=&quot;http://b3s.openlinksw.com/&quot; id=&quot;link-id11683120&quot;&gt;demo instance&lt;/a&gt; in a few days, after some more testing. When Virtuoso 6 ships, all &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id1157a500&quot;&gt;LOD&lt;/a&gt; Cloud AMIs and OpenLink-hosted LOD Cloud SPARQL endpoints will have this enabled by default. (AMI users will be able to disable the feature, if desired.) The feature works with Virtuoso 6 in both single server and cluster deployment.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>An Example of RDF Scalability</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-11-27#1488</atom:id>
  <atom:published>2008-11-27T11:23:47Z</atom:published>
  <atom:updated>2008-12-01T12:09:55.000008-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;We hear it to exhaustion, where is &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x14e828d8&quot;&gt;RDF&lt;/a&gt; scalability? We have been suggesting for a while that this is a solved question. I will here give some concrete numbers to back this.&lt;/p&gt; &lt;p&gt;The scalability dream is to add hardware and get increased performance in proportion to the power the added component has when measured by itself. A corollary dream is to take scalability effects that are measured in a simple task and see them in a complex task.&lt;/p&gt; &lt;p&gt;Below we show how we do 3.3 million random triple lookups per second on two 8 core commodity servers producing complete results, joining across partitions. On a single 4 core server, the figure is about 1 million lookups per second. With a single thread, it is about 250K lookups per second. This is the good case. But even our worse case is quite decent.&lt;/p&gt; &lt;p&gt;We took a simple &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x14fef850&quot;&gt;SPARQL&lt;/a&gt; query, counting how many people say they reciprocally know each other. In the &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x1bca04d0&quot;&gt;Billion Triples Challenge&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1be84e88&quot;&gt;data&lt;/a&gt; set, there are 25M &lt;code&gt;foaf:knows&lt;/code&gt; quads of which 92K are reciprocal. &lt;i&gt;Reciprocal&lt;/i&gt; here means that when x knows y in some graph, y knows x in the same or any other graph.&lt;/p&gt; &lt;pre&gt;SELECT COUNT (*) WHERE { ?p1 foaf:knows ?p2 . ?p2 foaf:knows ?p1 }&lt;/pre&gt; &lt;p&gt;There is no guarantee that the triple of &lt;code&gt;x knows y&lt;/code&gt; is in the same partition as the triple y knows x. Thus the join is randomly distributed, n partitions to n partitions.&lt;/p&gt; &lt;p&gt;We left this out of the Billion Triples Challenge demo because this did not run fast enough for our liking. Since then, we have corrected this.&lt;/p&gt; &lt;p&gt;If run on a single thread, this query would be a loop over all the quads with a predicate of &lt;code&gt;foaf:knows&lt;/code&gt;, and an inner loop looking for a quad with 3 of 4 fields given (&lt;code&gt;SPO&lt;/code&gt;). If we have a partitioned situation, we have a loop over all the &lt;code&gt;foaf:knows&lt;/code&gt; quads in each partition, and an inner lookup looking for the reciprocal &lt;code&gt;foaf:knows&lt;/code&gt; quad in whatever partition it may be found.&lt;/p&gt; &lt;p&gt;We have implemented this with two different message patterns: &lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Centralized:&lt;/b&gt; One process reads all the &lt;code&gt;foaf:knows&lt;/code&gt; quads from all processes. Every 50K quads, it sends a batch of reciprocal quad checks to each partition that could contain a reciprocal quad. Each partition keeps the count of found reciprocal quads, and these are gathered and added up at the end.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Symmetrical:&lt;/b&gt; Each process reads the &lt;code&gt;foaf:knows&lt;/code&gt; quads in its partition, and sends a batch of checks to each process that could have the reciprocal &lt;code&gt;foaf:knows&lt;/code&gt; quad every 50K quads. At the end, the counts are gathered from all partitions. There is some additional control traffic but we do not go into its details here.&lt;/p&gt; &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;Below is the result measured on 2 machines each with 2 x Xeon 5345 (quad core; total 8 cores), 16G RAM, and each machine running 6 &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x16642a90&quot;&gt;Virtuoso&lt;/a&gt; instances. The interconnect is dual 1-Gbit ethernet. Numbers are with warm cache.&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;Centralized: 35,543 msec, 728,634 sequential + random lookups per second &lt;br /&gt; Cluster 12 nodes, 35 s. 1072 m/s 39,085 KB/s 316% cpu ... &lt;br /&gt; &lt;br /&gt; Symmetrical: 7706 msec, 3,360,740 sequential + random lookups per second &lt;br /&gt; Cluster 12 nodes, 7 s. 572 m/s 16,983 KB/s 1137% cpu ...&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The second line is the summary from the cluster status report for the duration of the query. The interesting numbers are the KB/s and the %CPU. The former is the cross-sectional data transfer rate for intra-cluster communication; the latter is the consolidated CPU utilization, where a constantly-busy core counts for 100%. The point to note is that the symmetrical approach takes 4x less real time with under half the data transfer rate. Further, when using multiple machines, the speed of a single interface does not limit the overall throughput as it does in the centralized situation.&lt;/p&gt; &lt;p&gt;These figures represent the best and worst cases of distributed &lt;code&gt;JOIN&lt;/code&gt;ing. If we have a straight sequence of &lt;code&gt;JOIN&lt;/code&gt;s, with single pattern optionals and existences and the order in which results are produced is not significant (i.e., there is aggregation, existence test, or &lt;code&gt;ORDER BY&lt;/code&gt;), the symmetrical pattern is applicable. On the other hand, if there are multiple triple pattern optionals, complex sub-queries, &lt;code&gt;DISTINCT&lt;/code&gt;s in the middle of the query, or results have to be produced in the order of an index, then the centralized approach must be used at least part of the time.&lt;/p&gt; &lt;p&gt;Also, if we must make transitive closures, which can be thought of as an extension of a &lt;code&gt;DISTINCT&lt;/code&gt; in a subquery, we must pass the data through a single point before moving the bindings to the next &lt;code&gt;JOIN&lt;/code&gt; in the sequence. This happens for example in resolving &lt;code&gt;&lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x14e1a160&quot;&gt;owl&lt;/a&gt;:sameAs&lt;/code&gt; at run time. However, the good news is that performance does not fall much below the centralized figure even when there are complex nested structures with intermediate transitive closures, &lt;code&gt;DISTINCT&lt;/code&gt;s, complex existence tests, etc., that require passing all intermediate results through a central point. No matter the complexity, it is always possible to vector some tens-of-thousands of variable bindings into a single message exchange. And if there are not that many intermediate results, then single query execution time is not a problem anyhow.&lt;/p&gt; &lt;p&gt;For our sample query, we would get still more speed by using a partitioned hash join, filling the hash from the &lt;code&gt;foaf:knows&lt;/code&gt; relations and then running the &lt;code&gt;foaf:knows&lt;/code&gt; relations through the hash. If the hash size is right, a hash lookup is somewhat better than an index lookup. The problem is that when the hash join is not the right solution, it is an expensive mistake: the best case is good; the worst case is very bad. But if there is no index then hash join is better than nothing. One problem of hash joins is that they make temporary data structures which, if large, will skew the working set. One must be quite sure of the cardinality before it is safe to try a hash join. So we do not do hash joins with RDF, but we do use them sometimes with relational data. &lt;/p&gt; &lt;p&gt;These same methods apply to relational data just as well. This does not make generic RDF storage outperform an application-specific relational representation on the same platform, as the latter benefits from all the same optimizations, but in terms of sheer numbers, this makes RDF representation an option where it was not an option before. RDF is all about not needing to design the schema around the queries, and not needing to limit what joins with what else.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Vs. MySQL: Setting the Berlin Record Straight (update 2)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-11-20#1485</atom:id>
  <atom:published>2008-11-20T11:06:11Z</atom:published>
  <atom:updated>2008-11-24T10:15:11.000021-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;In the context of the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0xa322b58&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt;, I have repeatedly written about measurement procedures and steady state. The point is that the numbers at larger scales are unreliable due to cache behavior if one is not careful about measurement and does not have adequate warmup. Thus it came to pass that one cut of the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x9524730&quot;&gt;BSBM&lt;/a&gt; paper had 3 seconds for &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0x2ba8db0&quot;&gt;MySQL&lt;/a&gt; and 100 for &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa9137d0&quot;&gt;Virtuoso&lt;/a&gt;, basically through ignoring cache effects.&lt;/p&gt; &lt;p&gt;So we decided to do it ourselves.&lt;/p&gt; &lt;p&gt;The score is (updated with revised &lt;code&gt;innodb_buffer_pool_size&lt;/code&gt; setting, based on advice noted down below):&lt;/p&gt; &lt;table border=&quot;1&quot; cellspacing=&quot;2&quot; cellpadding=&quot;5&quot;&gt; &lt;tr&gt; &lt;th&gt;n-clients&lt;/th&gt; &lt;th&gt;Virtuoso&lt;/th&gt; &lt;th&gt;MySQL &lt;br /&gt; (with increased buffer pool size)&lt;/th&gt; &lt;th&gt;MySQL &lt;br /&gt; (with default buffer poll size)&lt;/th&gt; &lt;/tr&gt; &lt;tr align=&quot;right&quot;&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt; 41,161.33&lt;/td&gt; &lt;td&gt; 27,023.11 &lt;/td&gt; &lt;td&gt; 12,171.41&lt;/td&gt; &lt;/tr&gt; &lt;tr align=&quot;right&quot;&gt; &lt;td&gt;4&lt;/td&gt; &lt;td&gt; 127,918.30&lt;/td&gt; &lt;td&gt; (pending) &lt;/td&gt; &lt;td&gt; 37,566.82&lt;/td&gt; &lt;/tr&gt; &lt;tr align=&quot;right&quot;&gt; &lt;td&gt;8&lt;/td&gt; &lt;td&gt; 218,162.29 &lt;/td&gt; &lt;td&gt; 105,524.23 &lt;/td&gt; &lt;td&gt; 51,104.39 &lt;/td&gt; &lt;/tr&gt; &lt;tr align=&quot;right&quot;&gt; &lt;td&gt;16&lt;/td&gt; &lt;td&gt; 214,763.58 &lt;/td&gt; &lt;td&gt; 98,852.42 &lt;/td&gt; &lt;td&gt; 47,589.18 &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;The metric is the query mixes per hour from the BSBM test driver output. For the interested, the complete output is &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/texts/bsbmres.txt&quot; id=&quot;link-id1119f770&quot;&gt;here&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The benchmark is pure &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x2b61c88&quot;&gt;SQL&lt;/a&gt;, nothing to do with &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x17a6d408&quot;&gt;SPARQL&lt;/a&gt; or &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x9a0a968&quot;&gt;RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The hardware is 2 x Xeon 5345 (2 x quad core, 2.33 GHz), 16 G RAM. The OS is 64-bit Debian Linux.&lt;/p&gt; &lt;p&gt;The benchmark was run at a scale of 200,000. Each run had 2000 warm-up query mixes and 500 measured query mixes, which gives steady state, eliminating any effects of OS disk cache and the like. Both databases were configured to use 8G for disk cache. The test effectively runs from memory. We ran an analyze table on each MySQL table but noticed that this had no effect. Virtuoso does the stats sampling on the go; possibly MySQL also since the explicit stats did not make any difference. The MySQL tables were served by the InnoDB engine. MySQL appears to cache results of queries in some cases. This was not apparent in the tests.&lt;/p&gt; &lt;p&gt;The versions are 5.09 for Virtuoso and 5.1.29 for MySQL. You can download and examine --&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/texts/virtuoso.ini&quot; id=&quot;link-id14fe17f0&quot;&gt;Virtuoso configuration file&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/texts/my.cnf&quot; id=&quot;link-id116fe490&quot;&gt;MySQL configuration file&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/texts/create_tables_and_rdf_view.sql&quot; id=&quot;link-id14ce9268&quot;&gt;Table definitions &amp;amp; RDF views&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/texts/mysqlinx.sql&quot; id=&quot;link-id1535e298&quot;&gt;Indexes on MySQL tables&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt; &lt;strike&gt;MySQL ought to do better. We suspect that here, just as in the TPC-D experiment we made way back, the query plans are not quite right. Also we rarely saw over 300% CPU utilization for MySQL. It is possible there is a config parameter that affects this. The public is invited to tell us about such.&lt;/strike&gt; &lt;/p&gt; &lt;p&gt; &lt;b&gt;Update:&lt;/b&gt; &lt;/p&gt; &lt;p&gt;Andreas Schultz of the BSBM team advised us to increase the &lt;code&gt;innodb_buffer_pool_size&lt;/code&gt; setting in the MySQL config. We did and it produced some improvement. Indeed, this is more like it, as we now see CPU utilization around 700% instead of the 300% in the previously published run, which rendered it suspect. Also, our experiments with TPC-D led us to expect better. We ran these things a few times so as to have warm cache.&lt;/p&gt; &lt;p&gt;On the first run, we noticed that the Innodb warm up time was somewhere well in excess of 2000 query mixes. Another time, we should make a graph of throughput as a function of time for both MySQL and Virtuoso. We recently made a greedy prefetch hack that should give us some mileage there. For the next BSBM, all we can advise is to run larger scale system for half an hour first and then measure and then measure again. If the second measurement is the same as the first then it is good.&lt;/p&gt; &lt;p&gt;As always, since MySQL is not our specialty, we confidently invite the public to tell us how to make it run faster. So, unless something more turns up, our next trial is a revisit of &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x17a20498&quot;&gt;TPC-H&lt;/a&gt;.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>ISWC 2008: Some Questions</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-11-04#1481</atom:id>
  <atom:published>2008-11-04T15:54:42Z</atom:published>
  <atom:updated>2008-11-04T14:37:01-05:00</atom:updated>
  <atom:content type="html">&lt;h2&gt;Inference: Is it always forward chaining?&lt;/h2&gt; &lt;p&gt;We got a number of questions about &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x13c64b60&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s inference support. It seems that we are the odd one out, as we do not take it for granted that inference ought to consist of materializing entailment.&lt;/p&gt; &lt;p&gt;Firstly, of course one can materialize all one wants with Virtuoso. The simplest way to do this is using SPARUL. With the recent transitivity extensions to &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x14d17778&quot;&gt;SPARQL&lt;/a&gt;, it is also easy to materialize implications of transitivity with a single statement. Our point is that for trivial entailment such as subclass, sub-property, single transitive property, and &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x128e55d0&quot;&gt;owl&lt;/a&gt;:sameAs, we do not require materialization, as we can resolve these at run time also, with backward-chaining built into the engine.&lt;/p&gt; &lt;p&gt;For more complex situations, one needs to materialize the entailment. At the present time, we know how to generalize our transitive feature to run arbitrary backward-chaining rules, including recursive ones. We could have a sort of Datalog backward-chaining embedded in our &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x12614770&quot;&gt;SQL&lt;/a&gt;/SPARQL and could run this with good parallelism, as the transitive feature already works with clustering and partitioning without dying of message latency. Exactly when and how we do this will be seen. Even if users want entailment to be materialized, such a rule system could be used for producing the materialization at good speed.&lt;/p&gt; &lt;p&gt;We had a word with &lt;a href=&quot;http://web.comlab.ox.ac.uk/people/Ian.Horrocks/&quot; id=&quot;link-id117c99d0&quot;&gt;Ian Horrocks&lt;/a&gt; on the question. He noted that it is often naive on behalf of the community to tend to equate description of semantics with description of algorithm. The &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x145b2980&quot;&gt;data&lt;/a&gt; need not always be blown up.&lt;/p&gt; &lt;p&gt;The advantage of not always materializing is that the working set stays better. Once the working set is no longer in memory, response times jump disproportionately. Also, if the data changes or is retracted or is unreliable, one can end up doing a lot of extra work with materialization. Consider the effect of one malicious sameAs statement. This can lead to a lot of effects that are hard to retract. On the other hand, if running in memory with static data such as the LUBM benchmark, the queries run some 20% faster if entailment subclasses and sub-properties are materialized rather than done at run time.&lt;/p&gt; &lt;h2&gt;Genetic Algorithms for SPARQL?&lt;/h2&gt; &lt;p&gt;Our compliments for the wildest idea of the conference go to &lt;a href=&quot;http://www.eyaloren.org/&quot; id=&quot;link-id1a203af8&quot;&gt;Eyal Oren&lt;/a&gt;, &lt;a href=&quot;http://www.few.vu.nl/~cgueret/&quot; id=&quot;link-id16208758&quot;&gt;Christophe GuÃ©ret&lt;/a&gt;, and &lt;a href=&quot;http://www.few.vu.nl/~schlobac/&quot; id=&quot;link-id111923e0&quot;&gt;Stefan Schlobach&lt;/a&gt;, &lt;i&gt;et al&lt;/i&gt;, for their &lt;a href=&quot;http://www.informatik.uni-trier.de/~ley/db/conf/semweb/iswc2008.html#OrenGS08&quot; id=&quot;link-id11793540&quot;&gt;paper on using genetic algorithms for guessing how variables in a SPARQL query ought to be instantiated&lt;/a&gt;. Prisoners of our &amp;quot;conventional wisdom&amp;quot; as we are, this might never have occurred to us.&lt;/p&gt; &lt;h2&gt;Schema Last?&lt;/h2&gt; &lt;p&gt;It is interesting to see how the industry comes to the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x12b57e90&quot;&gt;semantic web&lt;/a&gt; conferences talking about schema last while at the same time the traditional semantic web people stress enforcing schema constraints and making more predictably performing and database friendlier logics. So do the extremes converge.&lt;/p&gt; &lt;p&gt;There is a point to schema last. &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x12a8ff48&quot;&gt;RDF&lt;/a&gt; is very good for getting a view of ad hoc or unknown data. One can just load and look at what there is. Also, additions of unforeseen optional properties or relations to the schema are easy and efficient. However, it seems that a really high traffic online application would always benefit from having some application specific data structures. Such could also save considerably in hardware.&lt;/p&gt; &lt;p&gt;It is not a sharp divide between RDF and relational application oriented representation. We have the capabilities in our RDB to RDF mapping. We just need to show this and have SPARUL and data loading&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>ISWC 2008: RDB2RDF Face-to-Face</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-11-04#1477</atom:id>
  <atom:published>2008-11-04T13:26:19Z</atom:published>
  <atom:updated>2008-11-04T17:20:35-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;The W3C&amp;#39;s RDB-to-&lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x153bdcf8&quot;&gt;RDF&lt;/a&gt; mapping incubator group (&lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-id0x13e3e6b8&quot;&gt;RDB2RDF XG&lt;/a&gt;) met in &lt;a href=&quot;http://dbpedia.org/resource/Karlsruhe&quot; id=&quot;link-id0x15236b08&quot;&gt;Karlsruhe&lt;/a&gt; after &lt;a href=&quot;http://iswc2008.semanticweb.org/&quot; id=&quot;link-id0x2450fba8&quot;&gt;ISWC 2008&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The meeting was about writing a charter for a working group that would define a standard for mapping relational databases to RDF, either for purposes of import into RDF stores or of query mapping from &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x14c84338&quot;&gt;SPARQL&lt;/a&gt; to &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x146db368&quot;&gt;SQL&lt;/a&gt;. There was a lot of agreement and the meeting even finished ahead of the allotted time.&lt;/p&gt; &lt;h2&gt;Whose Identifiers?&lt;/h2&gt; &lt;p&gt;There was discussion concerning using the &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x12c15e58&quot;&gt;Entity&lt;/a&gt; Name Service from the Okkam project for assigning URIs to entities mapped from relational databases. This makes sense when talking about long-lived, legal entities, such as people or companies or geography. Of course, there are cases where this makes no sense; for example, a purchase order or maintenance call hardly needs an identifier registered with the ENS. The problem is, in practice, a CRM could mention customers that have an ENS registered ID (or even several such IDs) and others that have none. Of course, the CRM&amp;#39;s reference cannot depend on any registration. Also, even when there is a stable &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x12b7b5c0&quot;&gt;URI&lt;/a&gt; for the entity, a CRM may need a key that specifies some administrative subdivision of the customer.&lt;/p&gt; &lt;p&gt;Also we note that an on-demand RDB-to-RDF mapping may have some trouble dealing with &amp;quot;same as&amp;quot; assertions. If names that are anything other than string forms of the keys in the system must be returned, there will have to be a lookup added to the RDB. This is an administrative issue. Certainly going over the network to ask for names of items returned by queries has a prohibitive cost. It would be good for ad hoc integration to use shared URIs when possible. The trouble of adding and maintaining lookups for these, however, makes this more expensive than just mapping to RDF and using literals for joining between independently maintained systems.&lt;/p&gt; &lt;h2&gt; &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x14bf7da0&quot;&gt;XML&lt;/a&gt; or RDF?&lt;/h2&gt; &lt;p&gt;We talked about having a language for human consumption and another for discovery and machine processing of mappings. Would this latter be XML or RDF based? Describing every detail of syntax for a mapping as RDF is really tedious. Also such descriptions are very hard to query, just as &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x1493ffc0&quot;&gt;OWL&lt;/a&gt; ontologies are. One solution is to have opaque strings embedded into RDF, just like XSLT has &lt;a href=&quot;http://dbpedia.org/resource/XPath&quot; id=&quot;link-id0x1400fe98&quot;&gt;XPath&lt;/a&gt; in string form embedded into XML. Maybe it will end up in this way here also. Having a complete XML mapping of the parse tree for mappings, XQueryX-style, could be nice for automatic generation of mappings with XSLT from an XML view of the &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x14c846d8&quot;&gt;information&lt;/a&gt; schema. But then XSLT can also produce text, so an XML syntax that has every detail of a mapping language as distinct elements is not really necessary for this.&lt;/p&gt; &lt;p&gt;Another matter is then describing the RDF generated by the mapping in terms of RDFS or OWL. This would be a by-product of declaring the mapping. Most often, I would presume the target ontology to be given, though, reducing the need for this feature. But if RDF mapping is used for discovery of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x14f6f128&quot;&gt;data&lt;/a&gt;, such a description of the exposed data is essential.&lt;/p&gt; &lt;h2&gt;Interoperability&lt;/h2&gt; &lt;p&gt;We agreed with &lt;a href=&quot;http://www.informatik.uni-leipzig.de/~auer/foaf.rdf#me&quot; id=&quot;link-id0x1e776730&quot;&gt;SÃ¶ren Auer&lt;/a&gt; that we could make &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1477ad18&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s mapping language compatible with &lt;a href=&quot;http://triplify.org/&quot; id=&quot;link-id0x15514388&quot;&gt;Triplify&lt;/a&gt;. Triplify is very simple, extraction only, no SPARQL, but does have the benefit of expressing everything in SQL. As it happens, I would be the last person to tell a web developer what language to program in. So if it is SQL, then let it stay SQL. Technically, a lot of the information the Virtuoso mapping expresses is contained in the Triplify SQL statements, but not all. Some extra declarations are needed still but can have reasonable defaults.&lt;/p&gt; &lt;p&gt;There are two ways of stating a mapping. Virtuoso starts with the triple and says which tables and columns will produce the triple. Triplify starts with the SQL statement and says what triples it produces. These are fairly equivalent. For the web developer, the latter is likely more self-evident, while the former may be more compact and have less repetition.&lt;/p&gt; &lt;p&gt;Virtuoso and Triplify alone would give us the two interoperable implementations required from a working group, supposing the language were annotations on top of SQL. This would be a guarantee of delivery, as we would be close enough to the result from the get go.&lt;/p&gt; &lt;h2&gt;Related Web resources&lt;/h2&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSQL2RDF&quot; id=&quot;link-id14e27040&quot;&gt;OpenLink Virtuoso: Open-Source Edition: Mapping SQL Data to RDF&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/pdf/Virtuoso_SQL_to_RDF_Mapping.pdf&quot; id=&quot;link-id1baad3a8&quot;&gt;Virtuoso RDF Views â Getting Started Guide (PDF)&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>ISWC 2008: The Scalable Knowledge Systems Workshop</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-11-03#1473</atom:id>
  <atom:published>2008-11-03T13:16:47Z</atom:published>
  <atom:updated>2008-11-03T12:33:54-05:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Mike Dean of &lt;a href=&quot;http://dbpedia.org/resource/BBN_Technologies&quot; id=&quot;link-id0x25699878&quot;&gt;BBN Technologies&lt;/a&gt; opened the Scalable &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1ed01750&quot;&gt;Knowledge&lt;/a&gt; Systems Workshop with an invited talk. He reminded us of the facts of nature as concern the cost of distributed computing and running out of space for the working set. Developers in the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x21fbb9a8&quot;&gt;semantic web&lt;/a&gt; field deplorably often ignore these facts, or alternatively recognize them and admit that they are unbeatable, that one just can&amp;#39;t join across partitions.&lt;/p&gt; &lt;p&gt;I gave a talk about the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x20b6e020&quot;&gt;Virtuoso&lt;/a&gt; Cluster edition, wherein I repeated essentially the same ground facts as Mike and outlined how we (in spite of these) profit from distributed memory multiprocessing. To those not intimate with these questions, let me affirm that deriving benefit from threading in a symmetric multiprocessor box, let alone a cluster connected by a network, totally depends on having many relatively long running things going at a time and blocking as seldom as possible.&lt;/p&gt; &lt;p&gt;Further, Mike Dean talked about &lt;a href=&quot;http://www.asio.bbn.com/&quot; id=&quot;link-id0x222252f0&quot;&gt;ASIO&lt;/a&gt;, the BBN suite of semantic web tools. His most challenging statement was about the storage engine, a network-database-inspired triple-store using memory-mapped files. &lt;/p&gt; &lt;p&gt;Will the &lt;a href=&quot;http://dbpedia.org/resource/CODASYL&quot; id=&quot;link-id0x222d8730&quot;&gt;CODASYL&lt;/a&gt; days come back, and will the linked list on disk be the way to store triples/quads? I would say that this will have, especially with a memory-mapped file, probably a better best-case as a B-tree but that this also will be less predictable with fragmentation. With Virtuoso, using a B-tree index, we see about 20-30% of CPU time spent on index lookup when running LUBM queries. With a disk-based memory-mapped linked-list storage, we would see some improvements in this while getting hit probably worse than now in the case of fragmentation. Plus compaction on the fly would not be nearly as easy and surely far less local, if there were pointers between pages. So it is my intuition that trees are a safer bet with varying workloads while linked lists can be faster in a query-dominated in-memory situation.&lt;/p&gt; &lt;p&gt;Chris Bizer presented the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x22e41c40&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt; (&lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1c909960&quot;&gt;BSBM&lt;/a&gt;), which has already been discussed here in some detail. He did acknowledge that the next round of the race must have a real steady-state rule. This just means that the benchmark must be run long enough for the system under test to reach a state where the cache is full and the performance remains indefinitely at the same level. Reaching steady state can take 20-30 minutes in some cases.&lt;/p&gt; &lt;p&gt;Regardless of steady state, BSBM has two generally valid conclusions: &lt;/p&gt; &lt;ol&gt; &lt;li&gt;mapping relational to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x21d01890&quot;&gt;RDF&lt;/a&gt;, where possible, is faster than triple storage; and &lt;/li&gt; &lt;li&gt;the equivalent relational solution can be some 10x faster than the pure triples representation.&lt;/li&gt; &lt;/ol&gt; &lt;p&gt;Mike Dean asked whether BSBM was a case of a setup to have triple stores fail. Not necessarily, I would say; we should understand that one motivation of BSBM is testing mapping technologies. Therefore it must have a workload where mapping makes sense. Of course there are workloads where triples are unchallenged â take the &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x1feb9250&quot;&gt;Billion Triples Challenge&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1fe12b60&quot;&gt;data&lt;/a&gt; set for one.&lt;/p&gt; &lt;p&gt;Also, with BSBM, once should note that the query optimization time plays a fairly large role since most queries touch relatively little data. Also, even if the scale is large, the working set is not nearly the size of the database. This in fact penalizes mapping technologies against native &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1e275c88&quot;&gt;SQL&lt;/a&gt; since the difference there is compiling the query, especially since parameters are not used. So, Chris, since we both like to map, let&amp;#39;s make a benchmark that shows mapping closer to native SQL.&lt;/p&gt; &lt;h2&gt;Bridging the 10x Gap?&lt;/h2&gt; &lt;p&gt;When we run Virtuoso relational against Virtuoso triple store with the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x22046d88&quot;&gt;TPC-H&lt;/a&gt; workload, we see that the relational case is significantly faster. These are long queries, thus query optimization time is negligible; we are here comparing memory-based access times. Why is this? The answer is that a single index lookup gives multiple column values with almost no penalty for the extra column. Also, since the number of total joins is lower, the overhead coming from moving from join to next join is likewise lower. This is just a meter of count of executed instructions.&lt;/p&gt; &lt;p&gt;A column store joins in principle just as much as a triple store. However, since the BI workload often consists of scanning over large tables, the joins tend to be local, the needed lookup can often use the previous location as a starting point. A triple store can do the same if queries have high locality. We do this in some SQL situations and can try this with triples also. The RDF workload is typically more random in its access pattern, though. The other factor is the length of control path. A column store has a simpler control flow if it knows that the column will have exactly one value per row. With RDF, this is not a given. Also, the column store&amp;#39;s row is identified by a single number and not a multipart key. These two factors give the column store running with a fixed schema some edge over the more generic RDF quad store.&lt;/p&gt; &lt;p&gt;There was some discussion on how much closer a triple store could come to a relational one. Some gains are undoubtedly possible. We will see. For the ideal row store workload, the &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x22f837c0&quot;&gt;RDBMS&lt;/a&gt; will continue to have some edge. Large online systems typically have a large part of the workload that is simple and repetitive. There is nothing to prevent one having special indices for supporting such workload, even while retaining the possibility of arbitrary triples elsewhere. Some degree of application-specific data structure does make sense. We just need to show how this is done. In this way, we have a continuum and not an either/or choice of triples vs. tables.&lt;/p&gt; &lt;h2&gt;Scale, Where Next?&lt;/h2&gt; &lt;p&gt;Concerning the future direction of the workshop, there were a few directions suggested. One of the more interesting ones was Mike Dean&amp;#39;s suggestion about dealing with a large volume of same-as assertions, specifically a volume where materializing all the entailed triples was no longer practical. Of course, there is the question of scale. This time, we were the only ones focusing on a parallel database with no restrictions on joining.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso - Are We Too Clever for Our Own Good? (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-26#1467</atom:id>
  <atom:published>2008-10-26T12:15:35Z</atom:published>
  <atom:updated>2008-10-27T12:07:58-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;&amp;quot;Physician, heal thyself,&amp;quot; it is said. We profess to say what the messaging of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x1b4a25f0&quot;&gt;semantic web&lt;/a&gt; ought to be, but is our own perfect?&lt;/p&gt; &lt;p&gt;I will here engage in some critical introspection as well as amplify on some answers given to &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e4f9928&quot;&gt;Virtuoso&lt;/a&gt;-related questions in recent times.&lt;/p&gt; &lt;p&gt;I use some conversations from the &lt;a href=&quot;http://dbpedia.org/resource/Vienna&quot; id=&quot;link-id0x1e6c0ca8&quot;&gt;Vienna&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1e56df88&quot;&gt;Linked Data&lt;/a&gt; Practitioners meeting as a starting point. These views are mine and are limited to the Virtuoso server. These do not apply to the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1e680440&quot;&gt;ODS&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1e140068&quot;&gt;OpenLink Data Spaces&lt;/a&gt;) applications line, &lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1f4ba630&quot;&gt;OAT&lt;/a&gt; (&lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1ba4bac8&quot;&gt;OpenLink Ajax Toolkit&lt;/a&gt;), or &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1d4159b0&quot;&gt;ODE&lt;/a&gt; (&lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x1e973c80&quot;&gt;OpenLink Data Explorer&lt;/a&gt;).&lt;/p&gt; &lt;h3&gt;&amp;quot;It is not always clear what the main thrust is, we get the impression that you are spread too thin,&amp;quot; said &lt;a href=&quot;http://www.informatik.uni-leipzig.de/~auer/foaf.rdf#me&quot; id=&quot;link-id0x1f8bafe0&quot;&gt;SÃ¶ren Auer&lt;/a&gt;.&lt;/h3&gt; &lt;p&gt;Well, personally, I am all for core competence. This is why I do not participate in all the online conversations and groups as much as I could, for example. Time and energy are critical resources and must be invested where they make a difference. In this case, the real core competence is running in the database race. This in itself, come to think of it, is a pretty broad concept.&lt;/p&gt; &lt;p&gt;This is why we put a lot of emphasis on Linked Data and the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x200bd1f0&quot;&gt;Data&lt;/a&gt; Web for now, as this is the emerging game. This is a deliberate choice, not an outside imperative or built-in limitation. More specifically, this means exposing any pre-existing relational data as linked data plus being the definitive &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1fb03528&quot;&gt;RDF&lt;/a&gt; store.&lt;/p&gt; &lt;p&gt;We can do this because we own our database and &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1e7dcc70&quot;&gt;SQL&lt;/a&gt; and data access middleware and have a history of connecting to any &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1e9baf18&quot;&gt;RDBMS&lt;/a&gt; out there.&lt;/p&gt; &lt;p&gt;The principal message we have been hearing from the RDF field is the call for scale of triple storage. This is even louder than the call for relational mapping. We believe that in time mapping will exceed triple storage as such, once we get some real production strength mappings deployed, enough to outperform RDF warehousing.&lt;/p&gt; &lt;p&gt;There are also RDF middleware things like RDF-ization and demand-driven web harvesting (i.e, the so-called Sponger). These are &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1f5f6b78&quot;&gt;SPARQL&lt;/a&gt; options, thus accessed via standard interfaces. We have little desire to create our own languages or APIs, or to tell people how to program. This is why we recently introduced &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x206818c8&quot;&gt;Sesame&lt;/a&gt;- and &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x202b3348&quot;&gt;Jena&lt;/a&gt;-compatible APIs to our RDF store. From what we hear, these work. On the other hand, we do not hesitate to move beyond the standards when there is obvious value or necessity. This is why we brought SPARQL up to and beyond SQL expressivity. It is not a case of E3 (Embrace, Extend, Extinguish).&lt;/p&gt; &lt;p&gt;Now, this message could be better reflected in our material on the web. This &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1c82e508&quot;&gt;blog&lt;/a&gt; is a rather informal step in this direction; more is to come. For now we concentrate on delivering.&lt;/p&gt; &lt;p&gt;The conventional communications wisdom is to split the message by target audience. For this, we should split the RDF, relational, and web services messages from each other. We believe that a challenger, like the semantic web technology stack, must have a compelling message to tell for it to be interesting. This is not a question of research prototypes. The new technology cannot lack something the installed technology takes for granted.&lt;/p&gt; &lt;p&gt;This is why we do not tend to show things like how to insert and query a few triples: No business out there will insert and query triples for the sake of triples. There must be a more compelling story â for example, turning the whole world into a database. This is why our examples start with things like turning the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x20832510&quot;&gt;TPC-H&lt;/a&gt; database into RDF, queries and all. Anything less is not interesting. Why would an enterprise that has business intelligence and integration issues way more complex than the rather stereotypical TPC-H even look at a technology that pretends to be all for integration and all for expressivity of queries, yet cannot answer the first question of the entry exam?&lt;/p&gt; &lt;p&gt;The world out there is complex. But maybe we ought to make some simple tutorials? So, as a call to the people out there, tell us what a good tutorial would be. The question is more about figuring out what is out there and adapting these and making a sort of compatibility list. Jena and Sesame stuff ought to run as is. We could offer a webinar to all the data web luminaries showing how to promote the data web message with Virtuoso. After all, why not show it on the best platform?&lt;/p&gt; &lt;h3&gt;&amp;quot;You are arrogant. When I read your papers or documentation, the impression I get is that you say you are smart and the reader is stupid.&amp;quot;&lt;/h3&gt; &lt;p&gt;We should answer in multiple parts.&lt;/p&gt; &lt;p&gt;For general collateral, like web sites and documentation:&lt;/p&gt; &lt;p&gt;The web site gives a confused product image. For the Virtuoso product, we should divide at the top into&lt;/p&gt; &lt;ul&gt; &lt;li&gt; Data web and RDF - Host linked data, expose relational assets as linked data;&lt;/li&gt; &lt;li&gt; Relational Database - Full function, high performance, open source, Federated/Virtual Relational DBMS, expose heterogeneous RDB assets through one point of contact for integration;&lt;/li&gt; &lt;li&gt; Web Services - access all the above over standard protocols, dynamic web pages, web hosting.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;For each point, one simple statement. We all know what the above things mean?&lt;/p&gt; &lt;p&gt;Then we add a new point about scalability that impacts all the above, namely the Virtuoso version 6 Cluster, meaning that you can do all these things at 10 to 1000 times the scale. This means this much more data or in some cases this much more requests per second. This too is clear.&lt;/p&gt; &lt;p&gt;Far as I am concerned, hosting Java or .&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x20283a88&quot;&gt;NET&lt;/a&gt; does not have to be on the front page. Also, we have no great interest in going against &lt;a href=&quot;http://dbpedia.org/resource/Apache&quot; id=&quot;link-id0x2024a068&quot;&gt;Apache&lt;/a&gt; when it comes to a web server only situation. The fact that we have a web listener is important for some things but our claim to fame does not rest on this.&lt;/p&gt; &lt;p&gt;Then for documentation and training materials: The documentation should be better. Specifically it should have more of a how-to dimension since nobody reads the whole thing anyhow. About online tutorials, the order of presentation should be different. They do not really reflect what is important at the present moment either.&lt;/p&gt; &lt;p&gt;Now for conference papers: Since taking the data web as a focus area, we have submitted some papers and had some rejected because these do not have enough references and do not explain what is obvious to ourselves.&lt;/p&gt; &lt;p&gt;I think that the communications failure in this case is that we want to talk about end to end solutions and the reviewers expect research. For us, the solution is interesting and exists only if there is an adequate functionality mix for addressing a specific use case. This is why we do not make a paper about query cost model alone because the cost model, while indispensable, is a thing that is taken for granted where we come from. So we mention RDF adaptations to cost model, as these are important to the whole but do not find these to be the justification for a whole paper. If we made papers on this basis, we would have to make five times as many. Maybe we ought to.&lt;/p&gt; &lt;h3&gt;&amp;quot;Virtuoso is very big and very difficult&amp;quot;&lt;/h3&gt; &lt;p&gt;One thing that is not obvious from the Virtuoso packaging is that the minimum installation is an executable under 10MB and a config file. Two files.&lt;/p&gt; &lt;p&gt;This gives you SQL and SPARQL out of the box. Adding &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x1ee61058&quot;&gt;ODBC&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1b8c31c0&quot;&gt;JDBC&lt;/a&gt; clients is as simple as it gets. After this, there is basic database functionality. Tuning is a matter of a few parameters that are explained on this blog and elsewhere. Also, the full scale installation is available as an Amazon EC2 image, so no installation required.&lt;/p&gt; &lt;p&gt;Now for the difficult side:&lt;/p&gt; &lt;p&gt;Use SQL and SPARQL; use stored procedures whenever there is server side business logic. For some time critical web pages, use VSP. Do not use VSPX. Otherwise, use whatever you are used to â &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x20a13c00&quot;&gt;PHP&lt;/a&gt; or Java or anything else. For web services, simple is best. Stick to basics. &amp;quot;The engineer is one who can invent a simple thing.&amp;quot; Use SQL statements rather than admin UI.&lt;/p&gt; &lt;p&gt;Know that you can start a server with no database file and you get an initial database with nothing extra. The demo database, the way it is produced by installers is cluttered.&lt;/p&gt; &lt;p&gt;We should put this into a couple of use case oriented how-tos.&lt;/p&gt; &lt;p&gt;Also, we should create a network of &amp;quot;friendly local virtuoso geeks&amp;quot; for providing basic training and services so we do not have to explain these things all the time. To all you data-web-ers out there â please sign up and we will provide instructions, etc. Contact YrjÃ¤nÃ¤ Rankka (ghard[at-sign]openlinksw.com), or go through the mailing lists; do not contact me directly.&lt;/p&gt; &lt;h3&gt;&amp;quot;OK, we understand that you may be good at the large end of the spectrum but how do you reconcile this with the lightweight or embedded end, like the semantic desktop?&amp;quot;&lt;/h3&gt; &lt;p&gt;Now, what is good for one end is usually good for the other. Namely, a database, no matter the scale, needs to have space efficient storage, fast index lookup, and correct query plans. Then there are things that occur only at the high-end, like clustering, but these are separate things. For embedding, the initial memory footprint needs to be small. With Virtuoso, this is accomplished by leaving out some 200 built-in tables and 100,000 lines of SQL procedures that are normally in by default, supporting things such as DAV and diverse other protocols. After all, if SPARQL is all one wants these are not needed.&lt;/p&gt; &lt;p&gt;If one really wants to do one&amp;#39;s server logic (like web listener and thread dispatching) oneself, this is not impossible but requires some advice from us. On the other hand, if one wants to have logic for security close to the data, then using stored procedures is recommended; these execute right next to the data, and support inline SPARQL and SQL. Depending on the license status of the other code, some special licensing arrangements may apply.&lt;/p&gt; &lt;p&gt;We are talking about such things with different parties at present.&lt;/p&gt; &lt;h3&gt;&amp;quot;How webby are you? What is webby?&amp;quot;&lt;/h3&gt; &lt;p&gt;&amp;quot;Webby means distributed, heterogeneous, open; not monolithic consolidation of everything.&amp;quot;&lt;/p&gt; &lt;p&gt;We are philosophically webby. We come from open standards; we are after all called OpenLink; our history consists of connecting things. We believe in choice â the user should be able to pick the best of breed for components and have them work together. We cannot and do not wish to force replacement of existing assets. Transforming data on the fly and connecting systems, leaving data where it originally resides, is the first preference. For the data web, the first preference is a federation of independent SPARQL end points. When there is harvesting, we prefer to do it on demand, as with our Sponger. With the immense amount of data out there we believe in finding what is relevant &lt;i&gt;when&lt;/i&gt; it is relevant, preferably close at hand, leveraging things like social networks. With a data web, many things which are now siloized, such as marketplaces and social networks, will return to the open.&lt;/p&gt; &lt;p&gt;Google-style crawling of everything becomes less practical if one needs to run complex &lt;i&gt;ad hoc&lt;/i&gt; queries against the mass of data. For these types of scenarios, if one needs to warehouse, the data cloud will offer solutions where one pays for database on demand. While we believe in loosely coupled federation where possible, we have serious work on the scalability side for the data center and the compute-on-demand cloud.&lt;/p&gt; &lt;h3&gt;&amp;quot;How does OpenLink see the next five years unfolding?&amp;quot;&lt;/h3&gt; &lt;p&gt;Personally, I think we have the basics for the birth of a new inflection in the &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1fb9ae58&quot;&gt;knowledge&lt;/a&gt; economy. The &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x1f07c648&quot;&gt;URI&lt;/a&gt; is the unit of exchange; its value and competitive edge lie in the data it links you with. A name without context is worth little, but as a name gets more use, more &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1f007d60&quot;&gt;information&lt;/a&gt; can be found through that name. This is anything from financial statistics, to legal precedents, to news reporting or government data. Right now, if the SEC just added one line of markup to the XBRL template, this would instantaneously make all SEC-mandated reporting into linked data via GRDDL.&lt;/p&gt; &lt;p&gt;The URI is a carrier of brand. An information brand gets traffic and references, and this can be monetized in diverse ways. The key word is &lt;i&gt;context&lt;/i&gt;. Information overload is here to stay, and only better context offers the needed increase in productivity to stay ahead of the flood.&lt;/p&gt; &lt;p&gt;Semantic technologies on the whole can help with this. Why these should be semantic web or data web technologies as opposed to just semantic is the linked data value proposition. Even smart islands are still islands. Agility, scale, and scope, depend on the possibility of combining things. Therefore common terminologies and dereferenceability and discoverability are important. Without these, we are at best dealing with closed systems even if they were smart. The expert systems of the 1980s are a case in point.&lt;/p&gt; &lt;p&gt;Ever since the .com era, the &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id0x2048e670&quot;&gt;URL&lt;/a&gt; has been a brand. Now it becomes a URI. Thus, entirely hiding the URI from the user experience is not always desirable. The URI is a sort of handle on the provenance and where more can be found; besides, people are already used to these.&lt;/p&gt; &lt;p&gt;With linked data, information value-add products become easy to build and deploy. They can be basically just canned SPARQL queries combining data in a useful and insightful manner. And where there is traffic there can be monetization, whether by advertizing, subscription, or other means. Such possibilities are a natural adjunct to the blogosphere. To publish analysis, one no longer needs to be a think tank or media company. We could call this scenario the birth of a meshup economy.&lt;/p&gt; &lt;p&gt;For OpenLink itself, this is our roadmap. The immediate future is about getting our high end offerings like clustered RDF storage generally available, both on the cloud and for private data centers. Ourselves, we will offer the whole &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1c696170&quot;&gt;Linked Open Data&lt;/a&gt; cloud as a database. The single feature to come in version 2 of this is fully automatic partitioning and repartitioning for on-demand scale; now, you have to choose how many partitions you have.&lt;/p&gt; &lt;p&gt;This makes some things possible that were hard thus far.&lt;/p&gt; &lt;p&gt;On the mapping front, we go for real-scale data integration scenarios where we can show that SPARQL can unify terms and concepts across databases, yet bring no added cost for complex queries. Enterprises can use their existing warehouses and have an added level of abstraction, the possibility of cross systems interlinking, the advantages of using the same taxonomies and ontologies across systems, and so forth.&lt;/p&gt; &lt;p&gt;Then there will be developments in the direction of smarter web harvesting on demand with the Virtuoso &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html&quot; id=&quot;link-id0x206ab780&quot;&gt;Sponger&lt;/a&gt;, and federation of heterogeneous SPARQL end points. The federation is not so unlike clustering, except the time scales are 2 orders of magnitude longer. The work on SPARQL end point statistics and data set description and discovery is a good development in the community.&lt;/p&gt; &lt;p&gt;Then there will be NLP integration, as exemplified by the Open Calais linked data wrapper and more.&lt;/p&gt; &lt;p&gt;Can we pull this off or is this being spread too thin? We know from experience that all this can be accomplished. Scale is already here; we show it with the billion triples set. Mapping is here; we showed it last in the Berlin Benchmark. We will also show some TPC-H results after we get a little quiet after the ISWC event. Then there is ongoing maintenance but with this we have shown a steady turnaround and quick time to fix for pretty much anything.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>State of the Semantic Web, Part 2 - The Technical Questions (updated)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-26#1466</atom:id>
  <atom:published>2008-10-26T12:02:43Z</atom:published>
  <atom:updated>2008-10-27T11:28:14-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;Here I will talk about some more technical questions that came up. This is mostly general; &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x205901a0&quot;&gt;Virtuoso&lt;/a&gt; specific questions and answers are separate. &lt;/p&gt; &lt;h3&gt;&amp;quot;How to Bootstrap? Where will the triples come from?&amp;quot;&lt;/h3&gt; &lt;p&gt;There are already wrappers producing &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x13519ac8&quot;&gt;RDF&lt;/a&gt; from many applications. Since any structured or semi-structured &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1c93b418&quot;&gt;data&lt;/a&gt; can be converted to RDF and often there is even a pre-existing terminology for the application domain, the availability of the data &lt;i&gt;per se&lt;/i&gt; is not the concern.&lt;/p&gt; &lt;p&gt;The triples may come from any application or database, but they will not come from the end user directly. There was a good talk about photograph annotation in &lt;a href=&quot;http://dbpedia.org/resource/Vienna&quot; id=&quot;link-id0x1ea9d150&quot;&gt;Vienna&lt;/a&gt;, describing many ways of deriving metadata for photos. The essential wisdom is annotating on the spot and wherever possible doing so automatically. The consumer is very unlikely to go annotate photos after the fact. Further, one can infer that photos made with the same camera around the same time are from the same location. There are other such heuristics. In this use case, the end user does not need to see triples. There is some benefit though in using commonly used geographical terminology for linking to other data sources.&lt;/p&gt; &lt;h3&gt;&amp;quot;How will one develop applications?&amp;quot;&lt;/h3&gt; &lt;p&gt;I&amp;#39;d say one will develop them much the same way as thus far. In &lt;a href=&quot;http://dbpedia.org/resource/PHP&quot; id=&quot;link-id0x207fca00&quot;&gt;PHP&lt;/a&gt;, for example. Whether one&amp;#39;s query language is &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x20a5fde0&quot;&gt;SPARQL&lt;/a&gt; or &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1a0bb5e0&quot;&gt;SQL&lt;/a&gt; does not make a large difference in how basic web UI is made.&lt;/p&gt; &lt;p&gt;A SPARQL end-point is no more an end-user item than a SQL command-line is.&lt;/p&gt; &lt;p&gt;A common mistake among techies is that they think the data structure and user experience can or ought to be of the same structure. The UI dialogs do not, for example, have to have a 1:1 correspondence with SQL tables.&lt;/p&gt; &lt;p&gt;The idea of generating UI from data, whether relational or data-web, is so seductive that generation upon generation of developers fall for it, repeatedly. Even I, at OpenLink, after supposedly having been around the block a couple of times made some experiments around the topic. What does make sense is putting a thin wrapper or HTML around the application, using XSLT and such for formatting. Since the model does allow for unforeseen properties of data, one can build a viewer for these alongside the regular forms. For this, Ajax technologies like &lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x1e91d118&quot;&gt;OAT&lt;/a&gt; (the &lt;a href=&quot;http://oat.openlinksw.com/&quot; id=&quot;link-id0x174b7950&quot;&gt;OpenLink AJAX Toolkit&lt;/a&gt;) will be good.&lt;/p&gt; &lt;p&gt;The UI ought not to completely hide the URIs of the data from the user. It should offer a drill down to faceted views of the triples for example. Remember when Xerox talked about graphical user interfaces in 1980? &amp;quot;Don&amp;#39;t mode me in&amp;quot; was the slogan, as I recall.&lt;/p&gt; &lt;p&gt;Since then, we have vacillated between modal and non-modal interaction models. Repetitive workflows like order entry go best modally and are anyway being replaced by web services. Also workflows that are very infrequent benefit from modality; take personal network setup wizards, for example. But enabling the &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1ea14610&quot;&gt;knowledge&lt;/a&gt; worker is a domain that by its nature must retain some respect for human intelligence and not kill this by denying access to the underlying data, including provenance and URIs. Face it: the world is not getting simpler. It is increasingly data dependent and when this is so, having semantics and flexibility of access for the data is important.&lt;/p&gt; &lt;p&gt;For a real-time task-oriented user interface like a fighter plane cockpit, one will not show URIs unless specifically requested. For planning fighter sorties though, there is some potential benefit in having all data such as friendly and hostile assets, geography, organizational structure, etc., as &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x207bcd20&quot;&gt;linked data&lt;/a&gt;. It makes for more flexible querying. Linked data does not &lt;i&gt;per se&lt;/i&gt; mean open, so one can be joinable with open data through using the same identifiers even while maintaining arbitrary levels of security and compartmentalization.&lt;/p&gt; &lt;p&gt;For automating tasks that every time involve the same data and queries, RDF has no intrinsic superiority. Thus the user interfaces in places where RDF will have real edge must be more capable of &lt;i&gt;ad hoc&lt;/i&gt; viewing and navigation than regular real-time or line of business user interfaces.&lt;/p&gt; &lt;p&gt;The &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0x2083a6f0&quot;&gt;OpenLink Data Explorer&lt;/a&gt; idea of a &amp;quot;data behind the web page&amp;quot; view goes in this direction. Read the web as before, then hit a switch to go to the data view. There are and will be separate clarifications and demos about this.&lt;/p&gt; &lt;h3&gt;&amp;quot;What of the proliferation of standards? Does this not look too tangled, no clear identity? How would one know where to begin?&amp;quot;&lt;/h3&gt; &lt;p&gt;When &lt;a href=&quot;http://www.w3.org/2001/sw/sweo/&quot; id=&quot;link-id0x1e8eac68&quot;&gt;SWEO&lt;/a&gt; was beginning, there was an endlessly protracted discussion of the so-called layer cake. This acronym jungle is not good messaging. Just say linked, flexibly repurpose-able data, and rich vocabularies and structure. Just the right amount of structure for the application, less rigid and easier to change than relational.&lt;/p&gt; &lt;p&gt;Do not even mention the different serialization formats. Just say that it fits on top of the accepted web infrastructure â &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1e3806b8&quot;&gt;HTTP&lt;/a&gt;, URIs, and &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x1f547288&quot;&gt;XML&lt;/a&gt; where desired.&lt;/p&gt; &lt;p&gt;It is misleading to say inference is a box at some specific place in the diagram. Inference of different types may or may not take place at diverse points, whether presentation or storage, on demand or as a preprocessing step. Since there is structure and semantics, inference is possible if desired.&lt;/p&gt; &lt;h3&gt;&amp;quot;Can I make a social network application in RDF only, with no &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x20553ee0&quot;&gt;RDBMS&lt;/a&gt;?&amp;quot;&lt;/h3&gt; &lt;p&gt;Yes, in principle, but what do you have in mind? The answer is very context dependent. The person posing the question had an E-learning system in mind, with things such as course catalogues, course material, etc. In such a case, RDF is a great match, especially since the user count will not be in the millions. No university has that many students and anyway they do not hang online browsing the course catalogue.&lt;/p&gt; &lt;p&gt;On the other hand, if I think of making a social network site with RDF as the exclusive data model, I see things that would be very inefficient. For example, keeping a count of logins or the last time of login would be by default several times less efficient than with a RDBMS.&lt;/p&gt; &lt;p&gt;If some application is really large scale and has a knowable workload profile, like any social network does, then some task-specific data structure is simply economical. This does not mean that the application language cannot be SPARQL but this means that the storage format must be tuned to favor some operations over others, relational style. This is a matter of cost more than of feasibility. Ten servers cost less than a hundred and have failures ten times less frequently.&lt;/p&gt; &lt;p&gt;In the near term we will see the birth of an application paradigm for the data web. The data will be open, exposed, first-class citizen; yet the user experience will not have to be in a 1:1 image of the data.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>State of the Semantic Web, Part 1 - Sociology, Business, and Messaging (update 2)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-24#1460</atom:id>
  <atom:published>2008-10-24T10:19:03Z</atom:published>
  <atom:updated>2008-10-27T11:28:07-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;I was in &lt;a href=&quot;http://dbpedia.org/resource/Vienna&quot; id=&quot;link-id0x1f18a540&quot;&gt;Vienna&lt;/a&gt; for the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1ec788a0&quot;&gt;Linked Data&lt;/a&gt; Practitioners gathering this week. Danny Ayers asked me if I would &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x20838238&quot;&gt;blog&lt;/a&gt; about the State of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x20694ed8&quot;&gt;Semantic Web&lt;/a&gt; or write the &lt;i&gt;This Week&amp;#39;s Semantic Web&lt;/i&gt; column. I don&amp;#39;t have the time to cover all that may have happened during the past week but I will editorialize about the questions that again were raised in Vienna. How these things relate to &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x20b1cd38&quot;&gt;Virtuoso&lt;/a&gt; will be covered separately. This is about the overarching questions of the times, not the finer points of geek craft.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.informatik.uni-leipzig.de/~auer/foaf.rdf#me&quot; id=&quot;link-id0x1ff31b30&quot;&gt;SÃ¶ren Auer&lt;/a&gt; asked me to say a few things about relational to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1f8118e0&quot;&gt;RDF&lt;/a&gt; mapping. I will cite some highlights from this, as they pertain to the general scene. There was an &amp;quot;open hacking&amp;quot; session Wednesday night featuring lightning talks. I will use some of these too as a starting point.&lt;/p&gt; &lt;h3&gt;The messaging?&lt;/h3&gt; &lt;p&gt;The &lt;a href=&quot;http://www.w3.org/2001/sw/sweo/&quot; id=&quot;link-id0x1dc39210&quot;&gt;SWEO&lt;/a&gt; (Semantic Web Education and Outreach) interest group of the W3C spent some time looking for an elevator pitch for the Semantic Web. It became &amp;quot;&lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1f24dd98&quot;&gt;Data&lt;/a&gt; Unleashed.&amp;quot; Why not? Let&amp;#39;s give this some context.&lt;/p&gt; &lt;p&gt;So, if we are holding a &lt;i&gt;Semantic Web 101&lt;/i&gt; session, where should we begin? I hazard to guess that we should not begin by writing a FOAF file in Turtle by hand, as this is one thing that is not likely to happen in the real world.&lt;/p&gt; &lt;p&gt;Of course, the social aspect of the Data Web is the most immediately engaging, so a demo might be to go make an account with &lt;a href=&quot;http://myopenlink.net/&quot; id=&quot;link-id0x1f5e0198&quot;&gt;myopenlink&lt;/a&gt;.&lt;a href=&quot;http://dbpedia.org/resource/.NET_Framework&quot; id=&quot;link-id0x1ec49a00&quot;&gt;net&lt;/a&gt; and see that after one has entered the data one normally enters for any social network, one has become a Data Web citizen. This means that one can be found, just like this, with a query against the set of data spaces hosted on the system. Then we just need a few pages that repurpose this data and relate it to other data. We show some samples of queries like this in our &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x1ee35f70&quot;&gt;Billion Triples Challenge&lt;/a&gt; demo. We will make a webcast about this to make it all clearer.&lt;/p&gt; &lt;p&gt;Behold: The Data Web is about the world becoming a database; writing &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x20644808&quot;&gt;SPARQL&lt;/a&gt; queries or triples is incidental. You will write FOAF files by hand just as little as you now write &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1fd9fbc0&quot;&gt;SQL&lt;/a&gt; insert statements for filling in your account &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1dfd3540&quot;&gt;information&lt;/a&gt; on Myspace.&lt;/p&gt; &lt;p&gt;Every time there is a major shift in technology, this shift needs to be motivated by addressing a new class of problem. This means doing something that could not be done before. The last time this happened was when the relational database became the dominant IT technology. At that time, the questions involved putting the enterprise in the database and building a cluster of Line Of Business (LOB) applications around the database. The argument for the &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1e920868&quot;&gt;RDBMS&lt;/a&gt; was that you did not have to constrain the set of queries that might later be made, when designing the database. In other words, it was making things more &lt;i&gt;ad hoc&lt;/i&gt;. This was opposed then on grounds of being less efficient than the hierarchical and network databases which the relational eventually replaced.&lt;/p&gt; &lt;p&gt;Today, the point of the Data Web is that you do not have to constrain what your data can join or integrate with, when you design your database. The counter-argument is that this is slow and geeky and not scalable. See the similarity?&lt;/p&gt; &lt;p&gt;A difference is that we are not specifically aiming at replacing the RDBMS. In fact, if you know exactly what you will query and have a well defined workload, a relational representation optimized for the workload will give you about 10x the performance of the equivalent RDF warehouse. OLTP remains a relational-only domain.&lt;/p&gt; &lt;p&gt;However, when we are talking about doing queries and analytics against the Web, or even against more than a handful of relational systems, the things which make RDBMS good become problematic.&lt;/p&gt; &lt;h3&gt;What is the business value of this?&lt;/h3&gt; &lt;p&gt;The most reliable of human drives is the drive to make oneself known. This drives all, from any social scene to business communications to politics. Today, when you want to proclaim you exist, you do so first on the Web. The Web did not become the prevalent media because business loved it for its own sake, it became prevalent because business could not afford not to assert their presence there. If anything, the Web eroded the communications dominance of a lot of players, which was not welcome but still had to be dealt with, by embracing the Web.&lt;/p&gt; &lt;p&gt;Today, in a world driven by data, the Data Web will be catalyzed by similar factors: If your data is not there, you will not figure in query results. Search engines will play some role there but also many social applications will have reports that are driven by published data. Also consider any e-commerce, any marketplace, and so forth. The Data Portability movement is a case in point: Users want to own their own content; silo operators want to capitalize on holding it. Right now, we see these things in silos; the Data Web will create bridges between these, and what is now in silo data centers will be increasingly available on an ad hoc basis with Open Data.&lt;/p&gt; &lt;p&gt;Again, we see a movement from the specialized to the generic: What LinkedIn does in its data center can be done with ad hoc queries with &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1e715138&quot;&gt;linked open data&lt;/a&gt;. Of course, LinkedIn does these things somewhat more efficiently because their system is built just for this task, but the linked data approach has the built-in readiness to join with everything else at almost no cost, without making a new data warehouse for each new business question.&lt;/p&gt; &lt;p&gt;We could call this the sociological aspect of the thing. Getting to more concrete business, we see an economy that, we could say, without being alarmists, is confronted with some issues. Well, generally when times are bad, this results in consolidation of property and power. Businesses fail and get split up and sold off in pieces, government adds controls and regulations and so forth. This means ad hoc data integration, as control without data is just pretense. If times are lean, this also means that there is little readiness to do wholesale replacement of systems, which will take years before producing anything. So we must play with what there is and make it deliver, in ways and conditions that were not necessarily anticipated. The agility of the Data Web, if correctly understood, can be of great benefit there, especially on the reporting and business intelligence side. Specifically mapping line-of-business systems into RDF on the fly will help with integration, making the specialized warehouse the slower and more expensive alternative. But this too is needed at times.&lt;/p&gt; &lt;p&gt;But for the RDF community to be taken seriously there, the messaging must be geared in this direction. Writing FOAF files by hand is not where you begin the pitch. Well, what is more natural then having a global, queriable information space, when you have a global information driven economy?&lt;/p&gt; &lt;p&gt;The Data Web is about making this happen. First with doing this in published generally available data; next with the enterprises having their private data for their own use but still linking toward the outside, even though private data stays private: You can still use standard terms and taxonomies, where they apply, when talking of proprietary information.&lt;/p&gt; &lt;h3&gt;But let&amp;#39;s get back to more specific issues&lt;/h3&gt; &lt;p&gt;At the lightning talks in Vienna, one participant said, &amp;quot;Man&amp;#39;s enemy is not the lion that eats men, it&amp;#39;s his own brother. Semantic Web&amp;#39;s enemy is the &lt;a href=&quot;http://dbpedia.org/resource/XML&quot; id=&quot;link-id0x1aeb61b8&quot;&gt;XML&lt;/a&gt; Web services stack that ate its lunch.&amp;quot; There is some truth to the first part. The second part deserves some comment. The Web services stack is about transactions. When you have a fixed, often repeating task, it is a natural thing to make this a Web service. Even though SOA is not really prevalent in enterprise IT, it has value in things like managing supply-chain logistics with partners, etc. Lots of standard messages with unambiguous meaning. To make a parallel with the database world: first there was OLTP; then there was business intelligence. Of course, you must first have the transactions, to have something to analyze.&lt;/p&gt; &lt;p&gt;SOA is for the transactions; the Data Web is for integration, analysis, and discovery. It is the &lt;i&gt;ad hoc&lt;/i&gt; component of the real time enterprise, if you will. It is not a competitor against a transaction oriented SOA. In fact, RDF has no special genius for transactions. Another mistake that often gets made is stretching things beyond their natural niche. Doing transactions in RDF is this sort of over-stretching without real benefit.&lt;/p&gt; &lt;p&gt;&amp;quot;I made an ontology and it really did solve a problem. How do I convince the enterprise people, the MBA who says it&amp;#39;s too complex, the developer who says it is not what he&amp;#39;s used to, and so on?&amp;quot;&lt;/p&gt; &lt;p&gt;This is an education question. One of the findings of SWEO&amp;#39;s enterprise survey was that there was awareness that difficult problems existed. There were and are corporate ontologies and taxonomies, diversely implemented. Some of these needs are recognized. RDF based technologies offer to make these more open standards based. open standards have proven economical in the past. What we also hear is that major enterprises do not even know what their information and human resources assets are: Experts can&amp;#39;t be found even when they are in the next department, or reports and analysis gets buried in wikis, spreadsheets, and emails.&lt;/p&gt; &lt;p&gt;Just as when SQL took off, we need vendors to do workshops on getting started with a technology. The affair in Vienna was a step in this direction. Another type of event specially focusing on vertical problems and their Data Web solutions is a next step. For example, one could do a workshop on integrating supply chain information with Data Web technologies. Or one on making enterprise &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1fbd3398&quot;&gt;knowledge&lt;/a&gt; bases from HR, CRM, office automation, wikis, etc. The good thing is that all these things are additions to, not replacements of, the existing mission-critical infrastructure. And better use of what you already have ought to be the theme of the day.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Cluster Paper Update</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-02#1451</atom:id>
  <atom:published>2008-10-02T10:02:33Z</atom:published>
  <atom:updated>2008-10-03T04:38:06-04:00</atom:updated>
  <atom:content type="html">&lt;p&gt;An updated version of the paper about &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xc0abc50&quot;&gt;Virtuoso&lt;/a&gt; Cluster is available at &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/2008webscale_rdf.pdf&quot; id=&quot;link-id16459248&quot;&gt;2008webscale_rdf.pdf&lt;/a&gt; &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Update, Billion Triples and Outlook</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-10-02#1450</atom:id>
  <atom:published>2008-10-02T10:02:32Z</atom:published>
  <atom:updated>2008-10-02T12:47:07.000004-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Update, Billion Triples and Outlook&lt;/div&gt; &lt;p&gt;I will say a few things about what we have been doing and where we can go.&lt;/p&gt; &lt;p&gt;Firstly, we have a fairly scalable platform with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1aa82dc0&quot;&gt;Virtuoso&lt;/a&gt; 6 Cluster. It was most recently tested with the workload discussed in the previous &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1445&quot; id=&quot;link-id1638a5b8&quot;&gt;Billion Triples post&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;There is an updated version of &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/2008webscale_rdf.pdf&quot; id=&quot;link-id16280a68&quot;&gt;the paper about this&lt;/a&gt;. This will be presented at the web scale workshop of ISWC 2008 in Karlsruhe.&lt;/p&gt; &lt;p&gt;Right now, we are polishing some things in Virtuoso 6 -- some optimizations for smarter balancing of interconnect traffic over multiple network interfaces, and some more &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1abd3f38&quot;&gt;SQL&lt;/a&gt; optimizations specific to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1adbe410&quot;&gt;RDF&lt;/a&gt;. The must-have basics, like parallel running of sub-queries and aggregates, and all-around unrolling of loops of every kind into large partitioned batches, is all there and proven to work.&lt;/p&gt; &lt;p&gt;We spent a lot of time around the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1aaa0e78&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt; story, so we got to the more advanced stuff like the &lt;a href=&quot;http://challenge.semanticweb.org/&quot; id=&quot;link-id0x1a860a50&quot;&gt;Billion Triples Challenge&lt;/a&gt; rather late. We did along the way also run &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1a27f2a8&quot;&gt;BSBM&lt;/a&gt; with an &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1ad5c918&quot;&gt;Oracle&lt;/a&gt; back-end, with Virtuoso mapping &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1cf0e4a0&quot;&gt;SPARQL&lt;/a&gt; to SQL. This merits its own analysis in the near future. This will be the basic how-to of mapping OLTP systems to RDF. Depending on the case, one can use this for lookups in real-time or ETL.&lt;/p&gt; &lt;p&gt;RDF will deliver value in complex situations. An example of a complex relational mapping use case came from Ordnance Survey, presented at the &lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-id0x1ab96bb0&quot;&gt;RDB2RDF XG&lt;/a&gt;. Examples of complex warehouses include the &lt;a href=&quot;http://neurocommons.org/page/Main_Page&quot; id=&quot;link-id0x1adb2db0&quot;&gt;Neurocommons&lt;/a&gt; database, the Billion Triples Challenge, and the &lt;a href=&quot;http://www.garlik.com/&quot; id=&quot;link-id0x1925c7b0&quot;&gt;Garlik DataPatrol&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;In comparison, the Berlin workload is really simple and one where RDF is not at its best, as amply discussed on the &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x1c6d1480&quot;&gt;Linked Data&lt;/a&gt; forum. BSBM&amp;#39;s primary value is as a demonstrator for the basic mapping tasks that will be repeated over and over for pretty much any online system when presence on the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1a937400&quot;&gt;data&lt;/a&gt; web becomes as indispensable as presence on the HTML web.&lt;/p&gt; &lt;p&gt;I will now talk about the complex warehouse/web-harvesting side. I will come to the mapping in another post.&lt;/p&gt; &lt;p&gt;Now, all the things shown in the &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1445&quot; id=&quot;link-id14de1d18&quot;&gt;Billion Triples post&lt;/a&gt; can be done with a relational system specially built for each purpose. Since we are a general purpose &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x1a457c70&quot;&gt;RDBMS&lt;/a&gt;, we use this capability where it makes sense. For example, storing statistics about which tags or interests occur with which other tags or interests as RDF blank nodes makes no sense. We do not even make the experiment; we know ahead of time that the result is at least an order of magnitude in favor of the relational row-oriented solution in both space and time.&lt;/p&gt; &lt;p&gt;Whenever there is a data structure specially made for answering one specific question, like joint occurrence of tags, RDB and mapping is the way to go. With Virtuoso, this can fully-well coexist with physical triples, and can still be accessed in SPARQL and mixed with triples. This is territory that we have not extensively covered yet, but we will be giving some examples about this later.&lt;/p&gt; &lt;p&gt;The real value of RDF is in agility. When there is no time to design and load a new warehouse for every new question, RDF is unparalleled. Also SPARQL, once it has the necessary extensions of aggregating and sub-queries, is nicer than SQL, especially when we have sub-classes and sub-properties, transitivity, and &amp;quot;same as&amp;quot; enabled. These things have some run time cost and if there is a report one is hitting absolutely all the time, then chances are that resolving terms and identity at load-time and using materialized views in SQL is the reasonable thing. If one is inventing a new report every time, then RDF has a lot more convenience and flexibility.&lt;/p&gt; &lt;p&gt;We are just beginning to explore what we can do with data sets such as the online conversation space, linked data, and the open ontologies of &lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0x1aa5ea18&quot;&gt;UMBEL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Cyc&quot; id=&quot;link-id0x1a631a20&quot;&gt;OpenCyc&lt;/a&gt;. It is safe to say that we can run with real world scale without loss of query expressivity. There is an incremental cost for performance but this is not prohibitive. Serving the whole billion triples set from memory would cost about $32K in hardware. $8K will do if one can wait for disk part of the time. One can use these numbers as a basis for costing larger systems. For online search applications, one will note that running the indexes pretty much from memory is necessary for flat response time. For back office analytics this is not necessarily as critical. It all depends on the use case.&lt;/p&gt; &lt;p&gt;We expect to be able to combine geography, social proximity, subject matter, and &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0x1aebdcc8&quot;&gt;named entities&lt;/a&gt;, with hierarchical taxonomies and traditional full text, and to present this through a simple user interface.&lt;/p&gt; &lt;p&gt;We expect to do this with online response times if we have a limited set of starting points and do not navigate more than 2 or 3 steps from each starting point. An example would be to have a full text pattern and news group, and get the cloud of interests from the authors of matching posts. Another would be to make a faceted view of the properties of the 1000 people most closely connected to one person.&lt;/p&gt; &lt;p&gt;Queries like finding the fastest online responders to questions about romance across the global board-scape, or finding the person who initiates the most long running conversations about crime, take a bit longer but are entirely possible.&lt;/p&gt; &lt;p&gt;The genius of RDF is to be able to do these things within a general purpose database, ad hoc, in a single query language, mostly without materializing intermediate results. Any of these things could be done with arbitrary efficiency in a custom built system. But what is special now is that the cost of access to this type of &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1ab88490&quot;&gt;information&lt;/a&gt; and far beyond drops dramatically as we can do these things in a far less labor intensive way, with a general purpose system, with no redesigning and reloading of warehouses at every turn. The query becomes a commodity.&lt;/p&gt; &lt;p&gt;Still, one must know what to ask. In this respect, the self-describing nature of RDF is unmatched. A query like &lt;i&gt;list the top 10 attributes with the most distinct values for all persons&lt;/i&gt; cannot be done in SQL. SQL simply does not allow the columns to be variable.&lt;/p&gt; &lt;p&gt;Further, we can accept queries as text, the way people are used to supplying them, and use structure for drill-down or result-relevance, and also recognize named entities and subject matter concepts in query text. Very simple NLP will go a long way towards keeping SPARQL out of the user experience.&lt;/p&gt; &lt;p&gt;The other way of keeping query complexity hidden is to publish hand-written SPARQL as parameter-fed canned reports.&lt;/p&gt; &lt;p&gt;Between now and ISWC 2008, the last week of October, we will put out demos showing some of these things. Stay tuned.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>OpenLink Software&#39;s Virtuoso Submission to the Billion Triples Challenge</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-09-30#1446</atom:id>
  <atom:published>2008-09-30T16:24:34Z</atom:published>
  <atom:updated>2008-10-03T06:20:48.000094-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;h2&gt;Introduction&lt;/h2&gt; &lt;p&gt;We use &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb03e418&quot;&gt;Virtuoso&lt;/a&gt; 6 Cluster Edition to demonstrate the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Text and structured &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0xbd9dae8&quot;&gt;information&lt;/a&gt; based lookups&lt;/li&gt; &lt;li&gt;Analytics queries&lt;/li&gt; &lt;li&gt;Analysis of co-occurrence of features like interests and tags.&lt;/li&gt; &lt;li&gt;Dealing with identity of multiple IRI&amp;#39;s (&lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0xb383dd8&quot;&gt;owl&lt;/a&gt;:sameAs)&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The demo is based on a set of canned &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xbda6298&quot;&gt;SPARQL&lt;/a&gt; queries that can be invoked using the &lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0xbb292f0&quot;&gt;OpenLink Data Explorer&lt;/a&gt; (&lt;a href=&quot;http://ode.openlinksw.com/&quot; id=&quot;link-id0xc263528&quot;&gt;ODE&lt;/a&gt;) Firefox extension.&lt;/p&gt; &lt;p&gt;The demo queries can also be run directly against the SPARQL end point.&lt;/p&gt; &lt;p&gt;The demo is being worked on at the time of submission and may be shown online by appointment.&lt;/p&gt; &lt;p&gt;Automatic annotation of the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa173378&quot;&gt;data&lt;/a&gt; based on &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0xbdda558&quot;&gt;named entity extraction&lt;/a&gt; is being worked on at the time of this submission. By the time of ISWC 2008 the set of sample queries will be enhanced with queries based on extracted &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0xa66fbe0&quot;&gt;named entities&lt;/a&gt; and their relationships in the &lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0xa06e2c8&quot;&gt;UMBEL&lt;/a&gt; and Open CYC ontologies. &lt;/p&gt; &lt;p&gt;Also examples involving owl:sameAs are being added, likewise with similarity metrics and search hit scores.&lt;/p&gt; &lt;h2&gt;The Data&lt;/h2&gt; &lt;p&gt;The database consists of the billion triples data sets and some additions like Umbel. Also the Freebase extract is newer than the challenge original.&lt;/p&gt; &lt;p&gt;The triple count is 1115 million.&lt;/p&gt; &lt;p&gt;In the case of web harvested resources, the data is loaded in one graph per resource.&lt;/p&gt; &lt;p&gt;In the case of larger data sets like &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xc2bf770&quot;&gt;Dbpedia&lt;/a&gt; or the US census, all triples of the provenance share a data set specific graph.&lt;/p&gt; &lt;p&gt;All string literals are additionally indexed in a full text index. No stop words are used.&lt;/p&gt; &lt;p&gt;Most queries do not specify a graph. Thus they are evaluated against the union of all the graphs in the database. The indexing scheme is SPOG, GPOS, POGS, OPGS. All indices ending in S are bitmap indices. &lt;/p&gt; &lt;h2&gt;The Queries &lt;/h2&gt; &lt;p&gt;The demo uses Virtuoso SPARQL extensions in most queries. These extensions consist on one hand of well known &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xaf8cb40&quot;&gt;SQL&lt;/a&gt; features like aggregation with grouping and existence and value subqueries and on the other of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xafdceb8&quot;&gt;RDF&lt;/a&gt; specific features. The latter include run time RDFS and OWL inferencing support and backward chaining subclasses and transitivity. &lt;/p&gt; &lt;h3&gt;Simple Lookups&lt;/h3&gt; &lt;pre&gt;sparql select ?s ?p (bif:search_excerpt (bif:vector (&amp;#39;&lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0xbb64dd0&quot;&gt;semantic&amp;#39;, &amp;#39;web&lt;/a&gt;&amp;#39;), ?o)) where { ?s ?p ?o . filter (bif:contains (?o, &amp;quot;&amp;#39;semantic web&amp;#39;&amp;quot;)) } limit 10 ; &lt;/pre&gt; &lt;p&gt;This looks up triples with semantic web in the object and makes a search hit summary of the literal, highlighting the search terms. &lt;/p&gt; &lt;pre&gt;sparql select ?tp count(*) where { ?s ?p2 ?o2 . ?o2 a ?tp . ?s foaf:nick ?o . filter (bif:contains (?o, &amp;quot;plaid_skirt&amp;quot;)) } group by ?tp order by desc 2 limit 40 ; &lt;/pre&gt; &lt;p&gt;This looks at what sorts of things are referenced by the properties of the foaf handle plaid_skirt.&lt;/p&gt; &lt;p&gt;What are these things called?&lt;/p&gt; &lt;pre&gt;sparql select ?lbl count(*) where { ?s ?p2 ?o2 . ?o2 rdfs:label ?lbl . ?s foaf:nick ?o . filter (bif:contains (?o, &amp;quot;plaid_skirt&amp;quot;)) } group by ?lbl order by desc 2 ; &lt;/pre&gt; &lt;p&gt;Many of these things do not have a rdfs:label. Let us use a more general concept of lable which groups dc:title, foaf:name and other name-like properties together. The subproperties are resolved at run time, there is no materialization. &lt;/p&gt; &lt;pre&gt;sparql define input:inference &amp;#39;b3s&amp;#39; select ?lbl count(*) where { ?s ?p2 ?o2 . ?o2 b3s:label ?lbl . ?s foaf:nick ?o . filter (bif:contains (?o, &amp;quot;plaid_skirt&amp;quot;)) } group by ?lbl order by desc 2 ; &lt;/pre&gt; &lt;p&gt;We can list sources by the topics they contain. Below we look for graphs that mention terrorist bombing. &lt;/p&gt; &lt;pre&gt;sparql select ?g count(*) where { graph ?g { ?s ?p ?o . filter (bif:contains (?o, &amp;quot;&amp;#39;terrorist bombing&amp;#39;&amp;quot;)) } } group by ?g order by desc 2 ; &lt;/pre&gt; &lt;p&gt;Now some web 2.0 tagging of search results. The &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0xa8b89f8&quot;&gt;tag&lt;/a&gt; cloud of &amp;quot;computer&amp;quot;&lt;/p&gt; &lt;pre&gt;sparql select ?lbl count (*) where { ?s ?p ?o . ?o bif:contains &amp;quot;computer&amp;quot; . ?s sioc:topic ?tg . optional { ?tg rdfs:label ?lbl } } group by ?lbl order by desc 2 limit 40 ; &lt;/pre&gt; &lt;p&gt;This query will find the posters who talk the most about sex.&lt;/p&gt; &lt;pre&gt;sparql select ?auth count (*) where { ?d dc:creator ?auth . ?d ?p ?o filter (bif:contains (?o, &amp;quot;sex&amp;quot;)) } group by ?auth order by desc 2 ; &lt;/pre&gt; &lt;h3&gt;Analytics &lt;/h3&gt; &lt;p&gt;We look for people who are joined by having relatively uncommon interests but do not know each other.&lt;/p&gt; &lt;pre&gt;sparql select ?i ?cnt ?n1 ?n2 ?p1 ?p2 where { { select ?i count (*) as ?cnt where { ?p foaf:interest ?i } group by ?i } filter ( ?cnt &amp;gt; 1 &amp;amp;&amp;amp; ?cnt &amp;lt; 10) . ?p1 foaf:interest ?i . ?p2 foaf:interest ?i . filter (?p1 != ?p2 &amp;amp;&amp;amp; !bif:exists ((select (1) where {?p1 foaf:knows ?p2 })) &amp;amp;&amp;amp; !bif:exists ((select (1) where {?p2 foaf:knows ?p1 }))) . ?p1 foaf:nick ?n1 . ?p2 foaf:nick ?n2 . } order by ?cnt limit 50 ; &lt;/pre&gt; &lt;p&gt;The query takes a fairly long time, mostly spent counting the interested in 25M interest triples. It then takes people that share the interest and checks that neither claims to know the other. It then sorts the results rarest interest first. The query can be written more efficently but is here just to show that database-wide scans of the population are possible ad hoc. &lt;/p&gt; &lt;p&gt;Now we go to SQL to make a tag co-occurrence matrix. This can be used for showing a Technorati-style related tags line at the bottom of a search result page. This showcases the use of SQL together with SPARQL. The half-matrix of tags t1, t2 with the co-occurrence count at the intersection is much more efficiently done in SQL, specially since it gets updated as the data changes. This is an example of materialized intermediate results based on warehoused RDF. &lt;/p&gt; &lt;pre&gt;create table tag_count (tcn_tag iri_id_8, tcn_count int, primary key (tcn_tag)); alter index tag_count on tag_count partition (tcn_tag int (0hexffff00)); create table tag_coincidence (tc_t1 iri_id_8, tc_t2 iri_id_8, tc_count int, tc_t1_count int, tc_t2_count int, primary key (tc_t1, tc_t2)) alter index tag_coincidence on tag_coincidence partition (tc_t1 int (0hexffff00)); create index tc2 on tag_coincidence (tc_t2, tc_t1) partition (tc_t2 int (0hexffff00)); &lt;/pre&gt; &lt;p&gt;How many times each topic is mentioned?&lt;/p&gt; &lt;pre&gt; insert into tag_count select * from (sparql define output:valmode &amp;quot;LONG&amp;quot; select ?t count (*) as ?cnt where { ?s sioc:topic ?t } group by ?t) xx option (quietcast); &lt;/pre&gt; &lt;p&gt;Take all t1, t2 where t1 and t2 are tags of the same subject, store only the permutation where the internal id of t1 &amp;lt; that of t2.&lt;/p&gt; &lt;pre&gt;insert into tag_coincidence (tc_t1, tc_t2, tc_count) select &amp;quot;t1&amp;quot;, &amp;quot;t2&amp;quot;, cnt from (select &amp;quot;t1&amp;quot;, &amp;quot;t2&amp;quot;, count (*) as cnt from (sparql define output:valmode &amp;quot;LONG&amp;quot; select ?t1 ?t2 where { ?s sioc:topic ?t1 . ?s sioc:topic ?t2 }) tags where &amp;quot;t1&amp;quot; &amp;lt; &amp;quot;t2&amp;quot; group by &amp;quot;t1&amp;quot;, &amp;quot;t2&amp;quot;) xx where isiri_id (&amp;quot;t1&amp;quot;) and isiri_id (&amp;quot;t2&amp;quot;) option (quietcast); &lt;/pre&gt; &lt;p&gt;Now put the individual occurrence counts into the same table with the co-occurrence. This denormalization makes the related tags lookup faster. &lt;/p&gt; &lt;pre&gt;update tag_coincidence set tc_t1_count = (select tcn_count from tag_count where tcn_tag = tc_t1), tc_t2_count = (select tcn_count from tag_count where tcn_tag = tc_t2); &lt;/pre&gt; &lt;p&gt;Now each tag_coincidence row has the joint occurrence count and individual occurrence counts. A single select will return a Technorati-style related tags listing. &lt;/p&gt; &lt;p&gt;To show the &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x9d4bc60&quot;&gt;URI&lt;/a&gt;&amp;#39;s of the tags: &lt;/p&gt; &lt;pre&gt;select top 10 id_to_iri (tc_T1), id_to_iri (tc_t2), tc_count from tag_coincidence order by tc_count desc; &lt;/pre&gt; &lt;h3&gt;Social Networks &lt;/h3&gt; &lt;p&gt;We look at what interests people have &lt;/p&gt; &lt;pre&gt;sparql select ?o ?cnt where { { select ?o count (*) as ?cnt where { ?s foaf:interest ?o } group by ?o } filter (?cnt &amp;gt; 100) } order by desc 2 limit 100 ; &lt;/pre&gt; &lt;p&gt;Now the same for the Harry Potter fans &lt;/p&gt; &lt;pre&gt;sparql select ?i2 count (*) where { ?p foaf:interest &amp;lt;&lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0xba0b390&quot;&gt;http&lt;/a&gt;://www.livejournal.com/interests.bml?int=harry+potter&amp;gt; . ?p foaf:interest ?i2 } group by ?i2 order by desc 2 limit 20 ; &lt;/pre&gt; &lt;p&gt;We see whether knows relations are symmmetrical. We return the top n people that others claim to know without being reciprocally known.&lt;/p&gt; &lt;pre&gt;sparql select ?celeb, count (*) where { ?claimant foaf:knows ?celeb . filter (!bif:exists ((select (1) where { ?celeb foaf:knows ?claimant }))) } group by ?celeb order by desc 2 limit 10 ; &lt;/pre&gt; &lt;p&gt;We look for a well connected person to start from.&lt;/p&gt; &lt;pre&gt;sparql select ?p count (*) where { ?p foaf:knows ?k } group by ?p order by desc 2 limit 50 ; &lt;/pre&gt; &lt;p&gt;We look for the most connected of the many online identities of Stefan Decker.&lt;/p&gt; &lt;pre&gt;sparql select ?sd count (distinct ?xx) where { ?sd a foaf:Person . ?sd ?name ?ns . filter (bif:contains (?ns, &amp;quot;&amp;#39;Stefan Decker&amp;#39;&amp;quot;)) . ?sd foaf:knows ?xx } group by ?sd order by desc 2 ; &lt;/pre&gt; &lt;p&gt;We count the transitive closure of Stefan Decker&amp;#39;s connections &lt;/p&gt; &lt;pre&gt;sparql select count (*) where { { select * where { ?s foaf:knows ?o } } option (transitive, t_distinct, t_in(?s), t_out(?o)) . filter (?s = &amp;lt;mailto:stefan.decker@deri.org&amp;gt;) } ; &lt;/pre&gt; &lt;p&gt;Now we do the same while following owl:sameAs links.&lt;/p&gt; &lt;pre&gt;sparql define input:same-as &amp;quot;yes&amp;quot; select count (*) where { { select * where { ?s foaf:knows ?o } } option (transitive, t_distinct, t_in(?s), t_out(?o)) . filter (?s = &amp;lt;mailto:stefan.decker@deri.org&amp;gt;) } ; &lt;/pre&gt; &lt;h2&gt;Demo System&lt;/h2&gt; &lt;p&gt;The system runs on Virtuoso 6 Cluster Edition. The database is partitioned into 12 partitions, each served by a distinct server process. The system demonstrated hosts these 12 servers on 2 machines, each with 2 xXeon 5345 and 16GB memory and 4 SATA disks. For scaling, the processes and corresponding partitions can be spread over a larger number of machines. If each ran on its own server with 16GB RAM, the whole data set could be served from memory. This is desirable for search engine or fast analytics applications. Most of the demonstrated queries run in memory on second invocation. The timing difference between first and second run is easily an order of magnitude. &lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Requirements for Relational-to-RDF Mapping</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-09-08#1436</atom:id>
  <atom:published>2008-09-08T09:41:25Z</atom:published>
  <atom:updated>2008-09-08T15:03:09-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Requirements for Relational-to-RDF Mapping&lt;/div&gt; &lt;p&gt;Many of you will know about the W3C relational-to-&lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1e1be0a8&quot;&gt;RDF&lt;/a&gt; mapping incubator activity. The group is planning to suggest forming a working group for drawing up a specification for relational-to-RDF mapping.&lt;/p&gt; &lt;p&gt;To this effect, I recently summarized the group discussions and some of our own experiences around the topic at &amp;lt;&lt;a href=&quot;http://esw.w3.org/topic/Rdb2RdfXG/ReqForMappingByOErling&quot; id=&quot;link-id146030e8&quot;&gt;http://esw.w3.org/topic/Rdb2RdfXG/ReqForMappingByOErling&lt;/a&gt;&amp;gt;.&lt;/p&gt; &lt;p&gt;I will here discuss this less formally and more in the light of our own experience. A working group goal statement must be neutral vis Ã  vis the following points, even if any working group will unavoidably encounter these issues on the way. A &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1e6b3950&quot;&gt;blog&lt;/a&gt; post on the other hand can be more specific.&lt;/p&gt; &lt;p&gt;I gave a talk to the &lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-id0xa0932c68&quot;&gt;RDB2RDF XG&lt;/a&gt; this spring, with these &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations/Relational2RDF.ppt&quot; id=&quot;link-id14572540&quot;&gt;slides&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The main point is that people would really like to map on-the-fly, if they only could. Making an RDF warehouse is not of value in itself, but it is true that in some cases this cannot be avoided.&lt;/p&gt; &lt;p&gt;At first sight, one would think that a mapping specification could be neutral as regards whether one stores the mapped triples as triples or makes them on demand. There is almost no comparison between the complexity of doing non-trivial mappings on-the-fly versus mapping as ETL. Some of this complexity spills over into the requirements for a mapping language. &lt;/p&gt; &lt;h2&gt;Eliminating JOINs&lt;/h2&gt; &lt;p&gt;We expect to have a situation where one virtual triple can have many possible sources. The mapping is a union of mapped databases. Any integration scenario will have this feature. In such a situation, if we are &lt;code&gt;JOIN&lt;/code&gt;ing using such triples, we end up with &lt;code&gt;UNION&lt;/code&gt;s of all databases that could produce the triples in question. This is generally not desired. Therefore, in the on-demand mapping case, there must be a lot of type inference logic that is not relevant in the ETL scenario.&lt;/p&gt; &lt;p&gt;To make the point clearer, suppose a query like &amp;quot;list the organizations whose representatives have published about &lt;i&gt;xx&lt;/i&gt;.&amp;quot; Suppose that there are three databases mapped, all of which have a table of organizations, a table of persons with affiliation to organizations, a table of publications by these persons, and finally a table of tags for the publications. Now, we want the laboratories that have published with articles with &lt;a href=&quot;http://dbpedia.org/resource/Tag&quot; id=&quot;link-id0xa0977bf0&quot;&gt;tag&lt;/a&gt; &lt;i&gt;XX&lt;/i&gt;. It is a matter of common sense in this scenario that a publication will have the author and the author&amp;#39;s affiliation in the same database. However, the RDB-to-RDF mapping does not necessarily know this, if all that it is told is that a table makes IRIs of publications by applying a certain pattern to the primary key of the publications table. To infer what needs to be inferred, the system must realize that IRIs from one mapping are disjoint from IRIs from another: A paper in database &lt;i&gt;X&lt;/i&gt; will usually not have an author in database &lt;i&gt;Y&lt;/i&gt;. The IDs in database &lt;i&gt;Y&lt;/i&gt;, even if perchance equal to the IDs in &lt;i&gt;X&lt;/i&gt;, do not mean the same thing, and there is no point joining across databases by them.&lt;/p&gt; &lt;p&gt;This entire question is a non-issue in the ETL scenario, but is absolutely vital in the real-time mapping. This is also something that must be stated, at least implicitly, in any mapping. If a mapping translates keys of one place to IRIs with one pattern, and keys from another using another pattern, it must be inferable from the patterns whether the sets of IRIs will be disjoint.&lt;/p&gt; &lt;p&gt;This is critical. Otherwise we will be joining everything to everything else, and there will be orders of magnitude of penalty compared to hand-crafted &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa09490f8&quot;&gt;SQL&lt;/a&gt; over the same &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa095efd0&quot;&gt;data&lt;/a&gt; sources.&lt;/p&gt; &lt;h2&gt;Expectations and Limitations on Queries&lt;/h2&gt; &lt;p&gt; &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1e360230&quot;&gt;SPARQL&lt;/a&gt; queries translate quite well to SQL when there is only one table that can produce a triple with a subject of a given class, when there are few columns that can map to a given predicate, and when classes and predicates are literals in the query.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1f5edb30&quot;&gt;Virtuoso&lt;/a&gt; has some SQL extensions for dealing with breaking a wide table into a row per column. This facilitates dealing with predicates that are not known at query compile time. If the table in question is not managed by Virtuoso, Virtuoso&amp;#39;s SQL virtualization/federation takes care of the matter. If a mapping system goes directly to third-party SQL, no such tricks can be used.&lt;/p&gt; &lt;p&gt;The above example suggests that for supporting on-the-fly mapping without relying on owning the SQL underneath, some subsets of SPARQL may have to be defined. For example, one will probably have to require that all predicates be literals. The alternative is prohibitive run-time cost and complexity.&lt;/p&gt; &lt;p&gt;But we must not lose the baby with the bath-water. Aside from offering global identifiers, RDF&amp;#39;s attractions include subclasses and sub-predicates. In relational terms, these translate to &lt;code&gt;UNION&lt;/code&gt;s and do involve some added cost. A mapping system just has to have means of dealing with this cost, and of recognizing cases where this cost is prohibitive. Some further work is likely to be required for defining well-behaved subsets of SPARQL and mappings.&lt;/p&gt; &lt;h2&gt;ETL Ou Ne Pas ETL?&lt;/h2&gt; &lt;p&gt;Whether to warehouse or not? If one has hundreds of sources, of which some are not even relational, some ETL would seem necessary. Kashiup Vipul gave a position paper at last year&amp;#39;s RDB-to-RDF mapping workshop in Cambridge, Massachusetts, about a system of relational mapping and on-demand RDF-izers of diverse semi-structured biomedical data, e.g., spreadsheets. The issue certainly exists, and any mapping work will likely encounter integration scenarios where one part is fairly neatly mapped from relational stores, and another part comes from a less structured repository of ETLed physical triples.&lt;/p&gt; &lt;p&gt;Our take is that if something is a large or very large relational store, then map; else, ETL. With Virtuoso, we can mix mapped and local triples, but this is not a generally available feature of triple stores and standardization will likely have to wait until there are more implementations.&lt;/p&gt; &lt;h2&gt;Conclusions&lt;/h2&gt; &lt;ul&gt; &lt;li&gt;If you map on demand, watch out for an explosion of &lt;code&gt;UNION&lt;/code&gt;s when integrating sources that talk of similar things.&lt;/li&gt; &lt;li&gt;If you integrate lots of sources, some ETL is likely unavoidable. Look for ways of dealing with part ETL, part mapping. ETLing everything is not always best or even possible.&lt;/li&gt; &lt;li&gt;If you map a single fairly-clean RDB to RDF, mapping will work well, potentially much faster than triple storage. Higher storage density and more data per index lookup on the relational side.&lt;/li&gt; &lt;li&gt;If you map on demand, some restrictions to SPARQL may be practically necessary. These have to do with variables in predicate position, variables in class position, etc. Individual implementations may support these, but standardization will likely have to put limits on them.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;This was a quick summary, by no means comprehensive, on what an eventual RDB2RDF working group would come across. This is a sort of addendum to the requirements I outlined on the ESW wiki.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Transitivity and Graphs for SQL</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-09-08#1435</atom:id>
  <atom:published>2008-09-08T09:41:24Z</atom:published>
  <atom:updated>2008-09-08T15:43:07-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Transitivity and Graphs for SQL&lt;/div&gt; &lt;h2&gt;Background&lt;/h2&gt; &lt;p&gt;I have mentioned on a couple of prior occasions that basic graph operations ought to be integrated into the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa1a18c58&quot;&gt;SQL&lt;/a&gt; query language.&lt;/p&gt; &lt;p&gt;The history of databases is by and large about moving from specialized applications toward a generic platform. The introduction of the DBMS itself is the archetypal example. It is all about extracting the common features of applications and making these the features of a platform instead.&lt;/p&gt; &lt;p&gt;It is now time to apply this principle to graph traversal.&lt;/p&gt; &lt;p&gt;The rationale is that graph operations are somewhat tedious to write in a parallelize-able, latency-tolerant manner. Writing them as one would for memory-based &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xaf8c730&quot;&gt;data&lt;/a&gt; structures is easier but totally unscalable as soon as there is any latency involved, i.e., disk reads or messages between cluster peers.&lt;/p&gt; &lt;p&gt;The ad-hoc nature and very large volume of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xae41ef0&quot;&gt;RDF&lt;/a&gt; data makes this a timely question. Up until now, the answer to this question has been to materialize any implied facts in RDF stores. If &lt;i&gt;a&lt;/i&gt; was part of &lt;i&gt;b&lt;/i&gt;, and &lt;i&gt;b&lt;/i&gt; part of &lt;i&gt;&lt;a href=&quot;http://dbpedia.org/resource/C_(programming_language)&quot; id=&quot;link-id0xac9d8790&quot;&gt;c&lt;/a&gt;&lt;/i&gt;, the implied fact that &lt;i&gt;a&lt;/i&gt; is part of &lt;i&gt;c&lt;/i&gt; would be inserted explicitly into the database as a pre-query step.&lt;/p&gt; &lt;p&gt;This is simple and often efficient, but tends to have the downside that one makes a specialized warehouse for each new type of query. The activity becomes less ad-hoc.&lt;/p&gt; &lt;p&gt;Also, this becomes next to impossible when the scale approaches web scale, or if some of the data is liable to be on-and-off included-into or excluded-from the set being analyzed. This is why with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb68f9d0&quot;&gt;Virtuoso&lt;/a&gt; we have tended to favor inference on demand (&amp;quot;backward chaining&amp;quot;) and mapping of relational data into RDF without copying.&lt;/p&gt; &lt;p&gt;The SQL world has taken steps towards dealing with recursion with the &lt;code&gt;WITH - UNION&lt;/code&gt; construct which allows definition of recursive views. The idea there is to define, for example, a tree walk as a &lt;code&gt;UNION&lt;/code&gt; of the data of the starting node plus the recursive walk of the starting node&amp;#39;s immediate children.&lt;/p&gt; &lt;p&gt;The main problem with this is that I do not very well see how a SQL optimizer could effectively rearrange queries involving &lt;code&gt;JOIN&lt;/code&gt;s between such recursive views. This model of recursion seems to lose SQL&amp;#39;s non-procedural nature. One can no longer easily rearrange &lt;code&gt;JOIN&lt;/code&gt;s based on what data is given and what is to be retrieved. If the recursion is written from root to leaf, it is not obvious how to do this from leaf to root. At any rate, queries written in this way are so complex to write, let alone optimize, that I decided to take another approach.&lt;/p&gt; &lt;p&gt;Take a question like &amp;quot;list the parts of products of category &lt;i&gt;C&lt;/i&gt; which have materials that are classified as toxic.&amp;quot; Suppose that the product categories are a tree, the product parts are a tree, and the materials classification is a tree taxonomy where &amp;quot;toxic&amp;quot; has a multilevel substructure.&lt;/p&gt; &lt;p&gt;Depending on the count of products and materials, the query can be evaluated as either going from products to parts to materials and then climbing up the materials tree to see if the material is toxic. Or one could do it in reverse, starting with the different toxic materials, looking up the parts containing these, going to the part tree to the product, and up the product hierarchy to see if the product is in the right category. One should be able to evaluate the identical query either way depending on what indices exist, what the cardinalities of the relations are, and so forth â regular cost based optimization.&lt;/p&gt; &lt;p&gt;Especially with RDF, there are many problems of this type. In regular SQL, it is a long-standing cultural practice to flatten hierarchies, but this is not the case with RDF.&lt;/p&gt; &lt;p&gt;In Virtuoso, we see &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xb3bdcc0&quot;&gt;SPARQL&lt;/a&gt; as reducing to SQL. Any RDF-oriented database-engine or query-optimization feature is accessed via SQL. Thus, if we address run-time-recursion in the Virtuoso query engine, this becomes, &lt;i&gt;ipso facto&lt;/i&gt;, an SQL feature. Besides, we remember that SQL is a much more mature and expressive language than the current SPARQL recommendation.&lt;/p&gt; &lt;h2&gt; SQL and Transitivity &lt;/h2&gt; &lt;p&gt;We will here look at some simple social network queries. A later article will show how to do more general graph operations. We extend the SQL derived table construct, i.e., &lt;code&gt;SELECT&lt;/code&gt; in another &lt;code&gt;SELECT&lt;/code&gt;&amp;#39;s &lt;code&gt;FROM&lt;/code&gt; clause, with a &lt;code&gt;TRANSITIVE&lt;/code&gt; clause.&lt;/p&gt; &lt;p&gt;Consider the data:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;CREATE TABLE &amp;quot;knows&amp;quot; (&amp;quot;p1&amp;quot; INT, &amp;quot;p2&amp;quot; INT, PRIMARY KEY (&amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;) ); ALTER INDEX &amp;quot;knows&amp;quot; ON &amp;quot;knows&amp;quot; PARTITION (&amp;quot;p1&amp;quot; INT); CREATE INDEX &amp;quot;knows2&amp;quot; ON &amp;quot;knows&amp;quot; (&amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot;) PARTITION (&amp;quot;p2&amp;quot; INT); &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We represent a social network with the many-to-many relation &amp;quot;knows&amp;quot;. The persons are identified by integers.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 2); INSERT INTO &amp;quot;knows&amp;quot; VALUES (1, 3); INSERT INTO &amp;quot;knows&amp;quot; VALUES (2, 4);&lt;/code&gt; &lt;/pre&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p1&amp;quot; = 1;&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;We obtain the result:&lt;/p&gt; &lt;blockquote&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The operation is reversible:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4; &lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Since now we give &lt;i&gt;p2&lt;/i&gt;, we traverse from &lt;i&gt;p2&lt;/i&gt; towards &lt;i&gt;p1&lt;/i&gt;. The result set states that 4 is known by 2 and 2 is known by 1.&lt;/p&gt; &lt;p&gt;To see what would happen if &lt;i&gt;x&lt;/i&gt; knowing &lt;i&gt;y&lt;/i&gt; also meant &lt;i&gt;y&lt;/i&gt; knowing &lt;i&gt;x&lt;/i&gt;, one could write:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM (SELECT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot; FROM &amp;quot;knows&amp;quot; UNION ALL SELECT &amp;quot;p2&amp;quot;, &amp;quot;p1&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k2&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;100&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;Now, since we know that 1 and 4 are related, we can ask how they are related.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT * FROM (SELECT TRANSITIVE T_IN (1) T_OUT (2) T_DISTINCT &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, T_STEP (1) AS &amp;quot;via&amp;quot;, T_STEP (&amp;#39;step_no&amp;#39;) AS &amp;quot;step&amp;quot;, T_STEP (&amp;#39;path_id&amp;#39;) AS &amp;quot;path&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 AND &amp;quot;p2&amp;quot; = 4;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;250&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p1&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;via&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;step&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;path&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The two first columns are the ends of the path. The next column is the person that is a step on the path. The next one is the number of the step, counting from 0, so that the end of the path that corresponds to the end condition on the column designated as input, i.e., &lt;i&gt;p1&lt;/i&gt;, has number 0. Since there can be multiple solutions, the last column is a sequence number allowing distinguishing multiple alternative paths from each other.&lt;/p&gt; &lt;p&gt;For LinkedIn users, the friends ordered by distance and descending friend count query, which is at the basis of most LinkedIn search result views can be written as: &lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;SELECT p2, dist, (SELECT COUNT (*) FROM &amp;quot;knows&amp;quot; &amp;quot;c&amp;quot; WHERE &amp;quot;c&amp;quot;.&amp;quot;p1&amp;quot; = &amp;quot;k&amp;quot;.&amp;quot;p2&amp;quot; ) FROM (SELECT TRANSITIVE t_in (1) t_out (2) t_distinct &amp;quot;p1&amp;quot;, &amp;quot;p2&amp;quot;, t_step (&amp;#39;step_no&amp;#39;) AS &amp;quot;dist&amp;quot; FROM &amp;quot;knows&amp;quot; ) &amp;quot;k&amp;quot; WHERE &amp;quot;p1&amp;quot; = 1 ORDER BY &amp;quot;dist&amp;quot;, 3 DESC;&lt;/code&gt; &lt;/pre&gt; &lt;table width=&quot;150&quot;&gt; &lt;tr&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;p2&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;dist&lt;/th&gt; &lt;th align=&quot;center&quot; width=&quot;50&quot;&gt;aggregate&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;3&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;1&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td align=&quot;center&quot;&gt;4&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;2&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;h2&gt;How?&lt;/h2&gt; &lt;p&gt;The queries shown above work on Virtuoso v6. When running in cluster mode, several thousand graph traversal steps may be proceeding at the same time, meaning that all database access is parallelized and that the algorithm is internally latency-tolerant. By default, all results are produced in a deterministic order, permitting predictable slicing of result sets.&lt;/p&gt; &lt;p&gt;Furthermore, for queries where both ends of a path are given, the optimizer may decide to attack the path from both ends simultaneously. So, supposing that every member of a social network has an average of 30 contacts, and we need to find a path between two users that are no more than 6 steps apart, we begin at both ends, expanding each up to 3 levels, and we stop when we find the first intersection. Thus, we reach 2 * 30^3 = 54,000 nodes, and not 30^6 = 729,000,000 nodes.&lt;/p&gt; &lt;p&gt;Writing a generic database driven graph traversal framework on the application side, say in Java over &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0xa8a9ef8&quot;&gt;JDBC&lt;/a&gt;, would easily be over a thousand lines. This is much more work than can be justified just for a one-off, ad-hoc query. Besides, the traversal order in such a case could not be optimized by the DBMS.&lt;/p&gt; &lt;h2&gt;Next&lt;/h2&gt; &lt;p&gt;In a future &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0xb526a40&quot;&gt;blog&lt;/a&gt; post I will show how this feature can be used for common graph tasks like critical path, itinerary planning, traveling salesman, the 8 queens chess problem, etc. There are lots of switches for controlling different parameters of the traversal. This is just the beginning. I will also give examples of the use of this in SPARQL.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Epistemology of the Sponger, or How Virtuoso Drives a Web Query</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-09-05#1432</atom:id>
  <atom:published>2008-09-05T09:20:56Z</atom:published>
  <atom:updated>2008-09-05T16:04:28-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Epistemology of the Sponger, or How Virtuoso Drives a Web Query&lt;/div&gt; &lt;p&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1ed6cf28&quot;&gt;Virtuoso&lt;/a&gt; has an extensive collection of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1f8d1f78&quot;&gt;RDF&lt;/a&gt;-izers called Sponger Cartridges. These take a web resource in one of 30+ formats (so far) and extract RDF from it. The Virtuoso &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html&quot; id=&quot;link-id0x1edc90e8&quot;&gt;Sponger&lt;/a&gt; is a device which evaluates a query and along the way, finds dereferenceable links, dereferences them, and iteratively re-evaluates the query, until either nothing new is found or some limit is reached.&lt;/p&gt; &lt;p&gt;We could call this &lt;i&gt;query-driven crawling&lt;/i&gt;. The idea is intuitive â what one looks for, determines what one finds.&lt;/p&gt; &lt;p&gt;This does however raise certain questions pertaining to the nature and ultimate possibility of &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1f836b68&quot;&gt;knowledge&lt;/a&gt;, i.e., epistemology.&lt;/p&gt; &lt;p&gt;The process of querying could be said to go from the few to the many, just like the process of harvesting &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1edb1648&quot;&gt;data&lt;/a&gt; from the web, the way any search engine does. One follows links or makes joins and thereby increases one&amp;#39;s reach.&lt;/p&gt; &lt;p&gt;The difference is that a query has no &lt;i&gt;a priori&lt;/i&gt; direction. If I ask for the phone numbers of my friends and there are no phone numbers in the database, then it is valid to give an empty result without looking at my friends at all. &lt;a href=&quot;http://dbpedia.org/resource/Closed_world_assumption&quot; id=&quot;link-id0x1edf1f30&quot;&gt;Closed world&lt;/a&gt;, as it is said. Never mind that the friends would have had a &amp;quot;see also&amp;quot; link to a retrievable document that did have a phone number.&lt;/p&gt; &lt;p&gt;The problem is that a query execution plan determines what possible dereferenceable material the query will encounter during its execution. What is worse, a query plan tends toward the minimal, i.e., toward minimizing the chances of encountering something dereferenceable along the way. Where query and crawl appeared to have a similarity, in fact they have two opposite goals.&lt;/p&gt; &lt;p&gt;The user generally has no idea of the execution plan. In the general case, the user &lt;i&gt;cannot&lt;/i&gt; have an idea of this plan. There are valid, over 40 year old reasons for leaving the query planning to the database. In exceptional situations the user can read or direct these, but this is really quite tedious and requires understanding that is basically never present.&lt;/p&gt; &lt;p&gt;So, given a query, how do we find data that will match it, short of having a pre-loaded database of absolutely everything? This is certainly a desirable goal, and all in the &lt;a href=&quot;http://dbpedia.org/resource/Open_world_assumption&quot; id=&quot;link-id0x1eb46548&quot;&gt;open world&lt;/a&gt;, distributed spirit of the web.&lt;/p&gt; &lt;p&gt;Let us limit ourselves to queries that have some literals in the object or subject positions. A &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1ed293f8&quot;&gt;SPARQL&lt;/a&gt; query is basically a graph. Its vertices are variables and literals, and its edges are triple patterns. An edge is labeled by a predicate. For now, we will consider the predicate to always be a literal. From each literal, we can draw a tree, following each edge starting at this literal and descending until we find another literal. Each tree is not always a spanning tree of the graph, but all the trees collectively span the graph.&lt;/p&gt; &lt;p&gt;Consider the query &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;{ &amp;lt;john&amp;gt; knows ?x . &amp;lt;mary&amp;gt; knows ?x . ?x label ?l }.&lt;/code&gt; &lt;/blockquote&gt; The starting points are the literals &lt;code&gt;john&lt;/code&gt; and &lt;code&gt;mary&lt;/code&gt;. The &lt;code&gt;john&lt;/code&gt; tree has one child, &lt;code&gt;?x&lt;/code&gt;, which has the children &lt;code&gt;mary&lt;/code&gt; and &lt;code&gt;?l&lt;/code&gt;. One could notate it as &lt;blockquote&gt; &lt;code&gt;{ &amp;lt;john&amp;gt; knows ?x . {{ &amp;lt;mary&amp;gt; knows ?x} UNION {?x label ?l}}}&lt;/code&gt; &lt;/blockquote&gt; That is, the head first, and if it has more than one child, a union listing them, recursively. &lt;p&gt;If one composed such queries for each literal in the original pattern and evaluated each as a breadth first walk of the tree, no query optimization tricks, and for each binding of each variable, recorded whether there was something to dereference, one would in a finite time have reached all the directly reachable data. Then one could evaluate the original query, using whatever plan was preferred.&lt;/p&gt; &lt;p&gt;The check for dereferenceable data applied to each IRI-valued binding formed in the above evaluation, would consist of looking for &amp;quot;see also&amp;quot;, &amp;quot;same as&amp;quot;, and other such properties of the IRI. It could also consult text based search engines. Since the evaluation is breadth first, it generates a large number of parallel tasks and is fairly latency tolerant, i.e., it will not die if it must retrieve a few pages from remote sources. We will leave the exact rewrite rules for unions, optionals, aggregates, subqueries, and so on, as an exercise; the general idea should be clear enough.&lt;/p&gt; &lt;p&gt;We have here shown a way of transforming SPARQL queries in such a way as to guarantee dereferencing of findable links, without requiring the end user to either explicitly specify or understand query plans.&lt;/p&gt; &lt;p&gt;The present Sponger does not work exactly in this manner but it will be developed in this direction. Fortunately, the algorithms outlined above are nothing complicated.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>A quick look at SP2B, the SPARQL Performance Benchmark</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-08-27#1423</atom:id>
  <atom:published>2008-08-27T16:03:40Z</atom:published>
  <atom:updated>2008-09-02T09:49:57-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;A quick look at SP2B, the SPARQL Performance Benchmark&lt;/div&gt; &lt;p&gt;I finally got around to running the &lt;a href=&quot;http://dbis.informatik.uni-freiburg.de/index.php?project=SP2B&quot; id=&quot;link-id17bac628&quot;&gt;SP&lt;sup&gt;2&lt;/sup&gt;B SPARQL Performance Benchmark&lt;/a&gt; on the current &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1dcaaa48&quot;&gt;Virtuoso&lt;/a&gt; Open Source Edition, v5.0.8.&lt;/p&gt; &lt;p&gt;I ran it with the 5M triples scale, which is the highest scale for which the authors give numbers.&lt;/p&gt; &lt;p&gt;I got a run time of 25 minutes for the 12 queries, giving an arithmetic mean of the query time of 125 seconds. This is better than the 800 or so seconds that the authors had measured. Also, Q6 of the set had failed for the authors, but we have since fixed this; the fix is in the v5.0.8 cut.&lt;/p&gt; &lt;p&gt;I also tried it with a scale of 25M, but this became I/O bound and took a bit longer. I will try this with v6 and v7 cluster later, which are vastly better at anything I/O bound.&lt;/p&gt; &lt;p&gt;The machine was a 2GHz Xeon with 8G RAM. The query text was the one from the authors, with an explicit &lt;code&gt;FROM&lt;/code&gt; clause added; the client was the command line Interactive &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1be2c808&quot;&gt;SQL&lt;/a&gt; (iSQL).&lt;/p&gt; &lt;p&gt;If one does the test with the default index layout without specifying a graph, things will not work very well. Also, returning the million-row results of these queries over the &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id0x1d7ac018&quot;&gt;SPARQL protocol&lt;/a&gt; is not practical.&lt;/p&gt; &lt;p&gt;I will say something more about SP&lt;sup&gt;2&lt;/sup&gt;B when I get to have a closer look.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Configuring Virtuoso for Benchmarking</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-08-25#1419</atom:id>
  <atom:published>2008-08-25T14:06:11Z</atom:published>
  <atom:updated>2008-08-25T15:29:06.000036-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Configuring Virtuoso for Benchmarking&lt;/div&gt; &lt;p&gt;I will here summarize what should be known about running benchmarks with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xc152cf0&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;h2&gt;Physical Memory&lt;/h2&gt; &lt;p&gt;For 8G RAM, in the &lt;code&gt;[Parameters]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Parameters]&lt;br /&gt; ...&lt;br /&gt; NumberOfBuffers = 550000 &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;For 16G RAM, double thisâ&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Parameters]&lt;br /&gt; ...&lt;br /&gt; NumberOfBuffers = 1100000 &lt;/code&gt; &lt;/blockquote&gt; &lt;h2&gt;Transaction Isolation&lt;/h2&gt; &lt;p&gt;For most cases, certainly all &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xb7ba270&quot;&gt;RDF&lt;/a&gt; cases, &lt;i&gt;Read Committed&lt;/i&gt; should be the default transaction isolation. In the &lt;code&gt;[Parameters]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Parameters]&lt;br /&gt; ...&lt;br /&gt; DefaultIsolation = 2 &lt;/code&gt; &lt;/blockquote&gt; &lt;h2&gt;Multiuser Workload&lt;/h2&gt; &lt;p&gt;If &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x1a40f308&quot;&gt;ODBC&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1e003cf8&quot;&gt;JDBC&lt;/a&gt;, or similarly connected client applications are used, there must be more &lt;code&gt;ServerThreads&lt;/code&gt; available than there will be client connections. In the &lt;code&gt;[Parameters]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Parameters]&lt;br /&gt; ...&lt;br /&gt; ServerThreads = 100 &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;With web clients (unlike ODBC, JDBC, or similar clients), it may be justified to have fewer &lt;code&gt;ServerThreads&lt;/code&gt; than there are concurrent clients. The &lt;code&gt;MaxKeepAlives&lt;/code&gt; should be the maximum number of expected web clients. This can be more than the &lt;code&gt;ServerThreads&lt;/code&gt; count. In the &lt;code&gt;[HTTPServer]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [HTTPServer]&lt;br /&gt; ...&lt;br /&gt; ServerThreads = 100 &lt;br /&gt; MaxKeepAlives = 1000 &lt;br /&gt; KeepAliveTimeout = 10 &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt; &lt;i&gt;&lt;b&gt;Note&lt;/b&gt; â The &lt;code&gt;[HTTPServer] ServerThreads&lt;/code&gt; are taken from the total pool made available by the &lt;code&gt;[Parameters] ServerThreads&lt;/code&gt;. Thus, the &lt;code&gt;[Parameters] ServerThreads&lt;/code&gt; should always be at least as large as (and is best set greater than) the &lt;code&gt;[HTTPServer] ServerThreads&lt;/code&gt;, and if using the closed-source Commercial Version, should not exceed the licensed thread count.&lt;/i&gt; &lt;/p&gt; &lt;h2&gt;Disk Use&lt;/h2&gt; &lt;p&gt;The basic rule is to use one stripe (file) per distinct physical device (not per file system), using no RAID. For example, one might stripe a database over 6 files (6 physical disks), with an initial size of 60000 pages (the files will grow as needed). &lt;/p&gt; &lt;p&gt;For the above described example, in the &lt;code&gt;[Database]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Database]&lt;br /&gt; ...&lt;br /&gt; Striping = 1&lt;br /&gt; MaxCheckpointRemap = 2000000 &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;â and in the &lt;code&gt;[Striping]&lt;/code&gt; stanza, on one line per &lt;code&gt;SegmentName&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Striping]&lt;br /&gt; ...&lt;br /&gt; Segment1 = 60000 , /virtdev/db/virt-seg1.db = q1 , /data1/db/virt-seg1-str2.db = q2 , /data2/db/virt-seg1-str3.db = q3 , /data3/db/virt-seg1-str4.db = q4 , /data4/db/virt-seg1-str5.db = q5 , /data5/db/virt-seg1-str6.db = q6&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;As can be seen here, each file gets a background IO thread (the &lt;code&gt;= q&lt;i&gt;xxx&lt;/i&gt;&lt;/code&gt; clause). It should be noted that all files on the same physical device should have the same &lt;code&gt;q&lt;i&gt;xxx&lt;/i&gt;&lt;/code&gt; value. This is not directly relevant to the benchmarking scenario above, because we have only one file per device, and thus only one file per IO queue.&lt;/p&gt; &lt;h2&gt; &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xc8b97c0&quot;&gt;SQL&lt;/a&gt; Optimization&lt;/h2&gt; &lt;p&gt;If queries have lots of joins but access little &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x193b2fa8&quot;&gt;data&lt;/a&gt;, as with the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1b283ca0&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt;, the SQL compiler must be told not to look for better plans if the best plan so far is quicker than the compilation time expended so far. Thus, in the &lt;code&gt;[Parameters]&lt;/code&gt; stanza of &lt;code&gt;virtuoso.ini&lt;/code&gt;, set â&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; [Parameters]&lt;br /&gt; ...&lt;br /&gt; StopCompilerWhenXOverRunTime = 1 &lt;/code&gt; &lt;/blockquote&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>BSBM With Triples and Mapped Relational Data</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-08-06#1410</atom:id>
  <atom:published>2008-08-06T19:41:50Z</atom:published>
  <atom:updated>2008-08-06T16:29:44.000003-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;BSBM With Triples and Mapped Relational Data&lt;/div&gt; &lt;p&gt;The special contribution of the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id10039db0&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt; (&lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id106b2538&quot;&gt;BSBM&lt;/a&gt;) to the &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id101a75f8&quot;&gt;RDF&lt;/a&gt; world is to raise the question of doing OLTP with &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xae54170&quot;&gt;RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Of course, here we immediately hit the question of comparisons with relational databases. To this effect, &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0x1e847b08&quot;&gt;BSBM&lt;/a&gt; also specifies a relational schema and can generate the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id1206c378&quot;&gt;data&lt;/a&gt; as either triples or &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id1667f040&quot;&gt;SQL&lt;/a&gt; inserts.&lt;/p&gt; &lt;p&gt;The benchmark effectively simulates the case of exposing an existing &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id10a93518&quot;&gt;RDBMS&lt;/a&gt; as RDF. &lt;a href=&quot;http://www.openlinksw.com/dataspace/organization/openlink#this&quot; id=&quot;link-id13e46d80&quot;&gt;OpenLink Software&lt;/a&gt; calls this &lt;i&gt;RDF Views&lt;/i&gt;. &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id12027578&quot;&gt;Oracle&lt;/a&gt; is beginning to call this &lt;i&gt;semantic covers&lt;/i&gt;. The &lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-id161dc678&quot;&gt;RDB2RDF XG&lt;/a&gt;, a W3C incubator group, has been active in this area since Spring, 2008.&lt;/p&gt; &lt;h3&gt;But why an OLTP workload with RDF to begin with?&lt;/h3&gt; &lt;p&gt;We believe this is relevant because RDF promises to be the interoperability factor between potentially all of traditional IS. If &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1e7119d8&quot;&gt;data&lt;/a&gt; is online for human consumption, it may be online via a &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id106a8908&quot;&gt;SPARQL&lt;/a&gt; end-point as well. The economic justification will come from discoverability and from applications integrating multi-source structured data. Online shopping is a fine use case.&lt;/p&gt; &lt;p&gt;Warehousing all the world&amp;#39;s publishable data as RDF is not our first preference, nor would it be the publisher&amp;#39;s. Considerations of duplicate infrastructure and maintenance are reason enough. Consequently, we need to show that mapping can outperform an RDF warehouse, which is what we&amp;#39;ll do here.&lt;/p&gt; &lt;h3&gt;What We Got &lt;/h3&gt; &lt;p&gt;First, we found that &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1400&quot; id=&quot;link-id150ea748&quot;&gt;making the query plan took much too long&lt;/a&gt; in proportion to the run time. With BSBM this is an issue because the queries have lots of joins but access relatively little data. So we made a faster compiler and along the way retouched the cost model a bit.&lt;/p&gt; &lt;p&gt;But the really interesting part with BSBM is mapping relational data to RDF. For us, BSBM is a great way of showing that mapping can outperform even the best triple store. A relational row store is as good as unbeatable with the query mix. And when there is a clear mapping, there is no reason the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xae5aff0&quot;&gt;SPARQL&lt;/a&gt; could not be directly translated.&lt;/p&gt; &lt;p&gt;If Chris Bizer et al launched the mapping ship, we will be the ones to pilot it to harbor!&lt;/p&gt; &lt;p&gt;We filled two &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id12dbdc70&quot;&gt;Virtuoso&lt;/a&gt; instances with a BSBM200000 data set, for 100M triples. One was filled with physical triples; the other was filled with the equivalent relational data plus mapping to triples. Performance figures are given in &amp;quot;query mixes per hour&amp;quot;. (An update or follow-on to this post will provide elapsed times for each test run.)&lt;/p&gt; &lt;p&gt;With the unmodified benchmark we got:&lt;/p&gt; &lt;blockquote&gt; &lt;table&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Physical Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt;1297 qmph&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Mapped Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt;&lt;b&gt;3144 qmph&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;In both cases, most of the time was spent on Q6, which looks for products with one of three words in the label. We altered Q6 to use text index for the mapping, and altered the databases accordingly. (There is no such thing as an e-commerce site without a text index, so we are amply justified in making this change.)&lt;/p&gt; &lt;p&gt;The following were measured on the second run of a 100 query mix series, single test driver, warm cache.&lt;/p&gt; &lt;blockquote&gt; &lt;table&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Physical Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt; 5746 qmph&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Mapped Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt; &lt;b&gt;7525 qmph&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;We then ran the same with 4 concurrent instances of the test driver. The qmph here is 400 / the longest run time.&lt;/p&gt; &lt;blockquote&gt; &lt;table&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Physical Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt; 19459 qmph&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;i&gt;Mapped Triples:&lt;/i&gt; &lt;/td&gt; &lt;td&gt;Â  Â &lt;/td&gt; &lt;td&gt; &lt;b&gt;24531 qmph&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;/blockquote&gt; &lt;p&gt;The system used was 64-bit Linux, 2GHz dual-Xeon 5130 (8 cores) with 8G RAM. The concurrent throughputs are a little under 4 times the single thread throughput, which is normal for SMP due to memory contention. The numbers do not evidence significant overhead from thread synchronization.&lt;/p&gt; &lt;p&gt;The query compilation represents about 1/3 of total server side CPU. In an actual online application of this type, queries would be parameterized, so the throughputs would be accordingly higher. We used the &lt;code&gt;StopCompilerWhenXOverRunTime = 1&lt;/code&gt; option here to cut needless compiler overhead, the queries being straightforward enough.&lt;/p&gt; &lt;p&gt;We also see that the advantage of mapping can be further increased by more compiler optimizations, so we expect in the end mapping will lead RDF warehousing by a factor of 4 or so.&lt;/p&gt; &lt;h3&gt;Suggestions for BSBM&lt;/h3&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Reporting Rules.&lt;/b&gt; The benchmark spec should specify a form for disclosure of test run data, TPC style. This includes things like configuration parameters and exact text of queries. There should be accepted variants of query text, as with the TPC.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Multiuser operation.&lt;/b&gt; The test driver should get a stream number as parameter, so that each client makes a different query sequence. Also, disk performance in this type of benchmark can only be reasonably assessed with a naturally parallel multiuser workload.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Add business intelligence.&lt;/b&gt; SPARQL has aggregates now, at least with &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id11a25ac0&quot;&gt;Jena&lt;/a&gt; and &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xb003180&quot;&gt;Virtuoso&lt;/a&gt;, so let&amp;#39;s use these. The BSBM business intelligence metric should be a separate metric off the same data. Adding synthetic sales figures would make more interesting queries possible. For example, producing recommendations like &amp;quot;customers who bought this also bought xxx.&amp;quot;&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;For the SPARQL community&lt;/b&gt;, BSBM sends the message that one ought to support parameterized queries and stored procedures. This would be a &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id109e2448&quot;&gt;SPARQL protocol&lt;/a&gt; extension; the SPARUL syntax should also have a way of calling a procedure. Something like &lt;code&gt;select proc (??, ??)&lt;/code&gt; would be enough, where &lt;code&gt;??&lt;/code&gt; is a parameter marker, like &lt;code&gt;?&lt;/code&gt; in &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id13febf48&quot;&gt;ODBC&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id120416a8&quot;&gt;JDBC&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Add transactions.&lt;/b&gt;Especially if we are contrasting mapping vs. storing triples, having an update flow is relevant. In practice, this could be done by having the test driver send web service requests for order entry and the SUT could implement these as updates to the triples or a mapped relational store. This could use stored procedures or logic in an app server.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;h3&gt;Comments on Query Mix&lt;/h3&gt; &lt;p&gt;The time of most queries is less than linear to the scale factor. Q6 is an exception if it is not implemented using a text index. Without the text index, Q6 will inevitably come to dominate query time as the scale is increased, and thus will make the benchmark less relevant at larger scales.&lt;/p&gt; &lt;h2&gt;Next&lt;/h2&gt; &lt;p&gt;We include the sources of our RDF view definitions and other material for running BSBM with our forthcoming Virtuoso Open Source 5.0.8 release. This also includes all the query optimization work done for BSBM. This will be available in the coming days.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Optimizations for the Berlin SPARQL Benchmark</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-07-30#1401</atom:id>
  <atom:published>2008-07-30T18:52:11Z</atom:published>
  <atom:updated>2008-08-06T16:29:42-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Optimizations for the Berlin SPARQL Benchmark &lt;/div&gt; &lt;p&gt;We had a look at Chris Bizer&amp;#39;s initial results with the &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id105c9f78&quot;&gt;Berlin SPARQL Benchmark&lt;/a&gt; (&lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id102d62b0&quot;&gt;BSBM&lt;/a&gt;) on &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id13eb9780&quot;&gt;Virtuoso&lt;/a&gt;. The first results were rather bad, as nearly all of the run time was spent optimizing the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id14a51258&quot;&gt;SPARQL&lt;/a&gt; statements and under 10% actually running them.&lt;/p&gt; &lt;p&gt;So I spent a couple of days on the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xaad28d0&quot;&gt;SPARQL&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id108745b0&quot;&gt;SQL&lt;/a&gt; compiler, to the effect of making it do a better guess of initial execution plan and streamlining some operations. In fact, many of the queries in &lt;a href=&quot;http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html&quot; id=&quot;link-id0xaa230b8&quot;&gt;BSBM&lt;/a&gt; are not particularly sensitive to execution plan, as they access a very small portion of the database. So to close the matter, I put in a flag that makes the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1e9e8e28&quot;&gt;SQL&lt;/a&gt; compiler give up on devising new plans if the time of the best plan so far is less than the time spent compiling so far.&lt;/p&gt; &lt;p&gt;With these changes, available now as a diff on top of 5.0.7, we run quite well, several times better than initially. With the compiler time cut-off in place (ini parameter &lt;code&gt;StopCompilerWhenXOverRunTime = 1&lt;/code&gt;), we get the following times, output from the BSBM test driver:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt; Starting test... 0: 1031.22 ms, total: 1151 ms 1: 982.89 ms, total: 1040 ms 2: 923.27 ms, total: 968 ms 3: 898.37 ms, total: 932 ms 4: 855.70 ms, total: 865 ms Scale factor: 10000 Number of query mix runs: 5 times min/max Query mix runtime: 0.8557 s / 1.0312 s Total runtime: 4.691 seconds QMpH: 3836.77 query mixes per hour CQET: 0.93829 seconds average runtime of query mix CQET (geom.): 0.93625 seconds geometric mean runtime of query mix Metrics for Query 1: Count: 5 times executed in whole run AQET: 0.012212 seconds (arithmetic mean) AQET(geom.): 0.009934 seconds (geometric mean) QPS: 81.89 Queries per second minQET/maxQET: 0.00684000s / 0.03115700s Average result count: 7.0 min/max result count: 3 / 10 Metrics for Query 2: Count: 35 times executed in whole run AQET: 0.030490 seconds (arithmetic mean) AQET(geom.): 0.029776 seconds (geometric mean) QPS: 32.80 Queries per second minQET/maxQET: 0.02467300s / 0.06753000s Average result count: 22.5 min/max result count: 15 / 30 Metrics for Query 3: Count: 5 times executed in whole run AQET: 0.006947 seconds (arithmetic mean) AQET(geom.): 0.006905 seconds (geometric mean) QPS: 143.95 Queries per second minQET/maxQET: 0.00580000s / 0.00795100s Average result count: 4.0 min/max result count: 0 / 10 Metrics for Query 4: Count: 5 times executed in whole run AQET: 0.008858 seconds (arithmetic mean) AQET(geom.): 0.008829 seconds (geometric mean) QPS: 112.89 Queries per second minQET/maxQET: 0.00804400s / 0.01019500s Average result count: 3.4 min/max result count: 0 / 10 Metrics for Query 5: Count: 5 times executed in whole run AQET: 0.087542 seconds (arithmetic mean) AQET(geom.): 0.087327 seconds (geometric mean) QPS: 11.42 Queries per second minQET/maxQET: 0.08165600s / 0.09889200s Average result count: 5.0 min/max result count: 5 / 5 Metrics for Query 6: Count: 5 times executed in whole run AQET: 0.131222 seconds (arithmetic mean) AQET(geom.): 0.131216 seconds (geometric mean) QPS: 7.62 Queries per second minQET/maxQET: 0.12924200s / 0.13298200s Average result count: 3.6 min/max result count: 3 / 5 Metrics for Query 7: Count: 20 times executed in whole run AQET: 0.043601 seconds (arithmetic mean) AQET(geom.): 0.040890 seconds (geometric mean) QPS: 22.94 Queries per second minQET/maxQET: 0.01984400s / 0.06012600s Average result count: 26.4 min/max result count: 5 / 96 Metrics for Query 8: Count: 10 times executed in whole run AQET: 0.018168 seconds (arithmetic mean) AQET(geom.): 0.016205 seconds (geometric mean) QPS: 55.04 Queries per second minQET/maxQET: 0.01097600s / 0.05066900s Average result count: 12.8 min/max result count: 6 / 20 Metrics for Query 9: Count: 20 times executed in whole run AQET: 0.043813 seconds (arithmetic mean) AQET(geom.): 0.043807 seconds (geometric mean) QPS: 22.82 Queries per second minQET/maxQET: 0.04274900s / 0.04504100s Average result count: 0.0 min/max result count: 0 / 0 Metrics for Query 10: Count: 15 times executed in whole run AQET: 0.030697 seconds (arithmetic mean) AQET(geom.): 0.029651 seconds (geometric mean) QPS: 32.58 Queries per second minQET/maxQET: 0.02072000s / 0.03975700s Average result count: 1.1 min/max result count: 0 / 4 real 0 m 5.485 s user 0 m 2.233 s sys 0 m 0.170 s &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;Of the approximately 5.5 seconds of running five query mixes, the test driver spends 2.2 s. The server side processing time is 3.1 s, of which SQL compilation is 1.35 s. The rest is miscellaneous system time. The measurement is on 64-bit Linux, 2GHz dual-Xeon 5130 (8 cores) with 8G RAM. &lt;/p&gt; &lt;p&gt;We note that this type of workload would be done with stored procedures or prepared, parameterized queries in the SQL world.&lt;/p&gt; &lt;p&gt;There will be some further tuning still but this addresses the bulk of the matter. There will be a separate message about the patch containing these improvements.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso 5.0.7 Release, Now With Jena and Sesame APIs</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-07-17#1393</atom:id>
  <atom:published>2008-07-17T17:18:09Z</atom:published>
  <atom:updated>2008-07-17T15:28:22.000002-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso 5.0.7 Release, Now With Jena and Sesame APIs&lt;/div&gt; &lt;h2&gt;Improvements&lt;/h2&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/rdfnativestorageproviders.html&quot; id=&quot;link-id13e54d98&quot;&gt;Full operation&lt;/a&gt; with &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x11a3d360&quot;&gt;Jena&lt;/a&gt; and &lt;a href=&quot;http://sourceforge.net/projects/sesame/&quot; id=&quot;link-id0x1108d428&quot;&gt;Sesame&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1288aa00&quot;&gt;RDF&lt;/a&gt; Frameworks. This fully replaces any previous attempts at interop, and introduces samples and test suites.&lt;/li&gt; &lt;li&gt;Better support for alternate RDF indexing schemes&lt;/li&gt; &lt;li&gt;Parallel operation of the RDF Sponger, importing multiple sources concurrently.&lt;/li&gt; &lt;li&gt;New &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x128a9810&quot;&gt;data&lt;/a&gt; formats supported for on-demand RDF-ization in the Sponger&lt;/li&gt; &lt;li&gt;More efficient support for inference of subclass and sub-property; now capable of efficiently handling taxonomies of tens of thousands of classes&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x6af0678&quot;&gt;OWL&lt;/a&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/rdfsparqlrule.html#rdfsparqlruleintro&quot; id=&quot;link-id104d58d8&quot;&gt;equivalentClass and equivalentProperty&lt;/a&gt; support.&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/rdfdatarepresentation.html#rdfdynamiclocal&quot; id=&quot;link-id109606a8&quot;&gt;Dynamic IRI host part&lt;/a&gt; support for mapped data and for metadata of local resources. Renaming the host or using multiple virtual hosts will accept URIs with the right host part and refer to the same thing, no duplicate storage required.&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x12e0cc38&quot;&gt;SPARQL&lt;/a&gt; optimizations for &lt;code&gt;LIMIT&lt;/code&gt; and &lt;code&gt;OFFSET&lt;/code&gt; &lt;/li&gt; &lt;/ul&gt; &lt;h2&gt;Documentation&lt;/h2&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/perfdiag.html#perfdiagqueryplans&quot; id=&quot;link-id10a56dd0&quot;&gt;How to read query plans and how to use the key performance meters&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/rdfperformancetuning.html#rdfperfcost&quot; id=&quot;link-id106cb5c0&quot;&gt;How to diagnose SPARQL queries and how to decide what indexing scheme is right for each RDF use case&lt;/a&gt; &lt;/li&gt; &lt;li&gt;How to debug RDF views&lt;/li&gt; &lt;ul&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/sparqldebug.html&quot; id=&quot;link-id133b4420&quot;&gt;Better documentation of SPARQL extensions and options&lt;/a&gt; &lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com:80/virtuoso/rdfviews.html#rdfviewnorthwindexample1&quot; id=&quot;link-id1060fdd8&quot;&gt;A sample of correct RDF view usage with the Northwind demo data&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt; &lt;/ul&gt; &lt;h2&gt;Bug Fixes&lt;/h2&gt; &lt;ul&gt; &lt;li&gt;Generally improved safety of built-in functions, better argument checking.&lt;/li&gt; &lt;li&gt;Verified UTF8 international character support in all RDF use cases, &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x12839fd0&quot;&gt;SQL&lt;/a&gt; client/&lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id0x1288f350&quot;&gt;SPARQL protocol&lt;/a&gt;/all data formats.&lt;/li&gt; &lt;/ul&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>The DARQ Matter of Federation</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-06-09#1381</atom:id>
  <atom:published>2008-06-09T14:02:19Z</atom:published>
  <atom:updated>2008-06-11T15:15:14-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;The DARQ Matter of Federation&lt;/div&gt; &lt;p&gt;Astronomers propose that the universe is held together, so to speak, by the gravity of invisible &amp;quot;dark matter&amp;quot; spread in interstellar and intergalactic space.&lt;/p&gt; &lt;p&gt;For the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x19dbf410&quot;&gt;data&lt;/a&gt; web, it will be held together by federation, also an invisible factor. As in Minkowski space, so in &lt;a href=&quot;http://dbpedia.org/resource/Cyberspace&quot; id=&quot;link-id0x9fc13ff8&quot;&gt;cyberspace&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;To take the astronomical analogy further, putting too much visible stuff in one place makes a black hole, whose chief properties are that it is very heavy, can only get heavier and that nothing comes out.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://darq.sourceforge.net/&quot; id=&quot;link-id0x1d06bd88&quot;&gt;DARQ&lt;/a&gt; is Bastian Quilitz&amp;#39;s federated extension of the &lt;a href=&quot;http://jena.sourceforge.net/&quot; id=&quot;link-id0x1cf28f70&quot;&gt;Jena&lt;/a&gt; &lt;a href=&quot;http://jena.sourceforge.net/ARQ/&quot; id=&quot;link-id0x1cba22c8&quot;&gt;ARQ&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x171c7dc8&quot;&gt;SPARQL&lt;/a&gt; processor. It has existed for a while and was also presented at &lt;a href=&quot;http://www.eswc2008.org/&quot; id=&quot;link-id0x1ed53cd0&quot;&gt;ESWC2008&lt;/a&gt;. There is also SPARQL FED from Andy Seaborne, an explicit means of specifying which end point will process which fragment of a distributed SPARQL query. Still, for federation to deliver in an open, decentralized world, it must be transparent. For a specific application, with a predictable workload, it is of course OK to partition queries explicitly.&lt;/p&gt; &lt;p&gt;Bastian had split &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x1ce846c0&quot;&gt;DBpedia&lt;/a&gt; among five &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1cad0640&quot;&gt;Virtuoso&lt;/a&gt; servers and was querying this set with DARQ. The end result was that there was a rather frightful cost of federation as opposed to all the data residing in a single Virtuoso. The other result was that if selectivity of predicates was not correctly guessed by the federation engine, the proposition was a non-starter. With correct join order it worked, though.&lt;/p&gt; &lt;p&gt;Yet, we really want federation. Looking further down the road, we simply must make federation work. This is just as necessary as running on a server cluster for mid-size workloads.&lt;/p&gt; &lt;p&gt;Since we are convinced of the cause, let&amp;#39;s talk about the means.&lt;/p&gt; &lt;p&gt;For DARQ as it now stands, there&amp;#39;s probably an order of magnitude or even more to gain from a couple of simple tricks. If going to a SPARQL end point that is not the outermost in the loop join sequence, batch the requests together in one &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x19a48280&quot;&gt;HTTP&lt;/a&gt;/1.1 message. So, if the query is &amp;quot;get me my friends living in cities of over a million people,&amp;quot; there will be the fragment &amp;quot;get city where x lives&amp;quot; and later &amp;quot;ask if population of x greater than 1000000&amp;quot;. If I have 100 friends, I send the 100 requests in a batch to each eligible server.&lt;/p&gt; &lt;p&gt;Further, if running against a server of known brand, use a client-server connection and prepared statements with array parameters. This can well improve the processing speed at the remote end point by another order of magnitude. This gain may however not be as great as the latency savings from message batching. We will provide a sample of how to do this with Virtuoso over &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x1cf18278&quot;&gt;JDBC&lt;/a&gt; so Bastian can try this if interested.&lt;/p&gt; &lt;p&gt;These simple things will give a lot of mileage and may even decide whether federation is an option in specific applications. For the open web however, these measures will not yet win the day.&lt;/p&gt; &lt;p&gt;When federating &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1cf7d0e8&quot;&gt;SQL&lt;/a&gt;, colocation of data is sort of explicit. If two tables are joined and they are in the same source, then the join can go to the source. For SPARQL this is also so but with a twist:&lt;/p&gt; &lt;p&gt;If a foaf:Person is found on a given server, this does not mean that the Person&amp;#39;s geek code or email hash will be on the same server. Thus &lt;code&gt;{?p name &amp;quot;Johnny&amp;quot; . ?p geekCode ?g . ?p emailHash ?h }&lt;/code&gt; does not necessarily denote a colocated join if many servers serve items of the vocabulary.&lt;/p&gt; &lt;p&gt;However, in most practical cases, for obtaining a rapid answer, treating this as a colocated fragment will be appropriate. Thus, it may be necessary to be able to declare that geek codes will be assumed colocated with names. This will save a lot of message passing and offer decent, if not theoretically total recall. For search style applications, starting with such assumptions will make sense. If nothing is found, then we can partition each join step separately for the unlikely case that there were a server that gave geek codes but not names.&lt;/p&gt; &lt;p&gt;For Virtuoso, we find that a federated query&amp;#39;s asynchronous, parallel evaluation model is not so different from that on a local cluster. So the cluster version could have the option of federated query. The difference is that a cluster is local and tightly coupled and predictably partitioned but a federated setting is none of these.&lt;/p&gt; &lt;p&gt;For description, we would take DARQ&amp;#39;s description model and maybe extend it a little where needed. Also we would enhance the protocol to allow just asking for the query cost estimate given a query with literals specified. We will do this eventually.&lt;/p&gt; &lt;p&gt;We would like to talk to Bastian about large improvements to DARQ, specially when working with Virtuoso. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;Of course, one mode of federating is the crawl-as-you-go approach of the Virtuoso &lt;a href=&quot;http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html&quot; id=&quot;link-id0x1e163140&quot;&gt;Sponger&lt;/a&gt;. This will bring in fragments following seeAlso or sameAs declarations or other references. This will however not have the recall of a warehouse or federation over well described SPARQL end-points. But up to a certain volume it has the speed of local storage.&lt;/p&gt; &lt;p&gt;The emergence of voiD (Vocabulary of Interlinked Data) is a step in the direction of making federation a reality. There is &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1377&quot; id=&quot;link-id1109a4c8&quot;&gt;a separate post&lt;/a&gt; about this.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Aspects of RDF to RDF Mapping</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-06-09#1380</atom:id>
  <atom:published>2008-06-09T14:02:18Z</atom:published>
  <atom:updated>2008-06-11T13:15:39-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Aspects of RDF to RDF Mapping&lt;/div&gt; &lt;p&gt;The W3C has recently launched an &lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-idd763f48&quot;&gt;incubator group about mapping relational data to RDF&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;From participating in the group for the few initial sessions, I get the following impressions.&lt;/p&gt; &lt;p&gt;There is a segment of users, for example from the biomedical community, who do heavy duty &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x17f9e6f8&quot;&gt;data&lt;/a&gt; integration and look to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x17eabf48&quot;&gt;RDF&lt;/a&gt; for managing complexity. Unifying heterogeneous data under OWL ontologies, reasoning, and data integrity, are points of interest.&lt;/p&gt; &lt;p&gt;There is another segment that is concerned with semantifying the document web, which topic includes initiatives such as &lt;a href=&quot;http://triplify.org/&quot; id=&quot;link-id0x1a25cd28&quot;&gt;Triplify&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x182c41e8&quot;&gt;semantic web&lt;/a&gt; search such as &lt;a href=&quot;http://sindice.org/&quot; id=&quot;link-id0x1a29c5e8&quot;&gt;Sindice&lt;/a&gt;. The emphasis there is on minimizing entry cost and creating critical mass. The next one to come will clean up the semantics, if these need be cleaned up at all.&lt;/p&gt; &lt;p&gt;(Some cleanup is taking place with &lt;a href=&quot;http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/&quot; id=&quot;link-id0x17fd2b70&quot;&gt;Yago&lt;/a&gt; and &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0x17e6ab88&quot;&gt;Zitgist&lt;/a&gt;, but this is a matter for a different post.)&lt;/p&gt; &lt;p&gt;Thus, technically speaking, the mapping landscape is diverse, but ETL (extract-transform-load) seems to predominate. The biomedical people make data warehouses for answering specific questions. The web people are interested in putting data out in the expectation that the next player will warehouse it and allow running complex meshups against the whole of the RDF-ized web.&lt;/p&gt; &lt;p&gt;As one would expect, these groups see different issues and needs. Roughly speaking, one is about quality and structure and the other is about volume.&lt;/p&gt; &lt;p&gt;Where do we stand?&lt;/p&gt; &lt;p&gt;We are with the research data warehousers in saying that the mapping question is very complex and that it would indeed be nice to bypass ETL and go to the source &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x182acd68&quot;&gt;RDBMS&lt;/a&gt;(s) on demand. Projects in this direction are ongoing.&lt;/p&gt; &lt;p&gt;We are with the web people in building large RDF stores with scalable query answering for arbitrary RDF, for example, hosting a lot of the Linking Open Data sets, and working with Zitgist.&lt;/p&gt; &lt;p&gt;These things are somewhat different.&lt;/p&gt; &lt;p&gt;At present, both the research warehousers and the web scalers predominantly go for ETL.&lt;/p&gt; &lt;p&gt;This is fine by us as we definitely are in the large RDF store race.&lt;/p&gt; &lt;p&gt;Still, mapping has its point. A relational store will perform quite a bit faster than a quad store if it has the right covering indices or application-specific compressed columnar layout. Thus, there is nothing to block us from querying analytics in &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x16c91438&quot;&gt;SPARQL&lt;/a&gt;, once the obviously necessary extensions of sub-query, expressions and aggregation are in place.&lt;/p&gt; &lt;p&gt;To cite an example, the Ordnance Survey of the UK has a GIS system running on &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x17ee37c8&quot;&gt;Oracle&lt;/a&gt; with an entry pretty much for each mailbox, lamp post, and hedgerow in the country. According to Ordnance Survey, this would be 1 petatriple, 1e15 triples. &amp;quot;Such a big server farm that we&amp;#39;d have to put it on our map,&amp;quot; as Jenny Harding put it at &lt;a href=&quot;http://www.eswc2008.org/&quot; id=&quot;link-id0x1cab6330&quot;&gt;ESWC2008&lt;/a&gt;. I&amp;#39;d add that an even bigger map entry would be the power plant needed to run the 100,000 or so PCs this would take. This is counting 10 gigatriples per PC, which would not even give very good working sets.&lt;/p&gt; &lt;p&gt;So, on-the-fly RDBMS-to-RDF mapping in some cases is simply necessary. Still, the benefits of RDF for integration can be preserved if the translation middleware is smart enough. Specifically, this entails knowing what tables can be joined with what other tables and pushing maximum processing to the RDBMS(s) involved in the query.&lt;/p&gt; &lt;p&gt;You can download the slide set I used for the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa1fb7e8&quot;&gt;Virtuoso&lt;/a&gt; presentation for the RDB to RDF mapping incubator group (&lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations/Relational2RDF.ppt&quot; id=&quot;link-id106f9e88&quot;&gt;PPT&lt;/a&gt;; &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations&quot; id=&quot;link-id10a8dc90&quot;&gt;other formats&lt;/a&gt; coming soon). The main point is that real integration is hard and needs smart query splitting and optimization, as well as real understanding of the databases and subject matter from the &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x17ee38a0&quot;&gt;information&lt;/a&gt; architect. Sometimes in the web space it can suffice to put data out there with trivial RDF translation and hope that a search engine or such will figure out how to join this with something else. For the enterprise, things are not so. Benefits are clear if one can navigate between disjoint silos but making this accurate enough for deriving business conclusions, as well as efficient enough for production, is a soluble and non-trivial question.&lt;/p&gt; &lt;p&gt;We will show the basics of this with the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x1844d718&quot;&gt;TPC-H&lt;/a&gt; mapping, and by joining this with physical triples. We will also make a set of TPC-H format table sets, make mappings between keys in one to keys in the other, and show joins between the two. The SPARQL querying of one such data store is a done deal, including the SPARQL extensions for this. There is even a demo paper, Business Intelligence Extensions for SPARQL (&lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations/RDFAndMapped_BI.pdf&quot; id=&quot;link-id12ea4b18&quot;&gt;PDF&lt;/a&gt;; &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations&quot; id=&quot;link-id106e1810&quot;&gt;other formats&lt;/a&gt; coming soon), by us on the subject in the ESWC 2008 proceedings. If there is an issue left, it is just the technicality of always producing &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x17fc8d60&quot;&gt;SQL&lt;/a&gt; that looks hand-crafted and hence is better understood by the target RDBMS(s). For example, Oracle works better if one uses an &lt;code&gt;IN&lt;/code&gt; sub-query instead of the equivalent existence test.&lt;/p&gt; &lt;p&gt;Follow this &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0xa9bcef8&quot;&gt;blog&lt;/a&gt; for more on the topic; published papers are always a limited view on the matter.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>ESWC 2008</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-06-09#1379</atom:id>
  <atom:published>2008-06-09T14:02:16Z</atom:published>
  <atom:updated>2008-06-11T13:15:33-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;ESWC 2008&lt;/div&gt; &lt;p&gt;YrjÃ¤nÃ¤ Rankka and I attended &lt;a href=&quot;http://www.eswc2008.org/&quot; id=&quot;link-id10b7a038&quot;&gt;ESWC2008&lt;/a&gt; on behalf of OpenLink.&lt;/p&gt; &lt;p&gt;We were invited at the last minute to give a &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id105df758&quot;&gt;Linked Open Data&lt;/a&gt; talk at Paolo Bouquet&amp;#39;s Identity and Reference workshop. We also had a demo of &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id12eacca0&quot;&gt;SPARQL&lt;/a&gt; BI (&lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations/ESWC2008%20SPARQL%20BI%20OpenLink.ppt&quot; id=&quot;link-id10b43e58&quot;&gt;PPT&lt;/a&gt;); &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VirtPresentations&quot; id=&quot;link-id1116d8f0&quot;&gt;other formats coming soon&lt;/a&gt;), our business intelligence extensions to &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x16c9bfc8&quot;&gt;SPARQL&lt;/a&gt; as well as joining between relational &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id10badc40&quot;&gt;data&lt;/a&gt; mapped to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id108edaf8&quot;&gt;RDF&lt;/a&gt; and native &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x181a5ed8&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x17e69910&quot;&gt;data&lt;/a&gt;. i was also speaking at the social networks panel chaired by Harry Halpin.&lt;/p&gt; &lt;p&gt;I have gathered a few impressions that I will share in the next few posts (&lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1375&quot; id=&quot;link-id107298e0&quot;&gt;1 - RDF Mapping&lt;/a&gt;, &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1376&quot; id=&quot;link-id10b3a530&quot;&gt;2 - DARQ&lt;/a&gt;, &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1377&quot; id=&quot;link-id107290e0&quot;&gt;3 - voiD&lt;/a&gt;, &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1378&quot; id=&quot;link-id1071a950&quot;&gt;4 - Paradigmata&lt;/a&gt;). &lt;i&gt;Caveat: This is not meant to be complete or impartial press coverage of the event but rather some quick comments on issues of personal/OpenLink interest. The fact that I do not mention something does not mean that it is unimportant.&lt;/i&gt; &lt;/p&gt; &lt;h2&gt;The voiD Graph&lt;/h2&gt; &lt;p&gt; &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1a87f110&quot;&gt;Linked Open Data&lt;/a&gt; was well represented, with Chris Bizer, Tom Heath, ourselves and many others. The great advance for &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id108f3c48&quot;&gt;LOD&lt;/a&gt; this time around is &lt;a href=&quot;http://community.linkeddata.org/MediaWiki/index.php?MetaLOD#Kick-off_meeting_at_ESWC08&quot; id=&quot;link-id10df9830&quot;&gt;voiD, the Vocabulary of Interlinked Datasets&lt;/a&gt;, a means to describe what in fact is inside the &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1a089980&quot;&gt;LOD&lt;/a&gt; cloud, how to join it with what and so forth. Big time important if there is to be a &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1377&quot; id=&quot;link-iddf74578&quot;&gt;web of federatable data sources&lt;/a&gt;, feeding directly into what we have been saying for a while about SPARQL end-point self-description and discovery. There is reasonable hope of having something by the date of &lt;a href=&quot;http://www.linkeddataplanet.com/&quot; id=&quot;link-id10dd0848&quot;&gt;Linked Data Planet&lt;/a&gt; in a couple of weeks.&lt;/p&gt; &lt;h2&gt;Federating&lt;/h2&gt; &lt;p&gt;Bastian Quilitz gave a talk about his &lt;a href=&quot;http://darq.sourceforge.net/&quot; id=&quot;link-id108746e8&quot;&gt;DARQ&lt;/a&gt;, a federated version of Jena&amp;#39;s ARQ.&lt;/p&gt; &lt;p&gt;Something like &lt;a href=&quot;http://darq.sourceforge.net/&quot; id=&quot;link-id0x1a2d9860&quot;&gt;DARQ&lt;/a&gt;&amp;#39;s optimization statistics should make their way into the &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id10992348&quot;&gt;SPARQL protocol&lt;/a&gt; as well as the voiD data set description.&lt;/p&gt; &lt;p&gt;We really need federation but more on this in &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1376&quot; id=&quot;link-id1059d688&quot;&gt;a separate post&lt;/a&gt;.&lt;/p&gt; &lt;h2&gt; &lt;a href=&quot;http://xsparql.deri.ie/&quot; id=&quot;link-id10314308&quot;&gt;XSPARQL&lt;/a&gt; &lt;/h2&gt; &lt;p&gt;Axel Polleres et al had a paper about &lt;a href=&quot;http://xsparql.deri.ie/&quot; id=&quot;link-id0x1ad77490&quot;&gt;XSPARQL&lt;/a&gt;, a merge of &lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot; id=&quot;link-id10b98e90&quot;&gt;XQuery&lt;/a&gt; and SPARQL. While visiting DERI a couple of weeks back and again at the conference, we talked about OpenLink implementing the spec. It is evident that the engines must be in the same process and not communicate via the &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-protocol/&quot; id=&quot;link-id0x17e75190&quot;&gt;SPARQL protocol&lt;/a&gt; for this to be practical. We could do this. We&amp;#39;ll have to see when.&lt;/p&gt; &lt;p&gt;Politically, using &lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot; id=&quot;link-id0x18a9bf10&quot;&gt;XQuery&lt;/a&gt; to give expressions and XML synthesis to SPARQL would be fitting. These things are needed anyhow, as surely as aggregation and sub-queries but the latter would not so readily come from XQuery. Some rapprochement between RDF and XML folks is desirable anyhow.&lt;/p&gt; &lt;h2&gt;Panel: Will the Sem Web Rise to the Challenge of the Social Web?&lt;/h2&gt; &lt;p&gt;The social web panel presented the question of whether the sem web was ready for prime time with data portability.&lt;/p&gt; &lt;p&gt;The main thrust was expressed in Harry Halpin&amp;#39;s rousing closing words: &amp;quot;Men will fight in a battle and lose a battle for a cause they believe in. Even if the battle is lost, the cause may come back and prevail, this time changed and under a different name. Thus, there may well come to be something like our &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id122f4da0&quot;&gt;semantic web&lt;/a&gt;, but it may not be the one we have worked all these years to build if we do not rise to the occasion before us right now.&amp;quot;&lt;/p&gt; &lt;p&gt;So, how to do this? Dan Brickley asked the audience how many supported, or were aware of, the latest Web 2.0 things, such as &lt;a href=&quot;http://dbpedia.org/page/OAuth&quot; id=&quot;link-idf300bc0&quot;&gt;OAuth&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/page/OpenID&quot; id=&quot;link-id10ce7a40&quot;&gt;OpenID&lt;/a&gt;. A few were. The general idea was that research (after all, this was a research event) should be more integrated and open to the world at large, not living at the &amp;quot;outdated pace&amp;quot; of a 3 year funding cycle. Stefan Decker of DERI acquiesced in principle. Of course there is impedance mismatch between specialization and interfacing with everything.&lt;/p&gt; &lt;p&gt;I said that triples and vocabularies existed, that OpenLink had &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id1210dbf8&quot;&gt;ODS&lt;/a&gt; (&lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id11076be8&quot;&gt;OpenLink Data Spaces&lt;/a&gt;, &lt;a href=&quot;http://community.linkeddata.org/&quot; id=&quot;link-id10d46710&quot;&gt;Community LinkedData&lt;/a&gt;) for managing one&amp;#39;s data-web presence, but that scale would be the next thing. Rather large scale even, with 100 gigatriples (Gtriples) reached before one even noticed. It takes a lot of PCs to host this, maybe $400K worth at today&amp;#39;s prices, without replication. Count 16G ram and a few cores per Gtriple so that one is not waiting for disk all the time.&lt;/p&gt; &lt;p&gt;The tricks that Web 2.0 silos do with app-specific data structures and app-specific partitioning do not really work for RDF without compromising the whole point of smooth schema evolution and tolerance of ragged data.&lt;/p&gt; &lt;p&gt;So, simple vocabularies, minimal inference, minimal blank nodes. Besides, note that the inference will have to be done at run time, not forward-chained at load time, if only because users will not agree on what sameAs and other declarations they want for their queries. Not to mention spam or malicious sameAs declarations!&lt;/p&gt; &lt;p&gt;As always, there was the question of business models for the open data web and for semantic technologies in general. As we see it, &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id108b7688&quot;&gt;information&lt;/a&gt; overload is the factor driving the demand. Better contextuality will justify semantic technologies. Due to the large volumes and complex processing, a data-as-service model will arise. The data may be open, but its query infrastructure, cleaning, and keeping up-to-date, can be monetized as services.&lt;/p&gt; &lt;h2&gt;Identity and Reference&lt;/h2&gt; &lt;p&gt;For the identity and reference workshop, the ultimate question is metaphysical and has no single universal answer, even though people, ever since the dawn of time and earlier, have occupied themselves with the issue. Consequently, I started with the Genesis quote where Adam called things by &lt;i&gt;nominibus suis&lt;/i&gt;, off-hand implying that things would have some intrinsic ontologically-due names. This would be among the older references to the question, at least in widely known sources.&lt;/p&gt; &lt;p&gt;For present purposes, the consensus seemed to be that what would be considered the same as something else depended entirely on the application. What was similar enough to warrant a sameAs for cooking purposes might not warrant a sameAs for chemistry. In fact, complete and exact sameness for URIs would be very rare. So, instead of making generic weak similarity assertions like similarTo or seeAlso, one would choose a set of strong sameAs assertions and have these in effect for query answering if they were appropriate to the granularity demanded by the application.&lt;/p&gt; &lt;p&gt;Therefore sameAs is our permanent companion, and there will in time be malicious and spam sameAs. So, nothing much should be materialized on the basis of sameAs assertions in an &lt;a href=&quot;http://dbpedia.org/resource/Open_world_assumption&quot; id=&quot;link-id10c4dfd0&quot;&gt;open world&lt;/a&gt;. For an app-specific warehouse, sameAs can be resolved at load time.&lt;/p&gt; &lt;p&gt;There was naturally some apparent tension between the Occam camp of &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id105fd240&quot;&gt;entity&lt;/a&gt; name services and the LOD camp. I would say that the issue is more a perceived polarity than a real one. People will, inevitably, continue giving things names regardless of any centralized authority. Just look at natural language. But having a dictionary that is commonly accepted for established domains of discourse is immensely helpful.&lt;/p&gt; &lt;h2&gt;CYC and NLP&lt;/h2&gt; &lt;p&gt;The semantic search workshop was interesting, especially CYC&amp;#39;s presentation. CYC is, as it were, the grand old man of &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id10568158&quot;&gt;knowledge&lt;/a&gt; representation. Over the long term, I would have support of the CYC inference language inside a database query processor. This would mostly be for repurposing the huge &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1acff9d0&quot;&gt;knowledge&lt;/a&gt; base for helping in search type queries. If it is for transactions or financial reporting, then queries will be &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id130a0a80&quot;&gt;SQL&lt;/a&gt; and make little or no use of any sort of inference. If it is for summarization or finding things, the opposite holds. For scaling, the issue is just making correct cardinality guesses for query planning, which is harder when inference is involved. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;I will also have a closer look at natural language one of these days, quite inevitably, since &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id10795828&quot;&gt;Zitgist&lt;/a&gt; (for example) is into &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x18a12918&quot;&gt;entity&lt;/a&gt; disambiguation.&lt;/p&gt; &lt;h2&gt;Scale&lt;/h2&gt; &lt;p&gt;Garlic gave a talk about their Data Patrol and QDOS. We agree that storing the data for these as triples instead of 1000 or so constantly changing relational tables could well make the difference between next-to-unmanageable and efficiently adaptive.&lt;/p&gt; &lt;p&gt;Garlic probably has the largest triple collection in constant online use to date. We will soon join them with our hosting of the whole LOD cloud and &lt;a href=&quot;http://sindice.org/&quot; id=&quot;link-id0x17f18a38&quot;&gt;Sindice&lt;/a&gt;/&lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0x184e9e90&quot;&gt;Zitgist&lt;/a&gt; as triples.&lt;/p&gt; &lt;h2&gt;Conclusions&lt;/h2&gt; &lt;p&gt;There is a mood to deliver applications. Consequently, scale remains a central, even the principal topic. So for now we make bigger centrally-managed databases. At the next turn around the corner we will have to turn to federation. The point here is that a planetary-scale, centrally-managed, online system can be made when the workload is uniform and anticipatable, but if it is free-form queries and complex analysis, we have a problem. So we move in the direction of federating and charging based on usage whenever the workload is more complex than making simple lookups now and then.&lt;/p&gt; &lt;p&gt;For the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id1026ac28&quot;&gt;Virtuoso&lt;/a&gt; roadmap, this changes little. Next we make data sets available on Amazon EC2, as widely promised at ESWC. With big scale also comes rescaling and repartitioning, so this gets additional weight, as does further parallelizing of single user workloads. As it happens, the same medicine helps for both. At &lt;a href=&quot;http://www.linkeddataplanet.com/&quot; id=&quot;link-id0x17ff5c20&quot;&gt;Linked Data Planet&lt;/a&gt;, we will make more announcements.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Cluster Paper</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-05-30#1369</atom:id>
  <atom:published>2008-05-30T10:02:04Z</atom:published>
  <atom:updated>2008-05-30T06:02:05-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Cluster Paper&lt;/div&gt; &lt;div&gt; &lt;div&gt;We have a new article on &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id10424890&quot;&gt;Virtuoso&lt;/a&gt; cluster, submitted to ISWC 2008.&lt;/div&gt; &lt;div&gt; Right now we are working on hosting the billion triples challenge &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id1077f800&quot;&gt;data&lt;/a&gt; set at Amazon EC2 using &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id102117f0&quot;&gt;Virtuoso&lt;/a&gt; Cluster. This will be the first publicly available instance of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x20387e80&quot;&gt;Virtuoso&lt;/a&gt; Cluster and all interested may then instantiate their own copy on the EC2 infrastructure. &lt;/div&gt; &lt;br /&gt; &lt;div&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/2008iswc_webscale_rdf.pdf&quot; id=&quot;link-id10af2f30&quot;&gt;Towards Web Scale RDF&lt;/a&gt; &lt;br /&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/RDFAndMapped_BI.pdf&quot; id=&quot;link-idfedf9f0&quot;&gt;Integrating Open Sources and Relational Data with SPARQL&lt;/a&gt; &lt;br /&gt; &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/bisparql2.pdf&quot; id=&quot;link-id106e5418&quot;&gt;Business Intelligence Extensions for SPARQL&lt;/a&gt; &lt;br /&gt; &lt;/div&gt; &lt;br /&gt; &lt;div&gt; Look for a separate announcement in the near future. &lt;/div&gt; &lt;/div&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>DBpedia Benchmark Revisited</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-05-09#1359</atom:id>
  <atom:published>2008-05-09T19:33:42Z</atom:published>
  <atom:updated>2008-05-12T11:24:43-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;DBpedia Benchmark Revisited&lt;/div&gt; &lt;p&gt;We ran the &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x1cd6d0c8&quot;&gt;DBpedia&lt;/a&gt; benchmark queries again with different configurations of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1bf01048&quot;&gt;Virtuoso&lt;/a&gt;. I had not studied the details of the matter previously but now did have a closer look at the queries.&lt;/p&gt; &lt;p&gt;Comparing numbers given by different parties is a constant problem. In the case reported here, we loaded the full DBpedia 3, all languages, with about 198M triples, onto Virtuoso v5 and Virtuoso Cluster v6, all on the same 4 core 2GHz Xeon with 8G RAM. All databases were striped on 6 disks. The Cluster configuration was with 4 processes in the same box.&lt;/p&gt; &lt;p&gt;We ran the queries in two variants:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;With graph specified in the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1b9d3ca0&quot;&gt;SPARQL&lt;/a&gt; &lt;code&gt;FROM&lt;/code&gt; clause, using the default indices.&lt;/li&gt; &lt;li&gt;With no graph specified anywhere, using an alternate indexing scheme.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The times below are for the sequence of 5 queries; individual query times are not reported. I did not do a line-by-line review of the execution plans since they seem to run well enough. We could get some extra mileage from cost model tweaks, especially for the numeric range conditions, but we will do this when somebody comes up with better times.&lt;/p&gt; &lt;p&gt;First, about Virtuoso v5: Because there is a query in the set that specifies no condition on S or O and only P, this simply cannot be done with the default indices. With Virtuoso Cluster v6 it sort-of can, because v6 is more space efficient.&lt;/p&gt; &lt;p&gt;So we added the index:&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; create bitmap index &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1c364a58&quot;&gt;rdf&lt;/a&gt;_quad_pogs on rdf_quad (p, o, g, s); &lt;/code&gt; &lt;/blockquote&gt; &lt;table&gt; &lt;tr&gt; &lt;td&gt;Â &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;&lt;b&gt;Virtuoso v5 with&lt;br /&gt; gspo, ogps, pogs&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;&lt;b&gt;Virtuoso Cluster v6 with &lt;br /&gt;gspo, ogps&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;&lt;b&gt;Virtuoso Cluster v6 with &lt;br /&gt;gspo, ogps, pogs&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;b&gt;cold&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;210 s&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;136 s&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;33.4 s&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;b&gt;warm&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0.600 s&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;4.01 s&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0.628 s&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;OK, so now let us do it without a graph being specified. For all platforms, we drop any existing indices, and --&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt; create table r2 (g iri_id_8, s, iri_id_8, p iri_id_8, o any, primary key (s, p, o, g)) &lt;br /&gt; alter index R2 on R2 partition (s int (0hexffff00)); &lt;br /&gt; &lt;br /&gt; log_enable (2); &lt;br /&gt; insert into r2 (g, s, p, o) select g, s, p, o from rdf_quad; &lt;br /&gt; &lt;br /&gt; drop table rdf_quad; &lt;br /&gt; alter table r2 rename RDF_QUAD; &lt;br /&gt; create bitmap index rdf_quad_opgs on rdf_quad (o, p, g, s) partition (o varchar (-1, 0hexffff)); &lt;br /&gt; create bitmap index rdf_quad_pogs on rdf_quad (p, o, g, s) partition (o varchar (-1, 0hexffff)); &lt;br /&gt; create bitmap index rdf_quad_gpos on rdf_quad (g, p, o, s) partition (o varchar (-1, 0hexffff)); &lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;The code is identical for v5 and v6, except that with v5 we use &lt;code&gt;iri_id (32 bit)&lt;/code&gt; for the type, not &lt;code&gt;iri_id_8 (64 bit)&lt;/code&gt;. We note that we run out of IDs with v5 around a few billion triples, so with v6 we have double the ID length and still manage to be vastly more space efficient.&lt;/p&gt; &lt;p&gt;With the above 4 indices, we can query the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1bae4cd8&quot;&gt;data&lt;/a&gt; pretty much in any combination without hitting a full scan of any index. We note that all indices that do not begin with s end with s as a bitmap. This takes about 60% of the space of a non-bitmap index for data such as DBpedia.&lt;/p&gt; &lt;p&gt;If you intend to do completely arbitrary RDF queries in Virtuoso, then chances are you are best off with the above index scheme.&lt;/p&gt; &lt;table&gt; &lt;tr&gt; &lt;td&gt;Â &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;&lt;b&gt; Virtuoso v5 with&lt;br /&gt; gspo, ogps, pogs&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;&lt;b&gt; Virtuoso Cluster v6 with &lt;br /&gt; spog, pogs, opgs, gpos &lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;&lt;b&gt;warm&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0.595 s&lt;/td&gt; &lt;td align=&quot;center&quot;&gt;0.617 s&lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;The cold times were about the same as above, so not reproduced.&lt;/p&gt; &lt;h3&gt;Graph or No Graph?&lt;/h3&gt; &lt;p&gt;It is in the SPARQL spirit to specify a graph and for pretty much any application, there are entirely sensible ways of keeping the data in graphs and specifying which ones are concerned by queries. This is why Virtuoso is set up for this by default.&lt;/p&gt; &lt;p&gt;On the other hand, for the open web scenario, dealing with an unknown large number of graphs, enumerating graphs is not possible and questions like which graph of which source asserts x become relevant. We have two distinct use cases which warrant different setups of the database, simple as that.&lt;/p&gt; &lt;p&gt;The latter use case is not really within the SPARQL spec, so implementations may or may not support this. For example &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1cd2db78&quot;&gt;Oracle&lt;/a&gt; or Vertica would not do this well since they partition data according to graph or predicate, respectively. On the other hand, stores that work with one quad table, which is most of the ones out there, should do it maybe with some configuring, as shown above.&lt;/p&gt; &lt;p&gt;Frameworks like Jena are not to my &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1b300390&quot;&gt;knowledge&lt;/a&gt; geared towards having a wildcard for graph, although I would suppose this can be arranged by adding some &amp;quot;super-graph&amp;quot; object, a graph of all graphs. I don&amp;#39;t think this is directly supported and besides most apps would not need it.&lt;/p&gt; &lt;p&gt;Once the indices are right, there is no difference between specifying a graph and not specifying a graph with the queries considered. With more complex queries, specifying a graph or set of graphs does allow some optimizations that cannot be done with no graph specified. For example, bitmap intersections are possible only when all leading key parts are given.&lt;/p&gt; &lt;h3&gt;Conclusions&lt;/h3&gt; &lt;p&gt;The best warm cache time is with v5; the five queries run under 600 ms after the first go. This is noted to show that all-in-memory with a single thread of execution is hard to beat.&lt;/p&gt; &lt;p&gt;Cluster v6 performs the same queries in 623 ms. What is gained in parallelism is lost in latency if all operations complete in microseconds. On the other hand, Cluster v6 leaves v5 in the dust in any situation that has less than 100% hit rate. This is due to actual benefit from parallelism if operations take longer than a few microseconds, such as in the case of disk reads. Cluster v6 has substantially better data layout on disk, as well as fewer pages to load for the same content.&lt;/p&gt; &lt;p&gt;This makes it possible to run the queries without the pogs index on Cluster v6 even when v5 takes prohibitively long.&lt;/p&gt; &lt;p&gt;The morale of the story is to have a lot of RAM and space-efficient data representation.&lt;/p&gt; &lt;p&gt;The DBpedia benchmark does not specify any random access pattern that would give a measure of sustained throughput under load, so we are left with the extremes of cold and warm cache of which neither is quite realistic.&lt;/p&gt; &lt;p&gt;Chris Bizer and I have talked on and off about benchmarks and I have made suggestions that we will see incorporated into the Berlin SPARQL benchmark, which will, I believe, be much more informative.&lt;/p&gt; &lt;h3&gt;Appendix: Query Text&lt;/h3&gt; &lt;p&gt;For reference, the query texts specifying the graph are below. To run without specifying the graph, just drop the &lt;code&gt;FROM &amp;lt;&lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1c371db0&quot;&gt;http&lt;/a&gt;://dbpedia.org&amp;gt;&lt;/code&gt; from each query. The returned row counts are indicated below each query&amp;#39;s text.&lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;&lt;pre&gt; sparql SELECT ?p ?o FROM &amp;lt;http://dbpedia.org&amp;gt; WHERE { &amp;lt;http://dbpedia.org/resource/Metropolitan_Museum_of_Art&amp;gt; ?p ?o }; -- 1337 rows sparql PREFIX p: &amp;lt;http://dbpedia.org/property/&amp;gt; SELECT ?film1 ?actor1 ?film2 ?actor2 FROM &amp;lt;http://dbpedia.org&amp;gt; WHERE { ?film1 p:starring &amp;lt;http://dbpedia.org/resource/Kevin_Bacon&amp;gt; . ?film1 p:starring ?actor1 . ?film2 p:starring ?actor1 . ?film2 p:starring ?actor2 . }; -- 23910 rows sparql PREFIX p: &amp;lt;http://dbpedia.org/property/&amp;gt; SELECT ?artist ?artwork ?museum ?director FROM &amp;lt;http://dbpedia.org&amp;gt; WHERE { ?artwork p:artist ?artist . ?artwork p:museum ?museum . ?museum p:director ?director }; -- 303 rows sparql PREFIX geo: &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#&amp;gt; PREFIX foaf: &amp;lt;http://xmlns.com/foaf/0.1/&amp;gt; PREFIX xsd: &amp;lt;http://www.w3.org/2001/XMLSchema#&amp;gt; SELECT ?s ?homepage FROM &amp;lt;http://dbpedia.org&amp;gt; WHERE { &amp;lt;http://dbpedia.org/resource/Berlin&amp;gt; geo:lat ?berlinLat . &amp;lt;http://dbpedia.org/resource/Berlin&amp;gt; geo:long ?berlinLong . ?s geo:lat ?lat . ?s geo:long ?long . ?s foaf:homepage ?homepage . FILTER ( ?lat &amp;lt;= ?berlinLat + 0.03190235436 &amp;amp;&amp;amp; ?long &amp;gt;= ?berlinLong - 0.08679199218 &amp;amp;&amp;amp; ?lat &amp;gt;= ?berlinLat - 0.03190235436 &amp;amp;&amp;amp; ?long &amp;lt;= ?berlinLong + 0.08679199218) }; -- 56 rows sparql PREFIX geo: &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#&amp;gt; PREFIX foaf: &amp;lt;http://xmlns.com/foaf/0.1/&amp;gt; PREFIX xsd: &amp;lt;http://www.w3.org/2001/XMLSchema#&amp;gt; PREFIX p: &amp;lt;http://dbpedia.org/property/&amp;gt; SELECT ?s ?a ?homepage FROM &amp;lt;http://dbpedia.org&amp;gt; WHERE { &amp;lt;http://dbpedia.org/resource/New_York_City&amp;gt; geo:lat ?nyLat . &amp;lt;http://dbpedia.org/resource/New_York_City&amp;gt; geo:long ?nyLong . ?s geo:lat ?lat . ?s geo:long ?long . ?s p:architect ?a . ?a foaf:homepage ?homepage . FILTER ( ?lat &amp;lt;= ?nyLat + 0.3190235436 &amp;amp;&amp;amp; ?long &amp;gt;= ?nyLong - 0.8679199218 &amp;amp;&amp;amp; ?lat &amp;gt;= ?nyLat - 0.3190235436 &amp;amp;&amp;amp; ?long &amp;lt;= ?nyLong + 0.8679199218) }; -- 13 rows &lt;/pre&gt; &lt;/code&gt; &lt;/blockquote&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>SPARQL at WWW 2008</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-04-30#1354</atom:id>
  <atom:published>2008-04-30T16:28:10Z</atom:published>
  <atom:updated>2008-08-28T11:26:06.000004-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;SPARQL at WWW 2008&lt;/div&gt; &lt;p&gt;Andy Seaborne and Eric Prud&amp;#39;hommeaux, editors of the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x183b13a8&quot;&gt;SPARQL&lt;/a&gt; recommendation, convened a SPARQL birds of a feather session at &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0xd31c2d0&quot;&gt;WWW 2008&lt;/a&gt;. The administrative outcome was that implementors could now experiment with extensions, hopefully keeping each other current about their efforts and that towards the end of 2008, a new W3C working group might begin formalizing the experiences into a new SPARQL spec.&lt;/p&gt; &lt;p&gt;The session drew a good crowd, including many users and developers. The wishes were largely as expected, with a few new ones added. Many of the wishes already had diverse implementations, however most often without interop. I will below give some comments on the main issues discussed.&lt;/p&gt; &lt;/div&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;SPARQL Update&lt;/b&gt; - This is likely the most universally agreed upon extension. Implementations exist, largely along the lines of Andy Seaborne&amp;#39;s SPARUL spec, which is also likely material for a W3C member submission. The issue is without much controversy; transactions fall outside the scope, which is reasonable enough. With triple stores, we can define things as combinations of inserts and deletes, and isolation we just leave aside. If anything, operating on a transactional platform such as &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1d442cd8&quot;&gt;Virtuoso&lt;/a&gt;, one wishes to disable transactions for any operations such as bulk loads and long-running inserts and deletes. Transactionality has pretty much no overhead for a few hundred rows, but for a few hundred million rows the cost of locking and rollback is prohibitive. With Virtuoso, we have a row auto-commit mode which we recommend for use with &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xec62c58&quot;&gt;RDF&lt;/a&gt;: It commits by itself now and then, optionally keeping a roll forward log, and is transactional enough not to leave half triples around, i.e., inserted in one index but not another.&lt;/p&gt; &lt;p&gt;As far as we are concerned, updating physical triples along the SPARUL lines is pretty much a done deal.&lt;/p&gt; &lt;p&gt;The matter of updating relational &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x19f995c8&quot;&gt;data&lt;/a&gt; mapped to RDF is a whole other kettle of fish. On this, I should say that RDF has no special virtues for expressing transactions but rather has a special genius for integration. Updating is best left to web service interfaces that use &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x125b2b98&quot;&gt;SQL&lt;/a&gt; on the inside. Anyway, updating union views, which most mappings will be, is complicated. Besides, for transactions, one usually knows exactly what one wishes to update.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Full Text&lt;/b&gt; - Many people expressed a desire for full text access. Here we run into a deplorable confusion with regexps. The closest SPARQL has to full text in its native form is regexps, but these are not really mappable to full text except in rare special cases and I would despair of explaining to an end user what exactly these cases are. So, in principle, some regexps are equivalent to full text but in practice I find it much preferable to keep these entirely separate.&lt;/p&gt; &lt;p&gt;It was noted that what the users want is a text box for search words. This is a front end to the CONTAINS predicate of most SQL implementations. Ours is MS SQL Server compatible and has a SPARQL version called &lt;code&gt;bif:contains&lt;/code&gt;. One must still declare which triples one wants indexed for full text, though. This admin overhead seems inevitable, as text indexing is a large overhead and not needed by all applications.&lt;/p&gt; &lt;p&gt;Also, text hits are not boolean; usually they come with a hit score. Thus, a SPARQL extension for this could look like &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;select * where { ?thing has_description ?d . ?d ftcontains &amp;quot;gizmo&amp;quot; ftand &amp;quot;widget&amp;quot; score ?score . }&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;This would return all the subjects, descriptions, and scores, from subjects with a has_description property containing widget and gizmo. Extending the basic pattern is better than having the match in a filter, since the match binds a variable.&lt;/p&gt; &lt;p&gt;The &lt;a href=&quot;http://dbpedia.org/resource/XQuery&quot; id=&quot;link-id0xfec6788&quot;&gt;XQuery&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/XPath&quot; id=&quot;link-id0x1a789e38&quot;&gt;XPath&lt;/a&gt; groups have recently come up with a full-text spec, so I used their style of syntax above. We already have a full-text extension, as do some others. but for standardization, it is probably most appropriate to take the XQuery work as a basis. The XQuery full-text spec is quite complex, but I would expect most uses to get by with a small subset, and the structure seems better thought out, at first glance, than the more ad-hoc implementations in diverse SQLs.&lt;/p&gt; &lt;p&gt;Again, declaring any text index to support the search, as well as its timeliness or transactionality, are best left to implementations.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Federation&lt;/b&gt; - This is a tricky matter. &lt;a href=&quot;http://jena.sourceforge.net/ARQ/&quot; id=&quot;link-id0xba487f0&quot;&gt;ARQ&lt;/a&gt; has a SPARQL extension for sending a nested set of triple patterns to a specific end-point. The &lt;a href=&quot;http://darq.sourceforge.net/&quot; id=&quot;link-id0xf8a5ab0&quot;&gt;DARQ&lt;/a&gt; project has something more, including a selectivity model for SPARQL.&lt;/p&gt; &lt;p&gt;With federated SQL, life is simpler since after the views are expanded, we have a query where each table is at a known server and has more or less known statistics. Generally, execution plans where as much work as possible is pushed to the remote servers are preferred, and modeling the latencies is not overly hard. With SPARQL, each triple pattern could in principle come from any of the federated servers. Associating a specific end-point to a fragment of the query just passes the problem to the user. It is my guess that this is the best we can do without getting very elaborate, and possibly buggy, end-point content descriptions for routing federated queries.&lt;/p&gt; &lt;p&gt;Having said this, there remains the problem of join order. I suggested that we enhance the protocol by allowing asking an end-point for the query cost for a given SPARQL query. Since they all must have a cost model for optimization, this should not be an impossible request. A time cost and estimated cardinality would be enough. Making statistics available &lt;i&gt;Ã  la&lt;/i&gt; DARQ was also discussed. Being able to declare cardinalities expected of a remote end-point is probably necessary anyway, since not all will implement the cost model interface. For standardization, agreeing on what is a proper description of content and cardinality and how fine grained this must be will be so difficult that I would not wait for it. A cost model interface would nicely hide this within the end-point itself.&lt;/p&gt; &lt;p&gt;With Virtuoso, we do not have a federated SPARQL scheme but we could have the ARQ-like service construct. We&amp;#39;d use our own cost model with explicit declarations of cardinalities of the remote data for guessing a join order. Still, this is a bit of work. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;For practicality, the service construct coupled with join order hints is the best short term bet. Making this pretty enough for standardization is not self-evident, as it requires end-point description and/or cost model hooks for things to stay declarative.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;End-point description&lt;/b&gt; - This question has been around for a while; I have &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1085&quot; id=&quot;link-id101d3440&quot;&gt;blogged about it earlier&lt;/a&gt;, but we are not really at a point where there would be even rough consensus about an end-point ontology. We should probably do something on our own to demonstrate some application of this, as we host lots of &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x162d0de8&quot;&gt;linked open data&lt;/a&gt; sets.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;SQL equivalence&lt;/b&gt; - There were many requests for aggregation, some for subqueries and nesting, expressions in select, negation, existence and so on. I would call these all SQL equivalence. One use case was taking all the teams in the database and for all with over 5 members, add the big_team class and a property for member count.&lt;/p&gt; &lt;p&gt;With Virtuoso, we could write this as -- &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;construct { ?team a big_team . ?team member_count ?ct } from ... where {?team a team . { select ?team2 count (*) as ?ct where { ?m member_of ?team2 } . filter (?team = ?team2 and ? ct &amp;gt; 5) }}&lt;/code&gt; &lt;/blockquote&gt; &lt;p&gt;We have pretty much all the SQL equivalence features, as we have been working for some time at translating the &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0x11c870c8&quot;&gt;TPC-H&lt;/a&gt; workload into SPARQL.&lt;/p&gt; &lt;p&gt;The usefulness of these things is uncontested but standardization could be hard as there are subtle questions about variable scope and the like.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Inference&lt;/b&gt; - The SPARQL spec does not deal with transitivity or such matters because it is assumed that these are handled by an underlying inference layer. This is however most often not so. There was interest in more fine grained control of inference, for example declaring that just one property in a query would be transitive or that subclasses should be taken into account in only one triple pattern. As far as I am concerned, this is very reasonable, and we even offer extensions for this sort of thing in Virtuoso&amp;#39;s SPARQL. This however only makes sense if the inference is done at query time and pattern by pattern. For instance, if forward chaining is used, this no longer makes sense. Specifying that some forward chaining ought to be done at query time is impractical, as the operation can be very large and time consuming and it is the DBA&amp;#39;s task to determine what should be stored and for how long, how changes should be propagated, and so on. All these are application dependent and standardizing will be difficult.&lt;/p&gt; &lt;p&gt;Support for RDF features like lists and bags would all fall into the functions an underlying inference layer should perform. These things are of special interest when querying &lt;a href=&quot;http://dbpedia.org/resource/Web_Ontology_Language&quot; id=&quot;link-id0x156f3830&quot;&gt;OWL&lt;/a&gt; models, for example.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt; &lt;b&gt;Path expressions&lt;/b&gt; - Path expressions were requested by a few people. We have implemented some, as in &lt;/p&gt; &lt;blockquote&gt; &lt;code&gt;?product+?has_supplier+&amp;gt;s_name = &amp;quot;Gizmos, Inc.&amp;quot;.&lt;/code&gt; &lt;/blockquote&gt; This means that one supplier of product has name &amp;quot;Gizmo, Inc.&amp;quot;. This is a nice shorthand but we run into problems if we start supporting repetitive steps, optional steps, and the like.&lt;/li&gt; &lt;p&gt;In conclusion, update, full text, and basic counting and grouping would seem straightforward at this point. Nesting queries, value subqueries, views, and the like should not be too hard if an agreement is reached on scope rules. Inference and federation will probably need more experimentation but a lot can be had already with very simple fine grained control of backward chaining, if such applies, or with explicit end-point references and explicit join order. These are practical but not pretty enough for committee consensus, would be my guess. Anyway, it will be a few months before anything formal will happen.&lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Linked Data and Information Architecture</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-04-29#1350</atom:id>
  <atom:published>2008-04-29T14:37:22Z</atom:published>
  <atom:updated>2008-04-29T17:18:21.000048-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Linked Data and Information Architecture&lt;/div&gt; &lt;p&gt;We had a workshop on &lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1437ac70&quot;&gt;Linked Open Data&lt;/a&gt; (&lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0x1315f788&quot;&gt;LOD&lt;/a&gt;) last week in &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0x13737468&quot;&gt;Beijing&lt;/a&gt;. You can see the papers in &lt;a href=&quot;http://events.linkeddata.org/ldow2008/#program&quot; id=&quot;link-id10651ab8&quot;&gt;the program&lt;/a&gt;. The event was a success with plenty of good talks and animated conversation. I will not go into every paper here but will comment a little on the conversation and draw some technology requirements going forward.&lt;/p&gt; &lt;p&gt;Tim Berners-Lee showed a read-write version of &lt;a href=&quot;http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.8/tab.html&quot; id=&quot;link-id0x15633520&quot;&gt;Tabulator&lt;/a&gt;. This raises the question of updating on the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1350a178&quot;&gt;Data&lt;/a&gt; Web. The consensus was that one could assert what one wanted in one&amp;#39;s own space but that others&amp;#39; spaces would be read-only. What spaces one considered relevant would be the user&amp;#39;s or developer&amp;#39;s business, as in the document web.&lt;/p&gt; &lt;p&gt;It seems to me that a significant use case of LOD is an open-web situation where the user picks a broad read-only &amp;quot;data wallpaper&amp;quot; or backdrop of assertions, and then uses this combined with a much smaller, local, writable data set. This is certainly the case when editing data for publishing, as in Tim&amp;#39;s demo. This will also be the case when developing mesh-ups combining multiple distinct data sets bound together by sets of SameAs assertions, for example. Questions like, &amp;quot;What is the minimum subset of n data sets needed for deriving the result?&amp;quot; will be common. This will also be the case in applications using proprietary data combined with open data.&lt;/p&gt; &lt;p&gt;This means that databases will have to deal with queries that specify large lists of included graphs, all graphs in the store or all graphs with an exclusion list. All this is quite possible but again should be considered when architecting systems for an open &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0xa27bae8&quot;&gt;linked data&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Giant_Global_Graph&quot; id=&quot;link-id0x155c3f18&quot;&gt;web&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;&amp;quot;There is data but what can we really do with it? How far can we trust it, and what can we confidently decide based on it?&amp;quot;&lt;/p&gt; &lt;p&gt;As an answer to this question, &lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0xd447580&quot;&gt;Zitgist&lt;/a&gt; has compiled the &lt;a href=&quot;http://umbel.org/about/&quot; id=&quot;link-id0x14735008&quot;&gt;UMBEL&lt;/a&gt; taxonomy using &lt;a href=&quot;http://dbpedia.org/resource/SKOS&quot; id=&quot;link-id0x15ab1c48&quot;&gt;SKOS&lt;/a&gt;. This draws on Wikipedia, Open CYC, Wordnet, and &lt;a href=&quot;http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/&quot; id=&quot;link-id0x15d5aa88&quot;&gt;YAGO&lt;/a&gt;, hence the acronym WOWY. UMBEL is both a taxononmy and a set of instance data, containing a large set of &lt;a href=&quot;http://dbpedia.org/resource/Named_entity_recognition&quot; id=&quot;link-id0x9fe45d98&quot;&gt;named entities&lt;/a&gt;, including persons, organizations, geopolitical entities, and so forth. By extracting references to this set of named entities from documents and correlating this to the taxonomy, one gets a good idea of what a document (or part thereof) is about.&lt;/p&gt; &lt;p&gt;Kingsley presented this in the Zitgist demo. This is our answer to the criticism about &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xa1920800&quot;&gt;DBpedia&lt;/a&gt; having errors in classification. DBpedia, as a bootstrap stage, is about giving names to all things. Subsequent efforts like UMBEL are about refining the relationships.&lt;/p&gt; &lt;p&gt;&amp;quot;Should there be a global &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x12cd5290&quot;&gt;URI&lt;/a&gt; dictionary?&amp;quot;&lt;/p&gt; &lt;p&gt;There was a talk by Paolo Bouquet about &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x12d03400&quot;&gt;Entity&lt;/a&gt; Name System, a a sort of data DNS, with the purpose of associating some description and rough classification to URIs. This would allow discovering URIs for reuse. I&amp;#39;d say that this is good if it can cut down on the SameAs proliferation and if this can be widely distributed and replicated for resilience, &lt;i&gt;Ã  la&lt;/i&gt; DNS. On the other hand, it was pointed out that this was not quite in the LOD spirit, where parties would mint their own dereferenceable URIs, in their own domains. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;&amp;quot;What to do when identity expires?&amp;quot;&lt;/p&gt; &lt;p&gt;Giovanni of Sindice said that a document should be removed from search if it was no longer available. Kingsley pointed out that resilience of reference requires some way to recover data. The data web cannot be less resilient than the document web, and there is a point to having access to history. He recommended hooking up with the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id0x143e4130&quot;&gt;Internet&lt;/a&gt; Archive, since they make long term persistence their business. In this way, if an application depends on data, and the URIs on which it depends are no longer dereferenceable or or provide content from a new owner of the domain, those who need the old version can still get it and host it themselves.&lt;/p&gt; &lt;p&gt;It is increasingly clear that OWL SameAs is both the blessing and bane of linked data. We can easily have tens of URIs for the same thing, especially with people. Still, these should be considered the same.&lt;/p&gt; &lt;p&gt;Returning every synonym in a query answer hardly makes sense but accepting them as input seems almost necessary. This is what we do with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x15a2a930&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s SameAs support. Even so, this can easily double query times even when there are no synonyms.&lt;/p&gt; &lt;p&gt;Be that as it may, SameAs is here to stay; just consider the mapping of DBpedia to Geonames, for example.&lt;/p&gt; &lt;p&gt;Also, making aberrant SameAs statements can completely poison a data set and lead to absurd query results. Hence choosing which SameAs assertions from which source will be considered seems necessary. In an open web scenario, this leads inevitably to multi-graph queries that can be complex to write with regular &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x12bb8ce8&quot;&gt;SPARQL&lt;/a&gt;. By extension, it seems that a good query would also include the graphs actually used for deriving each result row. This is of course possible but has some implications on how databases should be organized.&lt;/p&gt; &lt;p&gt;Yves Raymond gave a talk about deriving identity between Musicbrainz and Jamendo. I see the issue as a core question of linked data in general. The algorithm Yves presented started with attribute value similarities and then followed related entities. Artists would be the same if they had similar names and similar names of albums with similar song titles, for example. We can find the same basic question in any analysis, for example, looking at how news reporting differs between media, supposing there is adequate entity extraction.&lt;/p&gt; &lt;p&gt;There is basic graph diffing in &lt;a href=&quot;http://data.semanticweb.org/conference/iswc-aswc/2007/tracks/research/papers/533/html&quot; id=&quot;link-id0x153c1fa8&quot;&gt;RDFSync&lt;/a&gt;, for example. But here we are expanding the context significantly. We will traverse references to some depth, allow similarity matches, SameAs, and so forth. Having presumed identity of two URIs, we can then look at the difference in their environment to produce a human readable summary. This could then be evaluated for purposes of analysis or of combining content.&lt;/p&gt; &lt;p&gt;At first sight, these algorithms seem well parallelizable, as long as all threads have access to all data. For scaling, this means a probably message-bound distributed algorithm. This is something to look into for the next stage of linked data.&lt;/p&gt; &lt;p&gt;Some inference is needed, but if everybody has their own choice of data sets to query, then everybody would also have their own entailed triples. This will make for an explosion of entailed graphs if forward chaining is used. Forward chaining is very nice because it keeps queries simple and easy to optimize. With Virtuoso, we still favor backward chaining since we expect a great diversity of graph combinations and near infinite volume in the open web scenario. With private repositories of slowly changing data put together for a special application, the situation is different.&lt;/p&gt; &lt;p&gt;In conclusion, we have a real LOD movement with actual momentum and a good idea of what to do next. The next step is promoting this to the broader community, starting with &lt;a href=&quot;http://www.linkeddataplanet.com/&quot; id=&quot;link-id0x155d1d00&quot;&gt;Linked Data Planet&lt;/a&gt; in New York in June.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>On Sem Web Search</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-04-29#1349</atom:id>
  <atom:published>2008-04-29T14:37:21Z</atom:published>
  <atom:updated>2008-10-02T11:37:12.000008-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;On Sem Web Search&lt;/div&gt; &lt;p&gt; &lt;i&gt;&amp;quot;I give the search keywords and you give me a &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1a603f18&quot;&gt;SPARQL&lt;/a&gt; end-point and a query that will get the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1bda5c00&quot;&gt;data&lt;/a&gt;.&amp;quot;&lt;/i&gt; &lt;/p&gt; &lt;p&gt;Thus did one SPARQL user describe the task of a semantic/data web search engine.&lt;/p&gt; &lt;p&gt;In &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1336&quot; id=&quot;link-idff98750&quot;&gt;a previous post&lt;/a&gt;, I suggested that if the data web were the size of the document web, we&amp;#39;d be looking at two orders of magnitude more search complexity. It just might be so.&lt;/p&gt; &lt;p&gt;In the conversation, I pointed out that a search engine might have a copy of everything and even a capability to do SPARQL and full text on it all, yet still the users would be better off doing the queries against the SPARQL end-points of the data publishers. It is a bit like the fact that not all web browsing runs off Google&amp;#39;s cache. With the data web, the point is even more pronounced, as serving a hit from Google&amp;#39;s cache is a small operation but a complex query might be a very large one.&lt;/p&gt; &lt;p&gt;Yet, the data web is about ad-hoc joining between data sets of different origins. Thus a search engine of the data web ought to be capable of joining also, even if large queries ought to be run against individual publishers&amp;#39; end-points or the user&amp;#39;s own data warehouse.&lt;/p&gt; &lt;p&gt;For ranking, the general consensus was that no single hit-ranking would be good for the data web. Thus word frequency-based hit-scores are OK for text hits but more is not obvious. I would think that some link analysis could apply but this will take some more experimentation.&lt;/p&gt; &lt;p&gt;For search summaries, if we have splitting of data sets into small fragments &lt;i&gt;Ã  la&lt;/i&gt; &lt;a href=&quot;http://sindice.com/&quot; id=&quot;link-id0x1d2b7288&quot;&gt;Sindice&lt;/a&gt;, search summaries are pretty much the same as with just text search. If we store triples, then we can give text style summaries of text hits in literals and Fresnel lens views of the structured data around the literal. For showing a page of hits, the lenses must abbreviate heavily but this is still feasible. The engine would know about the most common ontologies and summarize instance data accordingly.&lt;/p&gt; &lt;p&gt;Chris Bizer pointed out that trust and provenance are critical, especially if an answer is arrived at by joining multiple data sets. The trust of the conclusion is no greater than that of the weakest participating document. Different users will have different trusted sources.&lt;/p&gt; &lt;p&gt;A mature data web search engine would combine a provenance/trust specification, a search condition consisting of SPARQL or full text or both, and a specification for hit rank. Again, most searches would use defaults, but these three components should in principle be orthogonally specifiable.&lt;/p&gt; &lt;p&gt;Many places may host the same data set either for download or SPARQL access. The &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Identifier&quot; id=&quot;link-id0x1b2317d0&quot;&gt;URI&lt;/a&gt; of the data set is not its &lt;a href=&quot;http://dbpedia.org/resource/Uniform_Resource_Locator&quot; id=&quot;link-id0x1c55dd68&quot;&gt;URL&lt;/a&gt;. Different places may further host multiple data sets on one end-point. Thus the search engine ought to return all end-points where the set is to be found. The end-points themselves ought to be able to say what data sets they contain, under what graph IRIs. Since there is no consensus about end-point self description, this too would be left to the search engine. In practice, this could be accomplished by extending Sindice&amp;#39;s semantic site map specification. A possible query would be to find an end-point containing a set of named data sets. If none were found, the search engine itself could run a query joining all the sets since it at least would hold them all.&lt;/p&gt; &lt;p&gt;Since many places will host sets like Wordnet or Uniprot, indexing these once for each copy hardly makes sense. Thus a site should identify its data by the data set&amp;#39;s URI and not the copy&amp;#39;s URL.&lt;/p&gt; &lt;p&gt;It came up in the discussion that search engines should share a ping format so that a single message format would be enough to notify any engine about data being updated. This is already partly the case with Sindice and &lt;a href=&quot;http://www.pingthesemanticweb.com/&quot; id=&quot;link-id0xa405ebd0&quot;&gt;PTSW&lt;/a&gt; (&lt;a href=&quot;http://www.pingthesemanticweb.com/&quot; id=&quot;link-id0x1c051a00&quot;&gt;PingTheSemanticWeb&lt;/a&gt;) sharing a ping format. &lt;/p&gt; &lt;p&gt;Further, since it is no trouble to publish a copy of the 45G Uniprot file but a fair amount of work to index it, search engines should be smart about processing requests to index things, since these can amount to a denial of service attack. &lt;/p&gt; &lt;p&gt;Probably very large data sets should be indexed only in the form supplied by their publisher, and others hosting copies would just state that they hold a copy. If the claim to the copy proved false, users could complain and the search engine administrator would remove the listing. It seems that some manual curating cannot be avoided here. &lt;/p&gt; &lt;h2&gt;On Data Web Search Business Model&lt;/h2&gt; &lt;p&gt;It seems there can be an overlap between the data web search and the data web hosting businesses. For example, Talis rents space for hosting &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1a60c7e0&quot;&gt;RDF&lt;/a&gt; data with SPARQL access. A search engine should offer basic indexing of everything for free, but could charge either data publishers or end users for running SPARQL queries across data sets. These do not have the nicely anticipatable and fairly uniform resource consumption of text lookups. In this manner, a search provider could cost-justify the capacity for allowing arbitrary queries. &lt;/p&gt; &lt;p&gt;The value of the data web consists of unexpected joining. Such joining takes place most efficiently if the sources are at least in some proximity, for example in the same data center. Thus the search provider could monetize functioning as the database provider for mesh-ups. In the document web, publishing pages is very simple and there is no great benefit from co-locating search and pages, rather the opposite. For the data web, the hosting with SPARQL and all is more complex and resembles providing search. Thus providing search can combine with providing SPARQL hosting, once we accept in principle that search should have arbitrary inter-document joining, even if it is at an extra premium.&lt;/p&gt; &lt;p&gt;The present search business model is advertising. If the data web is to be accessed by automated agents such as mesh-up code, display of ads is not self-evident. This is quite separate from the fact that semantics can lead to better ad targeting.&lt;/p&gt; &lt;p&gt;One model would be to do text lookups for free from a regular web page but show ads, just a la Google search ads. Using the service via web services for text or SPARQL would have a cost paid by the searching or publishing party and would not be financed by advertising.&lt;/p&gt; &lt;p&gt;In the case of data used in value-add data products (mesh-ups) that have financial value to their users, the original publisher of the data could even be paid for keeping the data up-to-date. This would hold for any time-sensitive feeds like news or financial feeds. Thus the hosting/search provider would be a broker of data-use fees and the data producer would be in the position of an AdSense inventory owner, i.e., a web site which shows AdSense ads. Organizing this under a hub providing back-office functions similar to an ad network could make sense even if the actual processing were divided among many sites.&lt;/p&gt; &lt;p&gt;Kingsley has repeatedly formulated the core value proposition of the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x3728a2f8&quot;&gt;semantic web&lt;/a&gt; in terms of dealing with &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1bbcbeb8&quot;&gt;information&lt;/a&gt; overload: There is the real-time enterprise and the real-time individual and both are beasts of perception. Their image is won and lost in the &lt;a href=&quot;http://dbpedia.org/resource/Internet&quot; id=&quot;link-id0x1843b020&quot;&gt;Internet&lt;/a&gt; online conversation space. We know that allegations, even if later proven false, will stick if left unchallenged. The function of semantics on the web is to allow one to track and manage where one stands. In fact, Garlik has made a business of just this, but now from a privacy and security angle. The &lt;a href=&quot;http://www.garlik.com/&quot; id=&quot;link-id0x1aa76ab0&quot;&gt;Garlik DataPatrol&lt;/a&gt; harvests data from diverse sources and allows assessing vulnerability to identity theft, for example.&lt;/p&gt; &lt;p&gt;If one is in the business of collating all the structured data in the world, as a data web search engine is, then providing custom alerts for both security or public image management is quite natural. This can be a very valuable service if it works well.&lt;/p&gt; &lt;p&gt;At OpenLink, we will now experiment with the Sindice/&lt;a href=&quot;http://zitgist.com/about/&quot; id=&quot;link-id0x18800228&quot;&gt;Zitgist&lt;/a&gt;/PingTheSemanticWeb content. This is a regular part of the productization of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1adf39c8&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s cluster edition. We expect to release some results in the next 4 weeks.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>WWW 2008</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-04-29#1348</atom:id>
  <atom:published>2008-04-29T14:37:20Z</atom:published>
  <atom:updated>2008-04-29T13:35:23-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;WWW 2008&lt;/div&gt; &lt;p&gt;Following my return from WWW 2008 in &lt;a href=&quot;http://www2008.org/&quot; id=&quot;link-id0x9ff7d5d0&quot;&gt;Beijing&lt;/a&gt;, I will write a series of &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x9e4a7650&quot;&gt;blog&lt;/a&gt; posts discussing diverse topics that were brought up in presentations and conversations during the week.&lt;/p&gt; &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x9e7ae398&quot;&gt;Linked data&lt;/a&gt; was our main interest in the conference and there was a one day workshop on this, unfortunately overlapping with a day of W3C Advisory Committee meetings. Hence Tim Berners-Lee, one of the chairs of the workshop, could not attend for most of the day. Still, he was present to say that &amp;quot;&lt;a href=&quot;http://community.linkeddata.org/dataspace/organization/lod#this&quot; id=&quot;link-id0xa287d38&quot;&gt;Linked open data&lt;/a&gt; is the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x15372940&quot;&gt;semantic web&lt;/a&gt; and the web done as it ought to be done.&amp;quot; &lt;p&gt;For my part, I will draw some architecture conclusions from the different talks and extrapolate about the requirements on database platforms for linked data.&lt;/p&gt; &lt;p&gt;Chris Bizer predicted that 2008 would be the year of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa1454c58&quot;&gt;data&lt;/a&gt; web search, if 2007 was the year of &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xa0f73c50&quot;&gt;SPARQL&lt;/a&gt;. This may be the case, as linked data is now pretty much a reality and the questions of discovery become prevalent. There was a birds-of-a-feather session on this and I will make some comments on what we intend to explore in bridging between the text index based semantic web search engines and SPARQL.&lt;/p&gt; &lt;p&gt;Andy Seaborne convened a birds-of-a-feather session on the future of SPARQL. Many of the already anticipated and implemented requirements were confirmed and a few were introduced. A separate blog post will discuss these further.&lt;/p&gt; &lt;p&gt;From the various discussions held throughout the conference, we conclude that plug-and-play operation with the major semantic web frameworks of Jena, Sesame, and Redland, is our major immediate-term deliverable. Our efforts in this direction thus far are insufficient and we will next have these done with the right supervision and proper interop testing. The issues are fortunately simple but doing things totally right require some small server side support and some &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0xa5d4d5b8&quot;&gt;JDBC&lt;/a&gt;/&lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0x9dc28d10&quot;&gt;ODBC&lt;/a&gt; tweaks, so to the interested, we advise to wait for an update to be published on this blog.&lt;/p&gt; &lt;p&gt;I further had a conversation with Andy Seaborne about using Jena reasoning capabilities with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa2754050&quot;&gt;Virtuoso&lt;/a&gt; and generally the issues of &amp;quot;impedance mismatch&amp;quot; between reasoning and typical database workloads. More on this later. &lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>TPC H as Linked Data (Updated 2)</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-03-06#1322</atom:id>
  <atom:published>2008-03-06T16:34:12Z</atom:published>
  <atom:updated>2008-08-28T11:26:03.000004-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;TPC H as Linked Data (Updated 2)&lt;/div&gt; &lt;p&gt;We have a new demo online at &lt;a href=&quot;http://demo.openlinksw.com/tpc-h&quot; id=&quot;link-id1829c9a0&quot;&gt;http://demo.openlinksw.com/tpc-h&lt;/a&gt;. This takes the industry standard &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0xf0fff10&quot;&gt;TPC-H&lt;/a&gt; benchmark &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1387cd90&quot;&gt;data&lt;/a&gt; and presents it as &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; id=&quot;link-id0x152407c0&quot;&gt;linked data&lt;/a&gt; with a &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x11657940&quot;&gt;SPARQL&lt;/a&gt; end point and dereferenceable URIs. &lt;/p&gt; &lt;p&gt;This is an example of using &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xe560628&quot;&gt;Virtuoso&lt;/a&gt;&amp;#39;s relational-to-&lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xa07e2588&quot;&gt;RDF&lt;/a&gt; mapping for publishing business data, for browsing using the linked data principles and opening it to analytics queries in SPARQL.&lt;/p&gt; &lt;p&gt; As noted before, we have extended SPARQL with aggregation and nested queries, thus making it a viable &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa3666b00&quot;&gt;SQL&lt;/a&gt; substitute for decision support queries. &lt;/p&gt; &lt;p&gt;The article at &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSTPCHLinkedData&quot; id=&quot;link-id10799d10&quot;&gt;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSTPCHLinkedData&lt;/a&gt; gives details and the source code for the implementation.&lt;/p&gt; &lt;p&gt; We are still working on some aspects of the more complex TPC-H queries, thus the demo is not complete with all the 22 queries. This is however enough to see a representative sample of how analytics queries work with SPARQL and Virtuoso&amp;#39;s SQL-to-RDF mapping. The demo will be part of the next Virtuoso Open Source download, probably out next week.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>LUBM results with Virtuoso 6.0</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-02-04#1309</atom:id>
  <atom:published>2008-02-04T10:26:17Z</atom:published>
  <atom:updated>2008-08-28T12:06:11-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;LUBM results with Virtuoso 6.0&lt;/div&gt; &lt;p&gt;We have now run the LUBM benchmark on &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1c2197e8&quot;&gt;Virtuoso&lt;/a&gt; v6, with the same configuration &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1302&quot; id=&quot;link-id107f0238&quot;&gt;as discussed last Friday&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;We had a database of 8000 universities, and we ran 8 clients on slices of 100, 1000 and 8000 universities — same &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x7d0e498&quot;&gt;data&lt;/a&gt; but different sizes of working set.&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt; 100 universities: 35.3 qps 1000 universities: 26.3 qps 8000 universities: 13.1 qps&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;The 100 universities slice is about the same as with v5.0.5 (35.3 vs 33.1 qps). &lt;br /&gt;The 8000 universities set is almost 3x better (13.1 vs. 4.8 qps).&lt;/p&gt; &lt;p&gt;This comes from the fact that the v6 database takes half of the space of the v5.0.5 one.  Further, this is with 64-bit IDs for everything.  If the 5.5 database were with 64-bit IDs, we&amp;#39;d have a difference of over 3x.  This is worth something if it lets you get by with only 1 terabyte of RAM for the 100 billion  triple application, instead of 3 TB.&lt;/p&gt; &lt;p&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1358&quot; id=&quot;link-id15fb4d38&quot;&gt;In a few more days&lt;/a&gt;, we&amp;#39;ll give the results for Virtuoso v6 Cluster.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Latest LUBM Benchmark results for Virtuoso</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2008-02-01#1305</atom:id>
  <atom:published>2008-02-01T15:16:59Z</atom:published>
  <atom:updated>2008-08-28T12:06:08-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Latest LUBM Benchmark results for Virtuoso&lt;/div&gt; &lt;p&gt;We have now taken a close look at the query side of the LUBM benchmark, &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1296&quot; id=&quot;link-id10a98120&quot;&gt;as promised a couple of blog posts ago.&lt;/a&gt; &lt;/p&gt; &lt;p&gt;We load 8000 universities and run a query mix consisting of the 14 LUBM queries with different numbers of clients against different portions of the database.&lt;/p&gt; &lt;p&gt;When it is all in memory, we get 33 queries per second with 8 concurrent clients; when it is so I/O bound that 7.7 of 8 threads wait for disk, we get 5 qps. This was run in 8G RAM with 2 Xeon 5130.&lt;/p&gt; &lt;p&gt;We adapted some of the queries so that they do not run over the whole database. In terms of retrieving triples per second, this would be about 330000 for the rate of 33 qps, with 4 cores at 2GHz. This is a combination of random access and linear scans and bitmap merge intersections; lookups for non-found triples are not counted. The rate of random lookups alone based on known G, S, P, O, without any query logic, is about 250000 random lookups per core per second.&lt;/p&gt; &lt;p&gt;The article &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSArticleLUBMBenchmark&quot; id=&quot;link-id10237708&quot;&gt;LUBM and Virtuoso&lt;/a&gt; gives the details.&lt;/p&gt; &lt;p&gt;In the process of going through the workload we made some cost model adjustments and optimized the bitmap intersection join. In this way we can quickly determine which subjects are, for example, professors holding a degree from a given university. So the benchmark served us well in that it provided an incentive to further optimize some things.&lt;/p&gt; &lt;p&gt;Now, what has been said about &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x12cba3d8&quot;&gt;RDF&lt;/a&gt; benchmarking previously still holds. What does it mean to do so many LUBM queries per second? What does this say about the capacity to run an online site off RDF &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x9e043f38&quot;&gt;data&lt;/a&gt;? Or about &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x10c1a4f8&quot;&gt;information&lt;/a&gt; integration? Not very much. But then this was not the aim of the authors either.&lt;/p&gt; &lt;p&gt;So we still need to make a benchmark for online queries and search, and another for E-science and business intelligence. But we are getting there.&lt;/p&gt; &lt;p&gt;In the immediate future, we have the general availability of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x9ec5e620&quot;&gt;Virtuoso&lt;/a&gt; Open Source 5.0.5 early next week. This comes with a LUBM test driver and a test suite running against the LUBM qualification database.&lt;/p&gt; &lt;p&gt;After this we will give some numbers for the cluster edition with LUBM and &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0xcd7ec48&quot;&gt;TPC-H&lt;/a&gt;.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RDBMS to RDF Mapping Workshop, and Benchmarks</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-11-21#1271</atom:id>
  <atom:published>2007-11-21T13:07:03Z</atom:published>
  <atom:updated>2008-04-25T16:29:53-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;RDBMS to RDF Mapping Workshop, and Benchmarks&lt;/div&gt; &lt;p&gt;I was recently in Boston for the &lt;a href=&quot;http://www.w3.org/2007/03/RdfRDB/&quot; id=&quot;link-id10f990b0&quot;&gt;Mapping Relational Data to RDF workshop&lt;/a&gt; of the W3C.&lt;/p&gt; &lt;p&gt;The common feeling was that mapping everything to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1c343278&quot;&gt;RDF&lt;/a&gt; and querying it in terms of a generic domain ontology, mapped on demand into whatever line of business systems, would be very good if it only could be done. However, since this is not so easily done, the next best is to extract the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xb6f01d0&quot;&gt;data&lt;/a&gt; and then warehouse it as RDF.&lt;/p&gt; &lt;p&gt;The obstacles perceived were of the following types:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Lack of quality in the data. The different line of business systems do not in and of themselves hold enough semantics. If the meaning of data columns in relational tables were really known and explicit, these could be meaningfully used for joining across systems. But this is more complex than just mapping the metal &lt;i&gt;lead&lt;/i&gt; to the chemical symbol &lt;i&gt;Pb&lt;/i&gt; and back.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Lack of performance in RDF storage. Data sets even in the tens-of-millions of triples do not run very well in some stores. Well, we had the Banff life sciences demo with 450M triples in a small server box running &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1ca1c488&quot;&gt;Virtuoso&lt;/a&gt;, so this is not universal, plus of course we are coming up with a whole different order of magnitude, as often discussed on this &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0xb4dc850&quot;&gt;blog&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Lack of functionality in mapping and possibly lack of pushing through enough of the query processing to the underlying data stores.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Personally, I am quite aware of what to do with regard to performance of mapping and storage, and see these as eminently solvable issues. After all, we have a great investment of talent in databases in general and it can be well deployed towards RDF, as we have been doing these past couple of years. So we talk about the promise of a 360-degree view of &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1ae64448&quot;&gt;information&lt;/a&gt;, with RDF being the top layer. Everybody agrees that this is a nice concept. But this is a nice concept especially when it can do the things that are the most common baseline expectation of any regular DBMS, i.e., aggregation, grouping, sub-queries, VIEWs. Now, I would not go sell a DBMS that has no &lt;code&gt;COUNT&lt;/code&gt; operator to a data warehousing shop.&lt;/p&gt; &lt;p&gt;The fact that OpenLink and &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1aa10fd8&quot;&gt;Oracle&lt;/a&gt; allow RDF inside &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa26d330&quot;&gt;SQL&lt;/a&gt;, and OpenLink even adds native aggregates and grouping to &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1d81d990&quot;&gt;SPARQL&lt;/a&gt;, fixes the problem with regard to specific products, but leaves the standardization issue open. Of course, any vendor will solve these questions one way or another because a database with no aggregation is a non-starter.&lt;/p&gt; &lt;p&gt;I talked to Lee Feigenbaum, chair of the W3C DAWG, about the question of aggregates and general BI capabilities in SPARQL. He told me that, prior to his time with the DAWG, these were left out because they conflicted with the &lt;a href=&quot;http://dbpedia.org/resource/Open_world_assumption&quot; id=&quot;link-id0x1be3ab98&quot;&gt;open-world&lt;/a&gt; assumption around RDF: You cannot count a set because by definition you do not know that you have all the members, the world being open and all that.&lt;/p&gt; &lt;p&gt;Say what? Talk about the road to hell being paved with good intentions. Now, this is in no way Lee&amp;#39;s or the present day DAWG&amp;#39;s fault; as a member myself, I can attest to the good work and would under no circumstances wish any delays or revisions to SPARQL at this point. I am just pointing out a matter that all implementations should address, as a sort of precondition of entry into the real world IS space. If this can be done interoperably, so much the better.&lt;/p&gt; &lt;p&gt;Now, out of the deliberations at the Boston workshop arose at least two ideas for follow-up activity.&lt;/p&gt; &lt;p&gt;The first was an incubator group for RDF store and mapping benchmarking. This is very appropriate in order to dispel the bad name RDF storage and querying performance has been saddled with. As a first step in this direction, I will outline a &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1269&quot; id=&quot;link-id10306200&quot;&gt;social web oriented benchmark&lt;/a&gt; on this blog.&lt;/p&gt; &lt;p&gt;The second activity was an &lt;a href=&quot;http://www.w3.org/2005/Incubator/rdb2rdf/&quot; id=&quot;link-id10150a58&quot;&gt;incubator group for preparing standardization of mapping methodologies from relational schemas to RDF&lt;/a&gt;. We will be active on this as well.&lt;/p&gt; &lt;p&gt;The two offshoots appear logically separate but are not necessarily so in practice. A benchmark is after all something that is supposed to promote a technology to a user base. The user base seems to wish to put all online systems and data warehouses under a common top level RDF model and then query away, introducing no further replication of data or performance cost or ETL latencies.&lt;/p&gt; &lt;p&gt;Updating would also be nice but even query only would be very good. Personally, I&amp;#39;d say the RDF strength is all on the query side. Transactions are taken care of well enough by what there already is, RDF stands out in integration and the ad-hoc and discovery side of the matter. Given this, we expect the value to be consumed in a heterogeneous, multi-database, federated environment. Thus a benchmark should measure this aspect of the use-case. With the right mapping and queries, we could probably demonstrate the added cost of RDF to be very low, as long as we could push all queries that can be answered by a single source to the responsible DBMS. For distributed joins, we are back at the question of optimizing distributed queries but this is a familiar one and RDF is not the principal cost factor.&lt;/p&gt; &lt;p&gt;The subject does become quite complex at this point. We would have to take supposedly representative synthetic OLTP and BI data sets (like the ones in TPC-D, TPC-E, and &lt;a href=&quot;http://dbpedia.org/resource/TPC-H&quot; id=&quot;link-id0xb576e78&quot;&gt;TPC-H&lt;/a&gt;), and invent queries across them that would both make sense and be implementable in SPARQL extended with aggregates and sub-queries. Reliance on SPARQL extensions is simply unavoidable. Setting up the test systems would be non-trivial, even though there is a lot of industry experience in these matters on the database side.&lt;/p&gt; &lt;p&gt;So, while this is probably the benchmark most relevant to the target audience, we may have to start with a simpler one. I will next &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1269&quot; id=&quot;link-id10fa7a50&quot;&gt;outline something to the effect&lt;/a&gt;.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Cluster Stage 1</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-09-06#1251</atom:id>
  <atom:published>2007-09-06T11:11:52Z</atom:published>
  <atom:updated>2008-04-25T15:57:25-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Cluster Stage 1&lt;/div&gt; &lt;p&gt;I recall a quote from a stock car racing movie.&lt;/p&gt; &lt;p&gt;&amp;quot;What is the necessary prerequisite for winning a race?&amp;quot; asked the racing team boss.&lt;/p&gt; &lt;p&gt;&amp;quot;Being the fastest,&amp;quot; answered the hotshot driver, after yet another wrecked engine.&lt;/p&gt; &lt;p&gt;&amp;quot;No. It is finishing the race.&amp;quot;&lt;/p&gt; &lt;p&gt;In the interest of finishing, we&amp;#39;ll now leave optimizing the cluster traffic and scheduling and move to completing functionality. Our next stop is TPC-D. After this &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x1c3d9cb8&quot;&gt;TPC-C&lt;/a&gt;, which adds the requirement of handling distributed deadlocks. After this we add &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1be00c10&quot;&gt;RDF&lt;/a&gt;-specific optimizations.&lt;/p&gt; &lt;p&gt;This will be &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1f91c098&quot;&gt;Virtuoso&lt;/a&gt; 6 with the first stage of clustering support. This is with fixed partitions, which is just like a single database, except it runs on multiple machines. The stage after this is Virtuoso Cloud, the database with all the space filling properties of foam, expanding and contracting to keep an even &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1e1aeff8&quot;&gt;data&lt;/a&gt; density as load and resource availability change.&lt;/p&gt; &lt;p&gt;Right now, we have a pretty good idea of the final form of evaluating loop joins in a cluster, which after all is the main function of the thing. It makes sense to tune this to a point before going further. You want the pipes and pumps and turbines to have known properties and fittings before building a power plant.&lt;/p&gt; &lt;p&gt;To test this, we took a table of a million short rows and made one copy partitioned over 4 databases and one copy with all rows in one database. We ran all the instances in a 4 core Xeon box. We used Unix sockets for communication.&lt;/p&gt; &lt;p&gt;We joined the table to itself, like &lt;code&gt;&lt;b&gt;SELECT COUNT (*) FROM ct a, ct b WHERE b.row_no = a.row_no + 3&lt;/b&gt;&lt;/code&gt;. The &lt;code&gt;&lt;b&gt;+ 3&lt;/b&gt;&lt;/code&gt; causes the joined rows never to be on the same partition.&lt;/p&gt; &lt;p&gt;With cluster, the single operation takes 3s and with a single process it takes 4s. The overall CPU time for cluster is about 30% higher, some of which is inevitable since it must combine results, serialize them, and so forth. Some real time is gained by doing multiple iterations of the inner loop (getting the row for b) in parallel. This can be further optimized to maybe 2x better with cluster but this can wait a little.&lt;/p&gt; &lt;p&gt;Then we make a stream of 10 such queries. The stream with cluster is 14s; with the single process, it is 22s. Then we run 4 streams in parallel. The time with cluster is 39s and with a single process 36s. With 16 streams in parallel, cluster gets 2m51 and single process 3m21.&lt;/p&gt; &lt;p&gt;The conclusion is that clustering overhead is not significant in a CPU-bound situation. Note that all the runs were at 4 cores at 98-100%, except for the first, single-client run, which had one process at 98% and 3 at 32%.&lt;/p&gt; &lt;p&gt;The SMP single process loses by having more contention for mutexes serializing index access. Each wait carries an entirely ridiculous penalty of up to 6Âµs or so, &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1229&quot; id=&quot;link-id106dca10&quot;&gt;as discussed earlier on this blog&lt;/a&gt;. The cluster wins by less contention due to distributed data and loses due to having to process messages and remember larger intermediate results. These balance out, or close enough.&lt;/p&gt; &lt;p&gt;For the case with a single client, we can cut down on the coordination overhead by simply optimizing the code some more. This is quite possible, so we could get one process at 100% and 3 at 50%.&lt;/p&gt; &lt;p&gt;The numbers are only relevant as ballpark figures and the percentages will vary between different queries. The point is to prove that we actually win and do not jump from the frying pan into the fire by splitting queries across processes. As a point of comparison, running the query clustered just as one would run it locally took 53s.&lt;/p&gt; &lt;p&gt;We will later look at &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1336&quot; id=&quot;link-id108d9868&quot;&gt;the effects of different networks&lt;/a&gt;, as we get to revisit the theme with some real benchmarks.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso and cluster capacity allocation</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-08-28#1248</atom:id>
  <atom:published>2007-08-28T11:54:30Z</atom:published>
  <atom:updated>2008-04-30T15:08:20-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso and cluster capacity allocation&lt;/div&gt; &lt;p&gt;I just read &lt;a href=&quot;http://labs.google.com/papers/bigtable.html&quot; id=&quot;link-id140a68b0&quot;&gt;Google&amp;#39;s Bigtable&lt;/a&gt; paper. It is relevant here because it talks about keeping petabyte scale (1024TB) tables on a variable size cluster of machines.&lt;/p&gt; &lt;p&gt;I have talked about partitioning versus distributed cache in the &lt;a href=&quot;http://www.openlinksw.com/weblog/oerling/?id=1229&quot; id=&quot;link-id13f70dc8&quot;&gt;second to last post&lt;/a&gt;. The problem in short is that you do not expect a DBA to really know how to partition things, and even if the indices are correctly partitioned initially, repartitioning them is so bad that doing it online can be a problem. And repartitioning is needed whenever adding machines, unless the size increment is a doubling, which it will never be.&lt;/p&gt; &lt;p&gt;So &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0xa2957e38&quot;&gt;Oracle&lt;/a&gt; has really elegantly stepped around the whole problem by not partitioning for clustering in the first place. So incremental capacity change does not require repartitioning. Oracle has partitioning for other purposes but this is not tied to their cluster proposition.&lt;/p&gt; &lt;p&gt;I did not go the cache fusion route because I could not figure a way to know with near certainty where to send a request for a given key value. In the case we are interested in, the job simply must go to the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xa20ce8e0&quot;&gt;data&lt;/a&gt; and not the other way around. Besides, not being totally dependent on a microsecond latency interconnect and a SAN for performance enhances deployment options. Sending large batches of functions tolerates latency better than cache consistency messages which are a page at a time, unless of course you kill yourself with extra trickery for batching these too.&lt;/p&gt; &lt;p&gt;So how to adapt to capacity change? Well, by making the unit of capacity allocation much smaller than a machine, of course.&lt;/p&gt; &lt;p&gt;Google has done this in Bigtable by a scheme of dynamic range partitioning. The partition size is in the tens to hundreds of megabytes, something that can be moved around within reason. When the partition, called a tablet, gets too big, it splits. Just like a Btree index. The tree top must be common &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0xa20cebb0&quot;&gt;knowledge&lt;/a&gt;, as well as the allocation of partitions to servers but these can be cached here and there and do not change all the time.&lt;/p&gt; &lt;p&gt;So how could we do something of the sort here? I know for an experiential fact that when people cannot change the server memory pool size, let alone correctly set up disk striping, they simply cannot be expected to deal with partitioning. Besides, even if you know exactly what you are doing and why, configuring and refilling large numbers of partitions by hand is error prone, tedious, time consuming, and will run out of disk and require restoring backups and all sorts of DBA activity that will have everything down for a long time, unless of course you have MIS staff such as is not easily found.&lt;/p&gt; &lt;p&gt;The solution is not so complex. We start with a set number of machines and make a file group on each. A file group has a bunch of disk stripes and a log file and can be laid out on the local file system in the usual manner. The data goes into the file group, partitioned as defined. You still specify partitioning columns but not where each partition goes. The system will decide this by itself. When a server&amp;#39;s file group gets too big, it splits. One half of each key&amp;#39;s partition in the original stays where it was and the other half goes to the copy. The copies will hold rows that no longer belong there but these can be removed in the background. The new file group will be managed by the same server process and the partitioning &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0xc803148&quot;&gt;information&lt;/a&gt; on all servers gets updated to reflect the existence of the new file group and the range of hash values that belong there.&lt;/p&gt; &lt;p&gt;If a file group is kept at some reasonable size, under a few GB, these can be moved around between servers, even dynamically. &lt;/p&gt; &lt;p&gt;If data is kept replicated, then the replicas have to split at the same time and the system will have to make sure that the replicas are kept on separate machines.&lt;/p&gt; &lt;p&gt;So what happens to disk locality when file groups split? Nothing much. Firstly, partitioning will be set up so that consecutive values go to the same hash value, so that key compression is not ruined. Thus, consecutive numbers will be on the same page. Imagine an integer key partitioned two ways on bits 10-20. Values 0-1K go together, values 1K-2K go another way, values 2K-3K go the first way etc. &lt;/p&gt; &lt;p&gt;Now let us suppose the first partition, the even K&amp;#39;s splits. It could split so that multiples of 4 go one way and the rest another way. Now we&amp;#39;d have 0-1K in place, 2K-3K in the new partition, 4K-5K in place and so on. A sequential disk read, with some read ahead, would scan the partitions in parallel but the disk access would be made sequential by the read ahead logic â remember that these are controlled by the same server process.&lt;/p&gt; &lt;p&gt;For purposes of sending functions, the file group would be the recipient, not the host, per se. The allocation of file groups to hosts could change. &lt;/p&gt; &lt;p&gt;Now picture a transaction that touches multiple file groups. The requests going to collocated file groups can travel in the same batch and the recipient server process can run them sequentially or with a thread per file group, as may be convenient. Multiple threads per query on the same index make contention and needless thread switches. But since distinct file groups have their distinct mutexes there is less interference.&lt;/p&gt; &lt;p&gt;For purposes of transactions, we might view a file group as deserving a its own branch. In this way we would not have to abort transactions if file groups moved. A file group split would probably have to kill all uncommitted transactions on it so as not to have to split one branch in two or deal with uncommitted data in the split. This is hardly a problem, the event being rare. For purposes of checkpoints, logging, log archival, recovery, and such, a file group is its own unit. The Bigtable paper had some ideas about combining transaction logs and such, all quite straightforward and intuitive.&lt;/p&gt; &lt;p&gt;Writing the clustering logic with the file group, not the database process, as the main unit of location is a good idea and an entirely trivial change. This will make it possible to adjust capacity in almost real time without bringing everything to a halt by re-inserting terabytes of data in system wide repartitioning runs.&lt;/p&gt; &lt;p&gt;Implementing this on the current &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xa21de0e0&quot;&gt;Virtuoso&lt;/a&gt; is not a real difficulty. There is already a concept of file group, although we use only two, one for the data and one for temp. Using multiple ones is not a big deal.&lt;/p&gt; &lt;p&gt;Supporting capacity allocation at the file group level instead of the server level can be introduced towards the middle of the clustering effort and will not greatly impact timetables.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Cluster Preview</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-08-27#1245</atom:id>
  <atom:published>2007-08-27T09:51:45Z</atom:published>
  <atom:updated>2008-04-25T11:59:31-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Cluster Preview&lt;/div&gt; &lt;p&gt; &lt;b&gt;&lt;i&gt;I wrote the basics of the &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1229&quot; id=&quot;link-id1383c310&quot;&gt;Virtuoso clustering support&lt;/a&gt; over the past three weeks.Â  It can now manage connections, decide where things go, do two phase commits, insert and select &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xad603e0&quot;&gt;data&lt;/a&gt; from tables partitioned over multiple &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xbc64f48&quot;&gt;Virtuoso&lt;/a&gt; instances.Â  It works about enough to be measured, of which I will &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0xb958e90&quot;&gt;blog&lt;/a&gt; more over the next two weeks.&lt;/i&gt; &lt;/b&gt; &lt;/p&gt; &lt;p&gt; &lt;b&gt;&lt;i&gt;I will in the following give a features preview of what will be in the Virtuoso clustering support when it is released in the fall of this year (2007).&lt;/i&gt; &lt;/b&gt; &lt;/p&gt; &lt;h3&gt;Data Partitioning&lt;/h3&gt; &lt;p&gt;A Virtuoso database consists of indices only, so that the row of a table is stored together with the primary key.Â  Blobs are stored on separate pages when they do not fit inline within the row.Â  With clustering, partitioning can be specified index by index. Partitioning means that values of specific columns are used for determining where the containing index entry will be stored.Â  Virtuoso partitions by hash and allows specifying what parts of partitioning columns are used for the hash, for example bits 14-6 of an integer or the first 5 characters of a string.Â  Like this, key compression gains are not lost by storing consecutive values on different partitions.&lt;/p&gt; &lt;p&gt;Once the partitioning is specified, we specify which set of cluster nodes stores this index.Â  Not every index has to be split evenly across all nodes.Â  Also, all nodes do not have to have equal slices of the partitioned index, accommodating differences in capacity between cluster nodes.&lt;/p&gt; &lt;p&gt;Each Virtuoso instance can manage up to 32TB of data.Â  A cluster has no definite size limit.&lt;/p&gt; &lt;h3&gt;Load Balancing and Fault Tolerance&lt;/h3&gt; &lt;p&gt;When data is partitioned, an operation on the data goes where the data is. Â This provides a certain natural parallelism but we will discuss this further below.&lt;/p&gt; &lt;p&gt;Some data may be stored multiple times in the cluster, either for fail-over or for splitting read load.Â  Some data, such as database schema, is replicated on all nodes.Â  When specifying a set of nodes for storing the partitions of a key, it is possible to specify multiple nodes for the same partition.Â  If this is the case, updates go to all nodes and reads go to a randomly picked node from the group.&lt;/p&gt; &lt;p&gt;If one of the nodes in the group fails, operation can resume with the surviving node. Â The failed node can be brought back online from the transaction logs of the surviving nodes. A few transactions may be rolled back at the time of failure and again at the time of the failed node rejoining the cluster but these are aborts as in the case of deadlock and lose no committed data.&lt;/p&gt; &lt;h3&gt;Shared Nothing&lt;/h3&gt; &lt;p&gt;The Virtuoso architecture does not require a SAN for disk sharing across nodes.Â  This is reasonable since a few disks on a local controller can easily provide 300MB/s of read and passing this over an interconnect fabric that would also have to carry inter-node messages could saturate even a fast network. &lt;/p&gt; &lt;h3&gt;Client View&lt;/h3&gt; &lt;p&gt;A &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xbb32b58&quot;&gt;SQL&lt;/a&gt; or &lt;a href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0xaa4e2b8&quot;&gt;HTTP&lt;/a&gt; client can connect to any node of the cluster and get an identical view of all data with full transactional semantics.Â  DDL operations like table creation and package installation are limited to one node, though.&lt;/p&gt; &lt;p&gt;Applications such as &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1bc18300&quot;&gt;ODS&lt;/a&gt; will run unmodified.Â  They are installed on all nodes with a single install command.Â  After this, the data partitioning must be declared, which is a one time operation to be done cluster by cluster.Â  The only application change is specifying the partitioning columns for each index.Â  The gain is optional redundant storage and capacity not limited to a single machine.Â  The penalty is that single operations may take a little longer when not all data is managed by the same process but then the parallel throughput is increased. Â We note that the main ODS performance factor is web page logic and not database access. Â Thus splitting the web server logic over multiple nodes gives basically linear scaling.&lt;/p&gt; &lt;h3&gt;Parallel Query Execution&lt;/h3&gt; &lt;p&gt;Message latency is the principal performance factor in a clustered database.Â  Due to this, Virtuoso packs the maximum number of operations in a single message.Â  For example, when doing a loop join that reads one table sequentially and retrieves a row of another table for each row of the outer table, a large number of the join of the inner loop are run in parallel.Â  So, if there is a join of five tables that gets one row from each table and all rows are on different nodes, the time will be spent on message latency.Â  If each step of the join gets 10 rows, for a total of 100000 results, the message latency is not a significant factorÂ and the cluster will clearly outperform a single node.&lt;/p&gt; &lt;p&gt;Also, if the workload consists of large numbers of concurrent short updates or queries, the message latencies will even out and throughput will scale up even if doing a single transaction were faster on a single node.&lt;/p&gt; &lt;h3&gt;Parallel SQL&lt;/h3&gt; &lt;p&gt;There are SQL extensions for stored procedures allowing parallelizing operations. Â For example, if a procedure has a loop doing inserts, the inserted rows can be buffered until a sufficient number is available, at which point they are sent in batches to the nodes concerned. Â Transactional semantics are kept but error detection is deferred to the actual execution.&lt;/p&gt; &lt;h3&gt;Transactions&lt;/h3&gt; &lt;p&gt;Each transaction is owned by one node of the cluster, the node to which the client is connected.Â  When more than one node besides the owner of the transaction is updated, two phase commit is used.Â  This is transparent to the application code.Â  No external transaction monitor is required, the Virtuoso instances perform these functions internally.Â  There is a distributed deadlock detection scheme based on the nodes periodically sharing transaction waiting &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0xb78b5c0&quot;&gt;information&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Since read transactions can operate without locks, reading the last committed state of uncommitted updated rows, waiting for locks is not very common.&lt;/p&gt; &lt;h3&gt;Interconnect and Threading&lt;/h3&gt; &lt;p&gt;Virtuoso uses TCP to connect between instances.Â  A single instance can have multiple listeners at different network interfaces for cluster activity.Â  The interfaces will be used in a round-robin fashion by the peers, spreading the load over all network interfaces. A separate thread is created for monitoring each interface.Â  Long messages, such as transfers of blobs are done on a separate thread, thus allowing normal service on the cluster node while the transfer is proceeding.&lt;/p&gt; &lt;p&gt;We will have to test the performance of TCP over &lt;i&gt;Infiniband&lt;/i&gt; to see if there is clear gain in going to a lower level interface like &lt;i&gt;MPI&lt;/i&gt;.Â  The Virtuoso architecture is based on streams connecting cluster nodes point to point.Â  The design does not per se gain from remote DMA or other features provided by MPI.Â  Typically, messages are quite short, under 100K. Â Flow control for transfer of blobs is however nice to have but can be written at the application level if needed.Â  We will get real data on the performance of different interconnects in the next weeks. &lt;/p&gt; &lt;h3&gt;Deployment and Management&lt;/h3&gt; &lt;p&gt;Configuring is quite simple, with each process sharing a copy of the same configuration file. Â One line in the file differs from host to host, telling it which one it is.Â  Otherwise the database configuration files are individual per host, accommodating different file system layouts etc. Â Setting up a node requires copying the executable and two configuration files, no more.Â  Â All functionality is contained in a single process.Â  There are no installers to be run or such.&lt;/p&gt; &lt;p&gt;Changing the number or network interface of cluster nodes requires a cluster restart.Â  Changing data partitioning requires copying the data into a new table and renaming this over the old one.Â  This is time consuming and does not mix well with updates.Â  Splitting an existing cluster node requires no copying with repartitioning but shifting data between partitions does.&lt;/p&gt; &lt;p&gt;A consolidated status report shows the general state and level of intra-cluster traffic as count of messages and count of bytes.&lt;/p&gt; &lt;p&gt;Start, shutdown, backup, and package installation commands can only be issued from a single master node. Otherwise all is symmetrical.&lt;/p&gt; &lt;h3&gt;Present State and Next Developments&lt;/h3&gt; &lt;p&gt;The basics are now in place.Â  Some code remains to be written for such things as distributed deadlock detection, 2-phase commit recovery cycle, management functions, etc.Â  Some SQL operations like text index, statistics sampling, and index intersection need special support, yet to be written.&lt;/p&gt; &lt;p&gt;The &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xbca4e90&quot;&gt;RDF&lt;/a&gt; capabilities are not specifically affected by clustering except in a couple of places.Â  Loading will be slightly revised to use larger batches of rows to minimize latency, for example.&lt;/p&gt; &lt;p&gt;There is a pretty much infinite world of SQL optimizations for splitting aggregates, taking advantage of co-located joins etc.Â  These will be added gradually.Â  These are however not really central to the first application of RDF storage but are quite important for business intelligence, for example.&lt;/p&gt; &lt;p&gt;We will run some benchmarks for comparing single host and clustered Virtuoso instances over the next weeks.Â  Some of this will be with real data, giving an estimate on when we can move some of the RDF data we presently host to the new platform.Â  We will benchmark against &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0xa9cc1b8&quot;&gt;Oracle&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/IBM_DB2&quot; id=&quot;link-id0x1be5abb0&quot;&gt;DB2&lt;/a&gt; later but first we get things to work and compare against ourselves.&lt;/p&gt; &lt;p&gt;We roughly expect a halving in space consumption and a significant increase in single query performance and linearly scaling parallel throughput through addition of cluster nodes.&lt;/p&gt; &lt;p&gt; &lt;i&gt;The &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1246&quot; id=&quot;link-id106de430&quot;&gt;next update&lt;/a&gt; will be on this blog within two weeks.&lt;/i&gt; &lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>RDF, Clustered Databases, and Partitioning Vs. Cache Fusion</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-07-19#1230</atom:id>
  <atom:published>2007-07-19T12:29:12Z</atom:published>
  <atom:updated>2008-04-25T11:59:26-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;RDF, Clustered Databases, and Partitioning Vs. Cache Fusion&lt;/div&gt; &lt;p&gt;I recently read &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0xb279258&quot;&gt;Oracle&lt;/a&gt;&amp;#39;s papers about RAC, Real Application Clusters. This is relevant as we are presently working on the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xba1d4e0&quot;&gt;Virtuoso&lt;/a&gt; equivalent.&lt;/p&gt; &lt;p&gt;Caveat: The following is quite technical and not the final word on the matter.&lt;/p&gt; &lt;p&gt;Oracle&amp;#39;s claim is roughly as follows: Take a number of machines with access to a shared pool of disks and get scalability in processing power and memory without having to explicitly partition the &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1a9a7ab8&quot;&gt;data&lt;/a&gt; or perform other complicated configuration.&lt;/p&gt; &lt;p&gt;This works through implementing a cache consistency protocol between the participating boxes and by parallelizing queries just as one would do on a shared memory SMP box. Each disk page has a box assigned to keep track of it and the responsibility migrates so that the box most often needing the page gets to be the page&amp;#39;s guardian, so as not to have to ask anybody else for permission to write the page.&lt;/p&gt; &lt;p&gt;This is a compelling proposition. Surely, it must be unrealistic to expect people to manually partition databases. This would require some understanding of first principles which is scarce out there.&lt;/p&gt; &lt;p&gt;So, should we implement clustering a la Oracle?&lt;/p&gt; &lt;p&gt;Let&amp;#39;s look at some basics. If we have an OLTP workload like &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0xbaa89d8&quot;&gt;TPC-C&lt;/a&gt;, we usually have affinity between clients and the data they access. This will make each client&amp;#39;s pages migrate to be managed by the box the client is connected to. This will work pretty well, no worse than with a single box. If two clients are updating the same data but are connected to two different boxes, this is quite bad since the box that does not have responsibility for the page must ask the other box for write access. This is a round trip, at least tens of microseconds (Âµs). Consider in comparison that finding a row out of a million takes some 3Âµs.&lt;/p&gt; &lt;p&gt;Would it not be better to have each partition in a known place and leave all processing to that place? The write contention would be resolved in the box owning the partition and there would be a message but now for requesting the update, not dealing with cache consistency. At what level should one communicate between cluster nodes? Talk about disk pages or about logical operations? If there is complete affinity between boxes and data, the RAC style shared cache needs no messages, each box ends up managing the pages of its clients and all works just as with a local situation. If on the other hand any client will update any page at random, most updates must request the write permission from another node. I will here presume that index tree tops get eventually cached on all nodes. If this were not so, even index lookups would have to most often request each index page from a remote box. Never forget that with a tree index, it takes about 1Âµs to descend one level and 50Âµs for a message round trip between processes, not counting any transport latency.&lt;/p&gt; &lt;p&gt;What of &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xbe1bd98&quot;&gt;RDF&lt;/a&gt; query workloads? After all, we are in the first instance concerned with winning the RDF storage race and after this the TPC ones. We design for both but do RDF first since this is our chosen specialty.&lt;/p&gt; &lt;p&gt;The disadvantage of having to specify partitioning is less weighty with RDF since there are only a few big tables and they will be at default settings, pretty much always. We do not expect the application developer to ever change these settings although it is in principle possible.&lt;/p&gt; &lt;p&gt;What about queries? The RDF workload is mostly random access and loop joins. How would these run on RAC? For now, let&amp;#39;s make a thought experiment and compare cache fusion to a hash partitioned cluster. In the following, I do not describe how Oracle actually works but will just describe how I would do things if I implemented a RAC style clustering. With a RAC style cluster, I&amp;#39;d split the outer loop into equal partitions and run them in parallel on different boxes. Each would build a working set for its part of the query and pages that were needed by more than one box would be read once from disk and a second time from the node that had them in cache. The top nodes of index trees would end up cached on all boxes. It would seem that all boxes would fill their cache with the same data. Now it may be that RAC makes it so that a page is cached only on one box and other boxes wanting the page must go to that box to get access to the page. But this would be a disaster in index lookups. It is less than a microsecond per local index tree level, but if there is a round trip, it would be at least 50Âµs per level of the index tree. I don&amp;#39;t know for sure about Oracle but if I did RAC, I&amp;#39;d have to allow duplicate read content in caches. This would have the effect that the aggregate cache size would be closer to the single cache size than to the sum of the cache sizes. A physically partitioned database would not ship pages so caches would not overlap and the aggregate cache would indeed be the sum of the sizes. Now this is good only insofar all boxes participate but with evenly distributed data this is a good possibility.&lt;/p&gt; &lt;p&gt;Of course, if RAC knew to split queries so that data and nodes had real affinity then the problem would be less. For indices one would need a map of key values to boxes. A little like a top level index shared among all nodes. The key value would give the node, just like with partitioning.&lt;/p&gt; &lt;p&gt;This would make partitioning on the fly. Joins that were made the most frequently would cause migration that would make these joins co-located.&lt;/p&gt; &lt;p&gt;We must optimize the number of messages needed to execute long series of loop joins. For parallelizing single queries, the most obvious approach would be to partition the first/outermost loop that is more than one iteration into equal size chunks. With RDF data, the join keys will mostly begin GS or GO with a possible P appended. If GO or GS specify the partition, partitioning by hash will yield the node that will provide the result.&lt;/p&gt; &lt;p&gt;The number of messages can be reduced to a minimum of the number of join steps times number of boxes minus one if the loops are short enough and multiple operations are carried by one message.&lt;/p&gt; &lt;p&gt;With RAC style clustering, each index lookup would have to be sent to the node most likely to hold the answer. If pages have to be fetched from other nodes, we have disastrous performance, at least 50Âµs for each non-local page. If there are two non-local pages in a lookup, the overhead will exceed the overhead of delegating the single lookup. Index page access in lookups cannot be easily batched, the way index lookups going to the same node can be batched. Batching multiple hopefully long operations into a single message is the only way to defeat the extreme cost of sending messages. An index lookup does not know what page it will need until it needs it. A way of batching these would be to run multiple lookups in parallel and to combine remote page requests grouping them by destination. This would not be impossible, simply we would have to run 100 index lookups in parallel on a thread, 100 first levels, 100 second levels and so forth. Suppose an outer loop that gives 100 rows and then an inner loop that retrieves 1 row for each. A query to get the email address of Mary would do this, supposing 100 Marys in the db. &lt;code&gt;{?person firstName &amp;quot;Mary&amp;quot; . ?person email ?m .}&lt;/code&gt;. Suppose a cluster of 10 nodes. The first node gets the 100 rows of the outer loop, splits these into 10x10, 10 on each node and then each node does 10 lookups in parallel, meaning 10 first levels, 10 second levels, 10 third levels. The index tree would be 4 deep, branching 300 ways. Running the query a second time would find all data in memory and run with only 18 messages after getting the 100 rows of the first loop. The first run would send lots of messages, almost two per page, for about 800 messages after getting the 100 rows of the first loop.&lt;/p&gt; &lt;p&gt;With partitioning, the situation would be sending 18 messages constant. 9 batches of 10 index lookups and their replies. The latency is 50Âµs and the lookup is 4Âµs. We would in fact gain in real time, counting 50Âµs for messages and 4Âµs per lookup, the time through the whole exercise of 10x10 random lookups would be 90Âµs.&lt;/p&gt; &lt;p&gt;If I did RAC style clustering, I&amp;#39;d have to allow replicating the tops of index trees to all caches, and I&amp;#39;d have to batch page request messages from index lookups, effectively doing the lookup vector processing style, meaning 100 first levels, 100 second levels etc. Given a key beginning, I&amp;#39;d have to know what node to send this to, meaning pretty much doing the first levels of the lookup before deciding where to send the lookup, only to have the lookup redone by the box ending up with the lookup. Doing things this way would make Oracle RAC style clustering work with the use case.&lt;/p&gt; &lt;p&gt;Given this, it appears that hash partitioning is easier to implement. Cache fusion clustering without the above mentioned gimmicks would be easiest of all but it would have a disastrous number of messages or it would fill all the caches with the same data. Avoiding this is possible but hard, as described above.&lt;/p&gt; &lt;p&gt;We will have to experiment with Oracle RAC itself a bit farther down the road. Deciding to use partitioning instead of cache fusion does bring along conversion cost and a very high cost for repartitioning.&lt;/p&gt; &lt;p&gt;Now let us look at the issue of co-location of joins. In a loop join this means that the node that holds the row from the outer loop also holds the row in the inner loop. For example, if order and order line are partitioned on order id, joining them on order id will be aÂ  co-located join. Such joins do not involve necessary messages in partitioned clusters. In RAC, they do not involve messages if the pages have migrated to be managed by the node doing the join, otherwise they do, up to 20 or so for the worst case.&lt;/p&gt; &lt;p&gt;Do we get any benefit from co-location with RDF? Supposing joins that go from S to O to S (e.g., population of the city where Mary lives), we do not get much guarantee of co-location.&lt;/p&gt; &lt;p&gt;Suppose the indices GSPO partitioned of GS and OGPS on OG, we know the box with Marys and then we&amp;#39;d know the box where the residence of each Mary was, based on GS. given the city as S, we would again know the box that had the population. All the three triples could be on different boxes. This cannot be helped at design time. At run time this can be helped by batching messages that go to the same node. Let&amp;#39;s see how this fans out. 100 Marys from the first node. To get the city, we get 10 batches of 10 messages. We get the 100 cities and then we get their populations, again 10 batches of 10. In this scenario, the scheduling is centrally done by one thread. Suppose it were done by the 10 batches of 10 for getting the city of each Mary. For 10 cities, we&amp;#39;d get 10 lookups for population, each potentially to a different node. For this case, managing the execution by one thread instead of several makes bigger batches and less messages, as one would expect.&lt;/p&gt; &lt;p&gt;It seems that with the RDF case, one may as well forget co-location. In the relational case, one must take advantage of it when there is co-location and when not, try to compensate with longer batches of function shipping.&lt;/p&gt; &lt;p&gt;Excellent as some of the RAC claims are, it still seems that making it work well for an RDF workload would take such magical heuristics of location choice that implementing them would be hard and the result not altogether certain. I could get it to work eventually but hash partitioning seems by far the more predictable route. Also hash partitioning will work in shared nothing scenarios whereas RAC requires shared disks. Shared nothing will not require a SAN, which may make it somewhat lower cost. Also, if messages are grouped in large batches, the performance of the interconnect is not so critical, meaning that maybe even gigabit ethernet might do in cases. RAC style cache maintenance is more sensitive to interconnect latency than batched function shipping. Batched cache consistency is conceivable as discussed above but tough to do.&lt;/p&gt; &lt;p&gt;For recovery and hot software updates, things can be arranged if there is non-local disk access or if partitions are mirrored. A RAC type cluster could use a SAN with internal mirroring. A hash partitioned system could mirror partitions to more than one box with local disk, thus using no mirrored disks. Repartitioning remains the bane of partitioning and not much can be done about that, it seems. The only easy repartitioning is doubling the cluster size. So it seems.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Storage News</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-07-12#1226</atom:id>
  <atom:published>2007-07-12T14:29:26Z</atom:published>
  <atom:updated>2008-04-24T13:22:37-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Storage News&lt;/div&gt; &lt;p&gt;I have been away from the world for a few weeks, concentrating on technology.&lt;/p&gt; &lt;p&gt;We have now implemented an entirely new storage layout. With &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1c53b838&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1cc7af28&quot;&gt;data&lt;/a&gt;, we have now successfully doubled the working set.&lt;/p&gt; &lt;p&gt;This means that the number of triples that will fit in memory is doubled for any configuration. For any database in the hundreds of millions of triples, this is very significant. For LUBM data, we go from 75b to 35b per triple with the default indices.&lt;/p&gt; &lt;p&gt;This is obtained without using gzip or some other stream compression. Thus no decompression is needed at read time. Random access speeds are within 5% of those of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1cf61820&quot;&gt;Virtuoso&lt;/a&gt; v5.0.1, but the space requirement is halved and you can still locate a random triple in cache in a few microseconds.&lt;/p&gt; &lt;p&gt;What is better still, when using 8-byte IDs for IRIs instead of 4-byte ones, the space consumption stays almost the same since unique values are stored only once per page.&lt;/p&gt; &lt;p&gt;When applying gzip to the new storage layout, we usually get 3x compression. This means that 99% of 8K pages fit in 3K after compression. This is no real surprise since an index is repetitive pretty much by definition, even if the repeated sections are now shorter than in v5.0.1.&lt;/p&gt; &lt;p&gt;Gzip applied to pages does nothing for the working set since a page must remain random accessible for fast search but will cut disk usage to between half and a third. We will make this an option later. There are other tricks to be done with compression, like using a separate dictionary for non key text columns in relational applications. This would improve the working set in &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x1c019500&quot;&gt;TPC-C&lt;/a&gt; and TPC-D quite a bit so we may do this also while on the subject.&lt;/p&gt; &lt;p&gt;Right now we are writing the clustering support, revising all internal APIs to run with batches of rows instead of single rows. We will most likely release clustering and the new storage layout together, towards the end of summer, at least in internal deployments.&lt;/p&gt; &lt;p&gt;I will &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1d383588&quot;&gt;blog&lt;/a&gt; about results as and when they are obtained, over the next few weeks.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Feature Update</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-05-23#1199</atom:id>
  <atom:published>2007-05-23T14:09:37Z</atom:published>
  <atom:updated>2008-04-18T16:58:50-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Feature Update&lt;/div&gt; &lt;p&gt;We have a few new features that we did for the &lt;a href=&quot;http://www2007.org/&quot; id=&quot;link-id12603130&quot;&gt;WWW 2007&lt;/a&gt; conference that we will be shortly adding to the open source release.&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Optimization for &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xe67fed8&quot;&gt;SQL&lt;/a&gt; &lt;code&gt;IN&lt;/code&gt; predicate. The IN predicate with a list of values will now use an index if available. This is useful for &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x18791f70&quot;&gt;SPARQL&lt;/a&gt; queries with multiple &lt;code&gt;FROM&lt;/code&gt; graphs, for example.&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/fn_key_estimate.html&quot; id=&quot;link-idffb5400&quot;&gt;API for index population estimates&lt;/a&gt;. There is an API for getting an approximate count of matches given one or more leading key parts of an index.&lt;/li&gt; &lt;li&gt; &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/coredbengine.html#RowbyRowAutoCommit&quot; id=&quot;link-id1097c420&quot;&gt;Row-level autocommit mode&lt;/a&gt; – If one updates a huge table and the application does not require transaction isolation, it is possible to do this with an automatic commit after each row. This saves the server from having to keep rollback &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x18e3c340&quot;&gt;information&lt;/a&gt; on millions and billions of rows and saves it from temporary rollbacks of the uncommitted &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xde26fe8&quot;&gt;data&lt;/a&gt; for checkpoints etc. These things can completely hang a server if there are a few tens of millions of uncommitted inserts/deletes/updates.&lt;/li&gt; &lt;li&gt;64-bit IDs for IRIs and &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1896f518&quot;&gt;RDF&lt;/a&gt; objects, 64-bit integer data type. With the growth of some RDF databases to the tens of billions of triples, we run out of the 32-bit range for IDs of distinct IRIs. To accommodate this before actually running out, we introduce a longer ID.&lt;/li&gt; &lt;li&gt;Some cost model adjustments.&lt;/li&gt; &lt;li&gt;SQL extension for producing multiple result set rows from a single table row. This is useful for mapping SPARQL queries like &lt;code&gt;SELECT * FROM graph WHERE {?s ?p ?o}&lt;/code&gt; into a &lt;code&gt;UNION&lt;/code&gt; of &lt;code&gt;SELECT *&lt;/code&gt;’s from multiple tables of different width. Each term of the &lt;code&gt;UNION&lt;/code&gt; will simply produce multiple 3 column result rows for each actual row while not having to run through the tables multiple times. Together with this, we have also fixed a number of things with the relational-to-RDF mapping. We have been testing this extensively with the &lt;a href=&quot;http://blog.musicbrainz.org/&quot; id=&quot;link-id12109cb0&quot;&gt;Musicbrainz&lt;/a&gt; mapping by &lt;a href=&quot;http://fgiasson.com/blog/&quot; id=&quot;link-idffd52f0&quot;&gt;Fred Giasson&lt;/a&gt;. &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;These changes are small and to be released shortly.&lt;/p&gt; &lt;p&gt;There are also some larger things in the works, to be released during this summer, &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1197&quot; id=&quot;link-id10cafde8&quot;&gt;the next post&lt;/a&gt; gives an overview of these.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Cluster</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-05-23#1201</atom:id>
  <atom:published>2007-05-23T14:09:37Z</atom:published>
  <atom:updated>2008-04-24T09:52:23.000003-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Cluster&lt;/div&gt; &lt;p&gt;We often get questions on clustering support, especially around &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1e53d008&quot;&gt;RDF&lt;/a&gt;, where databases quickly get rather large. So we will answer them here.&lt;/p&gt; &lt;p&gt;But first on some support technology. We have an entire new disk allocation and IO system. It is basically operational but needs some further tuning. It offers much better locality and much better sequential access speeds.&lt;/p&gt; &lt;p&gt;Specially for dealing with large RDF databases, we will introduce &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1e042690&quot;&gt;data&lt;/a&gt; compression. We have over the years looked at different key compression possibilities but have never been very excited by them since thy complicate random access to index pages and make for longer execution paths, require scraping data for one logical thing from many places, and so on. Anyway, now we will compress pages before writing them to disk, so the cache is in machine byte order and alignment and disk is compressed. Since multiple processors are commonplace on servers, they can well be used for compression, that being such a nicely local operation, all in cache and requiring no serialization with other things.&lt;/p&gt; &lt;p&gt;Of course, what was fixed length now becomes variable length, but if the compression ratio is fairly constant, we reserve space for the expected compressed size, and deal with the rare overflows separately. So no complicated shifting data around when something grows.&lt;/p&gt; &lt;p&gt;Once we are done with this, this could well be a separate intermediate release.&lt;/p&gt; &lt;p&gt;Now about clusters. We have for a long time had various plans for clusters but have not seen the immediate need for execution. With the rapid growth in the Linking Open Data movement and questions on web scale &lt;a href=&quot;http://dbpedia.org/resource/Knowledge&quot; id=&quot;link-id0x1e7714f0&quot;&gt;knowledge&lt;/a&gt; systems, it is time to get going.&lt;/p&gt; &lt;p&gt;How will it work? &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1e3caea8&quot;&gt;Virtuoso&lt;/a&gt; remains a generic DBMS, thus the clustering support is an across the board feature, not something for RDF only. So we can join &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1ac67648&quot;&gt;Oracle&lt;/a&gt;, IBM &lt;a href=&quot;http://dbpedia.org/resource/IBM_DB2&quot; id=&quot;link-id0x1c2267d0&quot;&gt;DB2&lt;/a&gt;, and others at the multi-terabyte TPC races.&lt;/p&gt; &lt;p&gt;We introduce hash partitioning at the index level and allow for redundancy, where multiple nodes can serve the same partition, allowing for load balancing read and replacement of failing nodes and growth of cluster without interruption of service.&lt;/p&gt; &lt;p&gt;The &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1daea638&quot;&gt;SQL&lt;/a&gt; compiler, &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1ddb8c50&quot;&gt;SPARQL&lt;/a&gt;, and database engine all stay the same. There is a little change in the SQL run time, not so different from what we do with remote databases at present in the context of our &lt;a href=&quot;http://dbpedia.org/resource/Virtual_Database&quot; id=&quot;link-id0x1e13a880&quot;&gt;virtual database&lt;/a&gt; federation. There is a little extra complexity for distributed deadlock detection and sometimes multiple threads per transaction. We remember that one RPC round trip Is 3-4 index lookups, so we pipeline things so as to move requests in batches, a few dozen at a time.&lt;/p&gt; &lt;p&gt;The cluster support will be in the same executable and will be enabled by configuration file settings. Administration is limited to one node, but Web and SQL clients can connect to any node and see the same data. There is no balancing between storage and control nodes because clients can simply be allocated round robin for statistically even usage. In relational applications, as exemplified by &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x1c236bb0&quot;&gt;TPC-C&lt;/a&gt;, if one partitions by fields with an application meaning (such as warehouse ID), and if clients have an affinity to a particular chunk of data, they will of course preferentially connect to nodes hosting this data. With RDF, such affinity is unlikely, so nodes are basically interchangeable.&lt;/p&gt; &lt;p&gt;In practice, we develop in June and July. Then we can rent a supercomputer maybe from Amazon EC2 and experiment away.&lt;/p&gt; &lt;p&gt;We should just come up with a name for this. Maybe something astronomical, like star cluster. Big, bright but in this case not far away.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>WWW 2007</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-05-23#1195</atom:id>
  <atom:published>2007-05-23T13:31:38Z</atom:published>
  <atom:updated>2008-04-18T11:08:12.000003-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;WWW 2007&lt;/div&gt; &lt;p&gt;We were at the &lt;a href=&quot;http://www2007.org/&quot; id=&quot;link-id10a0aa08&quot;&gt;WWW 2007&lt;/a&gt; conference in Banff, Canada week before last. &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1963b400&quot;&gt;Virtuoso&lt;/a&gt; was a part of &lt;a href=&quot;http://network.nature.com/profile/U1ACB1466&quot; id=&quot;link-id1071a250&quot;&gt;Alan Ruttenberg&lt;/a&gt;’s &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0xd98ab38&quot;&gt;semantic web&lt;/a&gt; in health care and life sciences presentation. Alan had a database of 350M triples extracted from different biology and publication databases running on Virtuoso. We will also be experimenting on other biomedical datasets, both with real &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xe91aa88&quot;&gt;RDF&lt;/a&gt; and relational &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x181143b0&quot;&gt;data&lt;/a&gt; mapped to RDF on demand.&lt;/p&gt; &lt;p&gt;Linking Open Data was a big thing at WWW 2007. There is quite a bit of momentum gathering around publishing publicly available data as RDF and making these data sets mutually joinable. &lt;a href=&quot;http://www.bizer.de/&quot; id=&quot;link-id10142a40&quot;&gt;Chris Bizer&lt;/a&gt; of the &lt;a href=&quot;http://www.fu-berlin.de/en/&quot; id=&quot;link-id109d4f40&quot;&gt;Free University of Berlin&lt;/a&gt; will be demonstrating &lt;a href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0xe9ab408&quot;&gt;Dbpedia&lt;/a&gt; linked with a number of other data sets such as &lt;a href=&quot;http://www.geonames.org/&quot; id=&quot;link-id100c46d0&quot;&gt;Geonames&lt;/a&gt; and &lt;a href=&quot;http://blog.musicbrainz.org/&quot; id=&quot;link-id106e1980&quot;&gt;Musicbrainz&lt;/a&gt; and others at &lt;a href=&quot;http://www.eswc2007.org/&quot; id=&quot;link-id10f3c060&quot;&gt;ESWC 2007&lt;/a&gt; in a couple of weeks, also running on Virtuoso.&lt;/p&gt; &lt;p&gt;The last month or so has been spent mostly on the conference preparation and follow up, not to mention taking part in two EU project proposals. But now we are returning to normal operations and can do some technology for a change. More on this in the next post.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Announcing Virtuoso Open-Source Edition v5.0.0</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-04-12#1184</atom:id>
  <atom:published>2007-04-12T13:48:34Z</atom:published>
  <atom:updated>2007-04-12T09:50:10-04:00</atom:updated>
  <atom:content type="html">All, OpenLink Software are pleased to announce a new release of &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;Virtuoso&lt;/a&gt;, Open-Source Edition, version 5.0.0. This version includes: &lt;ul&gt; &lt;li&gt;Significant rewrite of database engine resulting in 50%-100% improvement on single CPU and in some cases up to 300% on multiprocessor CPUs by decreasing resource-contention between threads and other optimizations.&lt;/li&gt; &lt;li&gt;Radical expansion of RDF support including&lt;/li&gt; &lt;/ul&gt; &lt;ul&gt; &lt;li&gt;In-built middleware (called the Sponger) for transforming non-RDF into RDF &amp;quot;on the fly&amp;quot; (e.g. producing Triples from Microformats, REST-style Web Services, and (X)HTML etc.)&lt;/li&gt; &lt;li&gt;Full Text Indexing of Literal Objects in Triple Patterns (via Filter or magic bif:contains predicate applied to Literal Objects)&lt;/li&gt; &lt;li&gt;Basic Inferencing (Subclass and Subproperty Support)&lt;/li&gt; &lt;li&gt;SPARQL Aggregate Functions&lt;/li&gt; &lt;li&gt;SPARQL Update Language Support (Updates, Inserts, Deletions in SPARQL)&lt;/li&gt; &lt;li&gt;Improved Support of XML Schema Type System (including the use of XML Schema Complex Types as Objects of bif:xcontains predicate)&lt;/li&gt; &lt;/ul&gt; &lt;ul&gt; &lt;li&gt;Enhancements to the in-built SPARQL to SQL Compiler&amp;#39;s Cost Optimizer&lt;/li&gt; &lt;li&gt;Performance Optimizations to RDF VIEWs (SQL to RDF Mapping)&lt;/li&gt; &lt;li&gt;Various bug-fixes&lt;/li&gt; &lt;/ul&gt; NOTE: Databases created with earlier versions of Virtuoso will be automatically upgraded to Virtuoso 5.0 but after upgrade will not be readable with older Virtuoso versions. For more information please see: Virtuoso Open Source Edition: Home Page: &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/&quot;&gt;http://virtuoso.openlinksw.com/wiki/main/&lt;/a&gt; Download Page: &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSDownload&quot;&gt;http://virtuoso.openlinksw.com/wiki/main/Main/VOSDownload&lt;/a&gt; OpenLink Data Spaces: Home Page: &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/OdsIndex&quot;&gt;http://virtuoso.openlinksw.com/wiki/main/Main/OdsIndex&lt;/a&gt; SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/ODSSIOCRef&quot;&gt;http://virtuoso.openlinksw.com/wiki/main/Main/ODSSIOCRef&lt;/a&gt; Interactive SPARQL Demo: &lt;a href=&quot;http://demo.openlinksw.com/isparql/&quot;&gt;http://demo.openlinksw.com/isparql/&lt;/a&gt; OpenLink AJAX Toolkit (OAT): Project Page: &lt;a href=&quot;http://sourceforge.net/projects/oat&quot;&gt;http://sourceforge.net/projects/oat&lt;/a&gt; Live Demonstration: &lt;a href=&quot;http://demo.openlinksw.com/DAV/JS/oat/index.html&quot;&gt;http://demo.openlinksw.com/DAV/JS/oat/index.html&lt;/a&gt; &lt;p style=&quot;text-align:right;font-size:10px;&quot;&gt;Technorati Tags: &lt;a href=&quot;http://www.technorati.com/tag/database&quot; rel=&quot;tag&quot;&gt;database&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/databases&quot; rel=&quot;tag&quot;&gt;databases&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/open-source&quot; rel=&quot;tag&quot;&gt;open-source&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/OpenLink&quot; rel=&quot;tag&quot;&gt;OpenLink&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/RDBMS&quot; rel=&quot;tag&quot;&gt;RDBMS&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/RDF&quot; rel=&quot;tag&quot;&gt;RDF&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/semantic web&quot; rel=&quot;tag&quot;&gt;semantic web&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/Semantic Web&quot; rel=&quot;tag&quot;&gt;Semantic Web&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/SPARQL&quot; rel=&quot;tag&quot;&gt;SPARQL&lt;/a&gt;, &lt;a href=&quot;http://www.technorati.com/tag/virtuoso&quot; rel=&quot;tag&quot;&gt;virtuoso&lt;/a&gt; &lt;/p&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Comparison of Open Source Databases with TPC D Queries</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-02-05#1132</atom:id>
  <atom:published>2007-02-05T11:44:34Z</atom:published>
  <atom:updated>2008-04-17T21:04:40-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Comparison of Open Source Databases with TPC D Queries&lt;/div&gt; &lt;p&gt; &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1116&quot; id=&quot;link-id10598cc0&quot;&gt;Last time&lt;/a&gt; we talked about database engine and transactions. Now we have come to the realm of query processing in our revisiting of the DBMS side of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xd3ecd30&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Now the well established, respectable standard benchmark for the basics of query processing is TPC D with its derivatives H and R. So we have, for testing how different &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x15cf6938&quot;&gt;SQL&lt;/a&gt; optimizers manage the 22 queries, run a mini version of the D queries with a 1% scale database, some 30M of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x15ce3928&quot;&gt;data&lt;/a&gt;, all in memory. This basically catches whether SQL implementations miss some of the expected tricks and how efficient in memory loop and hash joins and aggregation are.&lt;/p&gt; &lt;p&gt;When we get to our next stop, high volume I/O, we will run the same with D databases in the 10G ballpark.&lt;/p&gt; &lt;p&gt;The databases were tested on the same machine, with warm cache, taking the best run of 3. All had full statistics and were running with read committed isolation, where applicable. The data was generated using the procedures from the Virtuoso test suite. The Virtuoso version tested was 5.0, to be released shortly. The &lt;a href=&quot;http://dbpedia.org/resource/MySQL&quot; id=&quot;link-id0xc952f58&quot;&gt;MySQL&lt;/a&gt; was 5.0.27, the PostgreSQL was 8.1.6. &lt;/p&gt; &lt;table style=&quot;width: 334px; height: 556px; &quot; border=&quot;1&quot;&gt; &lt;tbody&gt; &lt;tr&gt; &lt;th rowspan=&quot;2&quot;&gt;Query&lt;/th&gt; &lt;th colspan=&quot;4&quot;&gt;Query Times in Milliseconds&lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;th&gt; Virtuoso &lt;/th&gt; &lt;th&gt; PostgreSQL &lt;/th&gt; &lt;th&gt; MySQL &lt;/th&gt; &lt;th&gt; MySQL with InnoDB &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q1 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;206&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 763 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 312 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 198 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q2 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 4 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 6 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;3&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;3&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q3 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;13&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 51 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 254 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 64 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q4 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;4&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 16 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 24 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 60 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q5 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;15&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 22 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 64 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 68 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q6 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;9&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 70 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 189 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 65 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q7 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;52&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 143 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 211 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 84 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q8 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 29 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 31 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 13 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;11&lt;/b&gt; &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q9 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;36&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 114 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 97 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 61 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q10 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;32&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 51 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 117 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 57 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q11 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 16 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;9&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 12 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 10 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q12 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;8&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 21 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 18 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 130 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q13 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;18&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 74 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; - &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; - &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q14 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;7&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 21 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 418 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 1425 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q15 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;14&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 43 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 389 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 122 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q16 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;16&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 22 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 18 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 25 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q17 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;1&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 54 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 26 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 10 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q18 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;82&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 120 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; - &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; - &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q19 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 19 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 8 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;2&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 17 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q20 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;7&lt;b&gt; &lt;/b&gt;&lt;/b&gt;&lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 15 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 66 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 52 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q21 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;34&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 86 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 524 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 278 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt; Q22 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; &lt;b&gt;4&lt;/b&gt; &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 323 &lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 3311&lt;/td&gt; &lt;td align=&quot;right&quot;&gt; 805 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;Total (msec)&lt;/td&gt; &lt;td align=&quot;right&quot;&gt;&lt;b&gt;626&lt;/b&gt;&lt;/td&gt; &lt;td align=&quot;right&quot;&gt;2063&lt;/td&gt; &lt;td align=&quot;right&quot;&gt;6068&lt;/td&gt; &lt;td align=&quot;right&quot;&gt;3545&lt;/td&gt; &lt;/tr&gt; &lt;/tbody&gt; &lt;/table&gt; &lt;p&gt;We lead by a fair margin but MySQL is hampered by obviously getting some execution plans wrong and not doing Q13 and Q18 at all, at least not under several tens of seconds; so we left these out of the table in the interest of having comparable totals.&lt;/p&gt; &lt;p&gt;As usual, we also ran the workload on &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x167807c8&quot;&gt;Oracle&lt;/a&gt; 10g R2. Since Oracle does not like their numbers being published without explicit approval, we will just say that we are even with them within the parameters described above. Oracle has a more efficient decimal type so it wins where that is central, as on Q1. Also it seems to notice that the &lt;code&gt;GROUP BY&lt;/code&gt;s of Q18 are produced in order of grouping columns, so it needs no intermediate table for storing the aggregates. If we addressed these matters, we&amp;#39;d lead by some 15% whereas now we are even. A faster decimal arithmetic implementation may be in the release after next.&lt;/p&gt; &lt;p&gt;In the next posts, we will look at IO and disk allocation, and also return to &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xbb63c20&quot;&gt;RDF&lt;/a&gt; and LUBM.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso 5.0 Preview</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-01-10#1117</atom:id>
  <atom:published>2007-01-10T14:58:29Z</atom:published>
  <atom:updated>2008-04-17T21:04:37-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso 5.0 Preview&lt;/div&gt; &lt;p&gt;As &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1108&quot; id=&quot;link-id10c66e68&quot;&gt;previously said&lt;/a&gt;, we have a &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x15b8d948&quot;&gt;Virtuoso&lt;/a&gt; with brand new engine multithreading. It is now complete and passes its regular test suite. This is the basis for Virtuoso 5.0, to be available as the open source and commercial cuts as before.&lt;/p&gt; &lt;p&gt;As one benchmark, we used the &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x15fc4380&quot;&gt;TPC-C&lt;/a&gt; test driver that has always been bundled with Virtuoso. We ran 100000 new orders worth of the TPC-C transaction mix first with one client and then with 4 clients, each client going to its own warehouse, so there was not much lock contention. We did this on a 4 core Intel, the working set in RAM. With the old one, 1 client took 1m43 and 4 clients took 3m47. With the new one, one client took 1m30 and 4 clients took 2m37. So, 400000 new orders in 2m37, for 152820 new orders per minute as opposed to 105720 per minute previously. Do not confuse with the official tpmC metric, that one involves a whole bunch of further rules.&lt;/p&gt; &lt;p&gt;TPC-C has activity spread over a few different tables. With tests dealing with fewer tables, improvements in parallelism are far greater.&lt;/p&gt; &lt;p&gt;Aside from better parallelism, we have other features. One of them is a change in the read committed isolation, so that we now return the previous committed state for uncommitted changed rows instead of waiting for the updating transaction to terminate. This is similar to what &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0xe8cc528&quot;&gt;Oracle&lt;/a&gt; does for read committed. Also we now do log checkpoints without having to abort pending write transactions.&lt;/p&gt; &lt;p&gt;When we have faster inserts, we actually see the &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xe0e5ff8&quot;&gt;RDF&lt;/a&gt; bulk loader run slower. This is really backwards. The reason is that while one thread parses, other threads insert and if the inserting threads are done they go to wait on a semaphore and this whole business of context switching absolutely kills performance. With slower inserts, the parser keeps ahead so there is less context switching, hence better overall throughput. I still do not get it how the OS can spend between 1.5 and 6 microseconds, several thousand instructions, deciding what to do next when there are only 3-4 eligible threads and all the rest is background which goes with a few dozen slices per second. Solaris is a little better than Linux at this but not dramatically so. Mac OS X is way worse.&lt;/p&gt; &lt;p&gt;As said, we use Oracle 10G2 on the same platform (Linux FC5 64 bit) for sparring. It is really a very good piece of software. We have written the TPC C transactions in &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x183aac30&quot;&gt;SQL&lt;/a&gt;/PL. What is surprising is that these procedures run amazingly slowly, even with a single client. Otherwise the Oracle engine is very fast. Well, as I recall, the official TPC C runs with Oracle use an OCI client and no stored procedures. Strange. While Virtuoso for example fills the initial TPC C state a little faster than Oracle, the procedures run 5-10 times slower with Oracle than with Virtuoso, all &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1659fd20&quot;&gt;data&lt;/a&gt; in warm cache and a single client. While some parts of Oracle are really well optimized, all basic joins and aggregates etc, we are surprised at how they could have neglected such a central piece as the PL.&lt;/p&gt; &lt;p&gt;Also, we have looked at transaction semantics. Serializable is mostly serializable with Oracle but does not always keep a steady count. Also it does not prevent inserts into a space that has been found empty by a serializable transaction. True, it will not show these inserts to the serializable transaction, so in this it follows the rules. Also, to make a read really repeatable, it seems that the read has to be FOR UPDATE. Otherwise one can not implement a reliable resource transaction, like changing the balance of an account.&lt;/p&gt; &lt;p&gt;Anyway, the Virtuoso engine overhaul is now mostly complete. This is of course an open ended topic but the present batch is nearing completion. We have gone through as many as 3 implementations of hash joins, some things have yet to be finished there. Oracle has very good hash joins. The only way we could match that was to do it all in memory, dropping any persistent storage of the hash. This is of course OK if the hash is not very large and anyway hash joins go sour if the hash does not fit in working set.&lt;/p&gt; &lt;p&gt;As next topics, we have more RDF and the LUBM benchmark to finish. Also we should revisit TPC-D.&lt;/p&gt; &lt;p&gt;Databases are really quite complicated and extensive pieces of software. Much more so than the casual observer might think.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Season&amp;#39;s Greetings from Virtuoso Development</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-01-09#1113</atom:id>
  <atom:published>2007-01-09T07:05:07Z</atom:published>
  <atom:updated>2008-04-17T21:04:33.000001-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Season&amp;#39;s Greetings from Virtuoso Development&lt;/div&gt; &lt;p&gt;It&amp;#39;s been a long and very busy time since &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1085&quot; id=&quot;link-id104d7ac0&quot;&gt;the last blog post&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Now and then, circumstances call for a return to the contemplation of first principles. I have lately beheld the Platonic ideal of database-ness and translated it into engineering elegance. No quest is static and no objective is permanently achieved.&lt;/p&gt; &lt;p&gt;Accordingly, I have redone all &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xe7f8d18&quot;&gt;Virtuoso&lt;/a&gt; core engine structures for control of parallel execution. As we now routinely get multiple cores per chip, this is more important than before. Aside from dramatic improvements in multiprocessor performance, there is also quite a bit of optimization for basic relational operations.&lt;/p&gt; &lt;p&gt;Of course, this is not for the pure pleasure of geek-craft; it serves a very practical purpose. &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xde83868&quot;&gt;RDF&lt;/a&gt; opens a new database frontier, where these things make a significant difference. In application scenarios involving either federated/&lt;a href=&quot;http://dbpedia.org/resource/Virtual_Database&quot; id=&quot;link-id0x17791fa0&quot;&gt;virtual database&lt;/a&gt; or running typical web applications, the core concurrency of the DBMS is not really the determining factor. However, with RDF, we get a small number of very large tables and most processing goes to these tables. This is also often so with business intelligence but it is still more so with RDF. Thus the parallelism within a single index becomes essential.&lt;/p&gt; &lt;p&gt;We have also made a point by point comparison of Virtuoso and &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x17a3aeb0&quot;&gt;Oracle&lt;/a&gt; 10g for basic relational operations. Oracle is very good, certainly in the basic relational operations like table scans and different kinds of joins. As a matter of principle, we will at the minimum match Oracle in all these things, in single and multiprocessor environments. The Virtuoso cut forthcoming in January will have all this inside. We are also considering making and publishing a basic &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x181c4438&quot;&gt;RDBMS&lt;/a&gt; performance checklist, aimed at comparing specific aspects of relational engine performance. While the TPC tests give a good aggregate figure, it is sometimes interesting to look at a finer level of detail. We may not be allowed to give out numbers in all cases due to license terms but we can certainly make the test available and publish numbers for those who do not object to this.&lt;/p&gt; &lt;p&gt;Of course, RDF is the direct beneficiary of all these efforts, since RDF loading and querying basically rests on the performance of very relational things, such as diverse types of indices and joins.&lt;/p&gt; &lt;p&gt; More &lt;a href=&quot;http://dbpedia.org/resource/Information&quot; id=&quot;link-id0x1aa37c80&quot;&gt;information&lt;/a&gt; will be forthcoming in January.&lt;/p&gt; &lt;p&gt;Merry Christmas and productive new year to all.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso TPCC and Multiprocessor Linux and Mac</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-01-09#1110</atom:id>
  <atom:published>2007-01-09T06:35:06Z</atom:published>
  <atom:updated>2008-04-16T16:53:38.000001-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso TPCC and Multiprocessor Linux and Mac&lt;/div&gt; &lt;p&gt;We have &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSScale&quot; rel=&quot;sql&quot; id=&quot;link-id10650ec0&quot;&gt;updated our article&lt;/a&gt; on &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xc41ba10&quot;&gt;Virtuoso&lt;/a&gt; scalability with two new platforms: A 2 x dual core Intel Xeon and a Mac Mini with an Intel Core Duo.&lt;/p&gt; &lt;p&gt;We have more than quadrupled the best result so far.&lt;/p&gt; &lt;p&gt;The best score so far is 83K transactions per minute with a 40 warehouse (about 4G) database. This is attributable to the process running in mostly memory, with 3 out of 4 cores busy on the database server. But even when doubling the database size and number of 3 clients, we stay at 49K transactions per minute, now with a little under 2 cores busy and am average of 20 disk reads pending at all times, split over 4 SATA disks. The measurement is the count of completed transactions during a 1h run. With the 80 warehouse database, it took about 18 minutes for the system to reach steady state, with a warm working set, hence the actual steady rate is somewhat higher than 49K, as the warm up period was included in the measurement.&lt;/p&gt; &lt;p&gt;The metric on the Mac Mini was 2.7K with 2G RAM and one disk. The CPU usage was about one third of one core. Since we have had rates of over 10K with 2G RAM, we attribute the low result to running on a single disk which is not very fast at that.&lt;/p&gt; &lt;p&gt;We have run tests in 64 and 32 bit modes but have found little difference as long as actual memory does not exceed 4g. If anything, 32 bit binaries should have an advantage in cache hit rate since most &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xdf23240&quot;&gt;data&lt;/a&gt; structures take less space there. After the process size exceeds the 32 bit limit, there is a notable difference in favor of 64 bit. Having more than 4G of database buffers produces a marked advantage over letting the OS use the space for file system cache. So, 64 bit is worthwhile but only if there is enough memory. As for X86 having more registers in 64 bit mode, we have not specifically measured what effect that might have.&lt;/p&gt; &lt;p&gt;We also note that Linux has improved a great deal with respect to multiprocessor configurations. We use a very simple test with a number of threads acquiring and then immediately freeing the same mutex. On single CPU systems, the real time has pretty much increased linearly with the number of threads. On multiprocessor systems, we used to get very non-linear behavior, with 2 threads competing for the same mutex taking tens of times the real time as opposed to one thread. At last measurement, with a 64 bit FC 5, we saw 2 threads take 7x the real time when competing for the same mutex. This is in the same ballpark as Solaris 10 on a similar system. Mac OS X 10.4 Tiger on a 2x dual core Xeon Mac Pro did the worst so far, with two threads taking over 70x the time of one. With a Mac Mini with a single Core Duo, the factor between one thread and two was 73.&lt;/p&gt; &lt;p&gt;Also the proportion of system CPU on Tiger was consistently higher than on Solaris or Linux when running the same benchmarks. Of course for most applications this test is not significant but it is relevant for database servers, as there are many very short critical sections involved in multithreaded processing of indices and the like.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Recent Virtuoso Developments</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2007-01-09#1109</atom:id>
  <atom:published>2007-01-09T06:35:05Z</atom:published>
  <atom:updated>2008-04-16T16:53:37.000013-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Recent Virtuoso Developments&lt;/div&gt; &lt;p&gt;We have been extensively working on &lt;a href=&quot;http://dbpedia.org/resource/Virtual_Database&quot; id=&quot;link-id0x19aebcf8&quot;&gt;virtual database&lt;/a&gt; refinements. There are many &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa217cd28&quot;&gt;SQL&lt;/a&gt; cost model adjustments to better model distributed queries and we now support direct access to &lt;a href=&quot;http://dbpedia.org/resource/Oracle_Database&quot; id=&quot;link-id0x1751b990&quot;&gt;Oracle&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/IBM_Informix&quot; id=&quot;link-id0x17393938&quot;&gt;Informix&lt;/a&gt; statistics system tables. Thus, when you attach a table from one or the other, you automatically getup to date statistics. This helps &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x19fb24f0&quot;&gt;Virtuoso&lt;/a&gt; optimize distributed queries. Also the documentation is updated as concerns these, with a new section on distributed query optimization.&lt;/p&gt; &lt;p&gt;On the applications side, we have been keeping up with the SIOC &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xd13efd8&quot;&gt;RDF&lt;/a&gt; ontology developments. All &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x170b1630&quot;&gt;ODS&lt;/a&gt; applications now make their &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xdc517d0&quot;&gt;data&lt;/a&gt; available as SIOC graphs for download and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x19fec088&quot;&gt;SPARQL&lt;/a&gt; query access.&lt;/p&gt; &lt;p&gt;What is most exciting however is our advance in mapping relational data into RDF. We now have a mapping language that makes arbitrary legacy data in Virtuoso or elsewhere in the relational world RDF query-able. We will put out a white paper on this in a few days.&lt;/p&gt; &lt;p&gt;Also we have some innovations in mind for optimizing the physical storage of RDF triples. We keep experimenting, now with our sights set to the high end of triple storage, towards billion triple data sets. We are experimenting with a new more space efficient index structure for better working set behavior. Next week will yield the first results.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>More RDF scalability tests</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-11-01#1075</atom:id>
  <atom:published>2006-11-01T20:36:17Z</atom:published>
  <atom:updated>2008-04-16T16:53:39.000004-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;More RDF scalability tests&lt;/div&gt; &lt;p&gt;We have lately been busy with &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1aeda730&quot;&gt;RDF&lt;/a&gt; scalability. We work with the 8000 university LUBM &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x185e6e00&quot;&gt;data&lt;/a&gt; set, a little over a billion triples. We can load it in 23h 46m on a box with 8G RAM. With 16G we probably could get it in 16h.&lt;/p&gt; &lt;p&gt;The resulting database is 75G, 74 bytes per triple which is not bad. It will shrink a little more if explicitly compacted by merging adjacent partly filled pages. See &lt;a href=&quot;http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSBitmapIndexing&quot; id=&quot;link-id105e5cf8&quot;&gt;Advances in Virtuoso RDF Triple Storage&lt;/a&gt; for an in-depth treatment of the subject.&lt;/p&gt; &lt;p&gt;The real question of RDF scalability is finding a way of having more than one CPU on the same index tree without them hitting the prohibitive penalty of waiting for a mutex. The sure solution is partitioning, would probably have to be by range of the whole key. but before we go to so much trouble, we&amp;#39;ll look at dropping a couple of critical sections from index random access. Also some kernel parameters may be adjustable, like a spin count before calling the scheduler when trying to get an occupied mutex. Still we should not waste too much time on platform specifics. We&amp;#39;ll see.&lt;/p&gt; &lt;p&gt;We just updated the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x17b07998&quot;&gt;Virtuoso&lt;/a&gt; Open Source cut. The latest RDF refinements are not in, so maybe the cut will have to be refreshed shortly.&lt;/p&gt; &lt;p&gt;We are also now applying the relational to RDF mapping discussed in &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF&quot; id=&quot;link-id10677bb8&quot;&gt;Declarative SQL Schema to RDF Ontology Mapping&lt;/a&gt; to the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1732fa20&quot;&gt;ODS&lt;/a&gt; applications.&lt;/p&gt; &lt;p&gt;There is a form of the mapping in the VOS cut on the net but it is not quite ready yet. We must first finish testing it through mapping all the relational schemas of the ODS apps before we can really recommend it. This is another reason for a VOS update in the near future.&lt;/p&gt; &lt;p&gt;We will be looking at the query side of LUBM after the ISWC 2006 conference. So far, we find queries compile OK for many SIOC use cases with the cost model that there is now. A more systematic review of the cost model for &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xc56b868&quot;&gt;SPARQL&lt;/a&gt; will come when we get to the queries.&lt;/p&gt; &lt;p&gt;We put some ideas about inferencing in the Advances in Triple Storage paper. The question is whether we should forward chain such things as class subsumption and subproperties. If we build these into the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xa25c7a70&quot;&gt;SQL&lt;/a&gt; engine used for running SPARQL, we probably can do these as unions at run time with good performance and better working set due to not storing trivial entailed triples. Some more thought and experimentation needs to go into this.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso and ODS Update</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-08-10#1025</atom:id>
  <atom:published>2006-08-10T11:55:26Z</atom:published>
  <atom:updated>2008-04-16T16:53:34.000008-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso and ODS Update&lt;/div&gt; &lt;p&gt;We have released an update of &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x1b0d5100&quot;&gt;Virtuoso&lt;/a&gt; Open Source Edition and the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x1770ad30&quot;&gt;OpenLink Data Spaces&lt;/a&gt; suite.&lt;/p&gt; &lt;p&gt;This marks the coming of age of our &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x1a1c6800&quot;&gt;RDF&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1779b790&quot;&gt;SPARQL&lt;/a&gt; efforts. We have the new &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x170db778&quot;&gt;SQL&lt;/a&gt; cost model with SPARQL awareness, we have applications which present much of their &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x18ab4600&quot;&gt;data&lt;/a&gt; as SIOC, FOAF, ATOM OWL and other formats.&lt;/p&gt; &lt;p&gt;We continue refining these technologies. Our next roadmap item is mapping relational data into RDF and offering SPARQL access to relational data without data duplication. Expect a white paper about this soon.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>New RDF Store White Paper</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-07-31#1022</atom:id>
  <atom:published>2006-07-31T15:49:49Z</atom:published>
  <atom:updated>2008-04-16T16:53:31.000012-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;New RDF Store White Paper&lt;/div&gt; &lt;p&gt;There is a new paper &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSRDFWP&quot; id=&quot;link-id104e1698&quot;&gt;Implementing an RDF Triple Store using an ORDBMS&lt;/a&gt; at the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xc358098&quot;&gt;Virtuoso&lt;/a&gt; wiki.&lt;/p&gt; &lt;p&gt;This paper summarizes how we have extended Virtuoso&amp;#39;s &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1ab48290&quot;&gt;SQL&lt;/a&gt; and database engine to better accommodate storing &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xccf6878&quot;&gt;RDF&lt;/a&gt; triples and optimizing queries of RDF &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x19f1f538&quot;&gt;data&lt;/a&gt;. This is the first of a series. The next will concern mapping relational databases onto RDF ontologies for &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x19fd9cf8&quot;&gt;SPARQL&lt;/a&gt; access.&lt;/p&gt; &lt;p&gt;This paper concerns the next Virtuoso Open Source release, to be available for download a few days from this posting.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>More Thoughts on ORDBMS Clients, .NET and RDF</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-07-17#1008</atom:id>
  <atom:published>2006-07-17T12:16:02Z</atom:published>
  <atom:updated>2008-04-16T16:13:30.000001-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;More Thoughts on ORDBMS Clients, .NET and RDF&lt;/div&gt; &lt;p&gt;Continuing on from &lt;a href=&quot;http://www.openlinksw.com/dataspace/oerling/weblog/Orri%20Erling%27s%20Blog/1002&quot; id=&quot;link-id1064f0c8&quot;&gt;the previous post&lt;/a&gt;... If Microsoft opens the right interfaces for independent developers, we see many exciting possibilities for using &lt;a href=&quot;http://msdn2.microsoft.com/en-us/data/aa937699.aspx&quot; id=&quot;link-id10f3ab60&quot;&gt;ADO.NET&lt;/a&gt; 3.0 with &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x171ad660&quot;&gt;Virtuoso&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Microsoft quite explicitly states that their thrust is to decouple the client side representation of &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xdaf01b0&quot;&gt;data&lt;/a&gt; as .NET objects from the relational schema on the database. This is a worthy goal.&lt;/p&gt; &lt;p&gt;But we can also see other possible applications of the technology when we move away from strictly relational back ends. This can go in two directions: Towards object oriented database (OODBMS) and towards making applications for the &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x175fa2f0&quot;&gt;semantic web&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;In the OODBMS direction, we could equate Virtuoso table hierarchies with .NET classes and create a tighter coupling between client and database, going as it were in the other direction from Microsoft&amp;#39;s intended decoupling. For example, we could do typical OODBMS tricks such as pre-fetch of objects based on storage clustering. The simplest case of this is like virtual memory, where the request for one byte brings in the whole page or group of pages. The basic idea is that what is created together probably gets used together and if all objects are modeled as subclasses of (sub-tables) of a common superclass, then, regardless of instance type, what is created together (has consecutive IDs) will indeed tend to cluster on the same page. These tricks can deliver good results in very navigational applications like GIS or CAD. But these are rather specialized things and we do not see OODBMS making any great comeback.&lt;/p&gt; &lt;p&gt;But what is more interesting and more topical in the present times is making clients for the &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xc58f9f8&quot;&gt;RDF&lt;/a&gt; world. There, the OWL ontology could be used to make the .NET classes and the DBMS could, when returning URIs serving as subjects of triple include specified predicates on these subjects, enough to allow instantiating .NET instances as &amp;quot;proxies&amp;quot; of these RDF objects. Of course, only predicates for which the client has a representation are relevant, thus some client-server handshake is needed at the start. What data could be pre-fetched is like the intersection of a concise bounded description and what the client has classes for. The rest of the mapping would be very simple, with IRIs becoming pointers, multi-valued predicates lists, and so on. IRIs for which the RDF type is not known or inferable could be left out or represented as a special class with name-value pairs for its attributes, same with blank nodes.&lt;/p&gt; &lt;p&gt;In this way, .NET&amp;#39;s considerable UI capabilities could directly be exploited for visualizing RDF data, only given that the data complies reasonably well with a known ontology.&lt;/p&gt; &lt;p&gt;If a &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0xc5d8728&quot;&gt;SPARQL&lt;/a&gt; query returned a result-set, IRI type columns would be returned as .NET instances and the server would pre-fetch enough data for filling them in. For a CONSTRUCT, a collection object could be returned with the objects materialized inside. If the interfaces allow passing an &lt;a href=&quot;http://dbpedia.org/resource/Entity&quot; id=&quot;link-id0x19a434e8&quot;&gt;Entity&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1a146d30&quot;&gt;SQL&lt;/a&gt; string, these could possibly be specialized to allow for a SPARQL string instead. LINQ might have to be extended to allow for SPARQL type queries, though.&lt;/p&gt; &lt;p&gt;Many of these questions will be better answerable as we get more details on Microsoft&amp;#39;s forthcoming &lt;a href=&quot;http://dbpedia.org/resource/ADO.NET&quot; id=&quot;link-id0x985bc50&quot;&gt;ADO&lt;/a&gt; .NET release. We hope that sufficient latitude exists for exploring all these interesting avenues of development.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Object Relational Rediscovered?</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-07-13#1003</atom:id>
  <atom:published>2006-07-13T12:33:32Z</atom:published>
  <atom:updated>2008-04-16T16:13:26-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Object Relational Rediscovered?&lt;/div&gt; &lt;p&gt;I have recently read some of Microsoft&amp;#39;s &lt;a href=&quot;http://dbpedia.org/resource/ADO.NET&quot; id=&quot;link-id0x173cea20&quot;&gt;ADO&lt;/a&gt; .NET 3 papers. I am reminded of the distant past when I designed Kubl, which later became OpenLink &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x18bdfe68&quot;&gt;Virtuoso&lt;/a&gt;. So I will reminisce and speculate a little.&lt;/p&gt; &lt;p&gt;So now is the time when polymorphic queries and mixing relational style joins and object style navigation become politically acceptable and even recommended and there finally is a workable solution to having a foreign key in the database and a pointer or set of pointers in the client application. Not to mention change tracking so as to be able to update in-memory &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xd6f0ae0&quot;&gt;data&lt;/a&gt; structures and commit a delta against the database without explicit update statements.&lt;/p&gt; &lt;p&gt;All these questions existed already in the mid 90s and earlier. Since I was coming from OO and LISP into the database world, I even felt these questions to be important. The solution in the earliest Kubl was to have inheritance between tables, what became the &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xddcdac0&quot;&gt;SQL&lt;/a&gt; 2K &lt;code&gt;UNDER&lt;/code&gt; clause, and a virtual column called &lt;code&gt;_ROW&lt;/code&gt; that would select a serialization of the primary key entry. Then there was the function &lt;code&gt;row_key()&lt;/code&gt;, which when applied to a &lt;code&gt;_ROW&lt;/code&gt; virtual column would return a database-wide unique identifier of the row, containing the key info and the key part values plus which subtable of the table was at hand. Then there was a function for dereferencing a &lt;code&gt;row_key&lt;/code&gt; for getting the &lt;code&gt;_ROW&lt;/code&gt;. And one could store &lt;code&gt;row_keys&lt;/code&gt; into columns and dereference these in queries. Within SQL, one could use the &lt;code&gt;row_column&lt;/code&gt; function to extract individual column values from a &lt;code&gt;row_key&lt;/code&gt; or &lt;code&gt;_ROW&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;This was all fine server side. But we also had a client for Franz Inc.&amp;#39;s Allegro Common Lisp that talked to Kubl&amp;#39;s &lt;a href=&quot;http://dbpedia.org/resource/Open_Database_Connectivity&quot; id=&quot;link-id0xde2c348&quot;&gt;ODBC&lt;/a&gt; listener. This client had the basic statements and prepared statements and result sets, parameters and array parameters, a little like &lt;a href=&quot;http://dbpedia.org/resource/Java_Database_Connectivity&quot; id=&quot;link-id0x156409f8&quot;&gt;JDBC&lt;/a&gt; does now. But the extra was that we could do a mapping between a Lisp struct or object and a database key, so the &lt;code&gt;_ROW&lt;/code&gt; would automatically materialize into the Lisp struct or class instance. And the mapping between these materializations and the &lt;code&gt;row_keys&lt;/code&gt; identifying them in the database were kept in a thread environment called object space. Updates could be relational-style &lt;code&gt;UPDATEs&lt;/code&gt; or consist of putting a &lt;code&gt;_ROW&lt;/code&gt; serialization in database format back into the Kubl store with a single SQL function.&lt;/p&gt; &lt;p&gt;This was different from just storing object serializations into LOB columns, as is often done, insofar as the object classes and data members were really database tables and columns, thus native to the DBMS, not just opaque data to be processed client-side only.&lt;/p&gt; &lt;p&gt;So it was then possible to program a little like is shown in the ADO .NET 3 demos today, some ten years later.&lt;/p&gt; &lt;p&gt;Some of these functions still exist in Virtuoso, albeit in a deprecated state, and there is no client that can use these to any advantage. Indeed, we dropped this line of work when Kubl became Virtuoso, mostly because there was no standard and no client applications that would use such features. Instead, we concentrated on virtual &lt;a href=&quot;http://dbpedia.org/resource/Relational_database_management_system&quot; id=&quot;link-id0x175a7b10&quot;&gt;RDBMS&lt;/a&gt;, transparently accessing any third party data via ODBC.&lt;/p&gt; &lt;p&gt;Now however, as objects, both native SQL and Java and .NET, have become mainstream citizens of relational databases in general, Virtuoso and otherwise, and as Microsoft has legitimized accessing whole objects and not only scalar columns in result sets as part of ADO .NET 3, these things might be worth a second look.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso/PL</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-07-13#1001</atom:id>
  <atom:published>2006-07-13T11:23:26Z</atom:published>
  <atom:updated>2008-04-16T16:13:24-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;RDF Bulk Load and &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/PLREF.html&quot; id=&quot;link-id0x16395e78&quot;&gt;Virtuoso/PL&lt;/a&gt; Parallelism&lt;/div&gt; &lt;p&gt;We have been playing with the Wikipedia3 &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x16341340&quot;&gt;RDF&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0xcc5b1d0&quot;&gt;data&lt;/a&gt; set, 48 million triples or so. We have for a long time foreseen the need for a special bulk loader for RDF but this brought this into immediate relevance.&lt;/p&gt; &lt;p&gt;So I wrote a generic parallel extension to &lt;a href=&quot;http://docs.openlinksw.com/virtuoso/PLREF.html&quot; id=&quot;link-id105ca910&quot;&gt;Virtuoso/PL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x161633c8&quot;&gt;SQL&lt;/a&gt;. This consists of a function for creating a queue that will feed async requests to be served on a thread pool of configurable size. Each of the worker threads has its own transaction and the owner of the thread pool can look at or block for return states of individual request . This is a generic means for delegating work to async threads from &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xdc6b680&quot;&gt;Virtuoso&lt;/a&gt;/PL. Of course this can also be used at a lower level for parallelizing single SQL queries, for example aggregation of a large table or creating an index on a large table. Many applications, such as the &lt;a href=&quot;http://dbpedia.org/resource/OpenLink_Data_Spaces&quot; id=&quot;link-id0x191da620&quot;&gt;ODS&lt;/a&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/OdsFeedManager&quot; id=&quot;link-id106793c0&quot;&gt;Feed Manager&lt;/a&gt; will also benefit, since this makes it more convenient to schedule parallel downloads from news sources and the like. This extension will make its way into the release after next.&lt;/p&gt; &lt;p&gt;But back to RDF. We presently have the primary key of the triple store as GSPO and a second index as PGOS. Using this mechanism, we will experiment with different multithreaded loading configurations. One thread translates from the IRI text representation to the IRI IDs, one thread may insert into the GSPO index, which is typically local and a few threads will share the inserting into the PGOS key. The latter key is inserted in random order, whereas the former is inserted mainly in ascending order when loading new data. In this way, we should be able to keep full load on several CPUs and even more disks.&lt;/p&gt; &lt;p&gt;It turns out that the new async queue plus thread pool construct is very handy for any pipeline or symmetric parallelization. When this is well tested, I will update the documents and maybe do a technical article about this.&lt;/p&gt; &lt;p&gt;Transactionality is not an issue in the bulk load situation. The graph being loaded will anyway be incomplete until it is loaded, other graphs will not be affected and no significant amount of locks will be held at any time by the bulk loader threads.&lt;/p&gt; &lt;p&gt;Also later, when looking at within-query and other parallelization, we have many interesting possibilities. For example, we may measure the CPU and IO load and adjust the size of the shareable thread pool accordingly. All SQL or web requests get their thread just as they now do, and extra threads may be made available for opportunistic parallelization up until we have full CPU and IO utilization. Still, this will not lead to long queries preempting short ones, since all get at least one thread. I may post some results of parallel RDF loading later on this &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1b8b6ca8&quot;&gt;blog&lt;/a&gt;.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso SQL, SPARQL and Dynamic Statistics</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-07-11#999</atom:id>
  <atom:published>2006-07-11T19:39:29Z</atom:published>
  <atom:updated>2008-04-16T16:13:22-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso SQL, SPARQL and Dynamic Statistics&lt;/div&gt; &lt;p&gt;The last couple of weeks have been very busy, dealing with updates to the &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xd616b98&quot;&gt;Virtuoso&lt;/a&gt; &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1a3033d0&quot;&gt;SQL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1a12d9f8&quot;&gt;SPARQL&lt;/a&gt; compiler cost model. &lt;/p&gt; &lt;p&gt;The new SQL compiler takes samples of index population on demand, thus always works with up-to-date statistics. Further, when there are constant leading key parts, it can get an estimate of the selectivity of the constant criteria with a single lookup. &lt;/p&gt; &lt;p&gt;This is especially important for processing &lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0xd547cb0&quot;&gt;RDF&lt;/a&gt;. Since all triples go to one table unless otherwise declared, normal SQL statistics are not very useful for determining the join order for a SPARQL query. However, nearly always, SPARQL queries have a constant graph, constant predicate, sometimes constant subjects and objects. For example, using the index P, G, O, S, the compiler can know how many triples will have a given predicate within a given graph. This is done with a single lookup, without needing to count the actual triples, which would defeat the purpose. Also there is no need to do periodic statistics collection runs or to maintain counts of distinct combinations for multiple key parts. This makes for virtual certainty of getting reasonable join orders even for recently inserted or fast changing &lt;a href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x163b39d8&quot;&gt;data&lt;/a&gt; sets. &lt;/p&gt; &lt;p&gt;This will be part of the next Virtuoso Open Source release, probably in the next couple of weeks. There will also be a technical article with examples of how the dynamic statistics feature helps with RDF queries. &lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso Open Source Edition now for Windows</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-04-27#964</atom:id>
  <atom:published>2006-04-27T08:56:00Z</atom:published>
  <atom:updated>2008-04-16T16:13:37.000002-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso Open Source Edition now for Windows&lt;/div&gt; &lt;p&gt; &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main&quot; id=&quot;link-id10726730&quot;&gt;Virtuoso Open Source Edition&lt;/a&gt; has been updated to version 4.5.2. We have added a binary distribution for Windows and the source distribution comes with Visual Studio project files for building on Win32. The new source distribution is available from &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSUsageWindows&quot; id=&quot;link-id106bbbc8&quot;&gt;the Virtuoso wiki&lt;/a&gt;. Make sure that you get the distribution with version 4.5.2. Win64 support will be in the next update. &lt;/p&gt; &lt;p&gt;This release also enhances the &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1be84360&quot;&gt;SPARQL&lt;/a&gt; support with better inlining of SPARQL in &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x1770e2f0&quot;&gt;SQL&lt;/a&gt; and other features and fixes.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Virtuoso and Database Scalability</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-04-24#962</atom:id>
  <atom:published>2006-04-24T16:06:00Z</atom:published>
  <atom:updated>2008-04-16T16:13:34.000004-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Virtuoso and Database Scalability&lt;/div&gt; &lt;p&gt;We have a new &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main/VOSScale&quot; id=&quot;link-id1068c3f8&quot;&gt;technical article&lt;/a&gt;, benchmarking &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xd413030&quot;&gt;Virtuoso&lt;/a&gt; on different hardware configurations.&lt;/p&gt; &lt;p&gt;This is useful reading for anyone interested in using Virtuoso as a database back end for online applications or simply anyone interested in relational database scalability, no matter what specific DBMS.&lt;/p&gt; &lt;p&gt;We use an adaptation of the well known &lt;a href=&quot;http://dbpedia.org/resource/TPC-C&quot; id=&quot;link-id0x1aed8170&quot;&gt;TPC-C&lt;/a&gt; benchmark to see what hardware configuration will give the best price/performance. We also explain how to tune Virtuoso and how and why different parameters affect the throughput.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:title>Introducing Virtuoso Open Source Edition</atom:title>
  <atom:id>http://www.openlinksw.com/blog/vdb/blog/?date=2006-04-11#950</atom:id>
  <atom:published>2006-04-11T16:33:07Z</atom:published>
  <atom:updated>2008-04-16T16:13:32-04:00</atom:updated>
  <atom:content type="html">&lt;div&gt; &lt;div style=&quot;display:none;&quot;&gt;Introducing Virtuoso Open Source Edition&lt;/div&gt; &lt;p&gt;I am Orri Erling, program manager for &lt;a href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0xd7d9bc0&quot;&gt;Virtuoso&lt;/a&gt; at &lt;a href=&quot;http://www.openlinksw.com/dataspace/organization/openlink#this&quot; id=&quot;link-id0xd9951b0&quot;&gt;OpenLink Software&lt;/a&gt;. This &lt;a href=&quot;http://dbpedia.org/resource/Blog&quot; id=&quot;link-id0x1775bac0&quot;&gt;blog&lt;/a&gt; is about any and all aspects of technology that have to do with Virtuoso.&lt;/p&gt; &lt;p&gt;The launch of &lt;a href=&quot;http://virtuoso.openlinksw.com/wiki/main/Main&quot; id=&quot;link-id10b0c208&quot;&gt;Virtuoso Open Source Edition (VOS)&lt;/a&gt; marks a new period in our participation in the database world. We will henceforth be much more active, publish much more material, have a faster release cycle and actively reach out to the various areas of the open source community.&lt;/p&gt; &lt;p&gt;We have years worth of demos, white papers, articles, a suite of Virtuoso based applications, and much more that we will be unveiling over the following months.&lt;/p&gt; &lt;p&gt;We will track different aspects of Virtuoso work on this and related blogs. In the middle term, we will talk about the following:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;b&gt;&lt;a href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x171d1158&quot;&gt;RDF&lt;/a&gt;, &lt;a href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1a5dbd50&quot;&gt;SPARQL&lt;/a&gt; and &lt;a href=&quot;http://dbpedia.org/resource/Semantic_Web&quot; id=&quot;link-id0x17106870&quot;&gt;semantic web&lt;/a&gt; work&lt;/b&gt; - The initial VOS release has SPARQL support and this will continue to be refined and optimized. We will introduce SPARQL benchmark suites and the like as these become ready.&lt;/li&gt; &lt;li&gt; &lt;b&gt;Relational database&lt;/b&gt; - Virtuoso&amp;#39;s extensible &lt;a href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0xd3a91b0&quot;&gt;SQL&lt;/a&gt; and relational storage engine is the platform on which all the rest stands. Thus this continues to be improved, ranging from low level database engine work to SQL optimizations to various developer convenience features. A database-only configuration of Virtuoso is another possibility.&lt;/li&gt; &lt;li&gt; &lt;b&gt;DAV and web services&lt;/b&gt; - Web services are the main entry point for all Virtuoso&amp;#39;s features. These may eventually become more significant than the traditional SQL client interfaces, of which Virtuoso supports several.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;There is a whole suite of next generation file server features to be unveiled. These include items such as automatic metadata extraction and logical views on content based on its metadata, permissions etc.&lt;/p&gt; &lt;p&gt;In the immediate future, we will:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Keep enhancing the VOS wiki and edit the existing base of unpublished material to be ready for publication on this platform.&lt;/li&gt; &lt;li&gt;Keep adding to technical notes and FAQ&amp;#39;s on compiling and running on different platforms and using the different run time hosting options of Virtuoso.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The VOS development CVS will be updated at high frequency, in some areas even weekly. Stable snapshots will be made available 3 or 4 times a year.&lt;/p&gt; &lt;p&gt;We will have a very exciting spring, with radically more participation in the database and open source worlds than ever. Look for frequent updates on this blog.&lt;/p&gt; &lt;/div&gt;</atom:content>
 </atom:entry>
</atom:feed>