<?xml version="1.0" encoding="UTF-8" ?>
<!--RDF based XML document generated By OpenLink Virtuoso-->
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
 <rss:channel xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/">
  <rss:title>OpenLink Virtuoso (Product Blog)</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/</rss:link>
  <rss:description>A great place to track Virtuoso&#39;s rapid evolution.</rss:description>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2012-02-11T09:27:05Z</dc:date>
  <dc:rights xmlns:dc="http://purl.org/dc/elements/1.1/">OpenLink Software 1998-2006</dc:rights>
  <dc:language xmlns:dc="http://purl.org/dc/elements/1.1/">en-us</dc:language>
  <rss:items>
   <rdf:Seq>
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1700" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1697" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1695" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1692" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1690" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1688" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1687" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1686" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1685" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1680" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1679" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1676" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1674" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1672" />
      <rdf:li rdf:resource="http://www.openlinksw.com/blog/vdb/blog/?id=1670" />
   </rdf:Seq>
  </rss:items>
 </rss:channel>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1700">
  <rss:title>LOD2 Plenary and Review: Semanticist, Think Database!</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1700</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1700</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1700</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-09-30T21:02:01Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Last week the LOD2 FP7 project had its first review, preceded by its third plenary meeting. Before this, we did, as promised, get the column store and vectored execution capabilities of Virtuoso 7 Single-Server Edition extended to Virtuoso 7 Cluster Edition. More interesting still, we decoupled storage from the database server process, so now database files can migrate between server processes. This means that clusters are now elastic, i.e., new servers can be added to a cluster and the load can be redistributed without reloading the data. These things were long planned, but now are done. Measurements will be published in some weeks, as part of CWI&#39;s continued running of RDF store benchmarks, per the LOD2 plan. Doing the column store and elastic cluster is work enough, so I do not in general participate in support or consultancy or the like. This has some pros and cons. On the plus side, there is a relative lack of noise and a very clear idea of focus. Of course, this work is most highly applied, thus always informed by use cases, thus forgetting what ought to be done out there is not the problem. Rather, the problem is forgetting how things in fact are done as opposed to how they could or should be done. To cut a long story short, it has become clear to me that the DBMS must tell the application developer what to do. Of course, the application developer could also look at performance metrics, but they do not, and explaining these metrics is too much work and yields no lasting benefit. Developers will produce all kinds of performance diagnostic traces if requested, but going through this song and dance can also be avoided by the right automation. So, I will introduce two new product features called Wazzup? and Saywhat? Wazzup? is answered by a mood line, like &quot;Heavily disk bound: 100G more memory will give 10x speedup&quot; or &quot;Network bound: Processing in larger batches will give 5x more throughput&quot; and Saywhat? is answered by some commentary on the user&#39;s last action, for example &quot;there is no ?order with o_totalprice &lt; 0&quot; or &quot;there is no property O_misspelledtotallprrice.&quot; Wazzup? is about overall system state, and Saywhat? is about the user session, specifically query plans. But an explanation of a query plan is not understandable, so this will just point out some salient facts, like the reason why the answer comes out empty. The other thing that came to my attention is the fact that a user has no instinctive feel for ETL. A database person takes it for a self-evident truth that data is loaded in bulk, but the application developer does not think of that. Likewise, the line between warehousing and federating is not instinctively felt; actually the question is not even posed in these terms. So one will find Web protocols and end-points and glue code on the app server when one ought to have ETL and adequate hardware for running the consolidated database. Further, under-provisioning of equipment is endemic with semanticists. The Semantic Web gets a needlessly bad rap just because we find too much data on too little equipment. For example, I was surprised to learn that the Linked Geodata demo ran on only 16 GB RAM and 6 processor cores with 2 billion triples and 350 million points in a geo index. Now, even with our greatest space efficiency advances, there is no way this will run from memory. It is not that the Web 2.0 stack is necessarily efficient (we hear the wildest stories of lack of database understanding from that side too), but at least there is a culture of running with enough equipment. Surely when the web-scale data gear (e.g. Google Bigtable, Yahoo PNUTS, Amazon Dynamo) was new, by the operators&#39; own admission there was no way for this to be particularly efficient, database-wise. Not if your eventual consistency is a client application to a shared MySQL back-end. For a lookup or single-record-update workload, who cares when there is enough hardware? For analytics, there is the de facto impossibility of doing big joins, but map reduce is for that, all offline. The big web houses have always known how to deal with data; it is the smaller Web 2.0 guys who patch systems together with duct tape and memcache. Even so, the online experience gets created. Semanticism has no part of this outlook, except maybe for Freebase, but then they are from California and now have been inside Google for a while. We quite understand that when one needs to get big data online, one makes a key-value store as a point solution, because this way one owns what one operates, and the time to market is a lot shorter than if one tried building all this inside a general-purpose DBMS. Besides, the people who can in fact do this almost do not exist, and even if one had a whole army of this rare breed, development is not very scalable in a tightly-integrated system like a high-performance DBMS. Still further, to even start, one needs to own the DBMS, meaning that the initial platform must be known through and through. This is an issue even though open source platforms exist. The graph data, semdata, schema-last, RDF, linked data enterprise -- whatever one calls it -- makes the bold proposition of bringing complex-query-at-scale to heterogeneous data. This is a database claim. In the meantime, test deployments are made in defiance of database best practices. This is a bit like test driving a race car in reverse gear and steering by looking in the rear-view mirror. There is also no short-term scalable way to educate people. At the LOD2 review, one comment was that an integrated project ought to clearly indicate how to set up the tool chain for good performance, specially as concerns interfaces between the tools. This is very true. Experience shows that developers of tools cannot accurately anticipate what usage patterns will emerge in the field. Therefore, we propose to do better than just documentation; we will make the server recognize the common sources of inefficiency and point the user to the right action. Provisioning and usage patterns: The DBMS ought to know best. Imagine the following conversation: DBMS: Your application does single-triple INSERTs over client-server protocol all day, from a single client. 57% of real time goes in client server latency, 40% in cluster interconnect latency, 2% in compiling the statements, and 1% in doing the work. Use array parameters or bulk load from a file. Operator: My developers use industry-standard Java class libraries with a service-oriented architecture and strictly enforced interfaces. This is called software engineering. Watch out ere you raise your voice against the canon. [Some weeks later, after the load job has gone on for 10 days and gotten a third of the way, developers have discovered that JDBC has array parameters and are trying these.] DBMS: 60% of real time goes into waiting for locks. 10% of transactions get aborted for deadlock. Transactions consist of an average of 10 client-server operations. Use stored procedures; acquire locks in predictable order; do SELECT FOR UPDATE. Throughput will be 4x higher if client-server operations are merged into a single operation. The transactions only INSERT; hence consider bulk load instead. Operator: We are using an enterprise-class three-tier architecture. It has &quot;enterprise&quot; in the name and all the big guys are using it, so it must be scalable. Besides, it is distributed transactions, and distributed computing is the wave of the future. You are a cluster yourself, so the pot&#39;s got no business calling the kettle black. [After a while, the data gets loaded with bulk load, but now on a single stream.] DBMS: CPU is at 400% for an INSERT workload; adding more parallel threads will get 4.5x better throughput. [Some time has elapsed and there are Ajax client apps out there trying to use the data.] DBMS: Will you really not give me another 140 GB RAM and 16 more cores? Operator: No, on general principles I will not, shut up. DBMS: Do you know that your page impression takes 3 seconds and anything over 0.25 seconds is visibly slow? 300 GB worth of distinct pages have been accessed in the last 24 hours for 160 GB of RAM. Latency will drop 10x by using SSD; 50x by increasing RAM. Operator: No dice, bucket. Shut up, besides, when I scroll through the data I always use for testing, I get it fast enough, you are just doing this out of greed and self-importance. You are a server among many, just like the mail server; you databases are just pretentious. Currently addressing any of the above sorts of issues takes a long time and involves mostly-avoidable support communication. Questions of this sort do occur. We can probably produce commentary like the above based on logging some 50 numbers, and making some 15 regularly-run reports over these. The patterns to watch out for are well known. No, we will not make a Zippy the Pinhead office assistant; a computer should not try to be cute. This one will talk only in terms of gains from adjusting the deployment or usage patterns. Now, suppose the operator said yes to the request for more cores and memory; then it would be up to the DBMS to deliver. This entails a capacity to redistribute itself automatically, and to give a quantitative report on the success of this measure. This means usage-based repartitioning of the data to equalize load over a cluster. The relevant metric in the above case is the drop in response time. On the other hand, the DBMS should also notice if there is clearly unused capacity. This all will be presented as a line in the status report, so there is no extra wizard or workload analyzer that one must remember to run. For programmatic use there are SQL views for the relevant reports. As for ETL, even if the DBMS can detect that it is not being done right, this does not mean that the user will know what to do. Therefore, for all the Web harvesting we support, as well as any import from local file system or Web services, with some RDF-ization, we will simply implement a proper ETL utility that will do things right. Wazzup? can just point the user to that if the workload looks like loading. This will have its own status report giving a load and transform rate and will point out what takes the longest, after everything is duly parallelized and made asynchronous. Beyond these lessons, there is more to say about the review and plenary, we will get to that a bit later. We did promise a new edition of the LOD cache in a couple of months, now on the clustered column-store platform. Look for advances in data discoverability.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Last week the <a href="http://lod2.eu/" id="link-id0x57d7368">LOD2 FP7 project</a> had its first review, preceded by its third plenary meeting.</p>

<p>Before this, we did, <a href="http://www.openlinksw.com/weblog/oerling/?id=1683" id="link-id0x579c950">as promised</a>, get the column store and vectored execution capabilities of Virtuoso 7 Single-Server Edition extended to Virtuoso 7 Cluster Edition. More interesting still, we decoupled storage from the database server process, so now database files can migrate between server processes. This means that clusters are now elastic, i.e., new servers can be added to a cluster and the load can be redistributed without reloading the data.</p>

<p>These things were long planned, but now are done. Measurements will be published in some weeks, as part of CWI&#39;s continued running of RDF store benchmarks, per the LOD2 plan.</p>

<p>Doing the column store and elastic cluster is work enough, so I do not in general participate in support or consultancy or the like. This has some pros and cons. On the plus side, there is a relative lack of noise and a very clear idea of focus. Of course, this work is most highly applied, thus always informed by use cases, thus forgetting what ought to be done out there is not the problem. Rather, the problem is forgetting how things in fact <i>are</i> done as opposed to how they <i>could or should be</i> done.</p>

<p>To cut a long story short, it has become clear to me that the DBMS must tell the application developer what to do. Of course, the application developer could also look at performance metrics, but they do not, and explaining these metrics is too much work and yields no lasting benefit. Developers will produce all kinds of performance diagnostic traces if requested, but going through this song and dance can also be avoided by the right automation.</p>

<p>So, I will introduce two new product features called <i><b>Wazzup?</b></i> and <i><b>Saywhat?</b></i>
</p>

<p>
<b>Wazzup?</b> is answered by a mood line, like &quot;Heavily disk bound: 100G more memory will give 10x speedup&quot; or &quot;Network bound: Processing in larger batches will give 5x more throughput&quot; and <b>Saywhat?</b> is answered by some commentary on the user&#39;s last action, for example &quot;there is no ?order with o_totalprice &lt; 0&quot; or &quot;there is no property O_misspelledtotallprrice.&quot;</p>

<p>
<b>Wazzup?</b> is about overall system state, and <b>Saywhat?</b> is about the user session, specifically query plans. But an explanation of a query plan is not understandable, so this will just point out some salient facts, like the reason why the answer comes out empty.</p>

<p>The other thing that came to my attention is the fact that a user has no instinctive feel for <a href="http://dbpedia.org/page/Extract,_transform,_load" id="link-id0x527eb88">ETL</a>. A database person takes it for a self-evident truth that data is loaded in bulk, but the application developer does not think of that. Likewise, the line between warehousing and federating is not instinctively felt; actually the question is not even posed in these terms. So one will find Web protocols and end-points and glue code on the app server when one ought to have ETL and adequate hardware for running the consolidated database.</p>

<p>Further, under-provisioning of equipment is endemic with semanticists. The Semantic Web gets a needlessly bad rap just because we find too much data on too little equipment. For example, I was surprised to learn that the Linked Geodata demo ran on only 16 GB RAM and 6 processor cores with 2 billion triples and 350 million points in a geo index. Now, even with our greatest space efficiency advances, there is no way this will run from memory.</p>

<p>It is not that the Web 2.0 stack is necessarily efficient (we hear the wildest stories of lack of database understanding from that side too), but at least there is a culture of running with enough equipment. Surely when the web-scale data gear (e.g. Google Bigtable, Yahoo PNUTS, Amazon Dynamo) was new, by the operators&#39; own admission there was no way for this to be particularly efficient, database-wise. Not if your eventual consistency is a client application to a shared MySQL back-end. For a lookup or single-record-update workload, who cares when there is enough hardware? For analytics, there is the <i>de facto</i> impossibility of doing big joins, but map reduce is for that, all offline. The big web houses have always known how to deal with data; it is the smaller Web 2.0 guys who patch systems together with duct tape and memcache. Even so, the online experience gets created.</p>

<p>Semanticism has no part of this outlook, except maybe for Freebase, but then they are from California and now have been inside Google for a while.</p>

<p>We quite understand that when one needs to get big data online, one makes a key-value store as a point solution, because this way one owns what one operates, and the time to market is a lot shorter than if one tried building all this inside a general-purpose DBMS. Besides, the people who can in fact do this almost do not exist, and even if one had a whole army of this rare breed, development is not very scalable in a tightly-integrated system like a high-performance DBMS. Still further, to even start, one needs to own the DBMS, meaning that the initial platform must be known through and through. This is an issue even though open source platforms exist.</p>

<p>The graph data, semdata, schema-last, RDF, linked data enterprise -- whatever one calls it -- makes the bold proposition of bringing complex-query-at-scale to heterogeneous data. This is a database claim.</p>

<p>In the meantime, test deployments are made in defiance of database best practices. This is a bit like test driving a race car in reverse gear and steering by looking in the rear-view mirror.</p>

<p>There is also no short-term scalable way to educate people. At the LOD2 review, one comment was that an integrated project ought to clearly indicate how to set up the tool chain for good performance, specially as concerns interfaces between the tools. This is very true. Experience shows that developers of tools cannot accurately anticipate what usage patterns will emerge in the field. Therefore, we propose to do better than just documentation; we will make the server recognize the common sources of inefficiency and point the user to the right action.</p>

<h3>Provisioning and usage patterns: The DBMS ought to know best.</h3>

<p>Imagine the following conversation:</p>

<p>
<b>DBMS:</b> Your application does single-triple INSERTs over client-server protocol all day, from a single client. 57% of real time goes in client server latency, 40% in cluster interconnect latency, 2% in compiling the statements, and 1% in doing the work. Use array parameters or bulk load from a file.</p>

<p>
<b>Operator:</b> My developers use industry-standard Java class libraries with a service-oriented architecture and strictly enforced interfaces. This is called software engineering. Watch out ere you raise your voice against the canon.</p>

<p>
<i>[Some weeks later, after the load job has gone on for 10 days and gotten a third of the way, developers have discovered that JDBC has array parameters and are trying these.]</i>
</p>

<p>
<b>DBMS:</b> 60% of real time goes into waiting for locks. 10% of transactions get aborted for deadlock. Transactions consist of an average of 10 client-server operations. Use stored procedures; acquire locks in predictable order; do SELECT FOR UPDATE. Throughput will be 4x higher if client-server operations are merged into a single operation. The transactions only INSERT; hence consider bulk load instead.</p>

<p>
<b>Operator</b>: We are using an enterprise-class three-tier architecture. It has &quot;enterprise&quot; in the name and all the big guys are using it, so it must be scalable. Besides, it is distributed transactions, and distributed computing is the wave of the future. You are a cluster yourself, so the pot&#39;s got no business calling the kettle black.</p>

<p>
<i>[After a while, the data gets loaded with bulk load, but now on a single stream.]</i>
</p>

<p>
<b>DBMS</b>: CPU is at 400% for an INSERT workload; adding more parallel threads will get 4.5x better throughput.</p>

<p>
<i>[Some time has elapsed and there are Ajax client apps out there trying to use the data.]</i>
</p>

<p>
<b>DBMS</b>: Will you really not give me another 140 GB RAM and 16 more cores?</p>

<p>
<b>Operator</b>: No, on general principles I will not, shut up.</p>

<p>
<b>DBMS</b>: Do you know that your page impression takes 3 seconds and anything over 0.25 seconds is visibly slow? 300 GB worth of distinct pages have been accessed in the last 24 hours for 160 GB of RAM. Latency will drop 10x by using SSD; 50x by increasing RAM.</p>

<p>
<b>Operator</b>: No dice, bucket. Shut up, besides, when I scroll through the data I always use for testing, I get it fast enough, you are just doing this out of greed and self-importance. You are a server among many, just like the mail server; you databases are just pretentious.</p>

<p>Currently addressing any of the above sorts of issues takes a long time and involves mostly-avoidable support communication. Questions of this sort do occur. We can probably produce commentary like the above based on logging some 50 numbers, and making some 15 regularly-run reports over these. The patterns to watch out for are well known. No, we will not make a Zippy the Pinhead office assistant; a computer should not try to be cute. This one will talk only in terms of gains from adjusting the deployment or usage patterns.</p>

<p>Now, suppose the operator said <i>yes</i> to the request for more cores and memory; then it would be up to the DBMS to deliver. This entails a capacity to redistribute itself automatically, and to give a quantitative report on the success of this measure. This means usage-based repartitioning of the data to equalize load over a cluster. The relevant metric in the above case is the drop in response time. On the other hand, the DBMS should also notice if there is clearly unused capacity.</p>

<p>This all will be presented as a line in the status report, so there is no extra wizard or workload analyzer that one must remember to run. For programmatic use there are SQL views for the relevant reports.</p>

<p>As for ETL, even if the DBMS can detect that it is not being done right, this does not mean that the user will know what to do. Therefore, for all the Web harvesting we support, as well as any import from local file system or Web services, with some RDF-ization, we will simply implement a proper ETL utility that will do things right. <b>Wazzup?</b> can just point the user to that if the workload looks like loading. This will have its own status report giving a load and transform rate and will point out what takes the longest, after everything is duly parallelized and made asynchronous.</p>

<p>Beyond these lessons, there is more to say about the review and plenary, we will get to that a bit later. We did promise a new edition of the LOD cache in a couple of months, now on the clustered column-store platform. Look for advances in data discoverability.</p>
]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1697">
  <rss:title>GDB for the Data Driven Age (STI Summit Position Paper)</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1697</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1697</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1697</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-07-26T13:37:26Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Note: The following was written prior to the event, but was not posted until later due to human error. The Semantic Technology Institute (STI) is organizing a meeting around the questions of making semantic technology deliver on its promise. We were asked to present a position paper (reproduced below). This is another recap of our position on making graph databasing come of age. While the database technology matters are getting tackled, we are drawing closer to the question of deciding actually what kind of inference will be needed close to the data. My personal wish is to use this summit for clarifying exactly what is needed from the database in order to extract value from the data explosion. We have a good idea of what to do with queries but what is the exact requirement for transformation and alignment of schema and identifiers? What is the actual use case of inference, OWL or other, in this? It is time to get very concrete in terms of applications. We expect a mixed requirement but it is time to look closely at the details. GDB for the Data Driven Age Databases and knowledge representation both have decades of history, but to date the exchange of ideas and techniques between these disciplines has been limited. The intuition that there would be value in greater cooperation has not failed to occur to researchers on either side; after all, both sides deal with data. From this, we have seen deductive databases emerge, as well as more recently &quot;database friendly&quot; profiles of OWL. In this position paper we will examine what, in the most concrete terms, is needed in order to bring leading edge database technology together with expressive querying and reasoning. This draws on our experience in building Virtuoso, one of today&#39;s leading graph data stores. Following this, we argue for the creation of benchmarks and challenges that in fact do reflect reality and facilitate open and fair comparison of products and technologies. Data integration is often mentioned as the motivating use case for GDB, commonly popularized today as RDF. Database research has over the past few years produced great advances for business intelligence (i.e., complex queries and read-mostly workloads). These advances are typified by compressed columnar storage and architecture-conscious execution models, mostly based on the idea of always processing multiple sets of values in each operation (vectoring). With these techniques, raw performance with relatively simple schemas and regular data (e.g., TPC-H) is no longer a barrier to extracting value from data. A similar breakthrough has not been seen on the semantics side. Data integration still requires manual labor. Publishing GDB datasets is a good and necessary intermediate stage, but producing these datasets from diverse sources is not fundamentally different from doing the same work without GDB or RDF. Even so, GDB and RDF serve as a catalyst for a culture of publishing datasets. GDB, as a base model for integration, offers the following benefits over a purely relational result format: All entities have globally unique identifiers. Any statements may be associated ad hoc to any entities. These statements can be scoped into graphs according to their provenance, time, validity, etc. Obtaining this flexibility on a relational basis would simply require moving to an graph-like representation with essentially one-row-per-attribute. Indeed, we see key-value stores being used in online applications with high volatility of schema (e.g., social networks, search); and we also see relational applications making provisions for post-hoc addition of per-entity attributes (i.e., associating a bag of mixed non-first normal form data with entities). The benefits of a schema-last approach are recognized in many places. GDB seems a priori a fit for all these requirements, thus how will it claim its place as a solution? The first part of the answer lies in learning all the relevant database lessons. The second part lies in eliminating the impedance mismatch between querying and reasoning. The third and most important part consists of substantiating these claims in a manner that is understandable to the relevant publics, finally leading to the creation of a semantics-aware segment of the database industry. We will address each of these aspects in turn. GDB and RDB The problem is divided into storage format, execution, and query optimization. For the first two, Daniel Abadi&#39;s renowned Ph.D. thesis holds most of the keys. Space efficiency is specially important for Linked Data, since data is often voluminous, and many datasets have to be brought together for integration. Access patterns are also unpredictable, with indexed-random-access predominating, as opposed to RDB BI workloads where sequential scans and hash joins represent the bulk of the work. However, we find that a sorted column-wise compressed representation of Linked Data with a single quad table for all statements gives excellent space efficiency and good random access as well as random insert speed. The space efficiency is close to par with the equivalent column-wise relational format, since three of the four columns of the quad table compress to almost nothing. As many sort orders as are necessary may be maintained, but we find that two are enough, with some extra data structures for dealing with queries where the predicate is unspecified. The details are found in VLDB 2010 Semdata workshop paper, Directions and Challenges for Semantically Linked Data. Since GDB/RDF is a model typed at run time, the engine must support an &quot;ANY&quot; data type for columns and query variables, where values on successive rows may be of different types. This is a straightforward enhancement. Vectored execution is traditionally associated with column stores because the per-row access cost is relatively high, thus needing to access many nearby rows at a time in order to amortize the overhead. Aside this, vectored execution provides many opportunities for parallelism, from the instruction level all the way to threading and distributed execution on clusters, thus some form of execution on large numbers of concurrent query states is needed for RDF stores, just as it is needed for RDBMS&quot;s. Query optimization for GDBMS is similar to that for RDBMS, except that the statistics can no longer be collected by column and table, but must rather apply to individual entities and ranges of a single quad table. This can be provided through run-time sampling of the database based on constants in the query being optimized. This may take into account trivial inference such as expanding properties into the set of their sub-properties and the like. Beyond this, interleaving execution and optimization (as in ROX) seems to offer limitless possibilities, especially when inference is introduced, making optimizer statistics less predictive. In summary, starting with an RDBMS and going to GDB entails changes to all parts of the engine, but these changes are not fundamental. One does need to own the engine; however, otherwise the expertise for efficiently implementing these changes will not exist. Essentially any DBMS technique may be translated to a GDB use case, if its application can be decided at run-time. GDB may be schema-less, yet most datasets have fairly regular structure; the question is simply to reconstruct the needed statistics and schema information from the data on an as you go basis. Techniques with high up-front cost, like constructing specially ordered materializations for optimizing specific queries, are harder to deploy but still conceivable for GDB also. RDB and Inference Compared to the straightforwardly performance oriented world of database engines, the contours of the landscape become less defined when moving to inference. Databases, whether relational or schema-less all perform roughly the same functions but inference is more diverse. We include here also techniques like machine learning and meta-reasoning for guiding reasoning, although these might not strictly fit the definition. As we posit that data integration is the motivating use case for GDB as opposed to RDB (Relational Database Model), we must ask which modes of inference are actually required for data integration. Further, we need to ask whether these inferences ought to be applied as a preprocessing step (ETL or forward chaining), or as needed (backward chaining). Some low-hanging fruit can be collected by simply constructing class or property hierarchies; e.g., in the data at hand, the following properties have the meaning of company name, and the following classes have the meaning of company. We have found that such techniques can be efficiently supported at run-time, without materialization, if the support is simply built into the engine, which is in itself straightforward as long as one controls the engine. The same applies to trivial identity resolution, such as owl:sameAs or resolution of identity based on sharing an inverse-functional property value. These things take longer at run-time, but if one caches and reuses the result, one can get around materialization. We do not believe in weak statements of identity, as in X is similar to Y, since the meaning of similarity is entirely contextual. X and Y may or may not be interchangeable depending on the application; thus the statement on identity needs to be strong, but it must be easy to modify the grounds on which such a statement is made. This is a further argument for why one should not automatically materialize consequences of identity, particularly if dealing with web data where identity is especially problematic. Real-world problems are however harder than just bundling properties, classes, or instances into sets of interchangeable equivalents, which is all we have mentioned thus far. There are differences of modeling (&quot;address as many columns in customer table&quot; vs. &quot;address normalized away under a contact entity&quot;), normalization (&quot;first name&quot; and &quot;last name&quot; as one or more properties; national conventions on person names; tags as comma-separated in a string or as a one-to-many), incomplete data (one customer table has family income bracket, the other does not), diversity in units of measurement (Imperial vs. metric), variability in the definition of units (seven different things all called blood pressure), variability in unit conversions (currency exchange rates), to name a few. What a world! If data exists, the conversion questions are often answerable but their answer depends on context -- e.g., date of transaction for currency exchange rate; source of data for the definition of blood pressure. Alongside these, there remain issues of identity, e.g., depending on the perspective, a national subsidiary is or is not the same entity as the parent company, companies with the same name can be entirely unrelated in different jurisdictions. It appears that we may need a multi-level approach, combining different techniques for different phases of the integration process. We do not a priori believe that using SQL VIEWs for unit and modeling conversion, and then OWL for unifying terminology on top of this, were the whole solution. Even if this were the solution, the pipeline from the relational sources to SPARQL and OWL needs to be optimized for real-world BI information volumes, and the query language needs to be able to express the business questions and needs to interface with the reporting tools the analyst has come to expect. Our answer so far consists of a SPARQL extension with non-recursive rules, roughly equivalent to SQL VIEWs in expressive power, tightly integrated to the query engine. There is also limited support for recursion through transitive subqueries; thus one can compactly express things like &quot;all parts of all assemblies and subassemblies must satisfy applicable safety requirements, where the requirements depend on the type of the part in question.&quot; This is only an intermediate step. We believe that a database-scale generic inference engine with at least Datalog power, with second-order extensions like computed predicates, is needed, executing inside the DBMS, benefiting from the whole array of optimizations database-science expects of execution engines, as part of the answer. This will not relieve the analyst of having to consider that the currency rates in effect at the time of conversion must be taken into account when calculating profits, but this will at least make expressing this and similar pieces of context more compact. We note that time-to-answer has historically won over raw performance. This was also the case for RDBMS when these were the fresh challenger to the CODASYL incumbents, just as was the case with the adoption of high-level languages. The key is that the raw performance must be sufficient for the real world task. With the adoption of the database lessons outlined in the previous section, we believe this to be the case for GDB (and thus, RDF). Substantiating the Claims Benchmarks have a stellar record for improving any metric they measure. The question is, how can we make a metric that measures GDB&#39;s ability to deliver on its claim to fame -- time-to-answer for big data -- with all the integration and other complexities this entails? So far, GDB benchmarks have consisted of workloads where RDBMS are clearly better (e.g., LUBM, or the Berlin SPARQL Benchmark). This does not remove their usefulness for GDB, but does not constitute a GDB selling point, either. We suggest a dual approach. The first part is demonstrating that GDB is scalable for BI: We take the industry standard decision support benchmark TPC-H, which is very favorable to RDB and quite unfavorable to GDB, and show that we can tackle the workload at reasonable cost. If TPC-H is all one wants, an RDBMS will stay a better fit, but then this benchmark does not capture any of the heterogeneity, schema evolution, or other such requirements faced by real-world data warehouses. This is still a qualification test, not the selling point. The issue of benchmark is inextricably tied to the issue of messaging. There must be a compelling story, with which the IT community can identify. Further, the benchmark must capture real-world challenges in the area of interest. With all this, the benchmark should not be too expensive to run. Here too, a multistage approach suggests itself. Our tentative answer to this question is the Social Intelligence Benchmark (SIB), developed together with CWI and other partners in the LOD2 consortium. This simulates a social network and combines an online workload with complex analytics. This benchmark should cover all of the target areas of the LOD2 project, so that the project itself generates its own metric of success. The project has clear data integration targets, especially as applies to Web and Linked Data. Questions of integration with enterprise sources need to be further developed; for example, comparing CRM data with extractions from the online conversation space for market research. Data integration will invariably involve human effort, and the area cannot be satisfactorily covered with metrics of scale and throughput alone. Development time, accuracy of results, and cost of maintenance are all factors. Furthermore, the task being modeled must correspond to reality, still without being too domain-specific or prohibitively time-consuming to implement. Conclusions The data driven world will increase rewards for efficiency in data integration. We believe that such efficiency crucially depends on semantics. Real world requirements just might throw the database and AI communities together with enough heat and pressure for fusion to ignite, allegorically speaking. Without a clear and present need, the geek world analog of electrostatic repulsion will keep the communities separate, as has been the case thus far, and no new, qualitatively-different element will arise. Efforts such as this STI Summit and the LOD2 Project are needed for setting directions and communicating the requirement to the research world. In our fusion analogy, this is the field which directs the nuclei to collide. Once there is an actual reaction that produces more than it consumes by a sufficient margin, regular business dynamics will take over, and we will have an industry with several products of comparable capability, as well as a set of metrics, all to the benefit of the end user. References TPC-H results pages Daniel Abadi&#39;s Ph.D. Thesis, Query Execution in Column-Oriented Database Systems ( PDF ) Our VLDB 2010 Semdata workshop paper, Directions and Challenges for Semantically Linked Data ( HTML | PDF ) CWI&#39;s ROX: Run-time Optimization of XQueries ( PDF ) The LOD2 Project web site</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>
<i><b>Note:</b> The following was written prior to the event, but was not posted until later due to human error.</i>
</p>

<p>The <a href="http://sti2.org/" id="link-id0x261e3798">Semantic Technology Institute</a> (<a href="http://sti2.org/" id="link-id0x243dac30">STI</a>) is organizing <a href="http://summit2011.sti2.org/" id="link-id0x25fc4e68">a meeting</a> around the questions of making semantic technology deliver on its promise. We were asked to present a position paper (reproduced below). This is another recap of our position on making graph databasing come of age. While the database technology matters are getting tackled, we are drawing closer to the question of deciding actually what kind of inference will be needed close to the data. My personal wish is to use this summit for clarifying exactly what is needed from the database in order to extract value from the data explosion. We have a good idea of what to do with queries but what is the exact requirement for transformation and alignment of schema and identifiers? What is the actual use case of inference, OWL or other, in this? It is time to get very concrete in terms of applications. We expect a mixed requirement but it is time to look closely at the details.</p>


<h3>GDB for the Data Driven Age</h3>

<p>Databases and knowledge representation both have decades of history, but to date the exchange of ideas and techniques between these disciplines has been limited. The intuition that there would be value in greater cooperation has not failed to occur to researchers on either side; after all, both sides deal with data. From this, we have seen deductive databases emerge, as well as more recently &quot;database friendly&quot; profiles of OWL.</p>

<p>In this position paper we will examine what, in the most concrete terms, is needed in order to bring leading edge database technology together with expressive querying and reasoning. This draws on our experience in building <a href="http://virtuoso.openlinksw.com/" id="link-id0x240cdd28">Virtuoso</a>, one of today&#39;s leading <a href="http://dbpedia.org/page/Graph_database" id="link-id0x24ceaae0">graph data stores</a>. Following this, we argue for the creation of benchmarks and challenges that in fact do reflect reality and facilitate open and fair comparison of products and technologies.</p>

<p>Data integration is often mentioned as the motivating use case for GDB, commonly popularized today as RDF. Database research has over the past few years produced great advances for business intelligence (i.e., complex queries and read-mostly workloads). These advances are typified by compressed columnar storage and architecture-conscious execution models, mostly based on the idea of always processing multiple sets of values in each operation (vectoring). With these techniques, raw performance with relatively simple schemas and regular data (e.g., TPC-H) is no longer a barrier to extracting value from data.</p>

<p>A similar breakthrough has not been seen on the semantics side. Data integration still requires manual labor. Publishing GDB datasets is a good and necessary intermediate stage, but producing these datasets from diverse sources is not fundamentally different from doing the same work without GDB or RDF. Even so, GDB and RDF serve as a catalyst for a culture of publishing datasets.</p>

<p>GDB, as a base model for integration, offers the following benefits over a purely relational result format: </p>

<ul>
<li>All entities have globally unique identifiers.</li>
<li>Any statements may be associated ad hoc to any entities.</li>
<li>These statements can be scoped into graphs according to their provenance, time, validity, etc.</li> </ul>

<p>Obtaining this flexibility on a relational basis would simply require moving to an graph-like representation with essentially one-row-per-attribute. Indeed, we see key-value stores being used in online applications with high volatility of schema (e.g., social networks, search); and we also see relational applications making provisions for post-hoc addition of per-entity attributes (i.e., associating a bag of mixed non-first normal form data with entities). The benefits of a schema-last approach are recognized in many places.</p>

<p>GDB seems <i>a priori</i> a fit for all these requirements, thus how will it claim its place as a solution?</p>

<p>The first part of the answer lies in learning all the relevant database lessons. The second part lies in eliminating the impedance mismatch between querying and reasoning. The third and most important part consists of substantiating these claims in a manner that is understandable to the relevant publics, finally leading to the creation of a semantics-aware segment of the database industry. We will address each of these aspects in turn.</p>

<h4>GDB and RDB</h4>

<p>The problem is divided into storage format, execution, and query optimization. For the first two, Daniel Abadi&#39;s <a href="http://cs-www.cs.yale.edu/homes/dna/papers/abadiphd.pdf" id="link-id0x25ebd568">renowned Ph.D. thesis</a> holds most of the keys. Space efficiency is specially important for Linked Data, since data is often voluminous, and many datasets have to be brought together for integration. Access patterns are also unpredictable, with indexed-random-access predominating, as opposed to RDB BI workloads where sequential scans and hash joins represent the bulk of the work. However, we find that a sorted column-wise compressed representation of Linked Data with a single quad table for all statements gives excellent space efficiency and good random access as well as random insert speed. The space efficiency is close to par with the equivalent column-wise relational format, since three of the four columns of the quad table compress to almost nothing. As many sort orders as are necessary may be maintained, but we find that two are enough, with some extra data structures for dealing with queries where the predicate is unspecified. The details are found in VLDB 2010 Semdata workshop paper, <i><a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDirectionsChallengesSemdata" id="link-id0x244a8010">Directions and Challenges for Semantically Linked Data</a></i>. Since GDB/RDF is a model typed at run time, the engine must support an &quot;<code>ANY</code>&quot; data type for columns and query variables, where values on successive rows may be of different types. This is a straightforward enhancement.</p>

<p>Vectored execution is traditionally associated with column stores because the per-row access cost is relatively high, thus needing to access many nearby rows at a time in order to amortize the overhead. Aside this, vectored execution provides many opportunities for parallelism, from the instruction level all the way to threading and distributed execution on clusters, thus some form of execution on large numbers of concurrent query states is needed for RDF stores, just as it is needed for RDBMS&quot;s.</p>

<p>Query optimization for GDBMS is similar to that for RDBMS, except that the statistics can no longer be collected by column and table, but must rather apply to individual entities and ranges of a single quad table. This can be provided through run-time sampling of the database based on constants in the query being optimized. This may take into account trivial inference such as expanding properties into the set of their sub-properties and the like. Beyond this, interleaving execution and optimization (as in <a href="http://oai.cwi.nl/oai/asset/14193/14193B.pdf" id="link-id0x264cfd20">ROX</a>) seems to offer limitless possibilities, especially when inference is introduced, making optimizer statistics less predictive. </p>

<p>In summary, starting with an RDBMS and going to GDB entails changes to all parts of the engine, but these changes are not fundamental. One does need to own the engine; however, otherwise the expertise for efficiently implementing these changes will not exist. Essentially any DBMS technique may be translated to a GDB use case, if its application can be decided at run-time. GDB may be schema-less, yet most datasets have fairly regular structure; the question is simply to reconstruct the needed statistics and schema information from the data on an as you go basis. Techniques with high up-front cost, like constructing specially ordered materializations for optimizing specific queries, are harder to deploy but still conceivable for GDB also.</p>

<h4>RDB and Inference</h4>

<p>Compared to the straightforwardly performance oriented world of database engines, the contours of the landscape become less defined when moving to inference. Databases, whether relational or schema-less all perform roughly the same functions but inference is more diverse. We include here also techniques like machine learning and meta-reasoning for guiding reasoning, although these might not strictly fit the definition.</p>

<p>As we posit that data integration is the motivating use case for GDB as opposed to RDB (Relational Database Model), we must ask which modes of inference are actually required for data integration. Further, we need to ask whether these inferences ought to be applied as a preprocessing step (ETL or forward chaining), or as needed (backward chaining). Some low-hanging fruit can be collected by simply constructing class or property hierarchies; e.g., in the data at hand, the following properties have the meaning of company name, and the following classes have the meaning of company. We have found that such techniques can be efficiently supported at run-time, without materialization, if the support is simply built into the engine, which is in itself straightforward as long as one controls the engine. The same applies to trivial identity resolution, such as <code>owl:sameAs</code> or resolution of identity based on sharing an inverse-functional property value. These things take longer at run-time, but if one caches and reuses the result, one can get around materialization.</p>

<p>We do not believe in weak statements of identity, as in <i>X is similar to Y,</i> since the meaning of similarity is entirely contextual. X and Y may or may not be interchangeable depending on the application; thus the statement on identity needs to be strong, but it must be easy to modify the grounds on which such a statement is made. This is a further argument for why one should not automatically materialize consequences of identity, particularly if dealing with web data where identity is especially problematic.</p>

<p>Real-world problems are however harder than just bundling properties, classes, or instances into sets of interchangeable equivalents, which is all we have mentioned thus far. There are differences of modeling (&quot;address as many columns in customer table&quot; vs. &quot;address normalized away under a contact entity&quot;), normalization (&quot;first name&quot; and &quot;last name&quot; as one or more properties; national conventions on person names; tags as comma-separated in a string or as a one-to-many), incomplete data (one customer table has family income bracket, the other does not), diversity in units of measurement (Imperial vs. metric), variability in the definition of units (seven different things all called blood pressure), variability in unit conversions (currency exchange rates), to name a few. What a world!</p>

<p>If data exists, the conversion questions are often answerable but their answer depends on context -- e.g., date of transaction for currency exchange rate; source of data for the definition of blood pressure.</p>

<p>Alongside these, there remain issues of identity, e.g., depending on the perspective, a national subsidiary is or is not the same entity as the parent company, companies with the same name can be entirely unrelated in different jurisdictions.</p>

<p>It appears that we may need a multi-level approach, combining different techniques for different phases of the integration process. We do not <i>a priori</i> believe that using SQL VIEWs for unit and modeling conversion, and then OWL for unifying terminology on top of this, were the whole solution. Even if this were the solution, the pipeline from the relational sources to SPARQL and OWL needs to be optimized for real-world BI information volumes, and the query language needs to be able to express the business questions and needs to interface with the reporting tools the analyst has come to expect.</p>

<p>Our answer so far consists of a SPARQL extension with non-recursive rules, roughly equivalent to SQL VIEWs in expressive power, tightly integrated to the query engine. There is also limited support for recursion through transitive subqueries; thus one can compactly express things like &quot;all parts of all assemblies and subassemblies must satisfy applicable safety requirements, where the requirements depend on the type of the part in question.&quot;</p>

<p>This is only an intermediate step. We believe that a database-scale generic inference engine with at least Datalog power, with second-order extensions like computed predicates, is needed, executing inside the DBMS, benefiting from the whole array of optimizations database-science expects of execution engines, as part of the answer.</p>

<p>This will not relieve the analyst of having to consider that the currency rates in effect at the time of conversion must be taken into account when calculating profits, but this will at least make expressing this and similar pieces of context more compact.</p>

<p>We note that time-to-answer has historically won over raw performance. This was also the case for RDBMS when these were the fresh challenger to the CODASYL incumbents, just as was the case with the adoption of high-level languages. The key is that the raw performance must be sufficient for the real world task. With the adoption of the database lessons outlined in the previous section, we believe this to be the case for GDB (and thus, RDF).</p>

<h4>Substantiating the Claims</h4>

<p>Benchmarks have a stellar record for improving any metric they measure. The question is, how can we make a metric that measures GDB&#39;s ability to deliver on its claim to fame -- time-to-answer for big data -- with all the integration and other complexities this entails?</p>

<p>So far, GDB benchmarks have consisted of workloads where RDBMS are clearly better (e.g., LUBM, or the Berlin SPARQL Benchmark). This does not remove their usefulness for GDB, but does not constitute a GDB selling point, either.</p>

<p>We suggest a dual approach. The first part is demonstrating that GDB is scalable for BI: We take the industry standard decision support benchmark TPC-H, which is very favorable to RDB and quite unfavorable to GDB, and show that we can tackle the workload at reasonable cost. If TPC-H is all one wants, an RDBMS will stay a better fit, but then this benchmark does not capture any of the heterogeneity, schema evolution, or other such requirements faced by real-world data warehouses. This is still a qualification test, not the selling point.</p>

<p>The issue of benchmark is inextricably tied to the issue of messaging. There must be a compelling story, with which the IT community can identify. Further, the benchmark must capture real-world challenges in the area of interest. With all this, the benchmark should not be too expensive to run. Here too, a multistage approach suggests itself.</p>

<p>Our tentative answer to this question is the Social Intelligence Benchmark (SIB), developed together with CWI and other partners in the LOD2 consortium. This simulates a social network and combines an online workload with complex analytics. This benchmark should cover all of the target areas of the LOD2 project, so that the project itself generates its own metric of success. The project has clear data integration targets, especially as applies to Web and Linked Data. Questions of integration with enterprise sources need to be further developed; for example, comparing CRM data with extractions from the online conversation space for market research.</p>

<p>Data integration will invariably involve human effort, and the area cannot be satisfactorily covered with metrics of scale and throughput alone. Development time, accuracy of results, and cost of maintenance are all factors. Furthermore, the task being modeled must correspond to reality, still without being too domain-specific or prohibitively time-consuming to implement.</p>

<h4>Conclusions</h4>

<p>The data driven world will increase rewards for efficiency in data integration. We believe that such efficiency crucially depends on semantics. Real world requirements just might throw the database and AI communities together with enough heat and pressure for fusion to ignite, allegorically speaking. Without a clear and present need, the geek world analog of electrostatic repulsion will keep the communities separate, as has been the case thus far, and no new, qualitatively-different element will arise.</p>

<p>Efforts such as this STI Summit and the LOD2 Project are needed for setting directions and communicating the requirement to the research world. In our fusion analogy, this is the field which directs the nuclei to collide.</p>

<p>Once there is an actual reaction that produces more than it consumes by a sufficient margin, regular business dynamics will take over, and we will have an industry with several products of comparable capability, as well as a set of metrics, all to the benefit of the end user.</p>

<h4>References</h4>

<ul>
 <li>
  <p>TPC-H <a href="http://www.tpc.org/tpch/" id="link-id0x25b910e8">results pages</a>
  </p>
 </li>

<li>
  <p>Daniel Abadi&#39;s Ph.D. Thesis, <i>Query Execution in 
      Column-Oriented Database Systems</i> ( <a href="http://cs-www.cs.yale.edu/homes/dna/papers/abadiphd.pdf" id="link-id0x25f8eeb0">PDF</a> )</p>
</li>

<li>
  <p>Our VLDB 2010 Semdata workshop paper, <i>Directions and Challenges for Semantically Linked Data</i> ( <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDirectionsChallengesSemdata" id="link-id0x25f88520">HTML</a> | <a href="http://virtuoso.openlinksw.com/whitepapers/Directions_and_Challenges_for_Semantically_Linked_Data.pdf" id="link-id0x271416b8">PDF</a> )</p>
</li>

<li>
  <p>CWI&#39;s <i>ROX: Run-time Optimization of XQueries</i> ( <a href="http://oai.cwi.nl/oai/asset/14193/14193B.pdf" id="link-id0x2699ac78">PDF</a> )</p>
</li>

<li>
  <p>The <a href="http://lod2.eu/" id="link-id0x25856fc8">LOD2 Project web site</a>
  </p>
</li>
</ul>

]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1695">
  <rss:title>The 2011 STI Semantic Summit</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1695</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1695</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1695</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-07-22T15:49:15Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">I was recently at the STI 2011 summit in Riga, Latvia. This is a meeting of senior participants in the semantic web and sem tech scene, organized by STI of Dieter Fensel fame, with board members like Michael Brodie, Mark Greaves, and Jim Hendler. This is substantially about the intersection of AI, knowledge representation, and databases. As we have said before, the database side has not been very prominent in these meetings in the past, but this time we had Peter Boncz of CWI, of MonetDB and VectorWise fame, attending the proceedings. Will DB and AI finally meet? Well, they have met, but how do they get along? Before I try to answer this, let us look at some background. At present, CWI and OpenLink are working together in the LOD2 EU FP7 project, around the general topic of bringing the best of Relational Database (RDB) science to the Graph Database (GDB) world. Virtuoso has for a few months had a column store capability (which is about to be made available for public preview). CWI has a long history of column store work, with MonetDB and Ingres VectorWise as results. OpenLink&#39;s column store implementation is separate in terms of code but is of course influenced by the work at CWI and other published column store results. The plan is to transplant the applicable CWI innovations into the graph context within Virtuoso. These improvements naturally also benefit Virtuoso RDB (SQL), but the LOD2 project is primarily concerned with GDB applications. The RDB yardstick for much of this work is TPC-H, of which we have made a GDB translation. CWI is uniquely qualified as concerns this in light of VectorWise holding some of the top places in the TPC-H charts. Even now, we do in fact run the 22 TPC-H queries in SPARQL against the Virtuoso column store. True, these run faster in SQL against relational tables but we have established a beach head. From this initial position, we can incrementally improve the GDB/SPARQL and RDB/SQL functions, and see how close to SQL we get with SPARQL. I will make a separate post commenting on the differences between SQL and SPARQL. So let&#39;s get back to Riga. Mark Greaves said in his opening comments that he would be sick if he once again heard complaining about how bad and un-scalable the tools were. From all the talks, I did get the overall impression that just better databasing for Graph Data is still needed. OK, we have 1-1/2 years of unreleased work just for that about to hit the street; advances are substantial. Along these lines, the people from Bio2RDF pointed out that there still is a cost to publishing query services, specially for complex queries. Well, this cost will be substantially reduced. The takeaway from the meeting is that the most useful thing, for both our public and ourselves, is simply to keep advancing database tech for graph data. In the first instance, this is about launching what we already have; in the second, about going through the CWI record of innovation and adapting this to GDB. The thinking is that once query-answering on some tens-of-billions of triples is easily interactive no matter what question one asks, a tipping point will be reached, and GDB can efficiently play the role of data-melting-pot that has been envisioned for it. This is just a beginning, though. Michael Brodie has on a number of occasions pointed out that that (relational) database guys are only about performance with little or no regard to meaning or even questions of the applicability of the relational model. Peter Boncz then comments back that it can well be that the bulk of IT expenditure worldwide in fact goes into data integration. However, data integration is an &quot;AI-complete&quot; problem with infinite variety and consequent difficulty of measurement. So, making better database engines stands a much greater chance of success and has the nicety of relatively unambiguous metrics. Quite so. We are somewhere in the middle. I&#39;d say that GDB is still at the stage where making better databases is a matter of make-or-break and not a matter of cutting already vanishingly-short response times just for the sake of it. We will have progress if we just keep at it; for now, performance is still a basic need and not a luxury. Now that there is all this potentially integrable data published as graphs (most commonly as RDF serializations), what do we do? Someone at the Riga meeting suggested we take a look across the tracks to the RDB world to see what is being done there for data integration. The question is raised, what does GDB have for data integration? The automatic answer that GDB and RDF have OWL is not adequate, as was rightly pointed out by many. Having schema-last, global identifiers, and some culture of vocabulary reuse is nice, but this is only a start. To cite an example, owl:sameAs will not work when entities simply do not align: One database models a product as a parts hierarchy; another does the same but now based on the materials used in the parts. One tree just has a node that is not in the other. Besides, things like string matching (as in extracting area codes from phone numbers) are common, and OWL specifically excludes any such functions. It is now time to look at what will come after all the database advances. In my talk I outlined some things that have or are about to get solutions: Database technology: Applying advances from RDB (specifically columns, vectoring, and some adaptive query execution) will make GDB a possibility for data warehousing at some scale. Benchmarks: These advances will be demonstrable through benchmarking. There is a better suite of benchmarks with many variations of BSBM, an GDB-modified TPC-H, and the upcoming Social Intelligence Benchmark (SIBB) with actual graph data. There are the beginnings of an auditing process for result publishing, and a fair chance the semdata world will get its analog of the TPC. After these basics are more or less in hand, we have a vista of more diverse questions: What to do about inference? We do not want OWL or RIF for their own sake; instead we want whatever will declaratively facilitate making sense of data. This is an entirely use-case-driven question. If this can have a reasonably generic answer, we will build it into the engine. Data integration is highly diverse, and tool sets like IBM Infosphere have thousands of modules and functions for different aspects of the problem. To what degree does it make sense to put DI-oriented capabilities into a DBMS? Is it the case that SQL or SPARQL, plus or minus a few details, is as powerful as a language can be while staying application domain-agnostic? In other words, if more powerful reasoning is built into the query language, will the requirements vary so much between application domains that the work is not generally applicable? Datalog is general enough, but can we demonstrate substantially reduced time to answer with big data if this is built into the engine? Berkeley Orders Of Magnitude claims this, even though their claim is not exactly in a database context. We need use cases to refine the actual requirement for inference. In all these questions, we of necessity turn to the user community. In fact we do not follow the usage of these technologies as much as we ought to. One outcome of the Riga summit is a set of public challenges that will hopefully ameliorate this state of matters, to be released soon. The general feeling was that there is more going on on the data side than the AI side. The LOD movement proceeds and lightweight everything predominates, also for knowledge representation. There was some discussion about &quot;pay as you go&quot; integration. On the one hand, there is no up-front integration of information systems just for its own sake, so pay as you go is the only kind that exists, system by system, as the need becomes sufficient. On the other hand, each such integration is a process which has its distinct steps and maintenance and within itself it is planned, and thus pre-paid, so to speak. We need more work with the data itself to better understand the matter. The open government data should offer a playground for this and there will be a special challenge around this. Schema.org and Microdata got their share of discussion. As we see it, it is good that search engines make their pre-competitive data open. This is better than, for example, Google wanting retailers to put their catalogs in Google Base. We do not care about the specific syntax in which data is embedded; we support them all. Microdata converts easily to triples, and if one wants to make a tabular extraction for use with relational tools, this too is simple enough. Applications will have to do their own entity resolution, but this is independent of data publication format. All in all, the mood was positive. Mark Greaves noted in his closing remarks that there has been a 1000x increase in published GDB data over a few years. There is in fact a large quantity of technology for tackling almost any aspect of the LOD value chain, but people do not necessarily know about this nor is it easy to integrate. Still there would be great value in integration. Getting software to interoperate in a meaningful way is manual labor, so it might make sense to organize hackathons around this. While the STI Summit is for the senior people, there could be a parallel track of events for bringing the coders together to actually practice tool integration and interoperation.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I was recently at the <a href="http://www.openlinksw.com:80/www.sti2.org/events/2011-sti-semantic-summit" id="link-id0x2308d838">STI 2011 summit in Riga, Latvia</a>.
This is a meeting of senior participants in the semantic web and sem tech scene, organized by <a href="http://www.openlinksw.com:80/www.sti2.org/" id="link-id0x25076168">STI</a> of <a href="http://dbpedia.org/page/Dieter_Fensel" id="link-id0x24d2e998">Dieter Fensel</a> fame, with board members like <a href="http://www.michaelbrodie.com/" id="link-id0x224b4b58">Michael Brodie</a>, <a href="http://www.iks-project.eu/community/people/mark-greaves" id="link-id0x2308d4a8">Mark Greaves</a>, and <a href="http://dbpedia.org/page/James_Hendler" id="link-id0x24c192d0">Jim Hendler</a>.</p>

<p>This is substantially about the intersection of AI, knowledge representation, and databases. As we have said before, the database side has not been very prominent in these meetings in the past, but this time we had <a href="http://homepages.cwi.nl/~boncz/" id="link-id0x26654260">Peter Boncz</a> of CWI, of MonetDB and VectorWise fame, attending the proceedings.</p>

<p>Will DB and AI finally meet? Well, they have met, but how do they get along? Before I try to answer this, let us look at some background.</p>

<p>At present, CWI and <a href="http://www.openlinksw.com/" id="link-id0x24724fe0">OpenLink</a> are working together in the <a href="http://lod2.eu/" id="link-id0x24e20d90">LOD2 EU FP7 project</a>, around the general topic of bringing the best of <a href="http://dbpedia.org/page/Relational_database" id="link-id0x2475f128">Relational Database</a> (RDB) science to the <a href="http://dbpedia.org/page/Graph_database" id="link-id0x2474e988">Graph Database</a> (GDB) world. Virtuoso has for a few months had a column store capability (which is about to be made available for public preview). CWI has a long history of column store work, with MonetDB and Ingres VectorWise as results. OpenLink&#39;s column store implementation is separate in terms of code but is of course influenced by the work at CWI and other published column store results. The plan is to transplant the applicable CWI innovations into the graph context within Virtuoso. These improvements naturally also benefit Virtuoso RDB (SQL), but the LOD2 project is primarily concerned with GDB applications. The RDB yardstick for much of this work is <a href="http://dbpedia.org/resource/TPC-H" id="link-id0x22a96588">TPC-H</a>, of which we have made a GDB translation. CWI is uniquely qualified as concerns this in light of VectorWise holding some of the top places in the TPC-H charts.</p>

<p>Even now, we do in fact run the 22 TPC-H queries in SPARQL against the Virtuoso column store. True, these run faster in SQL against relational tables but we have established a beach head. From this initial position, we can incrementally improve the GDB/SPARQL and RDB/SQL functions, and see how close to SQL we get with SPARQL. I will make a separate post commenting on the differences between SQL and SPARQL.</p>

<p>So let&#39;s get back to Riga. Mark Greaves said in his opening comments that he would be sick if he once again heard complaining about how bad and un-scalable the tools were. From all the talks, I did get the overall impression that just better databasing for Graph Data is still needed. OK, we have 1-1/2 years of unreleased work just for that about to hit the street; advances are substantial. Along these lines, the people from <a href="http://www.bio2rdf.org/" id="link-id0x2315c088">Bio2RDF</a> pointed out that there still is a cost to publishing query services, specially for complex queries. Well, this cost will be substantially reduced.</p>

<p>The takeaway from the meeting is that the most useful thing, for both our public and ourselves, is simply to keep advancing database tech for graph data. In the first instance, this is about launching what we already have; in the second, about going through the CWI record of innovation and adapting this to GDB.</p>

<p>The thinking is that once query-answering on some tens-of-billions of triples is easily interactive no matter what question one asks, a tipping point will be reached, and GDB can efficiently play the role of data-melting-pot that has been envisioned for it.</p>

<p>This is just a beginning, though. Michael Brodie has on a number of occasions pointed out that that (relational) database guys are only about performance with little or no regard to meaning or even questions of the applicability of the relational model. Peter Boncz then comments back that it can well be that the bulk of IT expenditure worldwide in fact goes into data integration. However, data integration is an &quot;<a href="http://dbpedia.org/page/AI-complete" id="link-id0x24754170">AI-complete</a>&quot; problem with infinite variety and consequent difficulty of measurement. So, making better database engines stands a much greater chance of success and has the nicety of relatively unambiguous metrics. </p>

<p>Quite so. We are somewhere in the middle. I&#39;d say that GDB is still at the stage where making better databases is a matter of make-or-break and not a matter of cutting already vanishingly-short response times just for the sake of it. We will have progress if we just keep at it; for now, performance is still a basic need and not a luxury.</p>

<p>Now that there is all this potentially integrable data published as graphs (most commonly as RDF serializations), what do we do? Someone at the Riga meeting suggested we take a look across the tracks to the RDB world to see what is being done there for data integration. The question is raised, what does GDB have for data integration? The automatic answer that GDB and RDF have OWL is not adequate, as was rightly pointed out by many. Having schema-last, global identifiers, and some culture of vocabulary reuse is nice, but this is only a start. To cite an example, <code>owl:sameAs</code> will not work when entities simply do not align: One database models a product as a parts hierarchy; another does the same but now based on the materials used in the parts. One tree just has a node that is not in the other. Besides, things like string matching (as in extracting area codes from phone numbers) are common, and OWL specifically excludes any such functions.</p>

<p>It is now time to look at what will come after all the database advances. In my talk I outlined some things that have or are about to get solutions:</p>

<ul>
 <li>
  <p>
    <b>Database technology:</b> Applying advances from RDB (specifically columns, vectoring, and some adaptive query execution) will make GDB a possibility for data warehousing at some scale.</p>
 </li>

<li>
  <p>
    <b>Benchmarks:</b> These advances will be demonstrable through benchmarking. There is a better suite of benchmarks with many variations of BSBM, an GDB-modified TPC-H, and the upcoming Social Intelligence Benchmark (SIBB) with actual graph data. There are the beginnings of an auditing process for result publishing, and a fair chance the semdata world will get its analog of the TPC.</p>
</li>
</ul>

<p>After these basics are more or less in hand, we have a vista of more diverse questions:</p>

<ul>
 <li>
  <p>What to do about inference? We do not want OWL or RIF for their own sake; instead we want whatever will declaratively facilitate making sense of data. This is an entirely use-case-driven question. If this can have a reasonably generic answer, we will build it into the engine. </p>
 </li>

<li>
  <p>Data integration is highly diverse, and tool sets like IBM Infosphere have thousands of modules and functions for different aspects of the problem. To what degree does it make sense to put DI-oriented capabilities into a DBMS? </p>
</li>

<li>
  <p>Is it the case that SQL or SPARQL, plus or minus a few details, is as powerful as a language can be while staying application domain-agnostic? In other words, if more powerful reasoning is built into the query language, will the requirements vary so much between application domains that the work is not generally applicable? <a href="http://dbpedia.org/page/Datalog" id="link-id0x2403b2f0">Datalog</a> is general enough, but can we demonstrate substantially reduced time to answer with big data if this is built into the engine? <a href="http://boom.cs.berkeley.edu/" id="link-id0x23ed5730">Berkeley Orders Of Magnitude</a> claims this, even though their claim is not exactly in a database context. We need use cases to refine the actual requirement for inference.</p>
</li>
</ul>

<p>In all these questions, we of necessity turn to the user community. In fact we do not follow the usage of these technologies as much as we ought to. One outcome of the Riga summit is a set of public challenges that will hopefully ameliorate this state of matters, to be released soon.</p>

<p>The general feeling was that there is more going on on the data side than the AI side. The LOD movement proceeds and lightweight everything predominates, also for knowledge representation. There was some discussion about &quot;pay as you go&quot; integration. On the one hand, there is no up-front integration of information systems just for its own sake, so pay as you go is the only kind that exists, system by system, as the need becomes sufficient. On the other hand, each such integration is a process which has its distinct steps and maintenance and within itself it is planned, and thus pre-paid, so to speak. We need more work with the data itself to better understand the matter. The open government data should offer a playground for this and there will be a special challenge around this.</p>

<p>
<a href="http://schema.org/" id="link-id0x2475a708">Schema.org</a> and <a href="http://www.w3.org/TR/microdata/" id="link-id0x2a6f8b40">Microdata</a> got their share of discussion. As we see it, it is good that search engines make their pre-competitive data open. This is better than, for example, Google wanting retailers to put their catalogs in Google Base. We do not care about the specific syntax in which data is embedded; we support them all. Microdata converts easily to triples, and if one wants to make a tabular extraction for use with relational tools, this too is simple enough. Applications will have to do their own entity resolution, but this is independent of data publication format. </p>

<p>All in all, the mood was positive. Mark Greaves noted in his closing remarks that there has been a 1000x increase in published GDB data over a few years. There is in fact a large quantity of technology for tackling almost any aspect of the LOD value chain, but people do not necessarily know about this nor is it easy to integrate. Still there would be great value in integration. Getting software to interoperate in a meaningful way is manual labor, so it might make sense to organize hackathons around this. While the STI Summit is for the senior people, there could be a parallel track of events for bringing the coders together to actually practice tool integration and interoperation.</p>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1692">
  <rss:title>Transaction Semantics in RDF and Relational Models</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1692</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1692</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1692</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T23:55:43Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">As a part of defining benchmark audit for testing ACID properties on RDF stores, we will here examine different RDF scenarios where lack of concurrency control causes inconsistent results. In so doing, we consider common implementation techniques and implications as concern locking (pessimistic) and multi-version (optimistic) concurrency control schemes. In the following, we will talk in terms of triples, but the discussion can be trivially generalized to quads. We will use numbers for IRIs and literals. In most implementations, the internal representation for these is indeed a number (or at least some data type that has a well defined collation order). For ease of presentation, we consider a single index with key parts SPO. Any other index-like setting with any possible key order will have similar issues. Insert (Create) and Delete INSERT and DELETE as defined in SPARQL are queries which generate a result set which is then used for instantiating triple patterns. We note that a DELETE may delete a triple which the DELETE has not read; thus the delete set is not a subset of the read set. The SQL equivalent is the DELETE FROM table WHERE key IN ( SELECT key1 FROM other_table ) expression, supposing it were implemented as a scan of other_table and an index lookup followed by DELETE on table. The meaning of INSERT is that the triples in question exist after the operation, and the meaning of DELETE is that said triples do not exist. In a transactional context, this means that the after-image of the transaction is guaranteed either to have or not-have said triples. Suppose that the triples { 1 0 0 }, { 1 5 6 }, and { 1 5 7 } exist in the beginning. If we DELETE { 1 ?x ?y } and concurrently INSERT { 1 2 4 . 1 2 3 . 1 3 5 }, then whichever was considered to be first by the concurrency control of the DBMS would complete first, and the other after that. Thus the end state would either have no triples with subject 1 or would have the three just inserted. Suppose the INSERT inserts the first triple, { 1 2 4 }. The DELETE at the same time reads all triples with subject 1. The exclusive read waits for the uncommitted INSERT. The INSERT then inserts the second triple, { 1 2 3 }. Depending on the isolation of the read, this either succeeds, since no { 1 2 3 } was read, or causes a deadlock. The first corresponds to REPEATABLE READ isolation; the second to SERIALIZABLE. We would not get the desired end-state of either all the inserted triples or no triples with subject 1 if the read or the DELETE were not serializable. Furthermore if a DELETE template produced a triple that did not exist in the pre-image, the DELETE semantics still imply that this also does not exist in the after-image, which implies serializability. Read and Update Let us consider the prototypical transaction example of transferring funds from one account to another. Two balances are updated, and a history record is inserted. The initial state is a balance 10 b balance 10 We transfer 1 from a to b, and at the same time transfer 2 from b to a. The end state must have a at 11 and b at 9. A relational database needs REPEATABLE READ isolation for this. With RDF, txn1 reads that a has a balance of 10. At the same time, txn1 reads the balance of a. txn2 waits because the read of txn1 is exclusive. txn1 proceeds and read the balance of b. It then updates the balance of a and b. All goes without the deadlock which is always cited in this scenario, because the locks are acquired in the same order. The act of updating the balance of a, since RDF does not really have an update-in-place, consists of deleting { a balance 10 } and inserting { a balance 9 }. This gets done and txn1 commits. At this point, txn2 proceeds after its wait on the row that stated { a balance 10 }. This row is now gone, and txn2 sees that a has no balance, which is quite possible in RDF&#39;s schema-less model. We see that REPEATABLE READ is not adequate with RDF, even though it is with relational. The reason why there is no UPDATE-in-place is that the PRIMARY KEY of the triple includes all the parts, including the object. Even in a RDBMS, an UPDATE of a primary key part amounts to a DELETE-plus-INSERT. One could here argue that an implementation might still UPDATE-in-place if the key order were not changed. This would resolve the special case of the accounts but not a more general case. Thus we see that the read of the balance must be SERIALIZABLE. This means that the read locks the space before the first balance, so that no insertion may take place. In this way the read of txn2 waits on the lock that is conceptually before the first possible match of { a balance ?x }. locking order and OLTP To implement TPC-C, I would update the table with the highest cardinality first, and then all tables in descending order of cardinality. In this way, the locks with the highest likelihood for contention are held for the least time. If locking multiple rows of a table, these should be locked in a deterministic order, e.g., lowest key-value first. In this way, the workload would not deadlock. In actual fact, with clusters and parallel execution, the lock acquisition will not be guaranteed to be serial, so deadlocks do not entirely go away, but still may get fewer. Besides, any outside transaction might still lock in the wrong order and cause deadlocks, which is why the OLTP application must in any case be built to deal with the possibility of deadlock. This is the conventional relational view of the matter. In more recent times, in-memory schemes with deterministic lock acquisition (Abadi VLDB 2010) or single-threaded atomic execution of transactions (Uni Munich BIRTE workshop at VLDB2010, VoltDB) have been proposed. There the transaction is described as a stored procedure, possibly with extra annotations. These techniques might apply to RDF also. RDF is however an unlikely model for transaction-intensive applications, so we will not for now examine these further. RDBMS usually implement row-level locking. This means that once a column of a row has an uncommitted state, any other transaction is prevented from changing the row. This has no ready RDF equivalent. RDF is usually implemented as a row-per-triple system and applying row-level locking to this does not give the semantic one expects of a relational row. I would argue that it is not essential to enforce transactional guarantees in units of rows. The guarantees must apply between data that is read and written by a transaction. It does not need to apply to columns that the transaction does not reference. To take the TPC-C example, the new order transaction updates the stock level and the delivery transaction updates the delivery count on the stock table. In practice, a delivery and a new order falling on the same row of stock will lock each other out, but nothing in the semantics of the workload mandates this. It does not seem a priori necessary to recreate the row as a unit of concurrency control in RDF. One could say that a multi-attribute whole (such as an address) ought to be atomic for concurrency control, but then applications updating addresses will most likely read and update all the fields together even if only the street name changes. Pessimistic Vs. Optimistic Concurrency Control We have so far spoken only in terms of row-level locking, which is to my knowledge the most widely used model in RDBMS, and one we implement ourselves. Some databases (e.g., MonetDB and VectorWise) implement optimistic concurrency control. The general idea is that each transaction has a read and write set and when a transaction commits, any other transactions whose read or write set intersects with the write set of the committing transaction are marked un-committable. Once a transaction thus becomes un-committable, it may presumably continue reading indefinitely but may no longer commit its updates. Optimistic concurrency is generally coupled with multi-version semantics where the pre-image of a transaction is a clean committed state of the database as of a specific point in time, i.e., snapshot isolation. To implement SERIALIZABLE isolation, i.e., the guarantee that if a transaction twice performs a COUNT the result will be the same, one locks also the row that precedes the set of selected rows and marks each lock so as to prevent an insert to the right of the lock in key order. The same thing may be done in an optimistic setting. Positional Handling of Updates in Column Stores [Heman, Zukowski, CWI science library] discusses management of multiple consecutive snapshots in some detail. The paper does not go into the details of different levels of isolation but nothing there suggests that serializability could not be supported. There is some complexity in marking the space between ordered rows as non-insertable across multiple versions but this should be feasible enough. The issue of optimistic Vs. pessimistic concurrency does not seem to be affected by the differences between RDF and relational models. We note that an OLTP workload can be made to run with very few transaction aborts (deadlocks) by properly ordering operations when using a locking scheme. The same does not work with optimistic concurrency since updates happen immediately and transaction aborts occur whenever the writes of one intersect the reads or writes of another, regardless of the order in which these were made. Developers seldom understand transactions; therefore DBMS should, within the limits of the possible, optimize locking order for locking schemes. A simple example is locking in key order when doing an operation on a set of values. A more complex variant would consist of analyzing data dependencies in stored procedures and reordering updates so as to get the highest cardinality tables first. We note that this latter trick also benefits optimistic schemes. In RDF, the same principles apply but distinguishing cardinality of an updated set will have to rely on statistics of predicate cardinality. Such are anyhow needed for query optimization. Eventual Consistency Web scale systems that need to maintain consistent state across multiple data centers sometimes use &quot;eventual consistency&quot; schemes. Two-phase-commit becomes very inefficient as latency increases, thus strict transactional semantics have prohibitive cost if the system is more distributed than a cluster with a fast interconnect. Eventual consistency schemes (Amazon Dynamo, Yahoo! PNUTS) maintain history information on the record which is the unit of concurrency control. The record is typically a non-first normal form chunk of related data that it makes sense to store together from the application&#39;s viewpoint. Application logic can then be applied to reconciling differing copies of the same logical record. Such a scheme seems a priori ill-suited for RDF, where the natural unit of concurrency control would seem to be the quad. We first note that only recently changed (i.e., DELETEd + INSERTed quads, as there is no UPDATE-in-place) need history information. This history information can be stored away from the quad itself, thus not disrupting compression. When detecting that one site has INSERTed a quad that another has DELETEd in the same general time period, application logic can still be applied for reading related quads in order to arrive at a decision on how to reconcile two databases that have diverged. The same can apply to conflicting values of properties that for the application should be single-valued. Comparing time-stamped transaction logs on quads is not fundamentally different from comparing record histories in Dynamo or PNUTS. As we overcome the data size penalties that have until recently been associated with RDF, RDF becomes even more interesting as a data model for large online systems such as social network platforms where frequent application changes lead to volatility of schema. Key value stores are currently found in such applications, but they generally do not provide the query flexibility at which RDF excels. Conclusions We have gone over basic aspects of the endlessly complex and variable topic of transactions, and drawn parallels as well as outlined two basic differences between relational and RDF systems: What used to be REPEATABLE READ becomes SERIALIZABLE; and row-level locking becomes locking at the level of a single attribute value. For the rest, we see that the optimistic and pessimistic modes of concurrency control, as well as guidelines for writing transaction procedures, remain much the same. Based on this overview, it should be possible to design an ACID test for describing the ACID behavior of benchmarked systems. We do not intend to make transaction support a qualification requirement for an RDF benchmark, but information on transaction support will still be valuable in comparing different systems.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>As a part of defining benchmark audit for testing <a class="auto-href" href="http://dbpedia.org/resource/ACID" id="link-id0x1cfc6e38">ACID</a> properties on <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x1f1302b8">RDF</a> stores, we will here examine different RDF scenarios where lack of concurrency control causes inconsistent results.  In so doing, we consider common implementation techniques and implications as concern locking (pessimistic) and multi-version (optimistic) concurrency control schemes.</p>

<p>In the following, we will talk in terms of triples, but the discussion can be trivially generalized to quads.  We will use numbers for IRIs and literals.  In most implementations, the internal representation for these is indeed a number (or at least some <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x1728a9a8">data</a> type that has a well defined collation order).  For ease of presentation, we consider a single index with key parts <code>SPO</code>.  Any other index-like setting with any possible key order will have similar issues. </p>

<h2>Insert (Create) and Delete </h2>

<p>
<code>INSERT</code> and <code>DELETE</code> as defined in <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x16dee7f8">SPARQL</a> are queries which generate a result set which is then used for instantiating triple patterns.  We note that a <code>DELETE</code> may delete a triple which the <code>DELETE</code> has not read; thus the delete set is not a subset of the read set.  The <a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x1e3afb78">SQL</a> equivalent is the </p>

<blockquote>
 <code><pre>DELETE FROM table WHERE key IN 
   ( SELECT key1 FROM other_table )</pre>
 </code>
</blockquote>

<p>expression, supposing it were implemented as a scan of <code>other_table</code> and an index lookup followed by <code>DELETE</code> on table. </p>

<p>The meaning of <code>INSERT</code> is that the triples in question exist after the operation, and the meaning of <code>DELETE</code> is that said triples do not exist. In a transactional context, this means that the after-image of the transaction is guaranteed either to have or not-have said triples. </p>

<p>Suppose that the triples <code>{ 1 0 0 }</code>, <code>{ 1 5 6 }</code>, and <code>{ 1 5 7 }</code> exist in the beginning. If we <code>DELETE { 1 ?x ?y }</code> and concurrently <code>INSERT { 1 2 4 . 1 2 3 . 1 3 5 }</code>, then whichever was considered to be first by the concurrency control of the DBMS would complete first, and the other after that.  Thus the end state would either have no triples with subject <code>1</code> or would have the three just inserted. </p>

<p>Suppose the <code>INSERT</code> inserts the first triple, <code>{ 1 2 4 }</code>.  The <code>DELETE</code> at the same time reads all triples with subject <code>1</code>.  The exclusive read waits for the uncommitted <code>INSERT</code>.  The <code>INSERT</code> then inserts the second triple, <code>{ 1 2 3 }</code>. Depending on the isolation of the read, this either succeeds, since no <code>{ 1 2 3 }</code> was read, or causes a deadlock.  The first corresponds to <code>REPEATABLE READ</code> isolation; the second to <code>SERIALIZABLE</code>.</p>

<p>We would not get the desired end-state of either <i>all the inserted triples</i> or <i>no triples with subject <code>1</code></i> if the read or the <code>DELETE</code> were not serializable.</p>

<p>Furthermore if a <code>DELETE</code> template produced a triple that did not exist in the pre-image, the <code>DELETE</code> semantics still imply that this also does not exist in the after-image, which implies serializability.</p>


<h2>Read and Update</h2>

<p>Let us consider the prototypical transaction example of transferring funds from one account to another. Two balances are updated, and a history record is inserted.</p>

<p>The initial state is </p>

<blockquote>
<code><pre>a  balance  10
b  balance  10</pre></code>
</blockquote>

<p>We transfer <code>1</code> from <code>a</code> to <code>b</code>, and at the same time transfer <code>2</code> from <code>b</code> to <code>a</code>.  The end state must have <code>a</code> at <code>11</code> and <code>b</code> at <code>9</code>.</p>

<p>A relational database needs <code>REPEATABLE READ</code> isolation for this.</p>

<p>With RDF, <code>txn1</code> reads that <code>a</code> has a <code>balance</code> of <code>10</code>.   At the same time, <code>txn1</code> reads the <code>balance</code> of <code>a</code>.  <code>txn2</code> waits because the read of <code>txn1</code> is exclusive.  <code>txn1</code> proceeds and read the <code>balance</code> of <code>b</code>.  It then updates the <code>balance</code> of <code>a</code> and <code>b</code>. </p>

<p>All goes without the deadlock which is always cited in this scenario, because the locks are acquired in the same order. The act of updating the balance of <code>a</code>, since RDF does not really have an update-in-place, consists of deleting <code>{ a balance 10 }</code> and inserting <code>{ a balance 9 }</code>.  This gets done and <code>txn1</code> commits. At this point, <code>txn2</code> proceeds after its wait on the row that stated <code>{ a balance 10 }</code>.  This row is now gone, and <code>txn2</code> sees that <code>a</code> has no balance, which is quite possible in RDF&#39;s <a class="auto-href" href="http://dbpedia.org/resource/Database_schema" id="link-id0x1ebb94c8">schema</a>-less model.</p>

<p>We see that <code>REPEATABLE READ</code> is not adequate with RDF, even though it is with relational. The reason why there is no <code>UPDATE</code>-in-place is that the <code>PRIMARY KEY</code> of the triple includes all the parts, including the object. Even in a <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x1ca86578">RDBMS</a>, an <code>UPDATE</code> of a primary key part amounts to a <code>DELETE</code>-plus-<code>INSERT</code>.  One could here argue that an implementation might still <code>UPDATE</code>-in-place if the key order were not changed.  This would resolve the special case of the accounts but not a more general case.</p>

<p>Thus we see that the read of the balance must be <code>SERIALIZABLE</code>.  This means that the read locks the space before the first balance, so that no insertion may take place.  In this way the read of <code>txn2</code> waits on the lock that is conceptually before the first possible match of <code>{ a balance ?x }</code>.</p>


<h2>locking order and OLTP </h2>

<p>To implement <a class="auto-href" href="http://www.tpc.org/" id="link-id0x1e811d68">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/C%2B%2B" id="link-id0x1df9c990">C</a>, I would update the table with the highest cardinality first, and then all tables in descending order of cardinality.  In this way, the locks with the highest likelihood for contention are held for the least time.  If locking multiple rows of a table, these should be locked in a deterministic order, e.g., lowest key-value first.  In this way, the workload would not deadlock.  In actual fact, with clusters and parallel execution, the lock acquisition will not be guaranteed to be serial, so deadlocks do not entirely go away, but still may get fewer.  Besides, any outside transaction might still lock in the wrong order and cause deadlocks, which is why the OLTP application must in any case be built to deal with the possibility of deadlock.</p>

<p>This is the conventional relational view of the matter.  In more recent times, in-memory schemes with deterministic lock acquisition (<a href="http://cs-www.cs.yale.edu/homes/dna/papers/determinism-vldb10.pdf" id="link-id0x1c5d9340">Abadi VLDB 2010</a>) or single-threaded atomic execution of transactions (<a href="http://bird.cs.tu-berlin.de:8008/birte2010/" id="link-id0x1ec0ed18">Uni Munich BIRTE workshop at VLDB2010</a>, <a href="http://www.voltdb.com/" id="link-id0x1ab6e380">VoltDB</a>) have been proposed. There the transaction is described as a stored procedure, possibly with extra annotations.  These techniques might apply to RDF also. RDF is however an unlikely model for transaction-intensive applications, so we will not for now examine these further.</p>

<p>RDBMS usually implement row-level locking.  This means that once a column of a row has an uncommitted state, any other transaction is prevented from changing the row.  This has no ready RDF equivalent. RDF is usually implemented as a row-per-triple system and applying row-level locking to this does not give the semantic one expects of a relational row.  </p>

<p>I would argue that it is not essential to enforce transactional guarantees in units of rows.  The guarantees must apply between data that is <i>read</i> and <i>written</i> by a transaction.  It does not need to apply to columns that the transaction does not reference.  To take the TPC-C example, the <i>new order</i> transaction updates the stock level and the <i>delivery</i> transaction updates the delivery count on the stock table. In practice, a <i>delivery</i> and a <i>new order</i> falling on the same row of stock will lock each other out, but nothing in the semantics of the workload mandates this.</p>

<p>It does not seem <i>a priori</i> necessary to recreate the row as a unit of concurrency control in RDF.  One could say that a multi-attribute whole (such as an address) ought to be atomic for concurrency control, but then applications updating addresses will most likely read and update all the fields together even if only the street name changes.</p>


<h2>Pessimistic Vs. Optimistic Concurrency Control </h2>

<p>We have so far spoken only in terms of row-level locking, which is to my <a class="auto-href" href="http://dbpedia.org/resource/Knowledge" id="link-id0x1ebbf3f8">knowledge</a> the most widely used model in RDBMS, and one we implement ourselves.  Some databases (e.g., <a class="auto-href" href="http://dbpedia.org/resource/MonetDB" id="link-id0x1e771f48">MonetDB</a> and <a class="auto-href" href="http://www.ingres.com/vectorwise/" id="link-id0x1f3b4830">VectorWise</a>) implement optimistic concurrency control. The general idea is that each transaction has a read and write set and when a transaction commits, any other transactions whose read or write set intersects with the write set of the committing transaction are marked un-committable.  Once a transaction thus becomes un-committable, it may presumably continue reading indefinitely but may no longer commit its updates. Optimistic concurrency is generally coupled with multi-version semantics where the pre-image of a transaction is a clean committed state of the database as of a specific point in time, i.e., snapshot isolation.  </p>

<p>To implement <code>SERIALIZABLE</code> isolation, i.e., the guarantee that if a transaction twice performs a <code>COUNT</code> the result will be the same, one locks also the row that precedes the set of selected rows and marks each lock so as to prevent an insert to the right of the lock in key order.  The same thing may be done in an optimistic setting.</p>

<p>
  <a href="http://event.cwi.nl/SIGMOD-RWE/2010/22-7f15a1/paper.pdf" id="link-id0x1d5de810">Positional Handling of Updates in Column Stores</a> [Heman, Zukowski, <a class="auto-href" href="http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science" id="link-id0x1e7644d8">CWI</a> science library] discusses management of multiple consecutive snapshots in some detail. The paper does not go into the details of different levels of isolation but nothing there suggests that serializability could not be supported.  There is some complexity in marking the space between ordered rows as non-insertable across multiple versions but this should be feasible enough. </p>

<p>The issue of optimistic Vs. pessimistic concurrency does not seem to be affected by the differences between RDF and relational models.  We note that an OLTP workload can be made to run with very few transaction aborts (deadlocks) by properly ordering operations when using a locking scheme.  The same does not work with optimistic concurrency since updates happen immediately and transaction aborts occur whenever the writes of one intersect the reads or writes of another, regardless of the order in which these were made.</p>

<p>Developers seldom understand transactions; therefore DBMS should, within the limits of the possible, optimize locking order for locking schemes.  A simple example is locking in key order when doing an operation on a set of values.  A more complex variant would consist of analyzing data dependencies in stored procedures and reordering updates so as to get the highest cardinality tables first.  We note that this latter trick also benefits optimistic schemes.</p>

<p>In RDF, the same principles apply but distinguishing cardinality of an updated set will have to rely on statistics of predicate cardinality. Such are anyhow needed for query <a class="auto-href" href="http://dbpedia.org/resource/Program_optimization" id="link-id0x1f05c1a8">optimization</a>.</p>

<h2>Eventual Consistency </h2>

<p>Web scale systems that need to maintain consistent state across multiple data centers sometimes use &quot;eventual consistency&quot; schemes.  <a class="auto-href" href="http://dbpedia.org/resource/Two-phase_commit_protocol" id="link-id0x1cebd340">Two-phase-commit</a> becomes very inefficient as latency increases, thus strict transactional semantics have prohibitive cost if the system is more distributed than a cluster with a fast interconnect.</p>

<p>Eventual consistency schemes (<a href="http://dbpedia.org/page/Dynamo_(storage_system)" id="link-id0x1f9db8f8">Amazon Dynamo</a>, <a href="http://research.yahoo.com/project/212" id="link-id0x1da3db80">Yahoo! PNUTS</a>) maintain history <a class="auto-href" href="http://dbpedia.org/resource/Information" id="link-id0x1ec4dbc8">information</a> on the record which is the unit of concurrency control.  The record is typically a non-first normal form chunk of related data that it makes sense to store together from the application&#39;s viewpoint.  Application logic can then be applied to reconciling differing copies of the same logical record. </p>

<p>Such a scheme seems <i>a priori</i> ill-suited for RDF, where the natural unit of concurrency control would seem to be the quad.  We first note that only recently changed (i.e., <code>DELETEd + INSERTed</code> quads, as there is no <code>UPDATE</code>-in-place) need history information.  This history information can be stored away from the quad itself, thus not disrupting compression.  When detecting that one site has <code>INSERTed</code> a quad that another has <code>DELETEd</code> in the same general time period, application logic can still be applied for reading related quads in order to arrive at a decision on how to reconcile two databases that have diverged.  The same can apply to conflicting values of properties that for the application should be single-valued.  Comparing time-stamped transaction logs on quads is not fundamentally different from comparing record histories in Dynamo or PNUTS.</p>

<p>As we overcome the data size penalties that have until recently been associated with RDF, RDF becomes even more interesting as a data model for large online systems such as social network platforms where frequent application changes lead to volatility of schema.  Key value stores are currently found in such applications, but they generally do not provide the query flexibility at which RDF excels. </p>


<h2>Conclusions </h2>

<p>We have gone over basic aspects of the endlessly complex and variable topic of transactions, and drawn parallels as well as outlined two basic differences between relational and RDF systems: What used to be <code>REPEATABLE READ</code> becomes <code>SERIALIZABLE</code>; and row-level locking becomes locking at the level of a single attribute value.  For the rest, we see that the optimistic and pessimistic modes of concurrency control, as well as guidelines for writing transaction procedures, remain much the same.</p>

<p>Based on this overview, it should be possible to design an ACID test for describing the ACID behavior of benchmarked systems.  We do not intend to make transaction support a qualification requirement for an RDF benchmark, but information on transaction support will still be valuable in comparing different systems.</p>

]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1690">
  <rss:title>RDF and Transactions</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1690</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1690</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1690</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T22:52:56Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">I will here talk about RDF and transactions for developers in general. The next one talks about specifics and is for specialists. Transactions are certainly not the first thing that comes to mind when one hears &quot;RDF&quot;. We have at times used a recruitment questionnaire where we ask applicants to define a transaction. Many vaguely remember that it is a unit of work, but usually not more than that. We sometimes get questions from users about why they get an error message that says &quot;deadlock&quot;. &quot;Deadlock&quot; is what happens when multiple users concurrently update balances on multiple bank accounts in the wrong order. What does this have to do with RDF? There are in fact users who even use XA with a Virtuoso-based RDF application. Franz also has publicized their development of full ACID capabilities for AllegroGraph. RDF is a database schema model, and transactions will inevitably become an issue in databases. At the same time, the developer population trained with MySQL and PHP is not particularly transaction-aware. Transactions have gone out of style, declares the No-SQL crowd. Well, it is not so much SQL they object to but ACID, i.e., transactional guarantees. We will talk more about this in the next post. The SPARQL language and protocol do not go into transactions, except for expressing the wish that an UPDATE request to an end-point be atomic. But beware -- atomicity is a gateway drug, and soon one finds oneself on full ACID. If one says that a thing will either happen in its entirety or not at all, which is what (A) atomicity means, then the question arises of (I) isolation; that is, what happens if somebody else does something to the same data at the same time? Then comes the question of whether a thing, once having happened, will stay that way; i.e., (D) durability. Finally, there is (C) consistency, which means that the transaction&#39;s result must not contradict restrictions the database is supposed to enforce. RDF usually has no restrictions; thus consistency mostly means that the internal state of the DBMS must be consistent, e.g., different indices on triples/quads should contain the same data. There are, of course, database-like consistency criteria that one can express in RDF Schema and OWL, concerning data types, mandatory presence of properties, or restrictions on cardinality (i.e., one may only have one spouse at a time, and the like). If one indeed did enforce them all, then RDF would be very like the relational model -- with all the restrictions, but without the 40 years of work on RDBMS performance. For this reason, RDF use tends to involve data that is not structured enough to be a good fit for RDBMS. There is of course the OWL side, where consistency is important but is defined in such complex ways that they again are not a good fit for RDBMS. RDF could be seen to be split between the schema-last world and the knowledge representation world. I will here focus on the schema-last side. Transactions are relevant in RDF in two cases: 1. If data is trickle loaded in small chunks, one likes to know that the chunks do not get lost or corrupted; 2. If the application has any semantics that reserve resources, then these operations need transactions. The latter is not so common with RDF but examples include read-write situations, like checking if a seat is available and then reserving it. Transactionality guarantees that the same seat does not get reserved twice. Web people argue with some justification that since the four cardinal virtues of database never existed on the web to begin with, applying strict ACID to web data is beside the point, like locking the stable after the horse has long since run away. This may be so; yet the systems used for processing data, whether that data is dirty or not, benefit from predictable operation under concurrency and from not losing data. Analytics workloads are not primarily about transactions, but still need to specify what happens with updates. Analyzing data from measurements may not have concurrent updates, but there the transaction issue is replaced by the question of making explicit how the data was acquired and what processing has been applied to it before storage. As mentioned before, the LOD2 project is at the crossroads of RDF and database. I construe its mission to be the making of RDF into a respectable database discipline. Database respectability in turn is as good as inconceivable without addressing the very bedrock on which this science was founded: transactions. As previously argued, we need well-defined and auditable benchmarks. This again brings up the topic of transactions. Once we embark on the database benchmark route, there is no way around this. TPC-H mandates that the system under test support transactions, and the audit involves a test for this. We can do no less. This has led me to more closely examine the issue of RDF and transactions, and whether there exist differences between transactions applied to RDF and to relational data. As concerns Virtuoso, our position has been that one can get full ACID in Virtuoso, whether in SQL or SPARQL, by using a connected client (e.g., ODBC, JDBC, or the Jena or Sesame frameworks), and setting the isolation options on the connection. Having taken this step, one then must take the next step, which consists of dealing with deadlocks; i.e., with concurrent utilization, it may happen that the database at any time notifies the client that the transaction got aborted and the client must retry. Web developers especially do not like this, because this is not what MySQL has taught them to expect. MySQL does have transactional back-ends like InnoDB, but often gets used without transactions. With the March 2011 Virtuoso releases, we have taken a closer look at transactions with RDF. It is more practical to reduce the possibility of errors than to require developers to pay attention. For this reason we have automated isolation settings for RDF, greatly reduced the incidence of deadlocks, and even incorporated automatic deadlock retries where applicable. If all users lock resources they need in the same order, there will be no deadlocks. This is what we do with RDF load in Virtuoso 7; thus any mix of concurrent INSERTs and DELETEs, if these are under a certain size (normally 10000 quads) are guaranteed never to fail due to locking. These could still fail due to running out of space, though. With previous versions, there always was a possibility of having an INSERT or DELETE fail because of deadlock with multiple users. Vectored INSERT and DELETE are sufficient for making web crawling or archive maintenance practically deadlock free, since there the primary transaction is the INSERT or DELETE of a small graph. Furthermore, since the SPARQL protocol has no way of specifying transactions consisting of multiple client-server exchanges, the SPARQL end-point may deal with deadlocks by itself. If all else fails, it can simply execute requests one after the other, thus eliminating any possibility of locking. We note that many statements will be intrinsically free of deadlocks by virtue of always locking in key order, but this cannot be universally guaranteed with arbitrary size operations; thus concurrent operations might still sometimes deadlock. Anyway, vectored execution as introduced in Virtuoso 7, besides getting easily double-speed random access, also greatly reduces deadlocks by virtue of ordering operations. In the next post we will talk about what transactions mean with RDF and whether there is any difference with the relational model.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I will here talk about <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x249bc940">RDF</a> and transactions for developers in general. The next one talks about specifics and is for specialists.</p>

<p>Transactions are certainly not the first thing that comes to mind when one hears &quot;RDF&quot;.  We have at times used a recruitment questionnaire where we ask applicants to define a transaction.  Many vaguely remember that it is a unit of work, but usually not more than that.  We sometimes get questions from users about why they get an error message that says &quot;deadlock&quot;.  &quot;Deadlock&quot; is what happens when multiple users concurrently update balances on multiple bank accounts in the wrong order.  What does this have to do with RDF?</p>

<p>There are in fact users who even use XA with a <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x22c8dbc8">Virtuoso</a>-based RDF application.  <a class="auto-href" href="http://semanticweb.org/id/Franz_Inc" id="link-id0x27bd0c08">Franz</a> also has publicized their development of full <a class="auto-href" href="http://dbpedia.org/resource/ACID" id="link-id0x283985c8">ACID</a> capabilities for <a class="auto-href" href="http://semanticweb.org/id/AllegroGraph" id="link-id0x238ba438">AllegroGraph</a>.  RDF is a database <a class="auto-href" href="http://dbpedia.org/resource/Database_schema" id="link-id0x2864fef8">schema</a> model, and transactions will inevitably become an issue in databases.</p>

<p>At the same time, the developer population trained with <a class="auto-href" href="http://dbpedia.org/resource/MySQL" id="link-id0x284d2d80">MySQL</a> and <a class="auto-href" href="http://dbpedia.org/resource/PHP" id="link-id0x237230e8">PHP</a> is not particularly transaction-aware.  Transactions have gone out of style, declares the No-<a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x2920cc88">SQL</a> crowd.  Well, it is not so much SQL they object to but ACID, i.e., transactional guarantees. We will talk more about this in the next post.  The <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x283f0588">SPARQL</a> language and protocol do not go into transactions, except for expressing the wish that an <code>UPDATE</code> request to an end-point be atomic. But beware -- atomicity is a gateway drug, and soon one finds oneself on full ACID.  </p>

<p>If one says that a thing will either happen <i>in its entirety</i> or <i>not at all,</i> which is what (A) atomicity means, then the question arises of (I) isolation; that is, what happens if somebody else does something to the same <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x238280f8">data</a> at the same time?  Then comes the question of whether a thing, once having happened, will stay that way; i.e., (D) durability. Finally, there is (<a class="auto-href" href="http://dbpedia.org/resource/C%2B%2B" id="link-id0x276714b8">C</a>) consistency, which means that the transaction&#39;s result must not contradict restrictions the database is supposed to enforce.  RDF usually has no restrictions; thus consistency mostly means that the internal state of the DBMS must be consistent, e.g., different indices on triples/quads should contain the same data.</p>

<p>There are, of course, database-like consistency criteria that one can express in RDF Schema and <a class="auto-href" href="http://dbpedia.org/resource/Web_Ontology_Language" id="link-id0x28625a90">OWL</a>, concerning data types, mandatory presence of properties, or restrictions on cardinality (i.e., one may only have one spouse at a time, and the like).  </p>

<p>If one indeed did enforce them all, then RDF would be very like the relational model -- with all the restrictions, but without the 40 years of work on <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x249bf4f8">RDBMS</a> performance.  For this reason, RDF use tends to involve data that is not structured enough to be a good fit for RDBMS.</p>

<p>There is of course the OWL side, where consistency is important but is defined in such complex ways that they again are not a good fit for RDBMS.  RDF could be seen to be split between the schema-last world and the <a class="auto-href" href="http://dbpedia.org/resource/Knowledge" id="link-id0x249504f8">knowledge</a> representation world.  I will here focus on the schema-last side.</p>

<p>Transactions are relevant in RDF in two cases: 1. If data is trickle loaded in small chunks, one likes to know that the chunks do not get lost or corrupted; 2. If the application has any semantics that reserve resources, then these operations need transactions.  The latter is not so common with RDF but examples include read-write situations, like checking if a seat is available and then reserving it. Transactionality guarantees that the same seat does not get reserved twice.</p>

<p>Web people argue with some justification that since the four cardinal virtues of database never existed on the web to begin with, applying strict ACID to web data is beside the point, like locking the stable after the horse has long since run away.  This may be so; yet the systems used for processing data, whether that data is dirty or not, benefit from predictable operation under concurrency and from not losing data.</p>

<p>Analytics workloads are not primarily about transactions, but still need to specify what happens with updates.  Analyzing data from measurements may not have concurrent updates, but there the transaction issue is replaced by the question of making explicit how the data was acquired and what processing has been applied to it before storage.</p>


<p>As mentioned before, the <a class="auto-href" href="http://lod2.eu/" id="link-id0x27d952d0">LOD2</a> project is at the crossroads of RDF and database.  I construe its mission to be the making of RDF into a respectable database discipline.  Database respectability in turn is as good as inconceivable without addressing the very bedrock on which this science was founded: transactions.</p>

<p>As previously argued, we need well-defined and auditable benchmarks.  This again brings up the topic of transactions.  Once we embark on the database benchmark route, there is no way around this. <a class="auto-href" href="http://www.tpc.org/" id="link-id0x2359d2d0">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/TPC-H" id="link-id0x28edb770">H</a> mandates that the system under test support transactions, and the audit involves a test for this.  We can do no less.</p>

<p>This has led me to more closely examine the issue of RDF and transactions, and whether there exist differences between transactions applied to RDF and to relational data.  </p>

<p>As concerns Virtuoso, our position has been that one can get full ACID in Virtuoso, whether in SQL or SPARQL, by using a connected client (e.g., <a class="auto-href" href="http://dbpedia.org/resource/Open_Database_Connectivity" id="link-id0x23a55698">ODBC</a>, <a class="auto-href" href="http://dbpedia.org/resource/Java_Database_Connectivity" id="link-id0x235cecf0">JDBC</a>, or the <a class="auto-href" href="http://jena.sourceforge.net/" id="link-id0x23213900">Jena</a> or <a class="auto-href" href="http://sourceforge.net/projects/sesame/" id="link-id0x277874d0">Sesame</a> frameworks), and setting the isolation options on the connection.  Having taken this step, one then must take the next step, which consists of dealing with deadlocks; i.e., with concurrent utilization, it may happen that the database at any time notifies the client that the transaction got aborted and the client must retry.</p>

<p>Web developers especially do not like this, because this is not what MySQL has taught them to expect. MySQL does have transactional back-ends like InnoDB, but often gets used without transactions.</p>

<p>With the March 2011 Virtuoso releases, we have taken a closer look at transactions with RDF.  It is more practical to reduce the possibility of errors than to require developers to pay attention. For this reason we have automated isolation settings for RDF, greatly reduced the incidence of deadlocks, and even incorporated automatic deadlock retries where applicable.</p>

<p>If all users lock resources they need in the same order, there will be no deadlocks.  This is what we do with RDF load in Virtuoso 7; thus any mix of concurrent <code>INSERTs</code> and <code>DELETEs</code>, if these are under a certain size (normally 10000 quads) are guaranteed never to fail due to locking.  These could still fail due to running out of space, though. With previous versions, there always was a possibility of having an <code>INSERT</code> or <code>DELETE</code> fail because of deadlock with multiple users.   Vectored <code>INSERT</code> and <code>DELETE</code> are sufficient for    making web crawling or archive maintenance practically deadlock free, since there the primary transaction is the <code>INSERT</code> or <code>DELETE</code> of a small graph. </p>

<p>Furthermore, since the <a class="auto-href" href="http://www.w3.org/TR/rdf-sparql-protocol/" id="link-id0x23eadf50">SPARQL protocol</a> has no way of specifying transactions consisting of multiple client-server exchanges, the SPARQL end-point may deal with deadlocks by itself.  If all else fails, it can simply execute requests one after the other, thus eliminating any possibility of locking.  We note that many statements will be intrinsically free of deadlocks by virtue of always locking in key order, but this cannot be universally guaranteed with arbitrary size operations; thus concurrent operations might still sometimes deadlock.  Anyway, vectored execution as introduced in Virtuoso 7, besides getting easily double-speed random access, also greatly reduces deadlocks by virtue of ordering operations.</p>

<p>In the next post we will talk about what transactions mean with RDF and whether there is any difference with the relational model.</p>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1688">
  <rss:title>Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1688</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1688</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1688</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T22:32:28Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">This article covers the changes we have made to the BSBM test driver during our series of experiments. Drill-down mode - For queries that have a product type as parameter, the test driver will invoke the query multiple times with each time a random subtype of the product type of the previous invocation. The starting point of the drill-down is an a random type from a settable level in the hierarchy. The rationale for the drill-down mode is that depending on the parameter choice, there can be 1000x differences in query run time. Thus run times of consecutive query mixes will be incomparable unless we guarantee that each mix has a predictable number of queries with a product type from each level in the hierarchy. Permutation of query mix - In the BI workload, the queries are run in a random order on each thread in multiuser mode. Doing exactly the same thing on many threads is not realistic for large queries. The data access patterns must be spread out in order to evaluate how bulk IO is organized with differing concurrent demands. The permutations are deterministic on consecutive runs and do not depend on the non-deterministic timing of concurrent activities. For queries with a drill-down, the individual executions that make up the drill-down are still consecutive. New metrics - The BI Power is the geometric mean of query run times scaled to queries per hour and multiplied by the scale factor, where 100 Mt is considered the unit scale. The BI Throughput is the arithmetic mean of the run times scaled to QPH and adjusted to scale as with the Power metric. These are analogous to the TPC-H Power and Throughput metrics. The Power is defined as (scale_factor / 284826) * 3600 / ((t0 * t1 * ... * tn) ^(1 / n)) The Throughput is defined as (scale_factor / 284826) * 3600 / ((t0 + t2 + ... + tn) / n) The magic number 284826 is the scale that generates approximately 100 million triples (100 Mt). We consider this &quot;scale one.&quot; The reason for the multiplication is that scores at different scales should get similar numbers, otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries. We also show the percentage each query represents from the total time the test driver waits for responses. Deadlock retry - When running update mixes, it is possible that a transaction gets aborted by a deadlock. We have made a retry logic for this. Cluster mode - Cluster databases may have multiple interchangeable HTTP listeners. With this mode, one can specify multiple end-points so a multi-user workload can divide itself evenly over these. Identifying matter - A version number was added to test driver output. Use of the new switches is also indicated in the test driver output. SUT CPU - In comparing results it is crucial to differentiate between in memory runs and IO bound runs. To make this easier, we have added an option to report server CPU times over the timed portion (excluding warm-ups). A pluggable self-script determines the CPU times for the system; thus clusters can be handled, too. The time is given as a sum of the time the server processes have aged during the run and as a percentage over the wall-clock time. These changes will soon be available as a diff and as a source tree. This version is labeled BSBM Test Driver 1.1-opl; the -opl signifies OpenLink additions. We invite FU Berlin to include these enhancements into their Source Forge repository of the BSBM test driver. There is more precise documentation of these options in the README file in the above distribution. The next planned upgrade of the test driver concerns adding support for &quot;RDF-H&quot;, the RDF adaptation of the industry standard TPC-H decision support benchmark for RDBMS. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): The Substance of Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements (this post)</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>This article covers the changes we have made to the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x2361bf18">BSBM</a> test driver during our series of experiments.</p>

<ul>
 <li>
  <p>
    <b>Drill-down mode</b> - For queries that have a product type as parameter, the test driver will invoke the query multiple times with each time a random subtype of the product type of the previous invocation. The starting point of the drill-down is an a random type from a settable level in the hierarchy.  The rationale for the drill-down mode is that depending on the parameter choice, there can be 1000x differences in query run time.  Thus run times of consecutive query mixes will be incomparable unless we guarantee that each mix has a predictable number of queries with a product type from each level in the hierarchy.</p>
 </li>

<li>
  <b>Permutation of query mix</b> - In the BI workload, the queries are run in a random order on each thread in multiuser mode.  Doing exactly the same thing on many threads is not realistic for large queries. The <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x2834cec8">data</a> access patterns must be spread out in order to evaluate how bulk IO is organized with differing concurrent demands. The permutations are deterministic on consecutive runs and do not depend on the non-deterministic timing of concurrent activities.  For queries with a drill-down, the individual executions that make up the drill-down are still consecutive.</li>

<li>
  <p>
    <b>New metrics</b> - The BI Power is the geometric mean of query run times scaled to queries per hour and multiplied by the scale factor, where 100 Mt is considered the unit scale. The BI Throughput is the arithmetic mean of the run times scaled to QPH and adjusted to scale as with the Power metric. These are analogous to the <a class="auto-href" href="http://www.tpc.org/" id="link-id0x236c5158">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/TPC-H" id="link-id0x28814950">H</a> Power and Throughput metrics. </p>
<p>The <i>Power</i> is defined as</p> 
<blockquote>(scale_factor / 284826) *  3600 / ((t0 * t1 * ... * tn) ^(1 / n)) </blockquote>
<p>The <i>Throughput</i> is defined as</p> 
<blockquote>(scale_factor / 284826) *  3600 / ((t0 + t2 + ... +  tn) / n)</blockquote>
<p>The magic number 284826 is the scale that generates approximately 100 million triples (100 Mt).  We consider this &quot;scale one.&quot;  The reason for the multiplication is that scores at different scales should get similar numbers, otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries.</p>

<p>We also show the percentage each query represents from the total time the test driver waits for responses. </p>
</li>

<li>
  <p>
    <b>Deadlock retry</b> - When running update mixes, it is possible that a transaction gets aborted by a deadlock.   We have made a retry logic for this.</p>
</li>

<li>
  <p>
    <b>Cluster mode</b> - Cluster databases may have multiple interchangeable <a class="auto-href" href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x240f9008">HTTP</a> listeners.  With this mode, one can specify multiple end-points so a multi-user workload can divide itself evenly over these.</p>
</li>

<li>
  <p>
    <b>Identifying matter</b> - A version number was added to test driver output.  Use of the new switches is also indicated in the test driver output.</p>
</li>

<li>
  <p>
    <b>SUT <a class="auto-href" href="http://dbpedia.org/resource/Central_processing_unit" id="link-id0x249b7208">CPU</a></b> - In comparing results it is crucial to differentiate between in memory runs and IO bound runs.  To make this easier, we have added an option to report server CPU times over the timed portion (excluding warm-ups).  A pluggable self-script determines the CPU times for the system; thus clusters can be handled, too.  The time is given as a sum of the time the server processes have aged during the run and as a percentage over the wall-clock time.</p>
</li>
</ul>

<p>These changes will soon be available <a href="http://blogs.usnet.private:8893/RPC2" id="link-id0x1f9a57c0">as a diff</a> and <a href="http://blogs.usnet.private:8893/RPC2" id="link-id0x1f2fea08">as a source tree</a>. This version is labeled <b><code>BSBM Test Driver 1.1-opl</code></b>; the <b><code>-opl</code></b> signifies OpenLink additions.  </p>

<p>We invite FU Berlin to include these enhancements into their Source Forge repository of the BSBM test driver.  There is more precise documentation of these options in the README file in the above distribution.</p>

<p>The next planned upgrade of the test driver concerns adding support for &quot;<a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x2865ac68">RDF</a>-H&quot;, the RDF adaptation of the industry standard TPC-H decision support benchmark for <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x23597bb0">RDBMS</a>.</p>



<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1db2be00">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>

<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1dfcc038">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x197c26d0">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1d149cf0">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1ab69450">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1e67d688">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1dad87c8">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1cc73830">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1d6879a8">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x1dfae510">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1ef052a0">Benchmarks, Redux (part 11): The Substance of Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1dadddb0">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e662ef0">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1df6fa70">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
Benchmarks, Redux (part 15): BSBM Test Driver Enhancements <i>(this post)</i>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1687">
  <rss:title>Benchmarks, Redux (part 14): BSBM BI Mix</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1687</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1687</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1687</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T22:31:32Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">In this post, we look at how we run the BSBM-BI mix. We consider the 100 Mt and 1000 Mt scales with Virtuoso 7 using the same hardware and software as in the previous posts. The changes to workload and metric are given in the previous post. Our intent here is to look at whether the metric works, and to see what results will look like in general. We are as much testing the benchmark as we are testing the system-under-test (SUT). The results shown here will likely not be comparable with future ones because we will most likely change the composition of the workload since it seems a bit out of balance. Anyway, for the sake of disclosure, we attach the query templates. The test driver we used will be made available soon, so the interested may still try a comparison with their systems. If you practice with this workload for the coming races, the effort will surely not be wasted. Once we have come up with a rules document, we will redo all that we have published so far by-the-book, and have it audited as part of the LOD2 service we plan for this (see previous posts in this series). This will introduce comparability; but before we get that far with the BI workload, the workload needs to evolve a bit. Below we show samples of test driver output; the whole output is downloadable. 100 Mt Single User bsbm/testdriver -runs 1 -w 0 -idir /bs/1 -drill \ -ucf bsbm/usecases/businessIntelligence/sparql.txt \ -dg http://bsbm.org http://localhost:8604/sparql 0: 43348.14ms, total: 43440ms Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 43.3481s / 43.3481s Elapsed runtime: 43.348 seconds QMpH: 83.049 query mixes per hour CQET: 43.348 seconds average runtime of query mix CQET (geom.): 43.348 seconds geometric mean runtime of query mix AQET (geom.): 0.492 seconds geometric mean runtime of query Throughput: 1494.874 BSBM-BI throughput: qph*scale BI Power: 7309.820 BSBM-BI Power: qph*scale (geom) 100 Mt 8 User Thread 6: query mix 3: 195793.09ms, total: 196086.18ms Thread 8: query mix 0: 197843.84ms, total: 198010.50ms Thread 7: query mix 4: 201806.28ms, total: 201996.26ms Thread 2: query mix 5: 221983.93ms, total: 222105.96ms Thread 4: query mix 7: 225127.55ms, total: 225317.49ms Thread 3: query mix 6: 225860.49ms, total: 226050.17ms Thread 5: query mix 2: 230884.93ms, total: 231067.61ms Thread 1: query mix 1: 237836.61ms, total: 237959.11ms Benchmark run completed in 237.985427s Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 195.7931s / 237.8366s Total runtime (sum): 1737.137 seconds Elapsed runtime: 1737.137 seconds QMpH: 121.016 query mixes per hour CQET: 217.142 seconds average runtime of query mix CQET (geom.): 216.603 seconds geometric mean runtime of query mix AQET (geom.): 2.156 seconds geometric mean runtime of query Throughput: 2178.285 BSBM-BI throughput: qph*scale BI Power: 1669.745 BSBM-BI Power: qph*scale (geom) 1000 Mt Single User 0: 608707.03ms, total: 608768ms Scale factor: 2848260 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 608.7070s / 608.7070s Elapsed runtime: 608.707 seconds QMpH: 5.914 query mixes per hour CQET: 608.707 seconds average runtime of query mix CQET (geom.): 608.707 seconds geometric mean runtime of query mix AQET (geom.): 5.167 seconds geometric mean runtime of query Throughput: 1064.552 BSBM-BI throughput: qph*scale BI Power: 6967.325 BSBM-BI Power: qph*scale (geom) 1000 Mt 8 User bsbm/testdriver -runs 8 -mt 8 -w 0 -idir /bs/10 -drill \ -ucf bsbm/usecases/businessIntelligence/sparql.txt \ -dg http://bsbm.org http://localhost:8604/sparql Thread 3: query mix 4: 2211275.25ms, total: 2211371.60ms Thread 4: query mix 0: 2212316.87ms, total: 2212417.99ms Thread 8: query mix 3: 2275942.63ms, total: 2276058.03ms Thread 5: query mix 5: 2441378.35ms, total: 2441448.66ms Thread 6: query mix 7: 2804001.05ms, total: 2804098.81ms Thread 2: query mix 2: 2808374.66ms, total: 2808473.71ms Thread 1: query mix 6: 2839407.12ms, total: 2839510.63ms Thread 7: query mix 1: 2889199.23ms, total: 2889263.17ms Benchmark run completed in 2889.302566s Scale factor: 2848260 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 2211.2753s / 2889.1992s Total runtime (sum): 20481.895 seconds Elapsed runtime: 20481.895 seconds QMpH: 9.968 query mixes per hour CQET: 2560.237 seconds average runtime of query mix CQET (geom.): 2544.284 seconds geometric mean runtime of query mix AQET (geom.): 13.556 seconds geometric mean runtime of query Throughput: 1794.205 BSBM-BI throughput: qph*scale BI Power: 2655.678 BSBM-BI Power: qph*scale (geom) Metrics for Query: 1 Count: 8 times executed in whole run Time share 2.120884% of total execution time AQET: 54.299656 seconds (arithmetic mean) AQET(geom.): 34.607302 seconds (geometric mean) QPS: 0.13 Queries per second minQET/maxQET: 11.71547600s / 148.65379700s Metrics for Query: 2 Count: 8 times executed in whole run Time share 0.207382% of total execution time AQET: 5.309462 seconds (arithmetic mean) AQET(geom.): 2.737696 seconds (geometric mean) QPS: 1.34 Queries per second minQET/maxQET: 0.78729800s / 25.80948200s Metrics for Query: 3 Count: 8 times executed in whole run Time share 17.650472% of total execution time AQET: 451.893890 seconds (arithmetic mean) AQET(geom.): 410.481088 seconds (geometric mean) QPS: 0.02 Queries per second minQET/maxQET: 171.07262500s / 721.72939200s Metrics for Query: 5 Count: 32 times executed in whole run Time share 6.196565% of total execution time AQET: 39.661685 seconds (arithmetic mean) AQET(geom.): 6.849882 seconds (geometric mean) QPS: 0.18 Queries per second minQET/maxQET: 0.15696500s / 189.00906200s Metrics for Query: 6 Count: 8 times executed in whole run Time share 0.119916% of total execution time AQET: 3.070136 seconds (arithmetic mean) AQET(geom.): 2.056059 seconds (geometric mean) QPS: 2.31 Queries per second minQET/maxQET: 0.41524400s / 7.55655300s Metrics for Query: 7 Count: 40 times executed in whole run Time share 1.577963% of total execution time AQET: 8.079921 seconds (arithmetic mean) AQET(geom.): 1.342079 seconds (geometric mean) QPS: 0.88 Queries per second minQET/maxQET: 0.02205800s / 40.27761500s Metrics for Query: 8 Count: 40 times executed in whole run Time share 72.126818% of total execution time AQET: 369.323481 seconds (arithmetic mean) AQET(geom.): 114.431863 seconds (geometric mean) QPS: 0.02 Queries per second minQET/maxQET: 5.94377300s / 1824.57867400s The CPU for the multiuser runs stays above 1500% for the whole run. The CPU for the single user 100 Mt run is 630%; for the 1000 Mt run, this is 574%. This can be improved since the queries usually have a lot of data to work on. But final optimization is not our goal yet; we are just surveying the race track. The difference between a warm single user run and a cold single user run is about 15% with data on SSD; with data on disk, this would be more. The numbers shown are with warm cache. The single-user and multi-user Throughput difference, 1064 single-user vs. 1794 multi-user, is about what one would expect from the CPU utilization. With these numbers, the CPU does not appear badly memory-bound, else the increase would be less; also core multi-threading seems to bring some benefit. If the single-user run was at 800%, the Throughput would be 1488. The speed in excess of this may be attributed to core multi-threading, although we must remember that not every query mix is exactly the same length, so the figure is not exact. Core multi-threading does not seem to hurt, at the very least. Comparison of the same numbers with the column store will be interesting since it misses the cache a lot less and accordingly has better SMP scaling. The Intel Nehalem memory subsystem is really pretty good. For reference, we show a run with Virtuoso 6 at 100Mt. 0: 424754.40ms, total: 424829ms Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 1 times min/max Querymix runtime: 424.7544s / 424.7544s Elapsed runtime: 424.754 seconds QMpH: 8.475 query mixes per hour CQET: 424.754 seconds average runtime of query mix CQET (geom.): 424.754 seconds geometric mean runtime of query mix AQET (geom.): 1.097 seconds geometric mean runtime of query Throughput: 152.559 BSBM-BI throughput: qph*scale BI Power: 3281.150 BSBM-BI Power: qph*scale (geom) and 8 user Thread 5: query mix 3: 616997.86ms, total: 617042.83ms Thread 7: query mix 4: 625522.18ms, total: 625559.09ms Thread 3: query mix 7: 626247.62ms, total: 626304.96ms Thread 1: query mix 0: 629675.17ms, total: 629724.98ms Thread 4: query mix 6: 667633.36ms, total: 667670.07ms Thread 8: query mix 2: 674206.07ms, total: 674256.72ms Thread 6: query mix 5: 695020.21ms, total: 695052.29ms Thread 2: query mix 1: 701824.67ms, total: 701864.91ms Benchmark run completed in 701.909341s Scale factor: 284826 Explore Endpoints: 1 Update Endpoints: 1 Drilldown: on Number of warmup runs: 0 Number of clients: 8 Seed: 808080 Number of query mix runs (without warmups): 8 times min/max Querymix runtime: 616.9979s / 701.8247s Total runtime (sum): 5237.127 seconds Elapsed runtime: 5237.127 seconds QMpH: 41.031 query mixes per hour CQET: 654.641 seconds average runtime of query mix CQET (geom.): 653.873 seconds geometric mean runtime of query mix AQET (geom.): 2.557 seconds geometric mean runtime of query Throughput: 738.557 BSBM-BI throughput: qph*scale BI Power: 1408.133 BSBM-BI Power: qph*scale (geom) Having the numbers, let us look at the metric and its scaling. We take the geometric mean of the single-user Power and the multiuser Throughput. 100 Mt: sqrt ( 7771 * 2178 ); = 4114 1000 Mt: sqrt ( 6967 * 1794 ); = 3535 Scaling seems to work; the results are in the same general ballpark. The real times for the 1000 Mt run are a bit over 10x the times for the 100Mt run, as expected. The relative percentages of the queries are about the same on both scales, with the drill-down in Q8 alone being 77% and 72% respectively. The Q8 drill-down starts at the root of the product hierarchy. If we made this start one level from the top, its share would drop. This seems reasonable. Conversely, Q2 is out of place, with far too little share of the time. It takes a product as a starting point and shows a list of products with common features, sorted by descending count of common features. This would more appropriately be applied to a leaf product category instead, measuring how many of the products in the category have the top 20 features found in this category, to name an example. Also there should be more queries. At present it appears that BSBM-BI is definitely runnable, but a cursory look suffices to show that the workload needs more development and variety. We remember that I dreamt up the business questions last fall without much analysis, and that these questions were subsequently translated to SPARQL by FU Berlin. So, on one hand, BSBM-BI is of crucial importance because it is the first attempt at doing a benchmark with long running queries in SPARQL. On the other hand, BSBM-BI is not very good as a benchmark; TPC-H is a lot better. This stands to reason, as TPC-H has had years and years of development and participation by many people. Benchmark queries are trick questions: For example, TPC-H Q18 cannot be done without changing an IN into a JOIN with the IN subquery in the outer loop and doing streaming aggregation. Q13 cannot be done without a well-optimized HASH JOIN which besides must be partitioned at the larger scales. Having such trick questions in an important benchmark eventually results in everybody doing the optimizations that the benchmark clearly calls for. Making benchmarks thus entails a responsibility ultimately to the end user, because an irrelevant benchmark might in the worst case send developers chasing things that are beside the point. In the following, we will look at what BSBM-BI requires from the database and how these requirements can be further developed and extended. BSBM-BI does not have any clear trick questions, at least not premeditatedly. BSBM-BI just requires a cost model that can guess the fanout of a JOIN and the cardinality of a GROUP BY; it is enough to distinguish smaller from greater; the guess does not otherwise have to be very good. Further, the queries are written in the benchmark text so that joining from left to right would work, so not even a cost-based optimizer is strictly needed. I did however have to add some cardinality statistics to get reasonable JOIN order since we always reorder the query regardless of the source formulation. BSBM-BI does have variable selectivity from the drill-downs; thus these may call for different JOIN orders for different parameter values. I have not looked into whether this really makes a difference, though. There are places in BSBM-BI where using a HASH JOIN makes sense. We do not use HASH JOINs with RDF because there is an index for everything and making a HASH JOIN in the wrong place can have a large up-front cost, so one is more robust against cost model errors if one does not do HASH JOINs. This said, a HASH JOIN in the right place is a lot better than an index lookup. With TPC-H Q13, our best HASH JOIN is over 2x better than the best INDEX-based JOIN, both being well tuned. For questions like &quot;count the hairballs made in Germany reviewed by Japanese Hello Kitty fans,&quot; where two ends of a JOIN path are fairly selective doing the other as a HASH JOIN is good. This can, if the JOIN is always cardinality-reducing, even be merged inside an INDEX lookup. We have such capabilities since we have been for a while gearing up for the relational races, but are not using any of these with BSBM-BI, although they would be useful. Let us see the profile for a single user 100 Mt run. The database activity summary is -- select db_activity (0, &#39;http&#39;); 161.3M rnd  210.2M seq      0 same seg   104.5M same pg  45.08M same par      0 disk      0 spec disk      0B /      0 messages  2.393K fork See the post &quot;What Does BSBM Explore Measure&quot; for an explanation of the numbers. We see that there is more sequential access than random and the random has fair locality with over half on the same page as the previous and a lot of the rest falling under the same parent. Funnily enough, the explore mix has more locality. Running with a longer vector size would probably increase performance by getting better locality. There is an optimization that adjusts vector size on the fly if locality is not sufficient but this is not being used here. So we manually set vector size to 100000 instead of the default 10000. We get -- 172.4M rnd  220.8M seq      0 same seg   149.6M same pg  10.99M same par     21 disk    861 spec disk      0B /      0 messages     754 fork The throughput goes from 1494 to 1779. We see more hits on the same page, as expected. We do not make this setting a default since it raises the cost for small queries; therefore the vector size must be self-adjusting -- besides, expecting a DBA to tune this is not reasonable. We will just have to correctly tune the self-adjust logic, and we have again clear gains. Let us now go back to the first run with vector size 10000. The top of the CPU oprofile is as follows: 722309 15.4507 cmpf_iri64n_iri64n 434791 9.3005 cmpf_iri64n_iri64n_anyn_iri64n 294712 6.3041 itc_next_set 273488 5.8501 itc_vec_split_search 203970 4.3631 itc_dive_transit 199687 4.2714 itc_page_rcf_search 181614 3.8848 dc_itc_append_any 173043 3.7015 itc_bm_vec_row_check 146727 3.1386 cmpf_int64n 128224 2.7428 itc_vec_row_check 113515 2.4282 dk_alloc 97296 2.0812 page_wait_access 62523 1.3374 qst_vec_get_int64 59014 1.2623 itc_next_set_parent 53589 1.1463 sslr_qst_get 48003 1.0268 ds_add 46641 0.9977 dk_free_tree 44551 0.9530 kc_var_col 43650 0.9337 page_col_cmp_1 35297 0.7550 cmpf_iri64n_iri64n_anyn_gt_lt 34589 0.7399 dv_compare 25864 0.5532 cmpf_iri64n_anyn_iri64n_iri64n_lte 23088 0.4939 dk_free The top 10 are all index traversal, with the key compare for two leading IRI keys in the lead, corresponding to a lookup with P and S given. The one after that is with all parts given, corresponding to an existence test. The existence tests could probably be converted to HASH JOIN lookups to good advantage. Aggregation and arithmetic are absent. We should probably add a query like TPC-H Q1 that does nothing but these two. Considering the overall profile, GROUP BY seems to be around 3%. We should probably put in a query that makes a very large number of groups and could make use of streaming aggregation, i.e., take advantage of a situation where aggregation input comes already grouped by the grouping columns. A BI use case should offer no problem with including arithmetic, but there are not that many numbers in the BSBM set. Some code sections in the queries with conditional execution and costly tests inside ANDs and ORs would be good. TPC-H has such in Q21 and Q19. An OR with existences where there would be gain from good guesses of a subquery&#39;s selectivity would be appropriate. Also, there should be conditional expressions somewhere with a lot of data, like the CASE-WHEN in TPC-H Q12. We can make BSBM-BI more interesting by putting in the above. Also we will have to see where we can profit from HASH JOIN, both small and large. There should be such places in the workload already so this is a matter of just playing a bit more. This post amounts to a cheat sheet for the BSBM-BI runs a bit farther down the road. By then we should be operational with the column store and Virtuoso 7 Cluster, though, so not everything is yet on the table. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): The Substance of Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM-BI Modifications Benchmarks, Redux (part 14): BSBM-BI Mix (this post) Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>In this post, we look at how we run the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x236dcda8">BSBM</a>-BI mix.  We consider the 100 Mt and 1000 Mt scales with <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x284893c0">Virtuoso</a> 7 using the same hardware and software as in the previous posts.  The changes to workload and metric are given in the previous post.</p>

<p>Our intent here is to look at whether the metric works, and to see what results will look like in general.  We are as much testing the benchmark as we are testing the system-under-test (SUT).  The results shown here will likely not be comparable with future ones because we will most likely change the composition of the workload since it seems a bit out of balance.  Anyway, for the sake of disclosure, we attach the query templates.  The test driver we used will be made available soon, so the interested may still try a comparison with their systems. If you practice with this workload for the coming races, the effort will surely not be wasted.</p>


<p>Once we have come up with a rules document, we will redo all that we have published so far by-the-book, and have it audited as part of the <a class="auto-href" href="http://lod2.eu/" id="link-id0x23724860">LOD2</a> service we plan for this (see previous posts in this series).  This will introduce comparability; but before we get that far with the BI workload, the workload needs to evolve a bit.</p>

<p>Below we show samples of test driver output; the whole output is <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/BenchmarksReduxSupportingFiles/br.tar.gz" id="link-id0x1b703ad8">downloadable</a>.</p>

<p>100 Mt Single User</p>

<blockquote>
 <code><pre>
bsbm/testdriver   -runs 1   -w 0 -idir /bs/1  -drill  \  
   -ucf bsbm/usecases/businessIntelligence/<a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x2385eb48">sparql</a>.txt  \  
   -dg <a class="auto-href" href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x22e2f508">http</a>://bsbm.org http://localhost:8604/sparql
</pre>
 </code>
</blockquote>

<blockquote>
 <code><pre>
0: 43348.14ms, total: 43440ms

Scale factor:           284826
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Seed:                   808080
Number of query mix runs (without warmups): 1 times
min/max Querymix runtime:    43.3481s / 43.3481s
Elapsed runtime:        43.348 seconds
QMpH:                   83.049 query mixes per hour
CQET:                   43.348 seconds average runtime of query mix
CQET (geom.):           43.348 seconds geometric mean runtime of query mix
AQET (geom.):           0.492 seconds geometric mean runtime of query
Throughput:             1494.874 BSBM-BI throughput: qph*scale
BI Power:               7309.820 BSBM-BI Power: qph*scale (geom)
</pre>
 </code>
</blockquote>



<p>100 Mt 8 User </p>

<blockquote>
 <code><pre>
Thread 6: query mix 3: 195793.09ms, total: 196086.18ms
Thread 8: query mix 0: 197843.84ms, total: 198010.50ms
Thread 7: query mix 4: 201806.28ms, total: 201996.26ms
Thread 2: query mix 5: 221983.93ms, total: 222105.96ms
Thread 4: query mix 7: 225127.55ms, total: 225317.49ms
Thread 3: query mix 6: 225860.49ms, total: 226050.17ms
Thread 5: query mix 2: 230884.93ms, total: 231067.61ms
Thread 1: query mix 1: 237836.61ms, total: 237959.11ms
Benchmark run completed in 237.985427s

Scale factor:           284826
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Number of clients:      8
Seed:                   808080
Number of query mix runs (without warmups): 8 times
min/max Querymix runtime:    195.7931s / 237.8366s
Total runtime (sum):    1737.137 seconds
Elapsed runtime:        1737.137 seconds
QMpH:                   121.016 query mixes per hour
CQET:                   217.142 seconds average runtime of query mix
CQET (geom.):           216.603 seconds geometric mean runtime of query mix
AQET (geom.):           2.156 seconds geometric mean runtime of query
Throughput:             2178.285 BSBM-BI throughput: qph*scale
BI Power:               1669.745 BSBM-BI Power: qph*scale (geom)
</pre>
 </code>
</blockquote>


<p>1000 Mt Single User</p>

<blockquote>
 <code><pre>
0: 608707.03ms, total: 608768ms

Scale factor:           2848260
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Seed:                   808080
Number of query mix runs (without warmups): 1 times
min/max Querymix runtime:    608.7070s / 608.7070s
Elapsed runtime:        608.707 seconds
QMpH:                   5.914 query mixes per hour
CQET:                   608.707 seconds average runtime of query mix
CQET (geom.):           608.707 seconds geometric mean runtime of query mix
AQET (geom.):           5.167 seconds geometric mean runtime of query
Throughput:             1064.552 BSBM-BI throughput: qph*scale
BI Power:               6967.325 BSBM-BI Power: qph*scale (geom)
</pre>
 </code>
</blockquote>


<p>1000 Mt 8 User </p>

<blockquote>
 <code><pre>
bsbm/testdriver   -runs 8 -mt 8  -w 0 -idir /bs/10  -drill  \
   -ucf bsbm/usecases/businessIntelligence/sparql.txt   \
   -dg http://bsbm.org http://localhost:8604/sparql
</pre>
 </code>
</blockquote>

<blockquote>
 <code><pre>
Thread 3: query mix 4: 2211275.25ms, total: 2211371.60ms
Thread 4: query mix 0: 2212316.87ms, total: 2212417.99ms
Thread 8: query mix 3: 2275942.63ms, total: 2276058.03ms
Thread 5: query mix 5: 2441378.35ms, total: 2441448.66ms
Thread 6: query mix 7: 2804001.05ms, total: 2804098.81ms
Thread 2: query mix 2: 2808374.66ms, total: 2808473.71ms
Thread 1: query mix 6: 2839407.12ms, total: 2839510.63ms
Thread 7: query mix 1: 2889199.23ms, total: 2889263.17ms
Benchmark run completed in 2889.302566s

Scale factor:           2848260
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Number of clients:      8
Seed:                   808080
Number of query mix runs (without warmups): 8 times
min/max Querymix runtime:    2211.2753s / 2889.1992s
Total runtime (sum):    20481.895 seconds
Elapsed runtime:        20481.895 seconds
QMpH:                   9.968 query mixes per hour
CQET:                   2560.237 seconds average runtime of query mix
CQET (geom.):           2544.284 seconds geometric mean runtime of query mix
AQET (geom.):           13.556 seconds geometric mean runtime of query
Throughput:             1794.205 BSBM-BI throughput: qph*scale
BI Power:               2655.678 BSBM-BI Power: qph*scale (geom)

Metrics for Query:      1
Count:                  8 times executed in whole run
Time share              2.120884% of total execution time
AQET:                   54.299656 seconds (arithmetic mean)
AQET(geom.):            34.607302 seconds (geometric mean)
QPS:                    0.13 Queries per second
minQET/maxQET:          11.71547600s / 148.65379700s

Metrics for Query:      2
Count:                  8 times executed in whole run
Time share              0.207382% of total execution time
AQET:                   5.309462 seconds (arithmetic mean)
AQET(geom.):            2.737696 seconds (geometric mean)
QPS:                    1.34 Queries per second
minQET/maxQET:          0.78729800s / 25.80948200s

Metrics for Query:      3
Count:                  8 times executed in whole run
Time share              17.650472% of total execution time
AQET:                   451.893890 seconds (arithmetic mean)
AQET(geom.):            410.481088 seconds (geometric mean)
QPS:                    0.02 Queries per second
minQET/maxQET:          171.07262500s / 721.72939200s

Metrics for Query:      5
Count:                  32 times executed in whole run
Time share              6.196565% of total execution time
AQET:                   39.661685 seconds (arithmetic mean)
AQET(geom.):            6.849882 seconds (geometric mean)
QPS:                    0.18 Queries per second
minQET/maxQET:          0.15696500s / 189.00906200s

Metrics for Query:      6
Count:                  8 times executed in whole run
Time share              0.119916% of total execution time
AQET:                   3.070136 seconds (arithmetic mean)
AQET(geom.):            2.056059 seconds (geometric mean)
QPS:                    2.31 Queries per second
minQET/maxQET:          0.41524400s / 7.55655300s

Metrics for Query:      7
Count:                  40 times executed in whole run
Time share              1.577963% of total execution time
AQET:                   8.079921 seconds (arithmetic mean)
AQET(geom.):            1.342079 seconds (geometric mean)
QPS:                    0.88 Queries per second
minQET/maxQET:          0.02205800s / 40.27761500s

Metrics for Query:      8
Count:                  40 times executed in whole run
Time share              72.126818% of total execution time
AQET:                   369.323481 seconds (arithmetic mean)
AQET(geom.):            114.431863 seconds (geometric mean)
QPS:                    0.02 Queries per second
minQET/maxQET:          5.94377300s / 1824.57867400s
</pre>
 </code>
</blockquote>



<p>The <a class="auto-href" href="http://dbpedia.org/resource/Central_processing_unit" id="link-id0x2809d998">CPU</a> for the multiuser runs stays above 1500% for the whole run. The CPU for the single user 100 Mt run is 630%; for the 1000 Mt run, this is 574%. This can be improved since the queries usually have a lot of <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x22cf75b8">data</a> to work on.  But final <a class="auto-href" href="http://dbpedia.org/resource/Program_optimization" id="link-id0x238b94c8">optimization</a> is not our goal yet; we are just surveying the race track. The difference between a warm single user run and a cold single user run is about 15% with data on SSD; with data on disk, this would be more.  The numbers shown are with warm <a class="auto-href" href="http://dbpedia.org/resource/Cache" id="link-id0x23ad8c08">cache</a>.  The single-user and multi-user Throughput difference, 1064 single-user vs. 1794 multi-user, is about what one would expect from the CPU utilization.</p>

<p>With these numbers, the CPU does not appear badly memory-bound, else the increase would be less; also core multi-threading seems to bring some benefit.  If the single-user run was at 800%, the Throughput would be 1488.  The speed in excess of this may be attributed to core multi-threading, although we must remember that not every query mix is exactly the same length, so the figure is not exact.  Core multi-threading does not seem to hurt, at the very least.  Comparison of the same numbers with the column store will be interesting since it misses the cache a lot less and accordingly has better SMP scaling. The <a class="auto-href" href="http://dbpedia.org/resource/Intel_Corporation" id="link-id0x23568308">Intel</a> Nehalem memory subsystem is really pretty good.</p>
<p>




</p>
<p>For reference, we show a run with Virtuoso 6 at 100Mt. </p>

<blockquote>
 <code><pre>
0: 424754.40ms, total: 424829ms

Scale factor:           284826
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Seed:                   808080
Number of query mix runs (without warmups): 1 times
min/max Querymix runtime:    424.7544s / 424.7544s
Elapsed runtime:        424.754 seconds
QMpH:                   8.475 query mixes per hour
CQET:                   424.754 seconds average runtime of query mix
CQET (geom.):           424.754 seconds geometric mean runtime of query mix
AQET (geom.):           1.097 seconds geometric mean runtime of query
Throughput:             152.559 BSBM-BI throughput: qph*scale
BI Power:               3281.150 BSBM-BI Power: qph*scale (geom)
</pre>
 </code>
</blockquote>


<p>and 8 user </p>

<blockquote>
 <code><pre>
Thread 5: query mix 3: 616997.86ms, total: 617042.83ms
Thread 7: query mix 4: 625522.18ms, total: 625559.09ms
Thread 3: query mix 7: 626247.62ms, total: 626304.96ms
Thread 1: query mix 0: 629675.17ms, total: 629724.98ms
Thread 4: query mix 6: 667633.36ms, total: 667670.07ms
Thread 8: query mix 2: 674206.07ms, total: 674256.72ms
Thread 6: query mix 5: 695020.21ms, total: 695052.29ms
Thread 2: query mix 1: 701824.67ms, total: 701864.91ms
Benchmark run completed in 701.909341s

Scale factor:           284826
Explore Endpoints:      1
Update Endpoints:       1
Drilldown:              on
Number of warmup runs:  0
Number of clients:      8
Seed:                   808080
Number of query mix runs (without warmups): 8 times
min/max Querymix runtime:    616.9979s / 701.8247s
Total runtime (sum):    5237.127 seconds
Elapsed runtime:        5237.127 seconds
QMpH:                   41.031 query mixes per hour
CQET:                   654.641 seconds average runtime of query mix
CQET (geom.):           653.873 seconds geometric mean runtime of query mix
AQET (geom.):           2.557 seconds geometric mean runtime of query
Throughput:             738.557 BSBM-BI throughput: qph*scale
BI Power:               1408.133 BSBM-BI Power: qph*scale (geom)
</pre>
 </code>
</blockquote>




<p>Having the numbers, let us look at the metric and its scaling.  We take the geometric mean of the single-user Power and the multiuser Throughput.</p>


<blockquote>
 <code><pre>
 100 Mt: sqrt ( 7771 * 2178 ); = 4114

1000 Mt: sqrt ( 6967 * 1794 ); = 3535
</pre>
 </code>
</blockquote>


<p>Scaling seems to work; the results are in the same general ballpark.  The real times for the 1000 Mt run are a bit over 10x the times for the 100Mt run, as expected. The relative percentages of the queries are about the same on both scales, with the drill-down in Q8 alone being 77% and 72% respectively. The Q8 drill-down starts at the root of the product hierarchy.  If we made this start one level from the top, its share would drop.  This seems reasonable.</p>

<p>Conversely, Q2 is out of place, with far too little share of the time. It takes a product as a starting point and shows a list of products with common features, sorted by descending count of common features. This would more appropriately be applied to a leaf product category instead, measuring how many of the products in the category have the top 20 features found in this category, to name an example.</p>

<p>Also there should be more queries.</p>

<p>At present it appears that BSBM-BI is definitely runnable, but a cursory look suffices to show that the workload needs more development and variety.  We remember that I dreamt up the business questions last fall without much analysis, and that these questions were subsequently translated to SPARQL by FU Berlin.  So, on one hand, BSBM-BI is of crucial importance because it is the first attempt at doing a benchmark with long running queries in SPARQL.  On the other hand, BSBM-BI is not very good as a benchmark; <a class="auto-href" href="http://www.tpc.org/" id="link-id0x23872a10">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/TPC-H" id="link-id0x28487d98">H</a> is a lot better.  This stands to reason, as TPC-H has had years and years of development and participation by many people.</p>

<p>Benchmark queries are trick questions: For example, TPC-H Q18 cannot be done without changing an <code>IN</code> into a <code>JOIN</code> with the <code>IN</code> subquery in the outer loop and doing streaming aggregation.  Q13 cannot be done without a well-optimized <code><a class="auto-href" href="http://dbpedia.org/resource/Hash_join" id="link-id0x24974830">HASH JOIN</a></code> which besides must be partitioned at the larger scales.</p>

<p>Having such trick questions in an important benchmark eventually results in everybody doing the optimizations that the benchmark clearly calls for.  Making benchmarks thus entails a responsibility ultimately to the end user, because an irrelevant benchmark might in the worst case send developers chasing things that are beside the point.</p>


<p>In the following, we will look at what BSBM-BI requires from the database and how these requirements can be further developed and extended.</p>

<p>BSBM-BI does not have any clear trick questions, at least not premeditatedly. BSBM-BI just requires a cost model that can guess the fanout of a <code>JOIN</code> and the cardinality of a <code>GROUP BY</code>; it is enough to distinguish smaller from greater; the guess does not otherwise have to be very good. Further, the queries are written in the benchmark text so that joining from left to right would work, so not even a cost-based optimizer is strictly needed.  I did however have to add some cardinality statistics to get reasonable <code>JOIN</code> order since we always reorder the query regardless of the source formulation.</p>

<p>BSBM-BI does have variable selectivity from the drill-downs; thus these may call for different <code>JOIN</code> orders for different parameter values.  I have not looked into whether this really makes a difference, though.</p>

<p>There are places in BSBM-BI where using a <code>HASH JOIN</code> makes sense.  We do not use <code>HASH JOINs</code> with <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x23cbf908">RDF</a> because there is an index for everything and making a <code>HASH JOIN</code> in the wrong place can have a large up-front cost, so one is more robust against cost model errors if one does not do <code>HASH JOINs</code>.  This said, a <code>HASH JOIN</code> in the right place is a lot better than an index lookup.  With TPC-H Q13, our best <code>HASH JOIN</code> is over 2x better than the best <code>INDEX</code>-based <code>JOIN</code>, both being well tuned.  For questions like &quot;count the hairballs made in <a class="auto-href" href="http://dbpedia.org/resource/Germany" id="link-id0x249d3e28">Germany</a> reviewed by Japanese Hello Kitty fans,&quot; where two ends of a <code>JOIN</code> path are fairly selective doing the other as a <code>HASH JOIN</code> is good.  This can, if the <code>JOIN</code> is always cardinality-reducing, even be merged inside an <code>INDEX</code> lookup.  We have such capabilities since we have been for a while gearing up for the relational races, but are not using any of these with BSBM-BI, although they would be useful.</p>
 

<p>Let us see the profile for a single user 100 Mt run.</p>

<p>The database activity summary is --</p>

<p>
<code>select db_activity (0, &#39;http&#39;);</code>
</p>

<p>
<code> 161.3M rnd  210.2M seq      0 same seg   104.5M same pg  45.08M same par      0 disk      0 spec disk      0B /      0 messages  2.393K fork</code>
</p>


<p>See the post &quot;<a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1b1f3068">What Does BSBM Explore Measure</a>&quot; for an explanation of the numbers.  We see that there is more sequential access than random and the random has fair locality with over half on the same page as the previous and a lot of the rest falling under the same parent. Funnily enough, the explore mix has more locality.  Running with a longer vector size would probably increase performance by getting better locality.  There is an optimization that adjusts vector size on the fly if locality is not sufficient but this is not being used here. So we manually set vector size to 100000 instead of the default 10000. We get --</p>

<p>
<code> 172.4M rnd  220.8M seq      0 same seg   149.6M same pg  10.99M same par     21 disk    861 spec disk      0B /      0 messages     754 fork</code>
</p>


<p>The throughput goes from 1494 to 1779.  We see more hits on the same page, as expected.  We do not make this setting a default since it raises the cost for small queries; therefore the vector size must be self-adjusting -- besides, expecting a DBA to tune this is not reasonable. We will just have to correctly tune the self-adjust logic, and we have again clear gains.</p>

<p>Let us now go back to the first run with vector size 10000.</p>

<p>The top of the CPU <code>oprofile</code> is as follows:</p>

<blockquote>
 <code><pre>
722309   15.4507  cmpf_iri64n_iri64n
434791    9.3005  cmpf_iri64n_iri64n_anyn_iri64n
294712    6.3041  itc_next_set
273488    5.8501  itc_vec_split_search
203970    4.3631  itc_dive_transit
199687    4.2714  itc_page_rcf_search
181614    3.8848  dc_itc_append_any
173043    3.7015  itc_bm_vec_row_check
146727    3.1386  cmpf_int64n
128224    2.7428  itc_vec_row_check
113515    2.4282  dk_alloc
97296     2.0812  page_wait_access
62523     1.3374  qst_vec_get_int64
59014     1.2623  itc_next_set_parent
53589     1.1463  sslr_qst_get
48003     1.0268  ds_add
46641     0.9977  dk_free_tree
44551     0.9530  kc_var_col
43650     0.9337  page_col_cmp_1
35297     0.7550  cmpf_iri64n_iri64n_anyn_gt_lt
34589     0.7399  dv_compare
25864     0.5532  cmpf_iri64n_anyn_iri64n_iri64n_lte
23088     0.4939  dk_free
</pre>
 </code>
</blockquote>

<p>The top 10 are all index traversal, with the key compare for two leading IRI keys in the lead, corresponding to a lookup with <code>P</code> and <code>S</code> given.  The one after that is with all parts given, corresponding to an existence test.  The existence tests could probably be converted to <code>HASH JOIN</code> lookups to good advantage.  Aggregation and arithmetic are absent.  We should probably add a query like TPC-H Q1 that does nothing but these two.  Considering the overall profile, <code>GROUP BY</code> seems to be around 3%.  We should probably put in a query that makes a very large number of groups and could make use of streaming aggregation, i.e., take advantage of a situation where aggregation input comes already grouped by the grouping columns.</p>

<p>A BI use case should offer no problem with including arithmetic, but there are not that many numbers in the BSBM set.  Some code sections in the queries with conditional execution and costly tests inside <code>ANDs</code> and <code>ORs</code> would be good.  TPC-H has such in Q21 and Q19.  An <code>OR</code> with existences where there would be gain from good guesses of a subquery&#39;s selectivity would be appropriate.  Also, there should be conditional expressions somewhere with a lot of data, like the <code>CASE-WHEN</code> in TPC-H Q12.</p>

<p>We can make BSBM-BI more interesting by putting in the above.  Also we will have to see where we can profit from <code>HASH JOIN</code>, both small and large.  There should be such places in the workload already so this is a matter of just playing a bit more.</p>

<p>This post amounts to a cheat sheet for the BSBM-BI runs a bit farther down the road. By then we should be operational with the column store and Virtuoso 7 Cluster, though, so not everything is yet on the table.</p>



<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1fd1d4e0">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>

<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1d5b07d8">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1dfe6c48">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x197fce30">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1fbf4210">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1beeb1e0">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1d7e1818">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1dfc1730">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1ea819a8">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x1ec73da0">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1fbdce90">Benchmarks, Redux (part 11): The Substance of Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x19928618">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1f3d8710">Benchmarks, Redux (part 13): BSBM-BI Modifications </a>
</li>
<li>
Benchmarks, Redux (part 14): BSBM-BI Mix  <i>(this post)</i>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e627400">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1686">
  <rss:title>Benchmarks, Redux (part 13): BSBM BI Modifications</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1686</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1686</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1686</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T22:30:44Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">In this post we introduce changes to the BSBM BI queries and metric. These changes are motivated by prevailing benchmark practice and by our experiences in optimizing for the BSBM BI workload. We will publish results according to the definitions given here and recommend that any interested parties do likewise. The rationales are given in the text. Query Mix We have removed Q4 from the mix because it is quadratic to the scale factor. The other queries are roughly n * log (n). Parameter Substitution All queries that take a product type as parameter are run in flights of several query invocations where the product type goes from broader to more specific. The initial product type specifies either the root product type or an immediate subtype of this, and the last in the drill-down is a leaf type. The rationale for this is that the choice of product type may make several orders of magnitude difference in the run time of a query. In order to make consecutive query mixes roughly comparable in execution time, all mixes should have a predictable number of query invocations with product types of each level. Query Order In the BI mix, when running multiple concurrent clients, each query mix is submitted in a random order. Queries which do drill-downs always have the steps of the drill-down as consecutive in the session, but the query templates are permuted. This is done so as to make less likely that there were two concurrent queries accessing exactly the same data. In this way, scans cannot be trivially shared between queries -- but there are still opportunities for reuse of results and adapting execution to working set, e.g., starting with what is in memory. Metrics We use a TPC-H-like metric. This metric consists of a single-user part and a multi-user part, called respectively Power and Throughput. The Power metric is a geometric mean of query run-time. The Throughput is the total run-time divided by the number of queries completed. After taking the mean, the time is converted into queries-per-hour. This time is then multiplied by the scale factor divided by the scale factor for 100 Mt. In other words, we consider the 100 Mt data set as the unit scale. The Power is defined as ( scale_factor / 284826 ) * 3600 / ( ( t1 * t1 * ... * tn ) ^ ( 1 / n ) ) The Throughput is defined as ( scale_factor / 284826 ) * 3600 / ( ( t1 + t2 + ... + tn ) / n ) The magic number 284826 is the scale that generates approximately 100 million triples (100 Mt). We consider this scale &quot;one&quot;. The reason for the multiplication is that scores at different scales should get similar numbers; otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries. The Composite metric is the geometric mean of the Power and Throughput metrics. A complete report shows both Power and Throughput metrics, as well as individual query times for all queries. The rationale for using a geometric mean is to give an equal importance to long and short queries. Halving the execution time of either a long query or a short query will have the same effect on the metric. This is good for encouraging research into all aspects of query processing. On the other hand, real-life users are more interested in halving the time of queries that take one hour than of queries that take one second; therefore, the throughput metric considers run times. Taking the geometric mean of the two metrics gives more weight to the lower of the two than an arithmetic mean, hence we pay more attention to the worse of the two. Single-user and multi-user metrics are separate because of the relative importance of intra-query parallelization in BI workloads: There may not be large numbers of concurrent users, yet queries are still complex, and it is important to have maximum parallelization. Therefore the metric rewards single-user performance. In the next post we will look at the use of this metric and the actual content of BSBM BI. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): The Substance of Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications (this post) Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[
<p>In this post we introduce changes to the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x234e0ca0">BSBM</a> BI queries and metric. These changes are motivated by prevailing benchmark practice and by our experiences in optimizing for the BSBM BI workload.</p>

<p>We will publish results according to the definitions given here and recommend that any interested parties do likewise.  The rationales are given in the text.</p>


<h3>Query Mix</h3>

<p>We have removed Q4 from the mix because it is quadratic to the scale factor.  The other queries are roughly <code>n * log (n)</code>.  </p>


<h3>Parameter Substitution </h3>

<p>All queries that take a product type as parameter are run in flights of several query invocations where the product type goes from broader to more specific.  The initial product type specifies either the root product type or an immediate subtype of this, and the last in the drill-down is a leaf type.</p>

<p>The rationale for this is that the choice of product type may make several orders of magnitude difference in the run time of a query.  In order to make consecutive query mixes roughly comparable in execution time, all mixes should have a predictable number of query invocations with product types of each level.</p>


<h3>Query Order </h3>

<p>In the BI mix, when running multiple concurrent clients, each query mix is submitted in a random order.  Queries which do drill-downs always have the steps of the drill-down as consecutive in the session, but the query templates are permuted.  This is done so as to make less likely that there were two concurrent queries accessing exactly the same <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x23be8d28">data</a>.  In this way, scans cannot be trivially shared between queries -- but there are still opportunities for reuse of results and adapting execution to working set, e.g., starting with what is in memory.</p>


<h3>Metrics </h3>

<p>We use a <a class="auto-href" href="http://www.tpc.org/" id="link-id0x238c81a0">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/TPC-H" id="link-id0x28c6bbd8">H</a>-like metric.  This metric consists of a single-user part and a multi-user part, called respectively <i>Power</i> and <i>Throughput.</i>  The <i>Power</i> metric is a geometric mean of query run-time.  The <i>Throughput</i> is the total run-time divided by the number of queries completed.  After taking the mean, the time is converted into queries-per-hour.  This time is then multiplied by the scale factor divided by the scale factor for 100 Mt. In other words, we consider the 100 Mt data set as the unit scale.</p>

<p>The <i>Power</i> is defined as</p>
<blockquote>( scale_factor / 284826 ) *  3600 / ( ( t1 * t1 * ... * tn ) ^ ( 1 / n ) ) </blockquote>
<p>The <i>Throughput</i> is defined as</p>
<blockquote>( scale_factor / 284826 ) *  3600 / ( ( t1 + t2 + ... + tn ) / n ) </blockquote>
<p>The magic number <b><code>284826</code></b> is the scale that generates approximately 100 million triples (100 Mt).  We consider this scale &quot;one&quot;.  The reason for the multiplication is that scores at different scales should get similar numbers; otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries.</p>


<p>The <i>Composite</i> metric is the geometric mean of the <i>Power</i> and <i>Throughput</i> metrics.  A complete report shows both <i>Power</i> and <i>Throughput</i> metrics, as well as individual query times for all queries.  The rationale for using a geometric mean is to give an equal importance to long and short queries.  Halving the execution time of either a long query or a short query will have the same effect on the metric.  This is good for encouraging research into all aspects of query processing.  On the other hand, real-life users are more interested in halving the time of queries that take one hour than of queries that take one second; therefore, the throughput metric considers run times.</p>

<p>Taking the geometric mean of the two metrics gives more weight to the lower of the two than an arithmetic mean, hence we pay more attention to the worse of the two.</p>

<p>Single-user and multi-user metrics are separate because of the relative importance of intra-query parallelization in BI workloads: There may not be large numbers of concurrent users, yet queries are still complex, and it is important to have maximum parallelization. Therefore the metric rewards single-user performance.</p>


<p>In the next post we will look at the use of this metric and the actual content of BSBM BI.</p>



<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1b02d528">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>

<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1d65f740">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1a797860">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1d3538e0">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1e566f60">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1dedffd8">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1eb11528">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1db46c38">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1c8174e8">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x1dfa9338">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1e6dd7b0">Benchmarks, Redux (part 11): The Substance of Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1d154bb0">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
Benchmarks, Redux (part 13): BSBM BI Modifications <i>(this post)</i>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1f242ae0">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1ebf2f98">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1685">
  <rss:title>Benchmarks, Redux (part 12): Our Own BSBM Results Report</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1685</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1685</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1685</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-22T22:29:56Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">This is a placeholder; it will be replaced with a complete report in the very near future.</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>
<i>This is a placeholder; it will be replaced with a complete report in the very near future.</i>
</p>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1680">
  <rss:title>Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1680</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1680</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1680</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-10T23:30:11Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">Let us talk about what ought to be benchmarked in the context of RDF. A point that often gets brought up by RDF-ers when talking about benchmarks is that there already exist systems which perform very well at TPC-H and similar workloads, and therefore there is no need for RDF to go there. It is, as it were, somebody else&#39;s problem; besides, it is a solved one. On the other hand, being able to express what is generally expected of a query language might not be a core competence or a competitive edge, but it certainly is a checklist item. BSBM seems to be adopted as a de facto RDF benchmark, as there indeed is almost nothing else. But we should not lose sight of the fact that this is in fact a relational schema and workload that has just been straightforwardly transformed to RDF. BSBM was made, after all, in part for measuring RDB to RDF mapping. Thus BSBM is no more RDF-ish than a trivially RDF-ized TPC-H would be. TPC-H is however a bit more difficult if also a better thought out benchmark than the BSBM BI Mix proposal. But I do not expect an RDF audience to have any enthusiasm for this as this is indeed a very tough race by now, and besides one in which RDB and SQL will keep some advantage. However, using this as a validation test is meaningful, as there exists a validation dataset and queries that we already have RDF-ized. We could publish these and call this &quot;RDF-H&quot;. In the following I will outline what would constitute an RDF-friendly, scientifically interesting benchmark. The points are in part based on discussions with Peter Boncz of CWI. The Social Network Intelligence Benchmark (SNIB) takes the social web Facebook-style schema Ivan Mikhailov and I made last year under the name of Botnet BM. In LOD2, CWI is presently working on this. The data includes DBpedia as a base component used for providing conversation topics, information about geographical locales of simulated users, etc. DBpedia is not very large, around 200M-300M triples, but it is diverse enough. The data will have correlations, e.g., people who talk about sports tend to know other people who talk about the same sport, and they are more likely to know people from their geographical area than from elsewhere. The bulk of the data consists of a rich history of interactions including messages to individuals and groups, linking to people, dropping links, joining and leaving groups, and so forth. The messages are tagged using real-world concepts from DBpedia, and there is correlation between tagging and textual content since both are generated from Dbpedia articles. Since there is such correlation, NLP techniques like entity and relationship extraction can be used with the data even though this is not the primary thrust of SNIB. There is variation in frequency of online interaction, and this interaction consist of sessions. For example, one could analyze user behavior per time of day for online ad placement. The data probably should include propagating memes, fashions, and trends that travel on the social network. With this, one could query about their origin and speed of propagation. There should probably be cases of duplicate identities in the data, i.e., one real person using many online accounts to push an agenda. Resolving duplicate identities makes for nice queries. Ragged data with half-filled profiles and misspelled identifiers like person and place names are a natural part of the social web use case. The data generator should take this into account. Distribution of popularity and activity should follow a power-law-like pattern; actual measures of popularity can be sampled from existing social networks even though large quantities of data cannot easily be extracted. The dataset should be predictably scalable. For the workload considered, the relative importance of the queries or other measured tasks should not change dramatically with the scale. For example some queries are logarithmic to data size (e.g., find connections to a person), some are linear (e.g., find average online time of sports fans on Sundays), and some are quadratic or worse (e.g., find two extremists of the same ideology that are otherwise unrelated). Making a single metric from such parts may not be meaningful. Therefore, SNIB might be structured into different workloads. The first would be an online mix with typically short lookups and updates, around O ( log ( n ) ). The Business Intelligence Mix would be composed of queries around OO ( n log ( n ) ). Even so, with real data, choice of parameters will provide dramatic changes in query run-time. Therefore a run should be specified to have a predictable distribution of &quot;hard&quot; and &quot;easy&quot; parameter choices. In the BSBM BI mix modification, I did this by defining some to be drill downs from a more general to a more specific level of a hierarchy. This could be done here too in some cases; other cases would have to be defined with buckets of values. Both the real world and LOD2 are largely concerned with data integration. The SNIB workload can have aspects of this, for example, in resolving duplicate identities. These operations are more complex than typical database queries, as the attributes used for joining might not even match in the initial data. One characteristic of these is the production of sometimes large intermediate results that need to be materialized. Doing these operations in practice requires procedural control. Further, running algorithms like network analytics (e.g., Page rank, centrality, etc.) involves aggregation of intermediate results that is not very well expressible in a query language. Some basic graph operations like shortest path are expressible but then are not in unextended SPARQL 1.1; as these would for example involve returning paths, which are explicitly excluded from the spec. These are however the areas where we need to go for a benchmark that is more than a repackaging of a relational BI workload. We find that such a workload will have procedural sections either in application code or stored procedures. Map-reduce is sometimes used for scaling these. As one would expect, many cluster databases have their own version of these control structures. Therefore some of the SNIB workload could even be implemented as map-reduce jobs alongside parallel database implementations. We might here touch base with the LarKC map-reduce work to see if it could be applied to SNIB workloads. We see a three-level structure emerging. There is an Online mix which is a bit like the BSBM Explore mix, and an Analytics mix which is on the same order of complexity as TPC-H. These may have a more-or-less fixed query formulation and test driver. Beyond these, yet working on the same data, we have a set of Predefined Tasks which the test sponsor may implement in a manner of their choice. We would finally get to the &quot;raging conflict&quot; between the &quot;declarativists&quot; and the &quot;map reductionists.&quot; Last year&#39;s VLDB had a lot of map-reduce papers. I know of comparisons between Vertica and map reduce for doing a fairly simple SQL query on a lot of data, but here we would be talking about much more complex jobs on more interesting (i.e., less uniform) data. We might even interest some of the cluster RDBMS players (Teradata, Vertica, Greenplum, Oracle Exadata, ParAccel, and/or Aster Data, to name a few) in running this workload using their map-reduce analogs. We see that as we get to topics beyond relational BI, we do not find ourselves in an RDF-only world but very much at a crossroads of many technologies, e.g., map-reduce and its database analogs, various custom built databases, graph libraries, data integration and cleaning tools, and so forth. There is not, nor ought there to be, a sheltered, RDF-only enclave. RDF will have to justify itself in a world of alternatives. This must be reflected in our benchmark development, so relational BI is not irrelevant; in fact, it is what everybody does. RDF cannot be a total failure at this, even if this were not RDF&#39;s claim to fame. The claim to fame comes after we pass this stage, which is what we intend to explore in SNIB. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks (this post) Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Let us talk about what ought to be benchmarked in the context of <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x2a84d3c0">RDF</a>.</p>

<p>A point that often gets brought up by RDF-ers when talking about benchmarks is that there already exist systems which perform very well at <a class="auto-href" href="http://www.tpc.org/" id="link-id0x2a9758e8">TPC</a>-<a class="auto-href" href="http://dbpedia.org/resource/TPC-H" id="link-id0x2a8fa2a0">H</a> and similar workloads, and therefore there is no need for RDF to go there.  It is, as it were, somebody else&#39;s problem; besides, it is a solved one.</p>

<p>On the other hand, being able to express what is generally expected of a query language might not be a core competence or a competitive edge, but it certainly is a checklist item.</p>

<p>
<a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x29c75a30">BSBM</a> seems to be adopted as a de facto RDF benchmark, as there indeed is almost nothing else.  But we should not lose sight of the fact that this is in fact a relational <a class="auto-href" href="http://dbpedia.org/resource/Database_schema" id="link-id0x2a0565b8">schema</a> and workload that has just been straightforwardly transformed to RDF.  BSBM was made, after all, in part for measuring RDB to RDF mapping.  Thus BSBM is no more RDF-ish than a trivially RDF-ized TPC-H would be.  TPC-H is however a bit more difficult if also a better thought out benchmark than the BSBM BI Mix proposal.  But I do not expect an RDF audience to have any enthusiasm for this as this is indeed a very tough race by now, and besides one in which RDB and <a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x29c44d50">SQL</a> will keep some advantage.  However, using this as a validation test is meaningful, as there exists a validation dataset and queries that we already have RDF-ized.  We could publish these and call this &quot;RDF-H&quot;.  </p>

<p>In the following I will outline what would constitute an RDF-friendly, scientifically interesting benchmark.  The points are in part based on discussions with <a class="auto-href" href="http://nl.linkedin.com/in/peterboncz" id="link-id0x2ac282f0">Peter Boncz</a> of <a class="auto-href" href="http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science" id="link-id0x2a1c9e10">CWI</a>.</p>

<p>The <a class="auto-href" href="http://www.w3.org/wiki/Social_Network_Intelligence_BenchMark" id="link-id0x29e7d3d8">Social Network Intelligence Benchmark</a> (<a class="auto-href" href="http://www.w3.org/wiki/Social_Network_Intelligence_BenchMark" id="link-id0x2a70e3c0">SNIB</a>) takes the social web Facebook-style schema Ivan Mikhailov and I made last year under the name of Botnet BM.  In <a class="auto-href" href="http://lod2.eu/" id="link-id0x2a9a70f0">LOD2</a>, CWI is presently working on this.</p>

<p>The <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x2ad04408">data</a> includes <a class="auto-href" href="http://dbpedia.org/resource/DBpedia" id="link-id0x29d5eeb0">DBpedia</a> as a base component used for providing conversation topics, <a class="auto-href" href="http://dbpedia.org/resource/Information" id="link-id0x2ac97c40">information</a> about geographical locales of simulated users, etc.  DBpedia is not very large, around 200M-300M triples, but it is diverse enough.</p>

<p>The data will have correlations, e.g., people who talk about sports tend to know other people who talk about the same sport, and they are more likely to know people from their geographical area than from elsewhere.  </p>

<p>The bulk of the data consists of a rich history of interactions including messages to individuals and groups, linking to people, dropping links, joining and leaving groups, and so forth.  The messages are tagged using real-world concepts from DBpedia, and there is correlation between tagging and textual content since both are generated from Dbpedia articles.  Since there is such correlation, <a class="auto-href" href="http://dbpedia.org/resource/Natural_language_processing" id="link-id0x2ac359c0">NLP</a> techniques like <a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x2a1c8ed0">entity</a> and relationship extraction can be used with the data even though this is not the primary thrust of SNIB.</p>

<p>There is variation in frequency of online interaction, and this interaction consist of sessions.  For example, one could analyze user behavior per time of day for online ad placement.</p>

<p>The data probably should include propagating memes, fashions, and trends that travel on the social network.  With this, one could query about their origin and speed of propagation.</p>

<p>There should probably be cases of duplicate identities in the data, i.e., one real person using many online accounts to push an agenda. Resolving duplicate identities makes for nice queries.</p>

<p>Ragged data with half-filled profiles and misspelled identifiers like person and place names are a natural part of the social web use case. The data generator should take this into account.</p>

<ul>
<li>
  <p>Distribution of popularity and activity should follow a power-law-like pattern; actual measures of popularity can be sampled from existing social networks even though large quantities of data cannot easily be extracted.</p>
</li>

<li>
  <p>The dataset should be predictably scalable.  For the workload considered, the relative importance of the queries or other measured tasks should not change dramatically with the scale.</p>
</li>
</ul>

<p>For example some queries are logarithmic to data size (e.g., find connections to a person), some are linear (e.g., find average online time of sports fans on Sundays), and some are quadratic or worse (e.g., find two extremists of the same ideology that are otherwise unrelated).  Making a single metric from such parts may not be meaningful.  Therefore, SNIB might be structured into different workloads.</p>

<p>The first would be an online mix with typically short lookups and updates, around <code>O ( log ( n ) )</code>.  </p>

<p>The Business Intelligence Mix would be composed of queries around <code>OO ( n log ( n ) )</code>.  Even so, with real data, choice of parameters will provide dramatic changes in query run-time.  Therefore a run should be specified to have a predictable distribution of &quot;hard&quot; and &quot;easy&quot; parameter choices.  In the BSBM BI mix modification, I did this by defining some to be drill downs from a more general to a more specific level of a hierarchy.  This could be done here too in some cases; other cases would have to be defined with buckets of values. </p>

<p>Both the real world and LOD2 are largely concerned with data integration.  The SNIB workload can have aspects of this, for example, in resolving duplicate identities.  These operations are more complex than typical database queries, as the attributes used for joining might not even match in the initial data.</p>

<p>One characteristic of these is the production of sometimes large intermediate results that need to be materialized.  Doing these operations in practice requires procedural control.  Further, running algorithms like network analytics (e.g., Page rank, centrality, etc.) involves aggregation of intermediate results that is not very well expressible in a query language.  Some basic graph operations like shortest path are expressible but then are not in unextended <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x29d26588">SPARQL</a> 1.1; as these would for example involve returning paths, which are explicitly excluded from the spec.</p>

<p>These are however the areas where we need to go for a benchmark that is more than a repackaging of a relational BI workload.</p>

<p>We find that such a workload will have procedural sections either in application code or stored procedures.  Map-reduce is sometimes used for scaling these.  As one would expect, many cluster databases have their own version of these control structures.  Therefore some of the SNIB workload could even be implemented as map-reduce jobs alongside parallel database implementations.  We might here touch base with the <a class="auto-href" href="http://www.larkc.eu/" id="link-id0x29b69640">LarKC</a> map-reduce work to see if it could be applied to SNIB workloads. </p>

<p>We see a three-level structure emerging.  There is an <i>Online</i> mix which is a bit like the BSBM <i>Explore</i> mix, and an <i>Analytics</i> mix which is on the same order of complexity as TPC-H.  These may have a more-or-less fixed query formulation and test driver.  Beyond these, yet working on the same data, we have a set of <i>Predefined Tasks</i> which the test sponsor may implement in a manner of their choice.</p>

<p>We would finally get to the &quot;raging conflict&quot; between the &quot;declarativists&quot; and  the &quot;map reductionists.&quot;  Last year&#39;s VLDB had a lot of map-reduce papers.  I know of comparisons between <a class="auto-href" href="http://www.vertica.com/" id="link-id0x2a8c4510">Vertica</a> and map reduce for doing a fairly simple SQL query on a lot of data, but here we would be talking about much more complex jobs on more interesting (i.e., less uniform) data.</p>

<p>We might even interest some of the cluster <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x2995aaa8">RDBMS</a> players (<a class="auto-href" href="http://www.teradata.com/" id="link-id0x29c9af10">Teradata</a>, Vertica, <a class="auto-href" href="http://dbpedia.org/resource/Greenplum" id="link-id0x29c9af38">Greenplum</a>, <a class="auto-href" href="http://dbpedia.org/page/Oracle_Exadata" id="link-id0x29d48b78">Oracle Exadata</a>, <a class="auto-href" href="http://www.paraccel.com/" id="link-id0x29d48ba0">ParAccel</a>, and/or <a class="auto-href" href="http://www.asterdata.com/" id="link-id0x29bf8fb0">Aster Data</a>, to name a few) in running this workload using their map-reduce analogs.</p>


<p>We see that as we get to topics beyond relational BI, we do not find ourselves in an RDF-only world but very much at a crossroads of many technologies, e.g., map-reduce and its database analogs, various custom built databases, graph libraries, data integration and cleaning tools, and so forth.</p>

<p>There is not, nor ought there to be, a sheltered, RDF-only enclave.  RDF will have to justify itself in a world of alternatives.</p>

<p>This must be reflected in our benchmark development, so relational BI is not irrelevant; in fact, it is what everybody does.  RDF cannot be a total failure at this, even if this were not RDF&#39;s claim to fame. The claim to fame comes after we pass this stage, which is what we intend to explore in SNIB.</p>



<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1c9f7ab8">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>

<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1dd17b28">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1eb20620">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1f8a5ae8">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1ac14a08">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1d1f8d58">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1ea83308">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1b548028">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1c3d9c58">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x1f5e6978">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks <i>(this post)</i>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1c082a28">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1ec73578">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1eb25d48">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1b261958">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1679">
  <rss:title>Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1679</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1679</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1679</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-10T23:29:41Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">I have in the previous posts generally argued for and demonstrated the usefulness of benchmarks. Here I will talk about how this could be organized in a way that is tractable, and takes vendor and end user interests into account. These are my views on the subject and do not represent a LOD2 members consensus, but have been discussed in the consortium. My colleague Ivan Mikhailov once proposed that the only way to get benchmarks run right is to package them as a single script that does everything, like instant noodles -- just add water! But even instant noodles can be abused: Cook too long, add too much water, maybe forget to light the stove, and complain that the result is unsatisfyingly hard and brittle, lacking the suppleness one has grown to expect from this delicacy. No, the answer lies at the other end of the culinary spectrum, in gourmet cooking. Let the best cooks show what they can do, and let them work at it; let those who in fact have capacity and motivation for creating le chef d&#39;oeuvre culinaire (&quot;the culinary masterpiece&quot;) create it. Even so, there are many value points along the dimensions of preparation time, cost, and esthetic layout, not to forget taste and nutritional values. Indeed, an intimate knowledge de la vie secrete du canard (&quot;the secret life of duck&quot;) is required in order to liberate the aroma that it might take flight and soar. In the previous, I have shed some light on how we prepare le canard, and if le canard be such then la dinde (turkey) might in some ways be analogous; who is to say? In other words, as a vendor, we want to have complete control over the benchmarking process, and have it take place in our environment at a time of our choice. In exchange for this, we are ready to document and observe possibly complicated rules, document how the runs are made, and let others monitor and repeat them on the equipment on which the results are obtained. This is the TPC (Transaction Processing Performance Council) model. Another culture of doing benchmarks is the periodic challenge model used in TREC, the Billion Triples Challenge, the Semantic Search Challenge and others. In this model, vendors prepare the benchmark submission and agree to joint publication. A third party performing benchmarks by itself is uncommon in databases. Licenses even often explicitly prohibit this, for understandable reasons. The LOD2 project has an outreach activity called Publink where we offer to help owners of data to publish it as Linked Data. Similarly, since FP 7s are supposed to offer a visible service to their communities, I proposed that LOD2 offer to serve a role in disseminating and auditing RDF store benchmarks. One representative of an RDF store vendor I talked to, in relation to setting up a benchmark configuration of their product, told me that we could do this and that they would give some advice but that such an exercise was by its nature fundamentally flawed and could not possibly produce worthwhile results. The reason for this was that OpenLink engineers could not possibly learn enough about the other products nor unlearn enough of their own to make this a meaningful comparison. Isn&#39;t this the very truth? Let the chefs mix their own spices. This does not mean that there would not be comparability of results. If the benchmarks and processes are well defined, documented, and checked by a third party, these can be considered legitimate and not just one-off best-case results without further import. In order to stretch the envelope, which is very much a LOD2 goal, this benchmarking should be done on a variety of equipment -- whatever works best at the scale in question. Increasing the scale remains a stated objective. LOD2 even promised to run things with a trillion triples in another 3 years. Imagine that the unimpeachably impartial Berliners made house calls. Would this debase Justice to be a servant of mere show-off? Or would this on the contrary combine strict Justice with edifying Charity? Who indeed is in greater need of the light of objective evaluation than the vendor whose very nature makes a being of bias and prejudice? Even better, CWI, with its stellar database pedigree, agreed in principle to audit RDF benchmarks in LOD2. In this way one could get a stamp of approval for one&#39;s results regardless of when they were produced, and be free of the arbitrary schedule of third party benchmarking runs. On the relational side this is a process of some cost and complexity, but since the RDF side is still young and more on mutually friendly terms, the process can be somewhat lighter here. I did promise to draft some extra descriptions of process and result disclosure so that we could see how this goes. We could even do this unilaterally -- just publish Virtuoso results according to a predefined reporting and verification format. If others wished to publish by the same rules, LOD2 could use some of the benchmarking funds for auditing the proceedings. This could all take place over the net, so we are not talking about any huge cost or prohibitive amount of trouble. It would be in the FP7 spirit that LOD2 provide this service for free, naturally within reason. Then there is the matter of the BSBM Business Intelligence (BI) mix. At present, it seems everybody has chosen to defer the matter to another round of BSBM runs in the summer. This seems to fit the pattern of a public challenge with a few months given for contenders to prepare their submissions. Here we certainly should look at bigger scales and more diverse hardware than in the Berlin runs published this time around. The BI workload is in fact fairly cluster friendly, with big joins and aggregations that parallelize well. There it would definitely make sense to reserve an actual cluster, and have all contenders set up their gear on it. If all have access to the run environment and to monitoring tools, we can be reasonably sure that things will be done in a transparent manner. (I will talk about the BI mix in more detail in part 13 and part 14 of this series.) Once the BI mix has settled and there are a few interoperable implementations, likely in the summer, we could pass from the challenge model to a situation where vendors may publish results as they become available, with LOD2 offering its services for audit. Of course, this could be done even before then, but the content of the mix might not be settled. We likely need to check it on a few implementations first. For equipment, people can use their own, or LOD2 partners might on a case-by-case basis make some equipment available for running on the same hardware on which say the Virtuoso results were obtained. For example, FU Berlin could give people a login to get their recently published results fixed. Now this might or might not happen, so I will not hold my breath waiting for this but instead close with a proposal. As a unilateral diplomatic overture I put forth the following: If other vendors are interested in 1:1 comparison of their results with our publications, we can offer them a login to the same equipment. They can set up and tune their systems, and perform the runs. We will just watch. As an extra quid pro quo, they can try Virtuoso as configured for the results we have published, with the same data. Like this, both parties get to see the others&#39; technology with proper tuning and installation. What, if anything, is reported about this activity is up to the owner of the technology being tested. We will publish a set of benchmark rules that can serve as a guideline for mutually comparable reporting, but we cannot force anybody to use these. This all will function as a catalyst for technological advance, all to the ultimate benefit of the end user. If you wish to take advantage of this offer, you may contact Hugh Williams at OpenLink Software, and we will see how this can be arranged in practice. The next post will talk about the actual content of benchmarks. The milestone after this will be when we publish the measurement and reporting protocols. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process (this post) Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I have in the previous posts generally argued for and demonstrated the usefulness of benchmarks.</p>

<p>Here I will talk about how this could be organized in a way that is tractable, and takes vendor and end user interests into account. These are my views on the subject and do not represent a <a class="auto-href" href="http://lod2.eu/" id="link-id0x2acb0760">LOD2</a> members consensus, but have been discussed in the consortium. </p>

<p>My colleague Ivan Mikhailov once proposed that the only way to get benchmarks run right is to package them as a single script that does everything, like instant noodles -- just add water!  But even instant noodles can be abused: Cook too long, add too much water, maybe forget to light the stove, and complain that the result is unsatisfyingly hard and brittle, lacking the suppleness one has grown to expect from this delicacy. No, the answer lies at the other end of the culinary spectrum, in gourmet cooking.  Let the best cooks show what they can do, and let them work at it; let those who in fact have capacity and motivation for creating <i>le chef d&#39;oeuvre culinaire</i> (&quot;the culinary masterpiece&quot;) create it.  Even so, there are many value points along the dimensions of preparation time, cost, and esthetic layout, not to forget taste and nutritional values.  Indeed, an intimate <a class="auto-href" href="http://dbpedia.org/resource/Knowledge" id="link-id0x2aca6a30">knowledge</a> <i>de la vie secrete du canard</i> (&quot;the secret life of duck&quot;) is required in order to liberate the aroma that it might take flight and soar.  In the previous, I have shed some light on how we prepare <i>le canard</i>, and if <i>le canard</i> be such then <i>la dinde</i> (turkey) might in some ways be analogous; who is to say?</p>

<p>In other words, as a vendor, we want to have complete control over the benchmarking process, and have it take place in our environment at a time of our choice.  In exchange for this, we are ready to document and observe possibly complicated rules, document how the runs are made, and let others monitor and repeat them on the equipment on which the results are obtained.  This is the <a class="auto-href" href="http://www.tpc.org/" id="link-id0x2b847818">TPC</a> (Transaction Processing Performance Council) model.</p>

<p>Another culture of doing benchmarks is the periodic challenge model used in TREC, the <a class="auto-href" href="http://challenge.semanticweb.org/" id="link-id0x2ac3a6f8">Billion Triples Challenge</a>, the Semantic Search
Challenge and others. In this model, vendors prepare the benchmark submission and agree to joint publication.</p>

<p>A third party performing benchmarks by itself is uncommon in databases.  Licenses even often explicitly prohibit this, for understandable reasons.</p>

<p>The LOD2 project has an outreach activity called Publink where we offer to help owners of <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x2aea5930">data</a> to publish it as <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x2a790128">Linked Data</a>. Similarly, since FP 7s are supposed to offer a visible service to their communities, I proposed that LOD2 offer to serve a role in disseminating and auditing <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x29babb00">RDF</a> store benchmarks.</p>

<p>One representative of an RDF store vendor I talked to, in relation to setting up a benchmark configuration of their product, told me that we could do this and that they would give some advice but that such an exercise was by its nature fundamentally flawed and could not possibly produce worthwhile results.  The reason for this was that OpenLink engineers could not possibly learn enough about the other products nor unlearn enough of their own to make this a meaningful comparison.</p>

<p>Isn&#39;t this the very truth?   Let the chefs  mix their own spices.</p>

<p>This does not mean that there would not be comparability of results. If the benchmarks and processes are well defined, documented, and checked by a third party, these can be considered legitimate and not just one-off best-case results without further import.</p>

<p>In order to stretch the envelope, which is very much a LOD2 goal, this benchmarking should be done on a variety of equipment -- whatever works best at the scale in question.  Increasing the scale remains a stated objective.  LOD2 even promised to run things with a trillion triples in another 3 years.  </p>

<p>Imagine that the unimpeachably impartial Berliners made house calls. Would this debase Justice to be a servant of mere show-off?  Or would this on the contrary combine strict Justice with edifying Charity?  Who indeed is in greater need of the light of objective evaluation than the vendor whose very nature makes a being of bias and prejudice?</p>

<p>Even better, <a class="auto-href" href="http://dbpedia.org/resource/National_Research_Institute_for_Mathematics_and_Computer_Science" id="link-id0x2a21d108">CWI</a>, with its <a href="http://monetdb.cwi.nl/Development/Research/Articles/" id="link-id0x1d6479d0">stellar database pedigree</a>, agreed in principle to audit RDF benchmarks in LOD2. </p>

<p>In this way one could get a stamp of approval for one&#39;s results regardless of when they were produced, and be free of the arbitrary schedule of third party benchmarking runs.  On the relational side this is a process of some cost and complexity, but since the RDF side is still young and more on mutually friendly terms, the process can be somewhat lighter here.  I did promise to draft some extra descriptions of process and result disclosure so that we could see how this goes.</p>

<p>We could even do this unilaterally -- just publish <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x2a0d73d8">Virtuoso</a> results according to a predefined reporting and verification format.  If others wished to publish by the same rules, LOD2 could use some of the benchmarking funds for auditing the proceedings.  This could all take place over the <a class="auto-href" href="http://dbpedia.org/resource/.NET_Framework" id="link-id0x2a6b44a0">net</a>, so we are not talking about any huge cost or prohibitive amount of trouble.  It would be in the FP7 spirit that LOD2 provide this service for free, naturally within reason.</p>

<p>Then there is the matter of the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x2a1722a8">BSBM</a> Business Intelligence (BI) mix.  At present, it seems everybody has chosen to defer the matter to another round of BSBM runs in the summer.  This seems to fit the pattern of a public challenge with a few months given for contenders to prepare their submissions.  Here we certainly should look at bigger scales and more diverse hardware than in the Berlin runs published this time around.  The BI workload is in fact fairly cluster friendly, with big joins and aggregations that parallelize well.  There it would definitely make sense to reserve an actual cluster, and have all contenders set up their gear on it.  If all have access to the run environment and to monitoring tools, we can be reasonably sure that things will be done in a transparent manner.  </p>

<p>(I will talk about the BI mix in more detail in <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1dfcc038">part 13</a> and <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1edaa388">part 14</a> of this series.)</p>

<p>Once the BI mix has settled and there are a few interoperable implementations, likely in the summer, we could pass from the challenge model to a situation where vendors may publish results as they become available, with LOD2 offering its services for audit. </p>

<p>Of course, this could be done even before then, but the content of the mix might not be settled.  We likely need to check it on a few implementations first.</p>

<p>For equipment, people can use their own, or LOD2 partners might on a case-by-case basis make some equipment available for running on the same hardware on which say the Virtuoso results were obtained.  For example, FU Berlin could give people a login to get their recently published results fixed.  Now this might or might not happen, so I will not hold my breath waiting for this but instead close with a proposal.</p>

<p>As a unilateral diplomatic overture I put forth the following: If other vendors are interested in 1:1 comparison of their results with our publications, we can offer them a login to the same equipment.  They can set up and tune their systems, and perform the runs.  We will just watch.  As an extra quid pro quo, they can try Virtuoso as configured for the results we have published, with the same data.  Like this, both parties get to see the others&#39; technology with proper tuning and installation.  What, if anything, is reported about this activity is up to the owner of the technology being tested.  We will publish a set of benchmark rules that can serve as a guideline for mutually comparable reporting, but we cannot force anybody to use these.  This all will function as a catalyst for technological advance, all to the ultimate benefit of the end user.  If you wish to take advantage of this offer, you may contact <a href="mailto:hwilliams@openlinksw.com?subject=Collaborative RDF Benchmark" id="link-id0x1c071100">Hugh Williams at OpenLink Software, and we will see how this can be arranged in practice.</a>
</p>

<p>The next post will talk about the <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x19933fd8">actual content of benchmarks</a>.  The milestone after this will be when we publish the measurement and reporting protocols.</p>


<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1c554800">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>

<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1ec159e8">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1dd5eb10">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x18f05940">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1ed5ef10">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1e9cb130">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1dfa79d8">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1eb6f478">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1de5a918">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
Benchmarks, Redux (part 10): LOD2 and the Benchmark Process <i>(this post)</i>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1dae9060">Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1f45fa10">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1f49d2b8">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e68e4c8">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e353858">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1676">
  <rss:title>Benchmarks, Redux (part 9): BSBM With Cluster</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1676</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1676</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1676</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-09T22:54:50Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">This post is dedicated to our brothers in horizontal partitioning (or sharding), Garlik and Bigdata. At first sight, the BSBM Explore mix appears very cluster-unfriendly, as it contains short queries that access data at random. There is every opportunity for latency and few opportunities for parallelism. For this reason we had not even run the BSBM mix with Virtuoso Cluster. We were not surprised to learn that Garlik hadn&#39;t run BSBM either. We have understood from Systap that their Bigdata BSBM experiments were on a single-process configuration. But the 4Store results in the recent Berlin report were with a distributed setup, as 4Store always runs a multiprocess configuration, even on a single server, so it seemed interesting to us to compare how Virtuoso Cluster compares with Virtuoso Single with this workload. These tests were run on a different box than the recent BSBM tests, so those 4Store figures are not directly comparable. The setup here consists of 8 partitions, each managed by its own process, all running on the same box. Any of these processes can have its HTTP and SQL listener and can provide the same service. Most access to data goes over the interconnect, except when the data is co-resident in the process which is coordinating the query. The interconnect is Unix domain sockets since all 8 processes are on the same box. 6 Cluster - Load Rates and Times Scale Rate (quads per second) Load time (seconds) Checkpoint time (seconds) 100 Mt 119,204 749 89 200 Mt 121,607 1486 157 1000 Mt 102,694 8737 979 6 Single - Load Rates and Times Scale Rate (quads per second) Load time (seconds) Checkpoint time (seconds) 100 Mt 74,713 1192 145 The load times are systematically better than for 6 Single. This is also not bad compared to the 7 Single vectored load rates of 220 Kt/s or so. We note that loading is a cluster friendly operation, going at a steady 1400+% CPU utilization with an aggregate message throughput of 40MB/s. 7 Single is faster because of vectoring at the index level, not because the clusters were hitting communication overheads. 6 Cluster is faster than 6 Single because scale-out in this case diminishes contention, even on a single box. Throughput is as follows: 6 Cluster - Throughput (QMpH, query mixes per hour) Scale Single User 16 User 100 Mt 7318 43120 200 Mt 6222 29981 1000 Mt 2526 11156 6 Single - Throughput (QMpH, query mixes per hour) Scale Single User 16 User 100 Mt 7641 29433 200 Mt 6017 13335 1000 Mt 1770 2487 Below is a snapshot of status during the 6 Cluster 100 Mt run. Cluster 8 nodes, 15 s. 25784 m/s 25682 KB/s 1160% cpu 0% read 740% clw threads 18r 0w 10i buffers 1133459 12 d 4 w 0 pfs cl 1: 10851 m/s 3911 KB/s 597% cpu 0% read 668% clw threads 17r 0w 10i buffers 143992 4 d 0 w 0 pfs cl 2: 2194 m/s 7959 KB/s 107% cpu 0% read 9% clw threads 1r 0w 0i buffers 143616 3 d 2 w 0 pfs cl 3: 2186 m/s 7818 KB/s 107% cpu 0% read 9% clw threads 0r 0w 0i buffers 140787 0 d 0 w 0 pfs cl 4: 2174 m/s 2804 KB/s 77% cpu 0% read 10% clw threads 0r 0w 0i buffers 140654 0 d 2 w 0 pfs cl 5: 2127 m/s 1612 KB/s 71% cpu 0% read 9% clw threads 0r 0w 0i buffers 140949 1 d 0 w 0 pfs cl 6: 2060 m/s 544 KB/s 66% cpu 0% read 10% clw threads 0r 0w 0i buffers 141295 2 d 0 w 0 pfs cl 7: 2072 m/s 517 KB/s 65% cpu 0% read 11% clw threads 0r 0w 0i buffers 141111 1 d 0 w 0 pfs cl 8: 2105 m/s 522 KB/s 66% cpu 0% read 10% clw threads 0r 0w 0i buffers 141055 1 d 0 w 0 pfs The main meters for cluster execution are the messages-per-second (m/s), the message volume (KB/s), and the total CPU% of the processes. We note that CPU utilization is highly uneven and messages are short, about 1K on the average, compared to about 100K during the load. CPU would be evenly divided between the nodes if each got a share of the HTTP requests. We changed the test driver to round-robin requests between multiple end points. The work does then get evenly divided, but the speed is not affected. Also, this does not improve the message sizes since the workload consists mostly of short lookups. However, with the processes spread over multiple servers, the round-robin would be essential for CPU and especially for interconnect throughput. Then we try 6 Cluster at 1000 Mt. For Single User, we get 1180 m/s, 6955 KB/s, and 173% cpu. For 16 User, this is 6573 m/s, 44366 KB/s, 1470% cpu. This is a lot better than the figures with 6 Single, due to lower contention on the index tree, as discussed in A Benchmarking Story. Also Single User throughput on 6 Cluster outperforms 6 Single, due to the natural parallelism of doing the Q5 joins in parallel in each partition. The larger the scale, the more weight this has in the metric. We see this also in the average message size, i.e., the KB/s throughput is almost double while the messages/s is a bit under a third. The small-scale 6 Cluster run is about even with the 6 Single figure. Looking at the details, we see that the qps for Q1 in 6 Cluster is half of that on 6 Single, whereas the qps for Q5 on 6 Cluster is about double that of the 6 Single. This is as one might expect; longer queries are favored, and single row lookups are penalized. Looking further at the 6 Cluster status we see the cluster wait (clw) to be 740%. For 16 Users, this means that about half of the execution real time is spent waiting for responses from other partitions. A high figure means uneven distribution between partitions; a low figure means even. This is as expected, since many queries are concerned with just one S and its related objects. We will update this section once 7 Cluster is ready. This will implement vectored execution and column store inside the cluster nodes. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster (this post) Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>This post is dedicated to our brothers in horizontal partitioning (or sharding), <a class="auto-href" href="http://freebase.com/guid/9202a8c04000641f8000000005c908d6" id="link-id0x2a1e9010">Garlik</a> and <a class="auto-href" href="http://www.systap.com/bigdata.htm" id="link-id0x2acd5218">Bigdata</a>.</p>

<p>At first sight, the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x2bb33648">BSBM</a> <i>Explore</i> mix appears very cluster-unfriendly, as it contains short queries that access <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x2b8fffb8">data</a> at random. There is every opportunity for latency and few opportunities for parallelism.</p>

<p>For this reason we had not even run the BSBM mix with <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x2a84b780">Virtuoso</a> Cluster. We were not surprised to learn that <a href="http://steveharris.tumblr.com/post/3453040647/bsbm-v3-post-mortem" id="link-id0x1c4ef8d8">Garlik hadn&#39;t run BSBM either</a>. We have understood from <a class="auto-href" href="http://www.systap.com/" id="link-id0x2ad3d050">Systap</a> that their Bigdata BSBM experiments were on a single-process configuration.</p>

<p>But the 4Store results in the <a href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index.html" id="link-id0x1f8090f8">recent Berlin report</a> were with a distributed setup, as 4Store always runs a multiprocess configuration, even on a single server, so it seemed interesting to us to compare how Virtuoso Cluster compares with Virtuoso Single with this workload. These tests were run on a different box than the recent BSBM tests, so those 4Store figures are not directly comparable.</p>

<p>The setup here consists of 8 partitions, each managed by its own process, all running on the same box. Any of these processes can have its <a class="auto-href" href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x2ac28380">HTTP</a> and <a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x2bba8720">SQL</a> listener and can provide the same service. Most access to data goes over the interconnect, except when the data is co-resident in the process which is coordinating the query. The interconnect is Unix domain sockets since all 8 processes are on the same box.</p>

<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="4" align="center">6 Cluster - Load Rates and Times</th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center">Rate <br /> (quads per second)</th>
		<th align="center">Load time <br /> (seconds)</th>
		<th align="center">Checkpoint time <br /> (seconds)</th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 119,204 </td>
		<td align="center"> 749 </td>
		<td align="center"> 89 </td>
	</tr>
	<tr>
		<th align="center">200 Mt</th>
		<td align="center"> 121,607 </td>
		<td align="center"> 1486 </td>
		<td align="center"> 157 </td>
	</tr>
	<tr>
		<th align="center">1000 Mt</th>
		<td align="center"> 102,694 </td>
		<td align="center"> 8737 </td>
		<td align="center"> 979 </td>
	</tr>
</table>
<br />
<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="4" align="center">6 Single - Load Rates and Times</th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center">Rate <br /> (quads per second)</th>
		<th align="center">Load time <br /> (seconds)</th>
		<th align="center">Checkpoint time <br /> (seconds)</th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 74,713 </td>
		<td align="center"> 1192 </td>
		<td align="center"> 145 </td>
	</tr>
</table>



<p>The load times are systematically better than for 6 Single. This is also not bad compared to the 7 Single vectored load rates of 220 Kt/s or so. We note that loading is a cluster friendly operation, going at a steady 1400+% <a class="auto-href" href="http://dbpedia.org/resource/Central_processing_unit" id="link-id0x296b03b8">CPU</a> utilization with an aggregate message throughput of 40MB/s. 7 Single is faster because of vectoring at the index level, not because the clusters were hitting communication overheads. 6 Cluster is faster than 6 Single because scale-out in this case diminishes contention, even on a single box.</p>

<p>Throughput is as follows:</p>

<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="3" align="center"> 6 Cluster - Throughput <br /> (QMpH, query mixes per hour) </th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center"> Single User </th>
		<th align="center"> 16 User </th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 7318 </td>
		<td align="center"> 43120 </td>
	</tr>
	<tr>
		<th align="center">200 Mt</th>
		<td align="center"> 6222 </td>
		<td align="center"> 29981 </td>
	</tr>
	<tr>
		<th align="center">1000 Mt</th>
		<td align="center"> 2526 </td>
		<td align="center"> 11156 </td>
	</tr>
</table>
<br />
<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="3" align="center"> 6 Single - Throughput <br /> (QMpH, query mixes per hour) </th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center"> Single User </th>
		<th align="center"> 16 User </th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 7641 </td>
		<td align="center"> 29433 </td>
	</tr>
	<tr>
		<th align="center">200 Mt</th>
		<td align="center"> 6017 </td>
		<td align="center"> 13335 </td>
	</tr>
	<tr>
		<th align="center">1000 Mt</th>
		<td align="center"> 1770 </td>
		<td align="center"> 2487 </td>
	</tr>
</table>


<p>Below is a snapshot of status during the 6 Cluster 100 Mt run.</p>

<blockquote>
 <code><pre>
Cluster 8 nodes, 15 s.
       25784 m/s  25682 KB/s  1160% cpu  0% read  740% clw  threads 18r 0w 10i  buffers 1133459  12 d  4 w  0 pfs
cl 1:  10851 m/s   3911 KB/s   597% cpu  0% read  668% clw  threads 17r 0w 10i  buffers  143992   4 d  0 w  0 pfs
cl 2:   2194 m/s   7959 KB/s   107% cpu  0% read    9% clw  threads  1r 0w  0i  buffers  143616   3 d  2 w  0 pfs
cl 3:   2186 m/s   7818 KB/s   107% cpu  0% read    9% clw  threads  0r 0w  0i  buffers  140787   0 d  0 w  0 pfs
cl 4:   2174 m/s   2804 KB/s    77% cpu  0% read   10% clw  threads  0r 0w  0i  buffers  140654   0 d  2 w  0 pfs
cl 5:   2127 m/s   1612 KB/s    71% cpu  0% read    9% clw  threads  0r 0w  0i  buffers  140949   1 d  0 w  0 pfs
cl 6:   2060 m/s    544 KB/s    66% cpu  0% read   10% clw  threads  0r 0w  0i  buffers  141295   2 d  0 w  0 pfs
cl 7:   2072 m/s    517 KB/s    65% cpu  0% read   11% clw  threads  0r 0w  0i  buffers  141111   1 d  0 w  0 pfs
cl 8:   2105 m/s    522 KB/s    66% cpu  0% read   10% clw  threads  0r 0w  0i  buffers  141055   1 d  0 w  0 pfs
</pre>
 </code>
</blockquote>


<p>The main meters for cluster execution are the messages-per-second (m/s), the message volume (KB/s), and the total CPU% of the processes. </p>

<p>We note that CPU utilization is highly uneven and messages are short, about 1K on the average, compared to about 100K during the load. CPU would be evenly divided between the nodes if each got a share of the HTTP requests. We changed the test driver to round-robin requests between multiple end points. The work does then get evenly divided, but the speed is not affected. Also, this does not improve the message sizes since the workload consists mostly of short lookups. However, with the processes spread over multiple servers, the round-robin would be essential for CPU and especially for interconnect throughput. </p>


<p>Then we try 6 Cluster at 1000 Mt. For Single User, we get 1180 m/s, 6955 KB/s, and 173% cpu. For 16 User, this is 6573 m/s, 44366 KB/s, 1470% cpu.</p>

<p>This is a lot better than the figures with 6 Single, due to lower contention on the index tree, as discussed in <i><a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1e9a0b58">A Benchmarking Story</a></i>. Also Single User throughput on 6 Cluster outperforms 6 Single, due to the natural parallelism of doing the Q5 joins in parallel in each partition. The larger the scale, the more weight this has in the metric. We see this also in the average message size, i.e., the KB/s throughput is almost double while the messages/s is a bit under a third.</p>


<p>The small-scale 6 Cluster run is about even with the 6 Single figure. Looking at the details, we see that the qps for Q1 in 6 Cluster is half of that on 6 Single, whereas the qps for Q5 on 6 Cluster is about double that of the 6 Single. This is as one might expect; longer queries are favored, and single row lookups are penalized.</p>

<p>Looking further at the 6 Cluster status we see the cluster wait (<code>clw</code>) to be 740%. For 16 Users, this means that about half of the execution real time is spent waiting for responses from other partitions. A high figure means uneven distribution between partitions; a low figure means even. This is as expected, since many queries are concerned with just one S and its related objects.</p>


<p>We will update this section once 7 Cluster is ready. This will implement vectored execution and column store inside the cluster nodes.</p>



<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1d7894d0">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1e434888">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1f6b5260">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1dd29460">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1f0d78b8">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1f9a9670">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1c055370">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1dc06cd0">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
Benchmarks, Redux (part 9): BSBM With Cluster <i>(this post)</i>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x18f04db0">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1ee729b8">Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e2e76b8">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1d75ef48">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1ee518c0">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1d9244b0">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1674">
  <rss:title>Benchmarks, Redux (part 8): BSBM Explore and Update </rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1674</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1674</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1674</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-09T17:32:47Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">We will here look at the Explore and Update scenario of BSBM. This presents us with a novel problem as the specification does not address any aspect of ACID. A transaction benchmark ought to have something to say about this. The SPARUL (also known as SPARQL/Update) language does not say anything about transactionality, but I suppose it is in the spirit of the SPARUL protocol to promise atomicity and durability. We begin by running Virtuoso 7 Single, with Single User and 16 User, each at scales of 100 Mt, 200 Mt, and 1000 Mt. The transactionality is default, meaning SERIALIZABLE isolation between INSERTs and DELETEs, and READ COMMITTED isolation between READ and any UPDATE transaction. (Figures for Virtuoso 6 will also be presented here in the near future, as they are the currently shipping production versions.) Virtuoso 7 Single, Full ACID (QMpH, query mixes per hour) Scale Single User 16 User 100 Mt 9,969 65,537 200 Mt 8,646 40,527 1000 Mt 5,512 17,293 Virtuoso 6 Cluster, Full ACID (QMpH, query mixes per hour) Scale Single User 16 User 100 Mt 5604.520 34079.019 1000 Mt 2866.616 10028.325 Virtuoso 6 Single, Full ACID (QMpH, query mixes per hour) Scale Single User 16 User 100 Mt 7,152 21,065 200 Mt 5,862 16,895 1000 Mt 1,542 4,548 Each run is preceded by a warm-up of 500 or 300 mixes (the exact number is not material), resulting in a warm cache; see previous post on read-ahead for details. All runs do 1000 Explore and Update mixes. The initial database is in the state following the Explore only runs. The results are in line with the Explore results. There is a fair amount of variability between consecutive runs; the 16 User run at 1000 Mt varies between 14K and 19K QMpH depending on the measurement. The smaller runs exhibit less variability. In the following we will look at transactions and at how the definition of the workload and reporting could be made complete. Full ACID means serializable semantic of concurrent insert and delete of the same quad. Non-transactional means that on concurrent insert and delete of overlapping sets of quads the result is undefined. Further if one logged such &quot;transactions,&quot; the replay would give serialization although the initial execution did not, hence further confusing the issue. Considering the hypothetical use case of an e-commerce information portal, there is little chance of deletes and inserts actually needing serialization. An insert-only workload does not need serializability because an insert cannot fail. If the data already exists the insert does nothing, if the quad does not previously exist it is created. The same applies to deletes alone. If a delete and insert overlap, serialization would be needed but the semantics implicit in the use case make this improbable. Read-only transactions (i.e., the Explore mix in the Explore and Update scenario) will be run as READ COMMITTED. These do not see uncommitted data and never block for lock wait. The reads may not be repeatable. Our first point of call is to determine the cost of ACID. We run 1000 mixes of Explore and Update at 1000 Mt. The throughput is 19214 after a warm-up of 500 mixes. This is pretty good in comparison with the diverse read-only results at this scale. We look at the pertinent statistics: SELECT TOP 5 * FROM sys_l_stat ORDER BY waits DESC; KEY_TABLE INDEX_NAME LOCKS WAITS WAIT_PCT DEADLOCKS LOCK_ESC WAIT_MSECS =============== ============= ====== ===== ======== ========= ======== ========== DB.DBA.RDF_QUAD RDF_QUAD_POGS 179205 934 0 0 0 35164 DB.DBA.RDF_IRI RDF_IRI 20752 217 1 0 0 16445 DB.DBA.RDF_QUAD RDF_QUAD_SP 9244 3 0 0 0 235 We see 934 waits with a total duration of 35 seconds on the index with the most contention. The run was 187 seconds, real time. The lock wait time is not real time since this is the total elapsed wait time summed over all threads. The lock wait frequency is a little over one per query mix, meaning a little over one per five locking transactions. We note that we do not get deadlocks since all inserts and deletes are in ascending key order due to vectoring. This guarantees the absence of deadlocks for single insert transactions, as long as the transaction stays within the vector size. This is always the case since the inserts are a few hundred triples at the maximum. The waits concentrate on POGS, because this is a bitmap index where the locking resolution is less than a row, and the values do not correlate with insert order. The locking behavior could be better with the column store, where we would have row level locking also for this index. This is to be seen. The column store would otherwise tend to have higher cost per random insert. Considering these results it does not seem crucial to &quot;drop ACID,&quot; though doing so would save some time. We will now run measurements for all scales with 16 Users and ACID. Let us now see what the benchmark writes: SELECT TOP 10 * FROM sys_d_stat ORDER BY n_dirty DESC; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS =========================== ============================ ========= ======= ======== ======= ========= DB.DBA.RDF_QUAD RDF_QUAD_POGS 763846891 237436 0 58040 228606 DB.DBA.RDF_QUAD RDF_QUAD 213282706 1991836 0 30226 1940280 DB.DBA.RDF_OBJ RO_VAL 15474 17837 115 13438 17431 DB.DBA.RO_START RO_START 10573 11195 105 10228 11227 DB.DBA.RDF_IRI RDF_IRI 61902 125711 203 7705 121300 DB.DBA.RDF_OBJ RDF_OBJ 23809053 3205963 13 636 3072517 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 3237687 504486 15 340 488797 DB.DBA.RDF_QUAD RDF_QUAD_SP 89995 70446 78 99 68340 DB.DBA.RDF_QUAD RDF_QUAD_OP 19440 47541 244 66 45583 DB.DBA.VTLOG_DB_DBA_RDF_OBJ VTLOG_DB_DBA_RDF_OBJ 3014 1 0 11 11 DB.DBA.RDF_QUAD RDF_QUAD_GS 1261 801 63 10 751 DB.DBA.RDF_PREFIX RDF_PREFIX 14 168 1120 1 153 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 1807 200 11 1 200 The most dirty pages are on the POGS index, which is reasonable; values are spread out at random. After this we have the PSOG index, likely because of random deletes. New IRIs tend to get consecutive numbers and do not make many dirty pages. Literals come next, with the index from leading string or hash of the literal to id leading, as one would expect, again because of values being distributed at random. After this come IRIs. The distribution of updates is generally as one would expect. * * * Going back to BSBM, at least the following aspects of the benchmark have to be further specified: Disclosure of ACID properties. If the benchmark required full ACID many would not run this at all. Besides full ACID is not necessarily an absolute requirement based on the hypothetical usage scenario of the benchmark. However, when publishing numbers the guarantees that go with the numbers must be made explicit. This includes logging, checkpoint frequency or equivalent etc. Steady state. The working set of the Update mix is different from that of the Explore mixes. This touches more indices than Explore. The Explore warm-up is in part good but does not represent steady state. Checkpoint and sustained throughput. Benchmarks involving update generally have rules for checkpointing the state and for sustained throughput. In specific, the throughput of an update benchmark cannot rely on never flushing to persistent storage. Even bulk load must be timed with a checkpoint guaranteeing durability at the end. A steady update stream should be timed with a test interval of sufficient length involving a few checkpoints; for example, a minimum duration of 30 minutes with no less than 3 completed checkpoints in the interval with at least 9 minutes between the end of one and the start of the next. Not all DBMSs work with logs and checkpoints, but if an alternate scheme is used then this needs to be described. Memory and warm-up issues.We have seen the test data generator run out of memory when trying to generate update streams of meaningful length. Also the test driver should allow running updates in timed and non-timed mode (warm-up). With an update benchmark, many more things need to be defined, and the set-up becomes more system specific, than with a read-only workload. We will address these shortcomings in the measurement rules proposal to come. Especially with update workloads, the vendors need to provide tuning expertise; however, this will not happen if the benchmark does not properly set the expectations. If benchmarks serve as a catalyst for clearly defining how things are to be set up, then they will have served the end user. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update (this post) Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>We will here look at the <i>Explore and Update</i> scenario of <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x1c064218">BSBM</a>. This presents us with a novel problem as the specification does not address any aspect of <a class="auto-href" href="http://dbpedia.org/resource/ACID" id="link-id0x1c1852b0">ACID</a>.</p>

<p>A transaction benchmark ought to have something to say about this. The <a class="auto-href" href="http://dbpedia.org/page/SPARUL" id="link-id0x1dbca228">SPARUL</a> (also known as <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1eaa4fd0">SPARQL</a>/<a class="auto-href" href="http://dbpedia.org/page/SPARUL" id="link-id0x1dd12bb0">Update</a>) language does not say anything about transactionality, but I suppose it is in the spirit of the SPARUL protocol to promise atomicity and durability.</p>

<p>We begin by running <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1c5f4830">Virtuoso</a> 7 Single, with Single User and 16 User, each at scales of 100 Mt, 200 Mt, and 1000 Mt. The transactionality is default, meaning <code>SERIALIZABLE</code> isolation between <code>INSERTs</code> and <code>DELETEs</code>, and <code>READ COMMITTED</code> isolation between <code>READ</code> and any <code>UPDATE</code> transaction. (Figures for Virtuoso 6 will also be presented here in the near future, as they are the currently shipping production versions.)</p>


<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="3" align="center"> Virtuoso 7 Single, Full ACID <br /> (QMpH, query mixes per hour) </th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center"> Single User </th>
		<th align="center"> 16 User </th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 9,969 </td>
		<td align="center"> 65,537 </td>
	</tr>
	<tr>
		<th align="center">200 Mt</th>
		<td align="center"> 8,646 </td>
		<td align="center"> 40,527 </td>
	</tr>
	<tr>
		<th align="center">1000 Mt</th>
		<td align="center"> 5,512 </td>
		<td align="center"> 17,293 </td>
	</tr>
</table>
<br />
<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="3" align="center"> Virtuoso 6 Cluster, Full ACID <br /> (QMpH, query mixes per hour) </th>
	</tr>
	<tr>
		<th align="center"> Scale </th>
		<th align="center"> Single User </th>
		<th align="center"> 16 User </th>
	</tr>
	<tr>
		<th align="center"> 100 Mt </th>
		<td align="center"> 5604.520 </td>
		<td align="center"> 34079.019 </td>
	</tr>
	<tr>
		<th align="center"> 1000 Mt </th>
		<td align="center"> 2866.616 </td>
		<td align="center"> 10028.325 </td>
	</tr>
</table>
<br />
<table border="1" cellspacing="2" cellpadding="2" align="center" width="90%">
	<tr>
		<th colspan="3" align="center"> Virtuoso 6 Single, Full ACID <br /> (QMpH, query mixes per hour) </th>
	</tr>
	<tr>
		<th align="center">Scale</th>
		<th align="center"> Single User </th>
		<th align="center"> 16 User </th>
	</tr>
	<tr>
		<th align="center">100 Mt</th>
		<td align="center"> 7,152 </td>
		<td align="center"> 21,065 </td>
	</tr>
	<tr>
		<th align="center">200 Mt</th>
		<td align="center"> 5,862 </td>
		<td align="center"> 16,895 </td>
	</tr>
	<tr>
		<th align="center">1000 Mt</th>
		<td align="center"> 1,542 </td>
		<td align="center"> 4,548 </td>
	</tr>
</table>



<p>Each run is preceded by a warm-up of 500 or 300 mixes (the exact number is not material), resulting in a warm <a class="auto-href" href="http://dbpedia.org/resource/Cache" id="link-id0x1d4f13d8">cache</a>; see <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1f8ac510">previous post on read-ahead</a> for details. All runs do 1000 <i>Explore and Update</i> mixes. The initial database is in the state following the <i>Explore</i> only runs.</p>

<p>The results are in line with the <i>Explore</i> results. There is a fair amount of variability between consecutive runs; the 16 User run at 1000 Mt varies between 14K and 19K QMpH depending on the measurement. The smaller runs exhibit less variability.</p>

<p>In the following we will look at transactions and at how the definition of the workload and reporting could be made complete.</p>


<p>Full ACID means serializable semantic of concurrent insert and delete of the same quad. Non-transactional means that on concurrent insert and delete of overlapping sets of quads the result is undefined. Further if one logged such &quot;transactions,&quot; the replay would give serialization although the initial execution did not, hence further confusing the issue. Considering the hypothetical use case of an e-commerce information portal, there is little chance of deletes and inserts actually needing serialization. An insert-only workload does not need serializability because an insert cannot fail. If the <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x1ec05c10">data</a> already exists the insert does nothing, if the quad does not previously exist it is created. The same applies to deletes alone. If a delete and insert overlap, serialization would be needed but the semantics implicit in the use case make this improbable.</p>


<p>Read-only transactions (i.e., the <i>Explore</i> mix in the <i>Explore and Update</i> scenario) will be run as <code>READ COMMITTED</code>. These do not see uncommitted data and never block for lock wait. The reads may not be repeatable.</p>

<p>Our first point of call is to determine the cost of ACID. We run 1000 mixes of <i>Explore and Update</i> at 1000 Mt. The throughput is 19214 after a warm-up of 500 mixes. This is pretty good in comparison with the diverse read-only results at this scale.</p>

<p>We look at the pertinent statistics:</p>

<p>
<code></code>
</p>
<pre>
SELECT TOP 5 * FROM sys_l_stat ORDER BY waits DESC;
</pre>

<blockquote>
 <code><pre>
KEY_TABLE         INDEX_NAME       LOCKS   WAITS   WAIT_PCT   DEADLOCKS   LOCK_ESC   WAIT_MSECS
===============   =============   ======   =====   ========   =========   ========   ==========
DB.DBA.<a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x180837c8">RDF</a>_QUAD   RDF_QUAD_POGS   179205     934          0           0          0        35164
DB.DBA.RDF_IRI    RDF_IRI          20752     217          1           0          0        16445
DB.DBA.RDF_QUAD   RDF_QUAD_SP       9244       3          0           0          0          235
</pre>
 </code>
</blockquote>

<p>We see 934 waits with a total duration of 35 seconds on the index with the most contention. The run was 187 seconds, real time. The lock wait time is not real time since this is the total elapsed wait time summed over all threads. The lock wait frequency is a little over one per query mix, meaning a little over one per five locking transactions. </p>

<p>We note that we do not get deadlocks since all inserts and deletes are in ascending key order due to vectoring. This guarantees the absence of deadlocks for single insert transactions, as long as the transaction stays within the vector size. This is always the case since the inserts are a few hundred triples at the maximum. The waits concentrate on POGS, because this is a bitmap index where the locking resolution is less than a row, and the values do not correlate with insert order. The locking behavior could be better with the column store, where we would have row level locking also for this index. This is to be seen. The column store would otherwise tend to have higher cost per random insert.</p>

<p>Considering these results it does not seem crucial to &quot;drop ACID,&quot; though doing so would save <i>some</i> time. We will now run measurements for all scales with 16 Users and ACID. </p>

<p>Let us now see what the benchmark writes:</p>

<p>
<code></code>
</p>
<pre>
SELECT TOP 10 * FROM sys_d_stat ORDER BY n_dirty DESC;
</pre>

<blockquote>
 <code><pre>
KEY_TABLE                     INDEX_NAME                       TOUCHES     READS   READ_PCT   N_DIRTY   N_BUFFERS
===========================   ============================   =========   =======   ========   =======   =========
DB.DBA.RDF_QUAD               RDF_QUAD_POGS                  763846891    237436          0     58040      228606
DB.DBA.RDF_QUAD               RDF_QUAD                       213282706   1991836          0     30226     1940280
DB.DBA.RDF_OBJ                RO_VAL                             15474     17837        115     13438       17431
DB.DBA.RO_START               RO_START                           10573     11195        105     10228       11227
DB.DBA.RDF_IRI                RDF_IRI                            61902    125711        203      7705      121300
DB.DBA.RDF_OBJ                RDF_OBJ                         23809053   3205963         13       636     3072517
DB.DBA.RDF_IRI                DB_DBA_RDF_IRI_UNQC_RI_ID        3237687    504486         15       340      488797
DB.DBA.RDF_QUAD               RDF_QUAD_SP                        89995     70446         78        99       68340
DB.DBA.RDF_QUAD               RDF_QUAD_OP                        19440     47541        244        66       45583
DB.DBA.VTLOG_DB_DBA_RDF_OBJ   VTLOG_DB_DBA_RDF_OBJ                3014         1          0        11          11
DB.DBA.RDF_QUAD               RDF_QUAD_GS                         1261       801         63        10         751
DB.DBA.RDF_PREFIX             RDF_PREFIX                            14       168       1120         1         153
DB.DBA.RDF_PREFIX             DB_DBA_RDF_PREFIX_UNQC_RP_ID        1807       200         11         1         200
</pre>
 </code>
</blockquote>


<p>The most dirty pages are on the <code>POGS</code> index, which is reasonable; values are spread out at random. After this we have the <code>PSOG</code> index, likely because of random deletes. New IRIs tend to get consecutive numbers and do not make many dirty pages. Literals come next, with the index from leading string or hash of the literal to id leading, as one would expect, again because of values being distributed at random. After this come IRIs. The distribution of updates is generally as one would expect.</p>

<p align="center">* * *</p>

<p>Going back to BSBM, at least the following aspects of the benchmark have to be further specified:</p>

<ul>
<li>
  <p>
    <b>Disclosure of ACID properties.</b> If the benchmark required full ACID many would not run this at all. Besides full ACID is not necessarily an absolute requirement based on the hypothetical usage scenario of the benchmark. However, when publishing numbers the guarantees that go with the numbers must be made explicit. This includes logging, checkpoint frequency or equivalent etc.</p>
</li>

<li>
  <p>
    <b>Steady state.</b> The working set of the <i>Update</i> mix is different from that of the <i>Explore</i> mixes. This touches more indices than <i>Explore</i>. The <i>Explore</i> warm-up is in part good but does not represent steady state.</p>
</li>

<li>
  <p>
    <b>Checkpoint and sustained throughput.</b> Benchmarks involving update generally have rules for checkpointing the state and for sustained throughput. In specific, the throughput of an update benchmark cannot rely on never flushing to persistent storage. Even bulk load must be timed with a checkpoint guaranteeing durability at the end. A steady update stream should be timed with a test interval of sufficient length involving a few checkpoints; for example, a minimum duration of 30 minutes with no less than 3 completed checkpoints in the interval with at least 9 minutes between the end of one and the start of the next. Not all DBMSs work with logs and checkpoints, but if an alternate scheme is used then this needs to be described.</p>
</li>

<li>
  <p>
    <b>Memory and warm-up issues.</b>We have seen the test data generator run out of memory when trying to generate update streams of meaningful length. Also the test driver should allow running updates in timed and non-timed mode (warm-up).</p>
</li>
</ul>


<p>With an update benchmark, many more things need to be defined, and the set-up becomes more system specific, than with a read-only workload. We will address these shortcomings in the measurement rules proposal to come. Especially with update workloads, the vendors need to provide tuning expertise; however, this will not happen if the benchmark does not properly set the expectations. If benchmarks serve as a catalyst for clearly defining how things are to be set up, then they will have served the end user.</p>


<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1de61db8">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1f9f96f8">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1f89eeb0">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1ad83f30">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1de62178">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1b2ec018">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1ae6f028">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
Benchmarks, Redux (part 8): BSBM Explore and Update <i>(this post)</i>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x132605c0">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x1a9871b0">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1baa20f8">Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e25a840">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1b53db20">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e7ce520">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1b18f400">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>

</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1672">
  <rss:title>Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1672</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1672</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1672</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-07T23:39:22Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">We will here analyze what the BSBM Explore workload does. This is necessary in order to compare benchmark results at different scales. Historically, BSBM had a Query 6 whose share of the metric approached 100% as scale increased. The present mix does not have this query, but different queries still have different relative importance at different scales. We will here look at database-running statistics for BSBM at different scales. Finally, we look at CPU profiles. But first, let us see what BSBM reads in general. The system is in steady state after around 1500 query mixes; after this the working set does not shift much. After several thousand query mixes, we have: SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS ================= ============================ ========== ======= ======== ======= ========= DB.DBA.RDF_OBJ RDF_OBJ 114105938 3302150 2 0 3171275 DB.DBA.RDF_QUAD RDF_QUAD 977426773 2041156 0 0 1970712 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 8250414 509239 6 15 491631 DB.DBA.RDF_QUAD RDF_QUAD_POGS 3677233812 183860 0 0 175386 DB.DBA.RDF_IRI RDF_IRI 32 99710 302151 5 95353 DB.DBA.RDF_QUAD RDF_QUAD_OP 30597 51593 168 0 48941 DB.DBA.RDF_QUAD RDF_QUAD_SP 265474 47210 17 0 46078 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 6020 212 3 0 212 DB.DBA.RDF_PREFIX RDF_PREFIX 0 167 16700 0 157 The first column is the table, then the index, then the number of times a row was found. The fourth number is the count of disk pages read. The last number is the count of 8K buffer pool pages in use for caching pages of the index in question. Note that the index is clustered, i.e., there is no table data structure separate from the index. Most of the reads are for strings or RDF literals. After this comes the PSOG index for getting a property value given the subject. After this, but much lower, we have lookups of IRI strings given the ID. The index from object value to subject is used the most but the number of pages is small; only a few properties seem to be concerned. The rest is minimal in comparison. Now let us reset the counts and see what the steady state I/O profile is. SELECT key_stat (key_table, name_part (key_name, 2), &#39;reset&#39;) FROM sys_keys WHERE key_migrate_to IS NULL; SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC; KEY_TABLE INDEX_NAME TOUCHES READS READ_PCT N_DIRTY N_BUFFERS ================= ============================ ========== ======= ======== ======= ========= DB.DBA.RDF_OBJ RDF_OBJ 30155789 79659 0 0 3191391 DB.DBA.RDF_QUAD RDF_QUAD 259008064 8904 0 0 1948707 DB.DBA.RDF_QUAD RDF_QUAD_SP 68002 7730 11 0 53360 DB.DBA.RDF_IRI RDF_IRI 12 5415 41653 6 98804 DB.DBA.RDF_QUAD RDF_QUAD_POGS 975147136 1597 0 0 173459 DB.DBA.RDF_IRI DB_DBA_RDF_IRI_UNQC_RI_ID 2213525 1286 0 17 485093 DB.DBA.RDF_QUAD RDF_QUAD_OP 7999 904 11 0 48568 DB.DBA.RDF_PREFIX DB_DBA_RDF_PREFIX_UNQC_RP_ID 1494 1 0 0 213 Literal strings dominate. The SP index is used only for situations where the P is not specified, i.e., the DESCRIBE query. Based on this, I/O seems to be attributable mostly to this. The first RDF_IRI represents translations from string to IRI id; the second represents translations from IRI id to string. The touch count for the first RDF_IRI is not properly recorded, hence the miss % is out of line. We see SP missing the cache the most since its use is infrequent in the mix. We will next look at query processing statistics. For this we introduce a new meter. The db_activity SQL function provides a session-by-session cumulative statistic of activity. The fields are: rnd - Count of random index lookups. Each first row of a select or insert counts as one, regardless of whether something was found. seq - Count of sequential rows. Every move to next row on a cursor counts as 1, regardless of whether conditions match. same seg - For column store only; counts how many times the next row in a vectored join using an index falls in the same segment as the previous random access. A segment is the stretch of rows between entries in the sparse top level index on the column projection. same pg - Counts how many times a vectored index join finds the next match on the same page as the previous one. same par - Counts how many times the next lookup in a vectored index join falls on a different page than the previous but still under the same parent. disk - Counts how many disk reads were made, including any speculative reads initiated. spec disk - Counts speculative disk reads. messages - Counts cluster interconnect messages B (KB, MB, GB) - is the total length of the cluster interconnect messages. fork - Counts how many times a thread was forked (started) for query parallelization. The numbers are given with 4 significant digits and a scale suffix. G is 10^9 (1,000,000,000); M is 10^6 (1,000,000), K is 10^3 (1,000). We run 2000 query mixes with 16 Users. The special http account keeps a cumulative account of all activity on web server threads. SELECT db_activity (2, &#39;http&#39;); 1.674G rnd  3.223G seq      0 same seg  1.286G same pg  314.8M same par  6.186M disk  6.461M spec disk      0B /     0 messages  298.6K fork We see that random access dominates. The seq number is about twice the rnd number, meaning that the average random lookup gets two rows. Getting a row at random obviously takes more time than getting the next row. Since the index used is row-wise, the same seg is 0; the same pg indicates that 77% of the random accesses fall on the same page as the previous random access; most of the remaining random accesses fall under the same parent as the previous one. There are more speculative reads than disk reads which is an artifact of counting some concurrently speculated reads twice. This does indicate that speculative reads dominate. This is because a large part of the run was in the warm-up state with aggressive speculative reading. We reset the counts and run another 2000 mixes. Now let us look at the same reading after 2000 mixes, 16 user at 100Mt. 234.3M rnd  420.5M seq      0 same seg   188.8M same pg  29.09M same par  808.9K disk  919.9K spec disk      0B /      0 messages  76K fork We note that the ratios between the random and sequential and same page/parent counts are about the same. The sequential number looks to be even a bit smaller in proportion. The count of random accesses for the 100Mt run is 14% of the count for the 1000Mt run. The count of query parallelization threads is also much lower since it is worthwhile to schedule a new thread only if there are at least a few thousand operations to perform on it. The precise criterion for making a thread is that according to the cost model guess, the thread must have at least 5ms worth of work. We note that the 100 Mt throughput is a little over three-times that of the 1000 Mt throughput, as reported before. We might justifiably ask why the 100 Mt run is not seven-times faster instead, for this much less work. We note that for one-off random access, it makes no real difference whether the tree has 100 M or 1000 M rows; this translates to roughly 27 vs 30 comparisons, so the depth of the tree is not a factor per se. Besides, vectoring makes the tree often look only one or two levels deep, so the total row count matters even less there. To elucidate this last question, we look at the CPU profiles. We take an oprofile of 100 Single User mixes at both scales. For 100 Mt: 61161 10.1723 cmpf_iri64n_iri64n_anyn_gt_lt 31321 5.2093 box_equal 19027 3.1646 sqlo_parse_tree_has_node 15905 2.6453 dk_alloc 15647 2.6024 itc_next_set_neq 12702 2.1126 itc_vec_split_search 12487 2.0768 itc_dive_transit 11450 1.9044 itc_bm_vec_row_check 10646 1.7706 itc_page_rcf_search 9223 1.5340 id_hash_get 9215 1.5326 gen_qsort 8867 1.4748 sqlo_key_part_best 8807 1.4648 itc_param_cmp 8062 1.3409 cmpf_iri64n_iri64n 6820 1.1343 sqlo_in_list 6005 0.9987 dc_iri_id_cmp 5905 0.9821 dk_free_tree 5801 0.9648 box_hash 5509 0.9163 dks_esc_write 5444 0.9054 sql_tree_hash_1 For 1000 Mt 754331 31.4149 cmpf_iri64n_iri64n_anyn_gt_lt 146165 6.0872 itc_vec_split_search 144795 6.0301 itc_next_set_neq 131671 5.4836 itc_dive_transit 110870 4.6173 itc_page_rcf_search 66780 2.7811 gen_qsort 66434 2.7667 itc_param_cmp 58450 2.4342 itc_bm_vec_row_check 55213 2.2994 dk_alloc 47793 1.9904 cmpf_iri64n_iri64n 44277 1.8440 dc_iri_id_cmp 39489 1.6446 cmpf_int64n 36880 1.5359 dc_append_bytes 36601 1.5243 dv_compare 31286 1.3029 dc_any_value_prefetch 25457 1.0602 itc_next_set 20852 0.8684 box_equal 19895 0.8285 dk_free_tree 19698 0.8203 itc_page_insert_search 19367 0.8066 dc_copy The top function in both is the compare for an equality of two leading IRIs and a range for the trailing any. This corresponds to the range check in Q5. At the larger scale this is three times more important. At the smaller scale, the share of query optimization is about 6.5 times greater. The top function in this category is box_equal with 5.2% vs 0.87%. The remaining SQL compiler functions are all in proportion to this, totaling 14.3% of the 100 Mt top-20 profile. From this sample it appears ten times more scale is seven times more database operations. This is not taken into account in the metric. Query compilation is significant at the small end, and no longer significant at 1000 Mt. From these numbers, we could say that Virtuoso is about two times more efficient in terms of database operation throughput at 1000 Mt than at 100 Mt. We may conclude that different BSBM scales measure different things. The TPC workloads are relatively better in that they have a balance between metric components that stay relatively constant across a large range of scales. This is not necessarily something that should be fixed in the BSBM Explore mix. We must however take these factors better into account in developing the BI mix. Let us also remember that BSBM Explore is a relational workload. Future posts in this series will outline how we propose to make RDF-friendlier benchmarks. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued Benchmarks, Redux (part 7): What Does BSBM Explore Measure? (this post) Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>We will here analyze what the <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x1db49f28">BSBM</a> Explore workload does. This is necessary in order to compare benchmark results at different scales. Historically, BSBM had a Query 6 whose share of the metric approached 100% as scale increased. The present mix does not have this query, but different queries still have different relative importance at different scales.</p>

<p>We will here look at database-running statistics for BSBM at different scales. Finally, we look at <a class="auto-href" href="http://dbpedia.org/resource/Central_processing_unit" id="link-id0x1f150460">CPU</a> profiles.</p>


<p>But first, let us see what BSBM reads in general. The system is in steady state after around 1500 query mixes; after this the working set does not shift much. After several thousand query mixes, we have:</p>

<p>
<code>SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC;</code>
</p>

<blockquote>
 <code><pre>
KEY_TABLE          INDEX_NAME                       TOUCHES    READS  READ_PCT  N_DIRTY  N_BUFFERS
=================  ============================  ==========  =======  ========  =======  =========
DB.DBA.<a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x1ddb0b50">RDF</a>_OBJ     RDF_OBJ                        114105938  3302150         2        0    3171275
DB.DBA.RDF_QUAD    RDF_QUAD                       977426773  2041156         0        0    1970712
DB.DBA.RDF_IRI     DB_DBA_RDF_IRI_UNQC_RI_ID        8250414   509239         6       15     491631
DB.DBA.RDF_QUAD    RDF_QUAD_POGS                 3677233812   183860         0        0     175386
DB.DBA.RDF_IRI     RDF_IRI                               32    99710    302151        5      95353
DB.DBA.RDF_QUAD    RDF_QUAD_OP                        30597    51593       168        0      48941
DB.DBA.RDF_QUAD    RDF_QUAD_SP                       265474    47210        17        0      46078
DB.DBA.RDF_PREFIX  DB_DBA_RDF_PREFIX_UNQC_RP_ID        6020      212         3        0        212
DB.DBA.RDF_PREFIX  RDF_PREFIX                             0      167     16700        0        157
</pre>
 </code>
</blockquote>


<p>The first column is the table, then the index, then the number of times a row was found. The fourth number is the count of disk pages read. The last number is the count of 8K buffer pool pages in use for caching pages of the index in question. Note that the index is clustered, i.e., there is no table <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x1d4f9808">data</a> structure separate from the index. Most of the reads are for strings or RDF literals. After this comes the <code>PSOG</code> index for getting a property value given the subject. After this, but much lower, we have lookups of IRI strings given the ID. The index from object value to subject is used the most but the number of pages is small; only a few properties seem to be concerned. The rest is minimal in comparison.</p>

<p>Now let us reset the counts and see what the steady state I/O profile is.</p>

<p>
<code>SELECT key_stat (key_table, name_part (key_name, 2), &#39;reset&#39;) FROM sys_keys WHERE key_migrate_to IS NULL;</code>
</p>
<p>
<code>SELECT TOP 10 * FROM sys_d_stat ORDER BY reads DESC;</code>
</p>

<blockquote>
 <code><pre>
KEY_TABLE          INDEX_NAME                       TOUCHES    READS  READ_PCT  N_DIRTY  N_BUFFERS
=================  ============================  ==========  =======  ========  =======  =========
DB.DBA.RDF_OBJ     RDF_OBJ                         30155789    79659         0        0    3191391
DB.DBA.RDF_QUAD    RDF_QUAD                       259008064     8904         0        0    1948707
DB.DBA.RDF_QUAD    RDF_QUAD_SP                        68002     7730        11        0      53360
DB.DBA.RDF_IRI     RDF_IRI                               12     5415     41653        6      98804
DB.DBA.RDF_QUAD    RDF_QUAD_POGS                  975147136     1597         0        0     173459
DB.DBA.RDF_IRI     DB_DBA_RDF_IRI_UNQC_RI_ID        2213525     1286         0       17     485093
DB.DBA.RDF_QUAD    RDF_QUAD_OP                         7999      904        11        0      48568
DB.DBA.RDF_PREFIX  DB_DBA_RDF_PREFIX_UNQC_RP_ID        1494        1         0        0        213
</pre>
 </code>
</blockquote>



<p>Literal strings dominate. The <code>SP</code> index is used only for situations where the <code>P</code> is not specified, i.e., the <code>DESCRIBE</code> query. Based on this, I/O seems to be attributable mostly to this. The first <code>RDF_IRI</code> represents translations from string to IRI id; the second represents translations from IRI id to string. The touch count for the first <code>RDF_IRI</code> is not properly recorded, hence the miss % is out of line. We see <code>SP</code> missing the <a class="auto-href" href="http://dbpedia.org/resource/Cache" id="link-id0x17d2e670">cache</a> the most since its use is infrequent in the mix.</p>


<p>We will next look at query processing statistics. For this we introduce a new meter.</p>

<p>The <code>db_activity</code> <a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x1d4915b8">SQL</a> function provides a session-by-session cumulative statistic of activity. The fields are: </p>

<ul>
<li>
  <b><code>rnd</code>
  </b> - Count of <i>random index lookups</i>. Each first row of a select or insert counts as one, regardless of whether something was found.</li>
<li>
  <b><code>seq</code>
  </b> - Count of <i>sequential rows</i>. Every move to next row on a cursor counts as 1, regardless of whether conditions match.</li>
<li>
  <b><code>same seg</code>
  </b> - For column store only; counts how many times the next row in a vectored join using an index falls in the <i>same segment</i> as the previous random access. A segment is the stretch of rows between entries in the sparse top level index on the column projection.</li>
<li>
  <b><code>same pg</code>
  </b> - Counts how many times a vectored index join finds the next match on the <i>same page</i> as the previous one.</li>
<li>
  <b><code>same par</code>
  </b> - Counts how many times the next lookup in a vectored index join falls on a different page than the previous but still under the <i>same parent</i>.</li>
<li>
  <b><code>disk</code>
  </b> - Counts how many <i>disk reads</i> were made, including any speculative reads initiated.</li>
<li>
  <b><code>spec disk</code>
  </b> - Counts <i>speculative disk reads</i>.</li>
<li>
  <b><code>messages</code>
  </b> - Counts <i>cluster interconnect messages</i> </li>
<li>
  <b><code>B (KB, MB, GB)</code>
  </b> - is the <i>total length</i> of the cluster interconnect messages.</li>
<li>
  <b><code>fork</code>
  </b> - Counts how many times a <i>thread was forked (started)</i> for query parallelization.</li>
</ul>

<p>The numbers are given with 4 significant digits and a scale suffix. G is 10^9 (1,000,000,000); M is 10^6 (1,000,000), K is 10^3 (1,000).</p>

<p>We run 2000 query mixes with 16 Users. The special <code><a class="auto-href" href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x1bf7f318">http</a></code> account keeps a cumulative account of all activity on web server threads.</p>

<blockquote>
<p>
  <code>SELECT db_activity (2, &#39;http&#39;);</code>
</p>
<p>
  <code>1.674G rnd  3.223G seq      0 same seg   1.286G same pg  314.8M same par  6.186M disk  6.461M spec disk      0B /     0 messages  298.6K fork</code>
</p>
</blockquote>

<p>We see that random access dominates. The <code>seq</code> number is about twice the <code>rnd</code> number, meaning that the average random lookup gets two rows. Getting a row at random obviously takes more time than getting the next row. Since the index used is row-wise, the <code>same seg</code> is 0; the <code>same pg</code> indicates that 77% of the random accesses fall on the same page as the previous random access; most of the remaining random accesses fall under the same parent as the previous one.</p>

<p>There are more speculative reads than disk reads which is an artifact of counting some concurrently speculated reads twice. This does indicate that speculative reads dominate. This is because a large part of the run was in the warm-up state with aggressive speculative reading. We reset the counts and run another 2000 mixes.</p>

<p>Now let us look at the same reading after 2000 mixes, 16 user at 100Mt.</p>

<blockquote>
<p>
  <code>234.3M rnd  420.5M seq      0 same seg   188.8M same pg  29.09M same par  808.9K disk  919.9K spec disk      0B /      0 messages     76K fork</code>
</p>
</blockquote>


<p>We note that the ratios between the random and sequential and same page/parent counts are about the same. The sequential number looks to be even a bit smaller in proportion. The count of random accesses for the 100Mt run is 14% of the count for the 1000Mt run. The count of query parallelization threads is also much lower since it is worthwhile to schedule a new thread only if there are at least a few thousand operations to perform on it. The precise criterion for making a thread is that according to the cost model guess, the thread must have at least 5ms worth of work.</p>

<p>We note that the 100 Mt throughput is a little over three-times that of the 1000 Mt throughput, as reported before. We might justifiably ask why the 100 Mt run is not seven-times faster instead, for this much less work. </p>

<p>We note that for one-off random access, it makes no real difference whether the tree has 100 M or 1000 M rows; this translates to roughly 27 vs 30 comparisons, so the depth of the tree is not a factor <i>per se</i>. Besides, vectoring makes the tree often look only one or two levels deep, so the total row count matters even less there.</p>

<p>To elucidate this last question, we look at the CPU profiles. We take an <a href="http://oprofile.sourceforge.net/about/" id="link-id0x1efb3360">oprofile</a> of 100 Single User mixes at both scales.</p>

For 100 Mt:

<blockquote>
 <code><pre>
61161    10.1723  cmpf_iri64n_iri64n_anyn_gt_lt
31321     5.2093  box_equal
19027     3.1646  sqlo_parse_tree_has_node
15905     2.6453  dk_alloc
15647     2.6024  itc_next_set_neq
12702     2.1126  itc_vec_split_search
12487     2.0768  itc_dive_transit
11450     1.9044  itc_bm_vec_row_check
10646     1.7706  itc_page_rcf_search
 9223     1.5340  id_hash_get
 9215     1.5326  gen_qsort
 8867     1.4748  sqlo_key_part_best
 8807     1.4648  itc_param_cmp
 8062     1.3409  cmpf_iri64n_iri64n
 6820     1.1343  sqlo_in_list
 6005     0.9987  dc_iri_id_cmp
 5905     0.9821  dk_free_tree
 5801     0.9648  box_hash
 5509     0.9163  dks_esc_write
 5444     0.9054  sql_tree_hash_1
</pre>
 </code>
</blockquote>


For 1000 Mt

<blockquote>
 <code><pre>
754331   31.4149  cmpf_iri64n_iri64n_anyn_gt_lt
146165    6.0872  itc_vec_split_search
144795    6.0301  itc_next_set_neq
131671    5.4836  itc_dive_transit
110870    4.6173  itc_page_rcf_search
 66780    2.7811  gen_qsort
 66434    2.7667  itc_param_cmp
 58450    2.4342  itc_bm_vec_row_check
 55213    2.2994  dk_alloc
 47793    1.9904  cmpf_iri64n_iri64n
 44277    1.8440  dc_iri_id_cmp
 39489    1.6446  cmpf_int64n
 36880    1.5359  dc_append_bytes
 36601    1.5243  dv_compare
 31286    1.3029  dc_any_value_prefetch
 25457    1.0602  itc_next_set
 20852    0.8684  box_equal
 19895    0.8285  dk_free_tree
 19698    0.8203  itc_page_insert_search
 19367    0.8066  dc_copy
</pre>
 </code>
</blockquote>


<p>The top function in both is the compare for an equality of two leading IRIs and a range for the trailing any. This corresponds to the range check in Q5. At the larger scale this is three times more important. At the smaller scale, the share of query <a class="auto-href" href="http://dbpedia.org/resource/Program_optimization" id="link-id0x1bf8ca38">optimization</a> is about 6.5 times greater. The top function in this category is <code>box_equal</code> with 5.2% vs 0.87%. The remaining SQL compiler functions are all in proportion to this, totaling 14.3% of the 100 Mt top-20 profile.</p>

<p>From this sample it appears ten times more scale is seven times more database operations. This is not taken into account in the metric. Query compilation is significant at the small end, and no longer significant at 1000 Mt. From these numbers, we could say that <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1be12350">Virtuoso</a> is about two times more efficient in terms of database operation throughput at 1000 Mt than at 100 Mt.</p>



<p>We may conclude that different BSBM scales measure different things. The <a class="auto-href" href="http://www.tpc.org/" id="link-id0x17eb98a0">TPC</a> workloads are relatively better in that they have a balance between metric components that stay relatively constant across a large range of scales.</p>


<p>This is not necessarily something that should be fixed in the BSBM Explore mix. We must however take these factors better into account in developing the BI mix.</p>

<p>Let us also remember that BSBM Explore is a relational workload. Future posts in this series will outline how we propose to make RDF-friendlier benchmarks. </p>


<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li> <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1a9bcff8">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1d3e5470">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x1de94770">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1ea66470">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1f1118d8">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs </a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1669" id="link-id0x1d1c0cd8">Benchmarks, Redux (part 6): BSBM and I/O, continued</a>
</li>
<li>
 Benchmarks, Redux (part 7): What Does BSBM Explore Measure? <i>(this post)</i>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1aaf4180">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1a957610">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x127e75c8">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1c9400f0">Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1d2c1d68">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1ea1fb40">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1c073a10">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1c5541e8">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
 <rss:item xmlns:rss="http://purl.org/rss/1.0/" rdf:about="http://www.openlinksw.com/blog/vdb/blog/?id=1670">
  <rss:title>Benchmarks, Redux (part 6): BSBM and I/O, continued</rss:title>
  <rss:link>http://www.openlinksw.com/blog/vdb/blog/?id=1670</rss:link>
  <wfw:comment xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/mt-tb/Http/comments?id=1670</wfw:comment>
  <wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://www.openlinksw.com/blog/vdb/blog/gems/rsscomment.xml?:id=1670</wfw:commentRss>
  <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2011-03-07T22:36:24Z</dc:date>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">In the words of Jim Gray, disks have become tapes. By this he means that a disk is really only good for sequential access. For this reason, the SSD extent read ahead was incomparably better. We note that in the experiment, every page in the general area of the database the experiment touched would in time be touched, and that the whole working set would end up in memory. Therefore no speculative read would be wasted. Therefore it stands to reason to read whole extents. So I changed the default behavior to use a very long window for triggering read-ahead as long as the buffer pool was not full. After the initial filling of the buffer pool, the read ahead would require more temporal locality before kicking in. Still, the scheme was not really good since the rest of the extent would go for background-read and the triggering read would be done right then, leading to extra seeks. Well, this is good for latency but bad for throughput. So I changed this too, going to an &quot;elevator only&quot; scheme where reads that triggered read-ahead would go with the read-ahead batch. Reads that did not trigger read-ahead would still be done right in place, thus favoring latency but breaking any sequentiality with its attendant 10+ ms penalty. We keep in mind that the test we target is BSBM warm-up time, which is purely a throughput business. One could have timeouts and could penalize queries that sacrificed too much latency to throughput. We note that even for this very simple metric, just reading the allocated database pages from start to end is not good since a large number of pages in fact never get read during a run. We further note that the vectored read-ahead without any speculation will be useful as-is for cases with few threads and striping, since at least one thread&#39;s random I/Os get to go to multiple threads. The benefit is less in multiuser situations where disks are randomly busy anyhow. In the previous I/O experiments, we saw that with vectored read ahead and no speculation, there were around 50 pages waiting for I/O at all times. With an easily-triggered extent read-ahead, there were around 4000 pages waiting. The more pages are waiting for I/O, the greater the benefit from the elevator algorithm of servicing I/O in order of file offset. In Virtuoso 5 we had a trick that would, if the buffer pool was not full, speculatively read every uncached sibling of every index tree node it visited. This filled the cache quite fast, but was useless after the cache was full. The extent read ahead first implemented in 6 was less aggressive, but would continue working with full cache and did in fact help with shifts in the working set. The next logical step is to combine the vector and extent read-ahead modes. We see what pages we will be getting, then take the distinct extents; if we have been to this extent within the time window, we just add all the uncached allocated pages of the extent to the batch. With this setting, especially at the start of the run, we get large read-ahead batches and maintain I/O queues of 5000 to 20000 pages. The SSD starting time drops to about 120 seconds from cold start to reach 1200% CPU. We see transfer rates of up to 150 MB/s per SSD. With HDDs, we see transfer rates around 14 MB/s per drive, mostly reading chunks of an average of seventy-one (71) 8K pages. The BSBM workload does not offer better possibilities for optimization, short of pre-reading the whole database, which is not practical at large scales. Some Details First we start from cold disk, with and without mandatory read of the whole extent on the touch. Without any speculation but with vectored read-ahead, here are the times for the first 11 query mixes: 0: 151560.82 ms, total: 151718 ms 1: 179589.08 ms, total: 179648 ms 2: 71974.49 ms, total: 72017 ms 3: 102701.73 ms, total: 102729 ms 4: 58834.41 ms, total: 58856 ms 5: 65926.34 ms, total: 65944 ms 6: 68244.69 ms, total: 68274 ms 7: 39197.15 ms, total: 39215 ms 8: 45654.93 ms, total: 45674 ms 9: 34850.30 ms, total: 34878 ms 10: 100061.30 ms, total: 100079 ms The average CPU during this time was 5%. The best read throughput was 2.5 MB/s; the average was 1.35 MB/s. The average disk read was 16 ms. With vectored read-ahead and full extents only, i.e., max speculation: 0: 178854.23 ms, total: 179034 ms 1: 110826.68 ms, total: 110887 ms 2: 19896.11 ms, total: 19941 ms 3: 36724.43 ms, total: 36753 ms 4: 21253.70 ms, total: 21285 ms 5: 18417.73 ms, total: 18439 ms 6: 21668.92 ms, total: 21690 ms 7: 12236.49 ms, total: 12267 ms 8: 14922.74 ms, total: 14945 ms 9: 11502.96 ms, total: 11523 ms 10: 15762.34 ms, total: 15792 ms ... 90: 1747.62 ms, total: 1761 ms 91: 1701.01 ms, total: 1714 ms 92: 1300.62 ms, total: 1318 ms 93: 1873.15 ms, total: 1886 ms 94: 1508.24 ms, total: 1524 ms 95: 1748.15 ms, total: 1761 ms 96: 2076.92 ms, total: 2090 ms 97: 2199.38 ms, total: 2212 ms 98: 2305.75 ms, total: 2319 ms 99: 1771.91 ms, total: 1784 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 1.3006s / 178.8542s Elapsed runtime: 872.993 seconds QMpH: 412.374 query mixes per hour The peak throughput is 91 MB/s, with average around 50 MB/s; CPU average around 50%. We note that the latency of the first query mix is hardly greater than in the non-speculative run, but starting from mix 3 the speed is clearly better. Then the same with cold SSDs. First with no speculation: 0: 5177.68 ms, total: 5302 ms 1: 2570.16 ms, total: 2614 ms 2: 1353.06 ms, total: 1391 ms 3: 1957.63 ms, total: 1978 ms 4: 1371.13 ms, total: 1386 ms 5: 1765.55 ms, total: 1781 ms 6: 1658.23 ms, total: 1673 ms 7: 1273.87 ms, total: 1289 ms 8: 1355.19 ms, total: 1380 ms 9: 1152.78 ms, total: 1167 ms 10: 1787.91 ms, total: 1802 ms ... 90: 1116.25 ms, total: 1128 ms 91: 989.50 ms, total: 1001 ms 92: 833.24 ms, total: 844 ms 93: 1137.83 ms, total: 1150 ms 94: 969.47 ms, total: 982 ms 95: 1138.04 ms, total: 1149 ms 96: 1155.98 ms, total: 1168 ms 97: 1178.15 ms, total: 1193 ms 98: 1120.18 ms, total: 1132 ms 99: 1013.16 ms, total: 1025 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 0.8201s / 5.1777s Elapsed runtime: 127.555 seconds QMpH: 2822.321 query mixes per hour The peak I/O is 45 MB/s, with average 28.3 MB/s; CPU average is 168%. Now, SSDs with max speculation. 0: 44670.34 ms, total: 44809 ms 1: 18490.44 ms, total: 18548 ms 2: 7306.12 ms, total: 7353 ms 3: 9452.66 ms, total: 9485 ms 4: 5648.56 ms, total: 5668 ms 5: 5493.21 ms, total: 5511 ms 6: 5951.48 ms, total: 5970 ms 7: 3815.59 ms, total: 3834 ms 8: 4560.71 ms, total: 4579 ms 9: 3523.74 ms, total: 3543 ms 10: 4724.04 ms, total: 4741 ms ... 90: 673.53 ms, total: 685 ms 91: 534.62 ms, total: 545 ms 92: 730.81 ms, total: 742 ms 93: 1358.14 ms, total: 1370 ms 94: 1098.64 ms, total: 1110 ms 95: 1232.20 ms, total: 1243 ms 96: 1259.57 ms, total: 1273 ms 97: 1298.95 ms, total: 1310 ms 98: 1156.01 ms, total: 1166 ms 99: 1025.45 ms, total: 1034 ms Scale factor: 2848260 Number of warmup runs: 0 Seed: 808080 Number of query mix runs (without warmups): 100 times min/max Querymix runtime: 0.4725s / 44.6703s Elapsed runtime: 269.323 seconds QMpH: 1336.683 query mixes per hour The peak I/O is 339 MB/s, with average 192 MB/s; average CPU is 121%. The above was measured with the read-ahead thread doing single-page reads. We repeated the test with merging reads with small differences. The max IO was 353 MB/s, and average 173 MB/s; average CPU 113%. We see that the start latency is quite a bit longer than without speculation and the CPU % is lower due to higher latency of individual I/O. The I/O rate is fair. We would expect more throughput however. We find that a supposedly better use of the API, doing single requests of up to 100 pages instead of consecutive requests of 1 page, does not make a lot of difference. The peak I/O is a bit higher; overall throughput is a bit lower. We will have to retry these experiments with a better controller. We have at no point seen anything like the 50K 4KB random I/Os promised for the SSDs by the manufacturer. We know for a fact that the controller gives about 700 MB/s sequential read with cat file /dev/null and two drives busy. With 4 drives busy, this does not get better. The best 30 second stretch we saw in a multiuser BSBM warm-up was 590 MB/s, which is consistent with the cat to /dev/null figure. We will later test with 8 SSDs with better controllers. Note that the average I/O and CPU are averages over 30 second measurement windows; thus for short running tests, there is some error from the window during which the activity ended. Let us now see if we can make a BSBM instance warm up from disk in a reasonable time. We run 16 users with max speculation. We note that after reading 7,500,000 buffers we are not entirely free of disk. The max speculation read-ahead filled the cache in 17 minutes, with an average of 58 MB/s. After the cache is filled, the system shifts to a more conservative policy on extent read-ahead; one which in fact never gets triggered with the BSBM Explore in steady state. The vectored read-ahead is kept on since this by itself does not read pages that are not needed. However, the vectored read-ahead does not run either, because the data that is accessed in larger batches is already in memory. Thus there remains a trickle of an average 0.49 MB/s from disk. This keeps CPU around 350%. With SSDs, the trickle is about 1.5 MB/s and CPU is around 1300% in steady state. Thus SSDs give approximately triple the throughput in a situation where there is a tiny amount of continuous random disk access. The disk access in question is 80% for retrieving RDF literal strings, presumably on behalf of the DESCRIBE query in the mix. This query touches things no other query touches and does so one subject at a time, in a way that can neither be anticipated nor optimized. The Virtuoso 7 column store will deal with this better because it is more space efficient overall. If we apply stream-compression to literals, these will go in under half the space, while quads will go in maybe one-quarter the space. Thus 3000 Mt all from memory should be possible with 72 GB RAM. 1000 Mt row-wise did fit in in 72 GB RAM except for the random literals accessed by the the DESCRIBE. This alone drops throughput to under a third of the memory-only throughput if using HDDs. SSDs, on the other hand, can largely neutralize this effect. Conclusions We have looked at basics of I/O. SSDs have been found to be a readily available solution to I/O bottlenecks without need for reconfiguration or complex I/O policies. We have been able to get a decent read rate under conditions of server warm-up or shift of working set even with HDDs. More advanced I/O matters will be covered with the column store. We note that the techniques discussed here apply identically to rows and columns. As concerns BSBM, it seems appropriate to include a warm-up time. In practice, this means that the store just must eagerly pre-read. This is not hard to do and can be quite useful. Benchmarks, Redux Series Benchmarks, Redux (part 1): On RDF Benchmarks Benchmarks, Redux (part 2): A Benchmarking Story Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs Benchmarks, Redux (part 6): BSBM and I/O, continued (this post) Benchmarks, Redux (part 7): What Does BSBM Explore Measure? Benchmarks, Redux (part 8): BSBM Explore and Update Benchmarks, Redux (part 9): BSBM With Cluster Benchmarks, Redux (part 10): LOD2 and the Benchmark Process Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks Benchmarks, Redux (part 12): Our Own BSBM Results Report Benchmarks, Redux (part 13): BSBM BI Modifications Benchmarks, Redux (part 14): BSBM BI Mix Benchmarks, Redux (part 15): BSBM Test Driver Enhancements</dc:description>
  <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>In the words of Jim Gray, disks have become tapes. By this he means that a disk is really only good for sequential access. For this reason, the SSD extent read ahead was incomparably better. We note that in the experiment, every page in the general area of the database the experiment touched would in time be touched, and that the whole working set would end up in memory. Therefore no speculative read would be wasted. Therefore it stands to reason to read whole extents.</p>

<p>So I changed the default behavior to use a very long window for triggering read-ahead as long as the buffer pool was not full. After the initial filling of the buffer pool, the read ahead would require more temporal locality before kicking in. </p>

<p>Still, the scheme was not really good since the rest of the extent would go for background-read and the triggering read would be done right then, leading to extra seeks. Well, this is good for latency but bad for throughput. So I changed this too, going to an &quot;elevator only&quot; scheme where reads that triggered read-ahead would go with the read-ahead batch. Reads that did not trigger read-ahead would still be done right in place, thus favoring latency but breaking any sequentiality with its attendant 10+ ms penalty.</p>


<p>We keep in mind that the test we target is <a class="auto-href" href="http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/index.html" id="link-id0x17c88010">BSBM</a> warm-up time, which is purely a throughput business. One could have timeouts and could penalize queries that sacrificed too much latency to throughput.</p>

<p>We note that even for this very simple metric, just reading the allocated database pages from start to end is not good since a large number of pages in fact never get read during a run.</p>

<p>We further note that the vectored read-ahead without any speculation will be useful as-is for cases with few threads and striping, since at least one thread&#39;s random I/Os get to go to multiple threads. The benefit is less in multiuser situations where disks are randomly busy anyhow. </p>

<p>In the previous I/O experiments, we saw that with vectored read ahead and no speculation, there were around 50 pages waiting for I/O at all times. With an easily-triggered extent read-ahead, there were around 4000 pages waiting. The more pages are waiting for I/O, the greater the benefit from the elevator algorithm of servicing I/O in order of file offset. </p>

<p>In <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1c51fae0">Virtuoso</a> 5 we had a trick that would, if the buffer pool was not full, speculatively read every uncached sibling of every index tree node it visited. This filled the <a class="auto-href" href="http://dbpedia.org/resource/Cache" id="link-id0x1d6a0cf0">cache</a> quite fast, but was useless after the cache was full. The extent read ahead first implemented in 6 was less aggressive, but would continue working with full cache and did in fact help with shifts in the working set.</p>

<p>The next logical step is to combine the vector and extent read-ahead modes. We see what pages we will be getting, then take the distinct extents; if we have been to this extent within the time window, we just add all the uncached allocated pages of the extent to the batch.</p>

<p>With this setting, especially at the start of the run, we get large read-ahead batches and maintain I/O queues of 5000 to 20000 pages. The SSD starting time drops to about 120 seconds from cold start to reach 1200% <a class="auto-href" href="http://dbpedia.org/resource/Central_processing_unit" id="link-id0x1d295448">CPU</a>. We see transfer rates of up to 150 MB/s per SSD. With HDDs, we see transfer rates around 14 MB/s per drive, mostly reading chunks of an average of seventy-one (71) 8K pages.</p>

<p>The BSBM workload does not offer better possibilities for <a class="auto-href" href="http://dbpedia.org/resource/Program_optimization" id="link-id0x1aca8b40">optimization</a>, short of pre-reading the whole database, which is not practical at large scales. </p>

<h2>Some Details</h2>

<p>First we start from cold disk, with and without mandatory read of the whole extent on the touch.</p>

<p>Without any speculation but with vectored read-ahead, here are the times for the first 11 query mixes:</p>

<blockquote>
 <code><pre>
 0: 151560.82 ms, total: 151718 ms
 1: 179589.08 ms, total: 179648 ms
 2:  71974.49 ms, total:  72017 ms
 3: 102701.73 ms, total: 102729 ms
 4:  58834.41 ms, total:  58856 ms
 5:  65926.34 ms, total:  65944 ms
 6:  68244.69 ms, total:  68274 ms
 7:  39197.15 ms, total:  39215 ms
 8:  45654.93 ms, total:  45674 ms
 9:  34850.30 ms, total:  34878 ms
10: 100061.30 ms, total: 100079 ms
</pre>
 </code>
</blockquote>

<p>The average CPU during this time was 5%. The best read throughput was 2.5 MB/s; the average was 1.35 MB/s. The average disk read was 16 ms. </p>

<p>With vectored read-ahead and full extents only, i.e., max speculation:</p>

<blockquote>
 <code><pre>
 0: 178854.23 ms, total: 179034 ms
 1: 110826.68 ms, total: 110887 ms
 2:  19896.11 ms, total:  19941 ms
 3:  36724.43 ms, total:  36753 ms
 4:  21253.70 ms, total:  21285 ms
 5:  18417.73 ms, total:  18439 ms
 6:  21668.92 ms, total:  21690 ms
 7:  12236.49 ms, total:  12267 ms
 8:  14922.74 ms, total:  14945 ms
 9:  11502.96 ms, total:  11523 ms
10:  15762.34 ms, total:  15792 ms
...

90:   1747.62 ms, total:   1761 ms
91:   1701.01 ms, total:   1714 ms
92:   1300.62 ms, total:   1318 ms
93:   1873.15 ms, total:   1886 ms
94:   1508.24 ms, total:   1524 ms
95:   1748.15 ms, total:   1761 ms
96:   2076.92 ms, total:   2090 ms
97:   2199.38 ms, total:   2212 ms
98:   2305.75 ms, total:   2319 ms
99:   1771.91 ms, total:   1784 ms

Scale factor:              2848260
Number of warmup runs:     0
Seed:                      808080
Number of query mix runs 
  (without warmups):       100 times
min/max Querymix runtime:  1.3006s / 178.8542s
Elapsed runtime:           872.993 seconds
QMpH:                      412.374 query mixes per hour
</pre>
 </code>
</blockquote>


<p>The peak throughput is 91 MB/s, with average around 50 MB/s; CPU average around 50%.</p>

<p>We note that the latency of the first query mix is hardly greater than in the non-speculative run, but starting from mix 3 the speed is clearly better. </p>



<p>Then the same with cold SSDs. First with no speculation:</p>

<blockquote>
 <code><pre>
 0:   5177.68 ms, total:   5302 ms
 1:   2570.16 ms, total:   2614 ms
 2:   1353.06 ms, total:   1391 ms
 3:   1957.63 ms, total:   1978 ms
 4:   1371.13 ms, total:   1386 ms
 5:   1765.55 ms, total:   1781 ms
 6:   1658.23 ms, total:   1673 ms
 7:   1273.87 ms, total:   1289 ms
 8:   1355.19 ms, total:   1380 ms
 9:   1152.78 ms, total:   1167 ms
10:   1787.91 ms, total:   1802 ms
...

90:   1116.25 ms, total:   1128 ms
91:    989.50 ms, total:   1001 ms
92:    833.24 ms, total:    844 ms
93:   1137.83 ms, total:   1150 ms
94:    969.47 ms, total:    982 ms
95:   1138.04 ms, total:   1149 ms
96:   1155.98 ms, total:   1168 ms
97:   1178.15 ms, total:   1193 ms
98:   1120.18 ms, total:   1132 ms
99:   1013.16 ms, total:   1025 ms

Scale factor:              2848260
Number of warmup runs:     0
Seed:                      808080
Number of query mix runs 
  (without warmups):       100 times
min/max Querymix runtime:  0.8201s / 5.1777s
Elapsed runtime:           127.555 seconds
QMpH:                      2822.321 query mixes per hour
</pre>
 </code>
</blockquote>


<p>The peak I/O is 45 MB/s, with average 28.3 MB/s; CPU average is 168%.</p>

<p>Now, SSDs with max speculation.</p>

<blockquote>
 <code><pre>
 0:  44670.34 ms, total:  44809 ms
 1:  18490.44 ms, total:  18548 ms
 2:   7306.12 ms, total:   7353 ms
 3:   9452.66 ms, total:   9485 ms
 4:   5648.56 ms, total:   5668 ms
 5:   5493.21 ms, total:   5511 ms
 6:   5951.48 ms, total:   5970 ms
 7:   3815.59 ms, total:   3834 ms
 8:   4560.71 ms, total:   4579 ms
 9:   3523.74 ms, total:   3543 ms
10:   4724.04 ms, total:   4741 ms
...

90:    673.53 ms, total:    685 ms
91:    534.62 ms, total:    545 ms
92:    730.81 ms, total:    742 ms
93:   1358.14 ms, total:   1370 ms
94:   1098.64 ms, total:   1110 ms
95:   1232.20 ms, total:   1243 ms
96:   1259.57 ms, total:   1273 ms
97:   1298.95 ms, total:   1310 ms
98:   1156.01 ms, total:   1166 ms
99:   1025.45 ms, total:   1034 ms

Scale factor:              2848260
Number of warmup runs:     0
Seed:                      808080
Number of query mix runs 
  (without warmups):       100 times
min/max Querymix runtime:  0.4725s / 44.6703s
Elapsed runtime:           269.323 seconds
QMpH:                      1336.683 query mixes per hour
</pre>
 </code>
</blockquote>


<p>The peak I/O is 339 MB/s, with average 192 MB/s; average CPU is 121%.</p>

<p>The above was measured with the read-ahead thread doing single-page reads. We repeated the test with merging reads with small differences. The max IO was 353 MB/s, and average 173 MB/s; average CPU 113%.</p>

<p>We see that the start latency is quite a bit longer than without speculation and the CPU % is lower due to higher latency of individual I/O. The I/O rate is fair. We would expect more throughput however. </p>

<p>We find that a supposedly better use of the API, doing single requests of up to 100 pages instead of consecutive requests of 1 page, does not make a lot of difference. The peak I/O is a bit higher; overall throughput is a bit lower.</p>



<p>We will have to retry these experiments with a better controller. We have at no point seen anything like the 50K 4KB random I/Os promised for the SSDs by the manufacturer. We know for a fact that the controller gives about 700 MB/s sequential read with <code>cat file /dev/null</code> and two drives busy. With 4 drives busy, this does not get better. The best 30 second stretch we saw in a multiuser BSBM warm-up was 590 MB/s, which is consistent with the <code>cat</code> to <code>/dev/null</code> figure. We will later test with 8 SSDs with better controllers. </p>

<p>Note that the average I/O and CPU are averages over 30 second measurement windows; thus for short running tests, there is some error from the window during which the activity ended. </p>


<p>Let us now see if we can make a BSBM instance warm up from disk in a reasonable time. We run 16 users with max speculation. We note that after reading 7,500,000 buffers we are not entirely free of disk. The max speculation read-ahead filled the cache in 17 minutes, with an average of 58 MB/s. After the cache is filled, the system shifts to a more conservative policy on extent read-ahead; one which in fact never gets triggered with the BSBM <i>Explore</i> in steady state. The vectored read-ahead is kept on since this by itself does not read pages that are not needed. However, the vectored read-ahead does not run either, because the <a class="auto-href" href="http://dbpedia.org/resource/Data" id="link-id0x1c9bca60">data</a> that is accessed in larger batches is already in memory. Thus there remains a trickle of an average 0.49 MB/s from disk. This keeps CPU around 350%. With SSDs, the trickle is about 1.5 MB/s and CPU is around 1300% in steady state. Thus SSDs give approximately triple the throughput in a situation where there is a tiny amount of continuous random disk access. The disk access in question is 80% for retrieving <a class="auto-href" href="http://dbpedia.org/resource/Resource_Description_Framework" id="link-id0x1c05e280">RDF</a> literal strings, presumably on behalf of the <code>DESCRIBE</code> query in the mix. This query touches things no other query touches and does so one subject at a time, in a way that can neither be anticipated nor optimized.</p>

<p>The Virtuoso 7 column store will deal with this better because it is more space efficient overall. If we apply stream-compression to literals, these will go in under half the space, while quads will go in maybe one-quarter the space. Thus 3000 Mt all from memory should be possible with 72 GB RAM. 1000 Mt row-wise did fit in in 72 GB RAM except for the random literals accessed by the the <code>DESCRIBE</code>. This alone drops throughput to under a third of the memory-only throughput if using HDDs. SSDs, on the other hand, can largely neutralize this effect.</p>

 
<h2>Conclusions</h2>


<p>We have looked at basics of I/O. SSDs have been found to be a readily available solution to I/O bottlenecks without need for reconfiguration or complex I/O policies. We have been able to get a decent read rate under conditions of server warm-up or shift of working set even with HDDs.</p>

<p>More advanced I/O matters will be covered with the column store. We note that the techniques discussed here apply identically to rows and columns.</p>

<p>As concerns BSBM, it seems appropriate to include a warm-up time. In practice, this means that the store just must eagerly pre-read. This is not hard to do and can be quite useful.</p>


<h3>
<i>Benchmarks, Redux</i> Series</h3>
<ul>
<li> <a href="http://www.openlinksw.com/weblog/oerling/?id=1658" id="link-id0x1b4342b0">Benchmarks, Redux (part 1): On RDF Benchmarks</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1660" id="link-id0x1d3e7388">Benchmarks, Redux (part 2): A Benchmarking Story</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1663" id="link-id0x153c7ba8">Benchmarks, Redux (part 3): Virtuoso 7 vs 6 on BSBM Load and Explore</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1665" id="link-id0x1da11d98">Benchmarks, Redux (part 4): Benchmark Tuning Questionnaire</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1667" id="link-id0x1d25d630">Benchmarks, Redux (part 5): BSBM and I/O; HDDs and SSDs</a>
</li>
<li>
 Benchmarks, Redux (part 6): BSBM and I/O, continued <i>(this post)</i>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1671" id="link-id0x1f1f5ee8">Benchmarks, Redux (part 7): What Does BSBM Explore Measure?</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1673" id="link-id0x1cd44938">Benchmarks, Redux (part 8): BSBM Explore and Update </a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1675" id="link-id0x1d51f848">Benchmarks, Redux (part 9): BSBM With Cluster</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1677" id="link-id0x13d333c0">Benchmarks, Redux (part 10): LOD2 and the Benchmark Process</a>
</li>
<li>
 <a href="http://www.openlinksw.com/weblog/oerling/?id=1678" id="link-id0x1e77a5e8">Benchmarks, Redux (part 11): On the Substance of RDF Benchmarks</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1ea1fb40">Benchmarks, Redux (part 12): Our Own BSBM Results Report</a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1e7786c8">Benchmarks, Redux (part 13): BSBM BI Modifications </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1f8a37f8">Benchmarks, Redux (part 14): BSBM BI Mix </a>
</li>
<li>
  <a href="http://www.openlinksw.com/weblog/oerling/?id=" id="link-id0x1c69e018">Benchmarks, Redux (part 15): BSBM Test Driver Enhancements </a>
</li>
</ul>]]></content:encoded>
  <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuso Data Space Bot &lt;kidehen@openlinksw.com&gt;</dc:creator>
 </rss:item>
</rdf:RDF>
