<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>

<title>Comparison of Open Source Databases with TPC D Queries</title><link>http://www.openlinksw.com:443/blog/vdb/blog/?id=1132</link><description>
Comparison of Open Source Databases with TPC D Queries

Last time we talked about database engine and transactions. Now we have come to the realm of query processing in our revisiting of the DBMS side of Virtuoso.
Now the well established, respectable standard benchmark for the basics of query processing is TPC D with its derivatives H and R. So we have, for testing how different SQL optimizers manage the 22 queries, run a mini version of the D queries with a 1% scale database, some 30M of data, all in memory. This basically catches whether SQL implementations miss some of the expected tricks and how efficient in memory loop and hash joins and aggregation are.
When we get to our next stop, high volume I/O, we will run the same with D databases in the 10G ballpark.
The databases were tested on the same machine, with warm cache, taking the best run of 3. All had full statistics and were running with read committed isolation, where applicable. The data was generated using the procedures from the Virtuoso test suite. The Virtuoso version tested was 5.0, to be released shortly. The MySQL was 5.0.27, the PostgreSQL was 8.1.6. 
  
 Query Query Times in Milliseconds  
  Virtuoso   PostgreSQL   MySQL   MySQL with InnoDB   
  Q1   206   763   312   198   
  Q2   4   6   3   3   
  Q3   13   51   254   64   
  Q4   4   16   24   60   
  Q5   15   22   64   68   
  Q6   9   70   189   65   
  Q7   52   143   211   84   
  Q8   29   31   13   11   
  Q9   36   114   97   61   
  Q10   32   51   117   57   
  Q11   16   9   12   10   
  Q12   8   21   18   130   
  Q13   18   74   -   -   
  Q14   7   21   418   1425   
  Q15   14   43   389   122   
  Q16   16   22   18   25   
  Q17   1   54   26   10   
  Q18   82   120   -   -   
  Q19   19   8   2   17   
  Q20   7   15   66   52   
  Q21   34   86   524   278   
  Q22   4   323   3311  805   
 Total (msec) 626 2063 6068 3545   
We lead by a fair margin but MySQL is hampered by obviously getting some execution plans wrong and not doing Q13 and Q18 at all, at least not under several tens of seconds; so we left these out of the table in the interest of having comparable totals.
As usual, we also ran the workload on Oracle 10g R2. Since Oracle does not like their numbers being published without explicit approval, we will just say that we are even with them within the parameters described above. Oracle has a more efficient decimal type so it wins where that is central, as on Q1. Also it seems to notice that the GROUP BYs of Q18 are produced in order of grouping columns, so it needs no intermediate table for storing the aggregates. If we addressed these matters, we&#39;d lead by some 15% whereas now we are even. A faster decimal arithmetic implementation may be in the release after next. In the next posts, we will look at IO and disk allocation, and also return to RDF and LUBM.
</description><pubDate>Mon, 05 Feb 2007 11:44:34 GMT</pubDate><generator>Virtuoso Universal Server 08.03.3334</generator><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Virtuoso Data Space Bot</dc:creator><image><title>Comparison of Open Source Databases with TPC D Queries</title><url>http://www.openlinksw.com:443/weblog/public/images/vbloglogo.gif</url><link>http://www.openlinksw.com:443/blog/vdb/blog/?id=1132</link><description>A great place to track Virtuoso&#39;s rapid evolution.</description><width>88</width><height>31</height></image>
<item><title>anonymous</title><guid>http://www.openlinksw.com:443/blog/vdb/blog/?id=1132#4494</guid><link>http://www.openlinksw.com:443/blog/vdb/blog/?id=1132#4494</link><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/" /><pubDate>Tue, 13 Feb 2007 09:18:58 GMT</pubDate><description>Where Firebird? It&#39;s Open Source Databases too.</description></item>
</channel>
</rss>
