We had a look at Chris Bizer's initial results with the Berlin SPARQL Benchmark (BSBM) on Virtuoso. The first results were rather bad, as nearly all of the run time was spent optimizing the SPARQL statements and under 10% actually running them.

So I spent a couple of days on the SPARQL/SQL compiler, to the effect of making it do a better guess of initial execution plan and streamlining some operations. In fact, many of the queries in BSBM are not particularly sensitive to execution plan, as they access a very small portion of the database. So to close the matter, I put in a flag that makes the SQL compiler give up on devising new plans if the time of the best plan so far is less than the time spent compiling so far.

With these changes, available now as a diff on top of 5.0.7, we run quite well, several times better than initially. With the compiler time cut-off in place (ini parameter StopCompilerWhenXOverRunTime = 1), we get the following times, output from the BSBM test driver:

Starting test...

0: 1031.22 ms, total: 1151 ms
1:  982.89 ms, total: 1040 ms
2:  923.27 ms, total:  968 ms
3:  898.37 ms, total:  932 ms
4:  855.70 ms, total:  865 ms

Scale factor:               10000
Number of query mix runs:   5 times
min/max Query mix runtime:  0.8557 s / 1.0312 s
Total runtime:              4.691 seconds
QMpH:                       3836.77 query mixes per hour
CQET:                       0.93829 seconds average runtime 
                                       of query mix
CQET (geom.):               0.93625 seconds geometric mean 
                                       runtime of query mix

Metrics for Query 1:
   Count:                 5 times executed in whole run
   AQET:                  0.012212 seconds (arithmetic mean)
   AQET(geom.):           0.009934 seconds (geometric mean)
   QPS:                   81.89 Queries per second
   minQET/maxQET:         0.00684000s / 0.03115700s
   Average result count:  7.0
   min/max result count:  3 / 10

Metrics for Query 2:
   Count:                 35 times executed in whole run
   AQET:                  0.030490 seconds (arithmetic mean)
   AQET(geom.):           0.029776 seconds (geometric mean)
   QPS:                   32.80 Queries per second
   minQET/maxQET:         0.02467300s / 0.06753000s
   Average result count:  22.5
   min/max result count:  15 / 30

Metrics for Query 3:
   Count:                 5 times executed in whole run
   AQET:                  0.006947 seconds (arithmetic mean)
   AQET(geom.):           0.006905 seconds (geometric mean)
   QPS:                   143.95 Queries per second
   minQET/maxQET:         0.00580000s / 0.00795100s
   Average result count:  4.0
   min/max result count:  0 / 10

Metrics for Query 4:
   Count:                 5 times executed in whole run
   AQET:                  0.008858 seconds (arithmetic mean)
   AQET(geom.):           0.008829 seconds (geometric mean)
   QPS:                   112.89 Queries per second
   minQET/maxQET:         0.00804400s / 0.01019500s
   Average result count:  3.4
   min/max result count:  0 / 10

Metrics for Query 5:
   Count:                 5 times executed in whole run
   AQET:                  0.087542 seconds (arithmetic mean)
   AQET(geom.):           0.087327 seconds (geometric mean)
   QPS:                   11.42 Queries per second
   minQET/maxQET:         0.08165600s / 0.09889200s
   Average result count:  5.0
   min/max result count:  5 / 5

Metrics for Query 6:
   Count:                 5 times executed in whole run
   AQET:                  0.131222 seconds (arithmetic mean)
   AQET(geom.):           0.131216 seconds (geometric mean)
   QPS:                   7.62 Queries per second
   minQET/maxQET:         0.12924200s / 0.13298200s
   Average result count:  3.6
   min/max result count:  3 / 5

Metrics for Query 7:
   Count:                 20 times executed in whole run
   AQET:                  0.043601 seconds (arithmetic mean)
   AQET(geom.):           0.040890 seconds (geometric mean)
   QPS:                   22.94 Queries per second
   minQET/maxQET:         0.01984400s / 0.06012600s
   Average result count:  26.4
   min/max result count:  5 / 96

Metrics for Query 8:
   Count:                 10 times executed in whole run
   AQET:                  0.018168 seconds (arithmetic mean)
   AQET(geom.):           0.016205 seconds (geometric mean)
   QPS:                   55.04 Queries per second
   minQET/maxQET:         0.01097600s / 0.05066900s
   Average result count:  12.8
   min/max result count:  6 / 20

Metrics for Query 9:
   Count:                 20 times executed in whole run
   AQET:                  0.043813 seconds (arithmetic mean)
   AQET(geom.):           0.043807 seconds (geometric mean)
   QPS:                   22.82 Queries per second
   minQET/maxQET:         0.04274900s / 0.04504100s
   Average result count:  0.0
   min/max result count:  0 / 0

Metrics for Query 10:
   Count:                 15 times executed in whole run
   AQET:                  0.030697 seconds (arithmetic mean)
   AQET(geom.):           0.029651 seconds (geometric mean)
   QPS:                   32.58 Queries per second
   minQET/maxQET:         0.02072000s / 0.03975700s
   Average result count:  1.1
   min/max result count:  0 / 4

   real  0 m 5.485 s
   user  0 m 2.233 s
   sys   0 m 0.170 s

Of the approximately 5.5 seconds of running five query mixes, the test driver spends 2.2 s. The server side processing time is 3.1 s, of which SQL compilation is 1.35 s. The rest is miscellaneous system time. The measurement is on 64-bit Linux, 2GHz dual-Xeon 5130 (8 cores) with 8G RAM.

We note that this type of workload would be done with stored procedures or prepared, parameterized queries in the SQL world.

There will be some further tuning still but this addresses the bulk of the matter. There will be a separate message about the patch containing these improvements.