We have now taken a close look at the query side of the LUBM benchmark, as promised a couple of blog posts ago.

We load 8000 universities and run a query mix consisting of the 14 LUBM queries with different numbers of clients against different portions of the database.

When it is all in memory, we get 33 queries per second with 8 concurrent clients; when it is so I/O bound that 7.7 of 8 threads wait for disk, we get 5 qps. This was run in 8G RAM with 2 Xeon 5130.

We adapted some of the queries so that they do not run over the whole database. In terms of retrieving triples per second, this would be about 330000 for the rate of 33 qps, with 4 cores at 2GHz. This is a combination of random access and linear scans and bitmap merge intersections; lookups for non-found triples are not counted. The rate of random lookups alone based on known G, S, P, O, without any query logic, is about 250000 random lookups per core per second.

The article LUBM and Virtuoso gives the details.

In the process of going through the workload we made some cost model adjustments and optimized the bitmap intersection join. In this way we can quickly determine which subjects are, for example, professors holding a degree from a given university. So the benchmark served us well in that it provided an incentive to further optimize some things.

Now, what has been said about RDF benchmarking previously still holds. What does it mean to do so many LUBM queries per second? What does this say about the capacity to run an online site off RDF data? Or about information integration? Not very much. But then this was not the aim of the authors either.

So we still need to make a benchmark for online queries and search, and another for E-science and business intelligence. But we are getting there.

In the immediate future, we have the general availability of Virtuoso Open Source 5.0.5 early next week. This comes with a LUBM test driver and a test suite running against the LUBM qualification database.

After this we will give some numbers for the cluster edition with LUBM and TPC-H.