Benchmarks, Redux (part 15): BSBM Test Driver Enhancements

Details

This article covers the changes we have made to the BSBM test driver during our series of experiments.

Drill-down mode - For queries that have a product type as parameter, the test driver will invoke the query multiple times with each time a random subtype of the product type of the previous invocation. The starting point of the drill-down is an a random type from a settable level in the hierarchy. The rationale for the drill-down mode is that depending on the parameter choice, there can be 1000x differences in query run time. Thus run times of consecutive query mixes will be incomparable unless we guarantee that each mix has a predictable number of queries with a product type from each level in the hierarchy.
Permutation of query mix - In the BI workload, the queries are run in a random order on each thread in multiuser mode. Doing exactly the same thing on many threads is not realistic for large queries. The data access patterns must be spread out in order to evaluate how bulk IO is organized with differing concurrent demands. The permutations are deterministic on consecutive runs and do not depend on the non-deterministic timing of concurrent activities. For queries with a drill-down, the individual executions that make up the drill-down are still consecutive.
New metrics - The BI Power is the geometric mean of query run times scaled to queries per hour and multiplied by the scale factor, where 100 Mt is considered the unit scale. The BI Throughput is the arithmetic mean of the run times scaled to QPH and adjusted to scale as with the Power metric. These are analogous to the TPC-H Power and Throughput metrics.

The Power is defined as

(scale_factor / 284826) * 3600 / ((t0 * t1 * ... * tn) ^(1 / n))

The Throughput is defined as

(scale_factor / 284826) * 3600 / ((t0 + t2 + ... + tn) / n)

The magic number 284826 is the scale that generates approximately 100 million triples (100 Mt). We consider this "scale one." The reason for the multiplication is that scores at different scales should get similar numbers, otherwise 10x larger scale would result roughly in 10x lower throughput with the BI queries.

We also show the percentage each query represents from the total time the test driver waits for responses.
Deadlock retry - When running update mixes, it is possible that a transaction gets aborted by a deadlock. We have made a retry logic for this.
Cluster mode - Cluster databases may have multiple interchangeable HTTP listeners. With this mode, one can specify multiple end-points so a multi-user workload can divide itself evenly over these.
Identifying matter - A version number was added to test driver output. Use of the new switches is also indicated in the test driver output.
SUT CPU - In comparing results it is crucial to differentiate between in memory runs and IO bound runs. To make this easier, we have added an option to report server CPU times over the timed portion (excluding warm-ups). A pluggable self-script determines the CPU times for the system; thus clusters can be handled, too. The time is given as a sum of the time the server processes have aged during the run and as a percentage over the wall-clock time.

These changes will soon be available as a diff and as a source tree. This version is labeled BSBM Test Driver 1.1-opl; the -opl signifies OpenLink additions.

We invite FU Berlin to include these enhancements into their Source Forge repository of the BSBM test driver. There is more precise documentation of these options in the README file in the above distribution.

The next planned upgrade of the test driver concerns adding support for "RDF-H", the RDF adaptation of the industry standard TPC-H decision support benchmark for RDBMS.