This is an update presenting sample results on a newer platform for a single-server configuration. This is to verify that performance scales with the addition of cores and clock speed. Further, we note that the jump from 100G to 300G changes very little about the score. 3x larger takes approximately 3x longer, as long as things are in memory.
The platform is one node of the CWI cluster which was also used for the 500Gt RDF experiments reported on this blog. The specification is dual Xeon E5 2650v2 (8 core, 16 thread, 2.6 GHz) with 256 GB RAM. The disk setup is a RAID-0 of three 2 TB rotating disks.
For the 100G, we go from 240 to 395, which is about 1.64x. The new platform has 16 vs 12 cores and a clock of 2.6 as opposed to 2.3. This makes a multiplier of 1.5. The rest of the acceleration is probably attributable to faster memory clock. Anyway, the point of more speed from larger platform is made.
The top level scores per run are as follows; the numerical quantities summaries are appended.
The interested may reproduce the results using the feature/analytics branch of the v7fasttrack git repository on GitHub as described in Part 13.
For the 300G runs, we note a much longer load time; see below, as this is seriously IO bound.
The first power test at 300G is a non-starter, even though this comes right after bulk load. Still, the data is not in working set and getting it from disk is simply an automatic disqualification, unless maybe one had 300 separate disks. This happens in TPC benchmarks, but not very often in the field. Looking at the first power run, the first queries take the longest, but by the time the power run starts, the working set is there. By an artifact of the metric (use of geometric mean for the power test), long queries are penalized less there than in the throughput run.
So, we run 3 executions instead of the prescribed 2, to have 2 executions from warm state.
To do 300G well in 256 GB of RAM, one needs either to use several SSDs, or to increase compression and keep all in memory, so no secondary storage at all. In order to keep all in memory, one could have stream-compression on string columns. Stream-compressing strings (e.g., o_comment, l_comment) does not pay if one is already in memory, but if stream-compressing strings eliminates going to secondary storage, then the win is sure.
As before, all caveats apply; the results are unaudited and for information only. Therefore we do not use the official metric name.
To be continued...
About this entry:
Author: Orri Erling
Published: 09/26/2014 17:02 GMT
06/10/2015 12:06 GMT
Comment Status: 0 Comments