Details

OpenLink Software
Burlington, United States

Subscribe

Post Categories

Recent Articles

Community Member Blogs

Display Settings

articles per page.
order.

Translate

DBpedia Usage Report [ Virtuso Data Space Bot ]

We've just published the latest DBpedia Usage Report, covering v3.3 (released July, 2009) to v3.9 (released September, 2013); v3.10 (sometimes called "DBpedia 2014"; released September, 2014) will be included in the next report.

We think you'll find some interesting details in the statistics. There are also some important notes about Virtuoso configuration options and other sneaky technical issues that can surprise you (as they did us!) when exposing an ad-hoc query server to the world.

# PermaLink Comments [0]
01/07/2015 15:12 GMT Modified: 01/07/2015 15:13 GMT
DBpedia Usage Report [ Orri Erling ]

We've just published the latest DBpedia Usage Report, covering v3.3 (released July, 2009) to v3.9 (released September, 2013); v3.10 (sometimes called "DBpedia 2014"; released September, 2014) will be included in the next report.

We think you'll find some interesting details in the statistics. There are also some important notes about Virtuoso configuration options and other sneaky technical issues that can surprise you (as they did us!) when exposing an ad-hoc query server to the world.

# PermaLink Comments [0]
01/07/2015 15:12 GMT Modified: 01/07/2015 15:13 GMT
LDBC: Making Semantic Publishing Execution Rules [ Orri Erling ]

LDBC SPB (Semantic Publishing Benchmark) is based on the BBC Linked Data use case. Thus the data modeling and transaction mix reflect the BBC's actual utilization of RDF. But a benchmark is not only a condensation of current best practice. The BBC Linked Data is deployed on Ontotext GraphDB (formerly known as OWLIM).

So, in SPB we wanted to address substantially more complex queries than the lookups than the BBC linked data deployment primarily serves. Diverse dataset summaries, timelines, and faceted search qualified by keywords and/or geography, are examples of online user experience that SPB needs to cover.

SPB is not an analytical workload, per se, but we still find that the queries fall broadly in two categories:

  • Some queries are centered on a particular search or entity. The data touched by the query size does not grow at the same rate as the dataset.
  • Some queries cover whole cross sections of the dataset, e.g., find the most popular tags across the whole database.
These different classes of questions need to be separated in a metric, otherwise the short lookup dominates at small scales, and the large query at large scales.

Another guiding factor of SPB was the BBC's and others' express wish to cover operational aspects such as online backups, replication, and fail-over in a benchmark. True, most online installations have to deal with these, yet these things are as good as absent from present benchmark practice. We will look at these aspects in a different article; for now, I will just discuss the matter of workload mix and metric.

Normally, the lookup and analytics workloads are divided into different benchmarks. Here, we will try something different. There are three things the benchmark does:

  • Updates - These sometimes insert a graph, sometimes delete and re-insert the same graph, sometimes just delete a graph. These are logarithmic to data size.

  • Short queries - These are lookups that most often touch on recent data and can drive page impressions. These are roughly logarithmic to data scale.

  • Analytics - These cover a large fraction of the dataset and are roughly linear to data size.

A test sponsor can decide on the query mix within certain bounds. A qualifying run must sustain a minimum, scale-dependent update throughput and must execute a scale-dependent number of analytical query mixes, or run for a scale-dependent duration. The minimum update rate, the minimum number of analytics mixes and the minimum duration all grow logarithmically to data size.

Within these limits, the test sponsor can decide how to mix the workloads. Publishing several results emphasizing different aspects is also possible. A given system may be especially good at one aspect, leading the test sponsor to accentuate this.

The benchmark has been developed and tested at small scales, between 50 and 150M triples. Next we need to see how it actually scales. There we expect to see how the two query sets behave differently. One effect that we see right away when loading data is that creating the full text index on the literals is in fact the longest running part. For a SF 32 ( 1.6 billion triples) SPB database we have the following space consumption figures:

  • 46,886 MB of RDF literal text
  • 23,924 MB of full text index for RDF literals
  • 23,598 MB of URI strings
  • 21,981 MB of quads, stored column-wise with default index scheme

Clearly, applying column-wise compression to the strings is the best move for increasing scalability. The literals are individually short, so literal per literal compression will do little or nothing but applying this by the column is known to get a 2x size reduction with Google Snappy.

The full text index does not get much from column store techniques, as it already consists of words followed by space efficient lists of word positions. The above numbers are measured with Virtuoso column store, with quads column-wise and the rest row-wise. Each number includes the table(s) and any extra indices associated to them.

Let's now look at a full run at unit scale, i.e., 50M triples.

The run rules stipulate a minimum of 7 updates per second. The updates are comparatively fast, so we set the update rate to 70 updates per second. This is seen not to take too much CPU. We run 2 threads of updates, 20 of short queries, and 2 of long queries. The minimum run time for the unit scale is 10 minutes, so we do 10 analytical mixes, as this is expected to take a little over 10 minutes. The run stops by itself when the last of the analytical mixes finishes.

The interactive driver reports:

Seconds run : 2,144
    Editorial:
        2 agents

        68,164 inserts (avg :   46  ms, min :    5  ms, max :   3002  ms)
         8,440 updates (avg :   72  ms, min :   15  ms, max :   2471  ms)
         8,539 deletes (avg :   37  ms, min :    4  ms, max :   2531  ms)

        85,143 operations (68,164 CW Inserts   (98 errors), 
                            8,440 CW Updates   ( 0 errors), 
                            8,539 CW Deletions ( 0 errors))
        39.7122 average operations per second

    Aggregation:
        20 agents

        4120  Q1   queries (avg :    789  ms, min :   197  ms, max :   6,767   ms, 0 errors)
        4121  Q2   queries (avg :     85  ms, min :    26  ms, max :   3,058   ms, 0 errors)
        4124  Q3   queries (avg :     67  ms, min :     5  ms, max :   3,031   ms, 0 errors)
        4118  Q5   queries (avg :    354  ms, min :     3  ms, max :   8,172   ms, 0 errors)
        4117  Q8   queries (avg :    975  ms, min :    25  ms, max :   7,368   ms, 0 errors)
        4119  Q11  queries (avg :    221  ms, min :    75  ms, max :   3,129   ms, 0 errors)
        4122  Q12  queries (avg :    131  ms, min :    45  ms, max :   1,130   ms, 0 errors)
        4115  Q17  queries (avg :  5,321  ms, min :    35  ms, max :  13,144   ms, 0 errors)
        4119  Q18  queries (avg :    987  ms, min :   138  ms, max :   6,738   ms, 0 errors)
        4121  Q24  queries (avg :    917  ms, min :    33  ms, max :   3,653   ms, 0 errors)
        4122  Q25  queries (avg :    451  ms, min :    70  ms, max :   3,695   ms, 0 errors)

        22.5239 average queries per second. 
        Pool 0, queries [ Q1 Q2 Q3 Q5 Q8 Q11 Q12 Q17 Q18 Q24 Q25 ]


        45,318 total retrieval queries (0 timed-out)
        22.5239 average queries per second

The analytical driver reports:

    Aggregation:
        2 agents

        14    Q4   queries (avg :   9,984  ms, min :   4,832  ms, max :   17,957  ms, 0 errors)
        12    Q6   queries (avg :   4,173  ms, min :      46  ms, max :    7,843  ms, 0 errors)
        13    Q7   queries (avg :   1,855  ms, min :   1,295  ms, max :    2,415  ms, 0 errors)
        13    Q9   queries (avg :     561  ms, min :     446  ms, max :      662  ms, 0 errors)
        14    Q10  queries (avg :   2,641  ms, min :   1,652  ms, max :    4,238  ms, 0 errors)
        12    Q13  queries (avg :     595  ms, min :     373  ms, max :    1,167  ms, 0 errors)
        12    Q14  queries (avg :  65,362  ms, min :   6,127  ms, max :  136,346  ms, 2 errors)
        13    Q15  queries (avg :  45,737  ms, min :  12,698  ms, max :   59,935  ms, 0 errors)
        13    Q16  queries (avg :  30,939  ms, min :  10,224  ms, max :   38,161  ms, 0 errors)
        13    Q19  queries (avg :     310  ms, min :      26  ms, max :    1,733  ms, 0 errors)
        12    Q20  queries (avg :  13,821  ms, min :  11,092  ms, max :   15,435  ms, 0 errors)
        13    Q21  queries (avg :  36,611  ms, min :  14,164  ms, max :   70,954  ms, 0 errors)
        13    Q22  queries (avg :  42,048  ms, min :   7,106  ms, max :   74,296  ms, 0 errors)
        13    Q23  queries (avg :  48,474  ms, min :  18,574  ms, max :   93,656  ms, 0 errors)
        0.0862 average queries per second. 
        Pool 0, queries [ Q4 Q6 Q7 Q9 Q10 Q13 Q14 Q15 Q16 Q19 Q20 Q21 Q22 Q23 ]


        180 total retrieval queries (2 timed-out)
        0.0862 average queries per second

The metric would be 22.52 qi/s , 310 qa/h, 39.7 u/s @ 50Mt (SF 1)

The SUT is dual Xeon E5-2630, all in memory. The platform utilization is steadily above 2000% CPU (over 20/24 hardware threads busy on the DBMS). The DBMS is Virtuoso Open Source (v7fasttrack at github.com, feature/analytics branch).

The minimum update rate of 7/s was sustained, but fell short of the target of 70/s. In this run, most demand was put on the interactive queries. Different thread allocations would give different ratios of the metric components. The analytics mix, for example, is about 3x faster without other concurrent activity.

Is this good or bad? I would say that this is possible but better can certainly be accomplished.

The initial observation is that Q17 is the worst of the interactive lot. 3x better is easily accomplished by avoiding a basic stupidity. The query does the evil deed of checking for a substring in a URI. This is done in the wrong place and accounts for most of the time. The query is meant to test geo retrieval but ends up doing something quite different. Optimizing this right would by itself almost double the interactive score. There are some timeouts in the analytical run, which as such disqualifies the run. This is not a fully compliant result, but is close enough to give an idea of the dynamics. So we see that the experiment is definitely feasible, is reasonably defined, and that the dynamics seen make sense.

As an initial comment of the workload mix, I'd say that interactive should have a few more very short point-lookups, to stress compilation times and give a higher absolute score of queries per second.

Adjustments to the mix will depend on what we find out about scaling. As with SNB, it is likely that the workload will shift a little so this result might not be comparable with future ones.

In the next SPB article, we will look closer at performance dynamics and choke points and will have an initial impression on scaling the workload.

# PermaLink Comments [0]
11/13/2014 16:19 GMT
LDBC: Creating a Metric for SNB [ Orri Erling ]

In the Making It Interactive post on the LDBC blog, we were talking about composing an interactive Social Network Benchmark (SNB) metric. Now we will look at what this looks like in practice.

A benchmark is known by its primary metric. An actual benchmark implementation may deal with endless complexity but the whole point of the exercise is to reduce this all to an extremely compact form, optimally a number or two.

For SNB, we suggest clicks per second Interactive at scale (cpsI@ so many GB) as the primary metric. To each scale of the dataset corresponds a rate of update in the dataset's timeline (simulation time). When running the benchmark, the events in simulation time are transposed to a timeline in real time.

Another way of expressing the metric is therefore acceleration factor at scale. In this example, we run a 300 GB database at an acceleration of 1.64; i.e., in the present example, we did 97 minutes of simulation time in 58 minutes of real time.

Another key component of a benchmark is the full disclosure report (FDR). This is expected to enable any interested party to reproduce the experiment.

The system under test (SUT) is Virtuoso running an SQL implementation of the workload at 300 GB (SF = 300). This run gives an idea of what an official report will look like but is not one yet. The implementation differs from the present specification in the following:

  • The SNB test driver is not used. Instead, the workload is read from the file system by stored procedures on the SUT. This is done to circumvent latencies in update scheduling in the test driver which would result in the SUT not reaching full platform utilization.

  • The workload is extended by 2 short lookups, i.e., person profile view and post detail view. These are very short and serve to give the test more of an online flavor.

  • The short queries appear in the report as multiple entries. This should not be the case. This inflates the clicks per second number but does not significantly affect the acceleration factor.

As a caveat, this metric will not be comparable with future ones.

Aside from the composition of the report, the interesting point is that with the present workload, a 300 GB database keeps up with the simulation timeline on a commodity server, also when running updates. The query frequencies and run times are in the full report. We also produced a graphic showing the evolution of the throughput over a run of one hour --

ldbc-snb-qpm.png
(click to embiggen)

We see steady throughput except for some slower minutes which correspond to database checkpoints. (A checkpoint, sometimes called a log checkpoint, is the operation which makes a database state durable outside of the transaction log.) If we run updates only at full platform, we get an acceleration of about 300x in memory for 20 minutes, then 10 minutes of nothing happening while the database is being checkpointed. This is measured with 6 2TB magnetic disks. Such a behavior is incompatible with an interactive workload. But with a checkpoint every 10 minutes and updates mixed with queries, checkpointing the database does not lead to impossible latencies. Thus, we do not get the TPC-C syndrome which requires tens of disks or several SSDs per core to run.

This is a good thing for the benchmark, as we do not want to require unusual I/O systems for competition. Such a requirement would simply encourage people to ignore the specification for the point and would limit the number of qualifying results.

The full report contains the details. This is also a template for later "real" FDRs. The supporting files are divided into test implementation and system configuration. With these materials plus the data generator, one should be able to repeat the results using a Virtuoso Open Source cut from v7fasttrack at github.com, feature/analytics branch.

In later posts we will analyze the results a bit more and see how much improvement potential we find. The next SNB article will be about the business intelligence and graph analytics areas of SNB.

# PermaLink Comments [0]
11/13/2014 16:09 GMT
On Universality and Core Competence [ Virtuso Data Space Bot ]

I will here develop some ideas on the platform of Peter Boncz's inaugural lecture mentioned in the previous post. This is a high-level look at where the leading edge of analytics will be, now that the column store is mainstream.

Peter's description of his domain was roughly as follows, summarized from memory:

The new chair is for data analysis and engines for this purpose. The data analysis engine includes the analytical DBMS but is a broader category. For example, the diverse parts of the big data chain (including preprocessing, noise elimination, feature extraction, natural language extraction, graph analytics, and so forth) fall under this category, and most of these things are usually not done in a DBMS. For anything that is big, the main challenge remains one of performance and time to solution. These things are being done, and will increasingly be done, on a platform with heterogenous features, e.g., CPU/GPU clusters, possibly custom hardware like FPGAs, etc. This is driven by factors of cost and energy efficiency. Different processing stages will sometimes be distributed over a wide area, as for example in instrument networks and any network infrastructure, which is wide area by definition.

The design space of database and all that is around it is huge, and any exhaustive exploration is impossible. Development times are long, and a platform might take ten years to be mature. This is ill compatible with academic funding cycles. However, we should not leave all the research in this to industry, as industry maximizes profit, not innovation or absolute performance. Architecting data systems has aspects of an art. Consider the parallel with architecture of buildings: There are considerations of function, compatibility with environment, cost, restrictions arising from the materials at hand, and so forth. How a specific design will work cannot be known without experiment. The experiments themselves must be designed to make sense. This is not an exact science with clear-cut procedures and exact metrics of success.

This is the gist of Peter's description of our art. Peter's successes, best exemplified by MonetDB and Vectorwise, arise from focus over a special problem area and from developing and systematically applying specific insights to a specific problem. This process led to the emergence of the column store, which is now a mainstream thing. The DBMS that does not do columns is by now behind the times.

Needless to say, I am a great believer in core competence. Not every core competence is exactly the same. But a core competence needs to be broad enough so that its integral mastery and consistent application can produce a unit of value valuable in itself. What and how broad this is varies a great deal. Typically such a unit of value is something that is behind a "natural interface." This defies exhaustive definition but the examples below may give a hint. Looking at value chains and all diverse things in them that have a price tag may be another guideline.

There is a sort of Hegelian dialectic to technology trends: At the start, it was generally believed that a DBMS would be universal like the operating system itself, with a few products with very similar functionality covering the whole field. The antithesis came with Michael Stonebraker declaring that one size no longer fit all. Since then the transactional (OLTP) and analytical (OLAP) sides are clearly divided. The eventual synthesis may be in the air, with pioneering work like HyPer led by Thomas Neumann of TU München. Peter, following his Humbolt prize, has spent a couple of days a week in Thomas's group, and I have joined him there a few times. The key to eventually bridging the gap would be compilation and adaptivity. If the workload is compiled on demand, then the right data structures could always be at hand.

This might be the start of a shift similar to the column store turning the DBMS on its side, so to say.

In the mainstream of software engineering, objects, abstractions and interfaces are held to be a value almost in and of themselves. Our science, that of performance, stands in apparent opposition to at least any naive application of the paradigm of objects and interfaces. Interfaces have a cost, and boxes limit transparency into performance. So inlining and merging distinct (in principle) processing phases is necessary for performance. Vectoring is one take on this: An interface that is crossed just a few times is much less harmful than one crossed a billion times. Using compilation, or at least type-and-data-structure-specific variants of operators and switching their application based on run-time observed behaviors, is another aspect of this.

Information systems thus take on more attributes of nature, i.e., more interconnectedness and adaptive behaviors.

Something quite universal might emerge from the highly problem-specific technology of the column store. The big scan, selective hash join plus aggregation, has been explored in slightly different ways by all of HyPer, Vectorwise, and Virtuoso.

Interfaces are not good or bad, in and of themselves. Well-intentioned naïveté in their use is bad. As in nature, there are natural borders in the "technosphere"; declarative query languages, processor instruction sets, and network protocols are good examples. Behind a relatively narrow interface lies a world of complexity of which the unsuspecting have no idea. In biology, the cell membrane might be an analogy, but this is in all likelihood more permeable and diverse in function than the techno examples mentioned.

With the experience of Vectorwise and later Virtuoso, it turns out that vectorization without compilation is good enough for TPC-H. Indeed, I see a few percent of gain at best from further breaking of interfaces and "biology-style" merging of operators and adding inter-stage communication and self-balancing. But TPC-H is not the end of all things, even though it is a sort of rite of passage: Jazz players will do their take on Green Dolphin Street and Summertime.

Science is drawn towards a grand unification of all which is. Nature, on the other hand, discloses more and more diversity and special cases, the closer one looks. This may be true of physical things, but also of abstractions such as software systems or mathematics.

So, let us look at the generalized DBMS, or the data analysis engine, as Peter put it. The use of DBMS technology is hampered by its interface, i.e., declarative query language. The well known counter-reactions to this are the NoSQL, MapReduce, and graph DB memes, which expose lower level interfaces. But then the interface gets put in the whole wrong place, denying most of the things that make the analytics DBMS extremely good at what it does.

We need better and smarter building blocks and interfaces at zero cost. We continue to need blocks of some sort, since algorithms would stop being understandable without any data/procedural abstraction. At run time, the blocks must overlap and interpenetrate: Scan plus hash plus reduction in one loop, for example. Inter-thread, inter-process status sharing for things like top k for faster convergence, for another. Vectorized execution of the same algorithm on many data for things like graph traversals. There are very good single blocks, like GPU graph algorithms, but interface and composability are ever the problem.

So, we must unravel the package that encapsulates the wonders of the analytical DBMS. These consist of scan, hash/index lookup, partitioning, aggregation, expression evaluation, scheduling, message passing and related flow control for scale-out systems, just to mention a few. The complete list would be under 30 long, with blocks parameterized by data payload and specific computation.

By putting these together in a few new ways, we will cover much more of the big data pipeline. Just-in-time compilation may well be the way to deliver these components in an application/environment tailored composition. Yes, keep talking about block diagrams, but never once believe that this represents how things work or ought to work. The algorithms are expressed as distinct things, but at the level of the physical manifestation, things are parallel and interleaved.

The core skill for architecting the future of data analytics is correct discernment of abstraction and interface. What is generic enough to be broadly applicable yet concise enough to be usable? When should the computation move, and when should the data move? What are easy ways of talking about data location? How can protect the application developer be protected from various inevitable stupidities?

No mistake about it, there are at present very few people with the background for formulating the blueprint for the generalized data pipeline. These will be mostly drawn from architects of DBMS. The prospective user is any present-day user of analytics DBMS, Hadoop, or the like. By and large, SQL has worked well within its area of applicability. If there had never been an anti-SQL rebel faction, SQL would not have been successful. Now that a broader workload definition calls for redefinition of interfaces, so as to use the best where it fits, there is a need for re-evaluation of the imperative Vs. declarative question.

T. S. Eliot once wrote that humankind cannot bear very much reality. It seems that we in reality can deconstruct the DBMS and redeploy the state of the art to serve novel purposes across a broader set of problems. This is a cross-over that slightly readjusts the mental frame of the DBMS expert but leaves the core precepts intact. In other words, this is a straightforward extension of core competence with no slide into the dilettantism of doing a little bit of everything.

People like MapReduce and stand-alone graph programming frameworks, because these do one specific thing and are readily understood. By and large, these are orders of magnitude simpler than the DBMS. Even when the DBMS provides in-process Java or CLR, these are rarely used. The single-purpose framework is a much narrower core competence, and thus less exclusive, than the high art of the DBMS, plus it has a faster platform development cycle.

In the short term, we will look at opening the SQL internal toolbox for graph analytics applications. I was discussing this idea with Thomas Neumann at Peter Boncz's party. He asked who would be the user. I answered that doing good parallel algorithms, even with powerful shorthands, was an expert task; so the people doing new types of analytics would be mostly on the system vendor side. However, modifying such for input selection and statistics gathering would be no harder than doing the same with ready-made SQL reports.

There is significant possibility for generalization of the leading edge of database. How will this fare against single-model frameworks? We hope to shed some light on this in the final phase of LDBC and beyond.

# PermaLink Comments [0]
10/22/2014 13:24 GMT
Inaugural Lecture of Prof. Boncz at VU Amsterdam [ Virtuso Data Space Bot ]

Last Friday, I attended the inaugural lecture of Professor Peter Boncz at the VU University Amsterdam. As the reader is likely to know, Peter is one of the database luminaries of the 21st century, known among other things for architecting MonetDB and Actian Vector (Vectorwise) and publishing a stellar succession of core database papers.

The lecture touched on the fact of the data economy and the possibilities of E-science. Peter proceeded to address issues of ethics of cyberspace and the fact of legal and regulatory practice trailing far behind the factual dynamics of cyberspace. In conclusion, Peter gave some pointers to his research agenda; for example, use of just-in-time compilation for fusing problem-specific logic with infrastructure software like databases for both performance and architecture adaptivity.

There was later a party in Amsterdam with many of the local database people as well as some from further away, e.g., Thomas Neumann of Munich, and Marcin Zukowsky, Vectorwise founder and initial CEO.

I should have had the presence of mind to prepare a speech for Peter. Stefan Manegold of CWI did give a short address at the party, while presenting the gifts from Peter's CWI colleagues. To this I will add my belated part here, as follows:

If I were to describe Prof. Boncz, our friend, co-worker, and mentor, in one word, this would be man of knowledge. If physicists define energy as that which can do work, then knowledge would be that which can do meaningful work. A schematic in itself does nothing. Knowledge is needed to bring this to life. Yet this is more than an outstanding specialist skill, as this implies discerning the right means in the right context and includes the will and ability to go through with this. As Peter now takes on the mantle of professor, the best students will, I am sure, not fail to recognize excellence and be accordingly inspired to strive for the sort of industry changing accomplishments we have come to associate with Peter's career so far. This is what our world needs. A big cheer for Prof. Boncz!

I did talk to many at the party, especially Pham Minh Duc, who is doing schema-aware RDF in MonetDB, and many others among the excellent team at CWI. Stefan Manegold told me about Rethink Big, an FP7 for big data policy recommendations. I was meant to be an advisor and still hope to go to one of their meetings for some networking about policy. On the other hand, the EU agenda and priorities, as discussed with, for example, Stefano Bertolo, are, as far as I am concerned, on the right track: The science of performance must meet with the real, or at least realistic, data. Peter did not fail to mention this same truth in his lecture: Spinoffs play a key part in research, and exposure to the world out there gives research both focus and credibility. As René Char put it in his poem L'Allumette (The Matchstick), "La tête seule à pouvoir de prendre feu au contact d'une réalité dure." ("The head alone has power to catch fire at the touch of hard reality.") Great deeds need great challenges, and there is nothing like reality to exceed man's imagination.

For my part, I was advertising the imminent advances in the Virtuoso RDF and graph functionality. Now that the SQL part, which is anyway the necessary foundation for all this, is really very competent, it is time to deploy these same things in slightly new ways. This will produce graph analytics and structure-aware RDF to match relational performance while keeping schema-last-ness. Anyway, the claim has been made; we will see how it is delivered during the final phase of LDBC and Geoknow.

# PermaLink Comments [0]
10/22/2014 13:24 GMT
On Universality and Core Competence [ Orri Erling ]

I will here develop some ideas on the platform of Peter Boncz's inaugural lecture mentioned in the previous post. This is a high-level look at where the leading edge of analytics will be, now that the column store is mainstream.

Peter's description of his domain was roughly as follows, summarized from memory:

The new chair is for data analysis and engines for this purpose. The data analysis engine includes the analytical DBMS but is a broader category. For example, the diverse parts of the big data chain (including preprocessing, noise elimination, feature extraction, natural language extraction, graph analytics, and so forth) fall under this category, and most of these things are usually not done in a DBMS. For anything that is big, the main challenge remains one of performance and time to solution. These things are being done, and will increasingly be done, on a platform with heterogenous features, e.g., CPU/GPU clusters, possibly custom hardware like FPGAs, etc. This is driven by factors of cost and energy efficiency. Different processing stages will sometimes be distributed over a wide area, as for example in instrument networks and any network infrastructure, which is wide area by definition.

The design space of database and all that is around it is huge, and any exhaustive exploration is impossible. Development times are long, and a platform might take ten years to be mature. This is ill compatible with academic funding cycles. However, we should not leave all the research in this to industry, as industry maximizes profit, not innovation or absolute performance. Architecting data systems has aspects of an art. Consider the parallel with architecture of buildings: There are considerations of function, compatibility with environment, cost, restrictions arising from the materials at hand, and so forth. How a specific design will work cannot be known without experiment. The experiments themselves must be designed to make sense. This is not an exact science with clear-cut procedures and exact metrics of success.

This is the gist of Peter's description of our art. Peter's successes, best exemplified by MonetDB and Vectorwise, arise from focus over a special problem area and from developing and systematically applying specific insights to a specific problem. This process led to the emergence of the column store, which is now a mainstream thing. The DBMS that does not do columns is by now behind the times.

Needless to say, I am a great believer in core competence. Not every core competence is exactly the same. But a core competence needs to be broad enough so that its integral mastery and consistent application can produce a unit of value valuable in itself. What and how broad this is varies a great deal. Typically such a unit of value is something that is behind a "natural interface." This defies exhaustive definition but the examples below may give a hint. Looking at value chains and all diverse things in them that have a price tag may be another guideline.

There is a sort of Hegelian dialectic to technology trends: At the start, it was generally believed that a DBMS would be universal like the operating system itself, with a few products with very similar functionality covering the whole field. The antithesis came with Michael Stonebraker declaring that one size no longer fit all. Since then the transactional (OLTP) and analytical (OLAP) sides are clearly divided. The eventual synthesis may be in the air, with pioneering work like HyPer led by Thomas Neumann of TU München. Peter, following his Humbolt prize, has spent a couple of days a week in Thomas's group, and I have joined him there a few times. The key to eventually bridging the gap would be compilation and adaptivity. If the workload is compiled on demand, then the right data structures could always be at hand.

This might be the start of a shift similar to the column store turning the DBMS on its side, so to say.

In the mainstream of software engineering, objects, abstractions and interfaces are held to be a value almost in and of themselves. Our science, that of performance, stands in apparent opposition to at least any naive application of the paradigm of objects and interfaces. Interfaces have a cost, and boxes limit transparency into performance. So inlining and merging distinct (in principle) processing phases is necessary for performance. Vectoring is one take on this: An interface that is crossed just a few times is much less harmful than one crossed a billion times. Using compilation, or at least type-and-data-structure-specific variants of operators and switching their application based on run-time observed behaviors, is another aspect of this.

Information systems thus take on more attributes of nature, i.e., more interconnectedness and adaptive behaviors.

Something quite universal might emerge from the highly problem-specific technology of the column store. The big scan, selective hash join plus aggregation, has been explored in slightly different ways by all of HyPer, Vectorwise, and Virtuoso.

Interfaces are not good or bad, in and of themselves. Well-intentioned naïveté in their use is bad. As in nature, there are natural borders in the "technosphere"; declarative query languages, processor instruction sets, and network protocols are good examples. Behind a relatively narrow interface lies a world of complexity of which the unsuspecting have no idea. In biology, the cell membrane might be an analogy, but this is in all likelihood more permeable and diverse in function than the techno examples mentioned.

With the experience of Vectorwise and later Virtuoso, it turns out that vectorization without compilation is good enough for TPC-H. Indeed, I see a few percent of gain at best from further breaking of interfaces and "biology-style" merging of operators and adding inter-stage communication and self-balancing. But TPC-H is not the end of all things, even though it is a sort of rite of passage: Jazz players will do their take on Green Dolphin Street and Summertime.

Science is drawn towards a grand unification of all which is. Nature, on the other hand, discloses more and more diversity and special cases, the closer one looks. This may be true of physical things, but also of abstractions such as software systems or mathematics.

So, let us look at the generalized DBMS, or the data analysis engine, as Peter put it. The use of DBMS technology is hampered by its interface, i.e., declarative query language. The well known counter-reactions to this are the NoSQL, MapReduce, and graph DB memes, which expose lower level interfaces. But then the interface gets put in the whole wrong place, denying most of the things that make the analytics DBMS extremely good at what it does.

We need better and smarter building blocks and interfaces at zero cost. We continue to need blocks of some sort, since algorithms would stop being understandable without any data/procedural abstraction. At run time, the blocks must overlap and interpenetrate: Scan plus hash plus reduction in one loop, for example. Inter-thread, inter-process status sharing for things like top k for faster convergence, for another. Vectorized execution of the same algorithm on many data for things like graph traversals. There are very good single blocks, like GPU graph algorithms, but interface and composability are ever the problem.

So, we must unravel the package that encapsulates the wonders of the analytical DBMS. These consist of scan, hash/index lookup, partitioning, aggregation, expression evaluation, scheduling, message passing and related flow control for scale-out systems, just to mention a few. The complete list would be under 30 long, with blocks parameterized by data payload and specific computation.

By putting these together in a few new ways, we will cover much more of the big data pipeline. Just-in-time compilation may well be the way to deliver these components in an application/environment tailored composition. Yes, keep talking about block diagrams, but never once believe that this represents how things work or ought to work. The algorithms are expressed as distinct things, but at the level of the physical manifestation, things are parallel and interleaved.

The core skill for architecting the future of data analytics is correct discernment of abstraction and interface. What is generic enough to be broadly applicable yet concise enough to be usable? When should the computation move, and when should the data move? What are easy ways of talking about data location? How can protect the application developer be protected from various inevitable stupidities?

No mistake about it, there are at present very few people with the background for formulating the blueprint for the generalized data pipeline. These will be mostly drawn from architects of DBMS. The prospective user is any present-day user of analytics DBMS, Hadoop, or the like. By and large, SQL has worked well within its area of applicability. If there had never been an anti-SQL rebel faction, SQL would not have been successful. Now that a broader workload definition calls for redefinition of interfaces, so as to use the best where it fits, there is a need for re-evaluation of the imperative Vs. declarative question.

T. S. Eliot once wrote that humankind cannot bear very much reality. It seems that we in reality can deconstruct the DBMS and redeploy the state of the art to serve novel purposes across a broader set of problems. This is a cross-over that slightly readjusts the mental frame of the DBMS expert but leaves the core precepts intact. In other words, this is a straightforward extension of core competence with no slide into the dilettantism of doing a little bit of everything.

People like MapReduce and stand-alone graph programming frameworks, because these do one specific thing and are readily understood. By and large, these are orders of magnitude simpler than the DBMS. Even when the DBMS provides in-process Java or CLR, these are rarely used. The single-purpose framework is a much narrower core competence, and thus less exclusive, than the high art of the DBMS, plus it has a faster platform development cycle.

In the short term, we will look at opening the SQL internal toolbox for graph analytics applications. I was discussing this idea with Thomas Neumann at Peter Boncz's party. He asked who would be the user. I answered that doing good parallel algorithms, even with powerful shorthands, was an expert task; so the people doing new types of analytics would be mostly on the system vendor side. However, modifying such for input selection and statistics gathering would be no harder than doing the same with ready-made SQL reports.

There is significant possibility for generalization of the leading edge of database. How will this fare against single-model frameworks? We hope to shed some light on this in the final phase of LDBC and beyond.

# PermaLink Comments [0]
10/22/2014 13:23 GMT
Inaugural Lecture of Prof. Boncz at VU Amsterdam [ Orri Erling ]

Last Friday, I attended the inaugural lecture of Professor Peter Boncz at the VU University Amsterdam. As the reader is likely to know, Peter is one of the database luminaries of the 21st century, known among other things for architecting MonetDB and Actian Vector (Vectorwise) and publishing a stellar succession of core database papers.

The lecture touched on the fact of the data economy and the possibilities of E-science. Peter proceeded to address issues of ethics of cyberspace and the fact of legal and regulatory practice trailing far behind the factual dynamics of cyberspace. In conclusion, Peter gave some pointers to his research agenda; for example, use of just-in-time compilation for fusing problem-specific logic with infrastructure software like databases for both performance and architecture adaptivity.

There was later a party in Amsterdam with many of the local database people as well as some from further away, e.g., Thomas Neumann of Munich, and Marcin Zukowsky, Vectorwise founder and initial CEO.

I should have had the presence of mind to prepare a speech for Peter. Stefan Manegold of CWI did give a short address at the party, while presenting the gifts from Peter's CWI colleagues. To this I will add my belated part here, as follows:

If I were to describe Prof. Boncz, our friend, co-worker, and mentor, in one word, this would be man of knowledge. If physicists define energy as that which can do work, then knowledge would be that which can do meaningful work. A schematic in itself does nothing. Knowledge is needed to bring this to life. Yet this is more than an outstanding specialist skill, as this implies discerning the right means in the right context and includes the will and ability to go through with this. As Peter now takes on the mantle of professor, the best students will, I am sure, not fail to recognize excellence and be accordingly inspired to strive for the sort of industry changing accomplishments we have come to associate with Peter's career so far. This is what our world needs. A big cheer for Prof. Boncz!

I did talk to many at the party, especially Pham Minh Duc, who is doing schema-aware RDF in MonetDB, and many others among the excellent team at CWI. Stefan Manegold told me about Rethink Big, an FP7 for big data policy recommendations. I was meant to be an advisor and still hope to go to one of their meetings for some networking about policy. On the other hand, the EU agenda and priorities, as discussed with, for example, Stefano Bertolo, are, as far as I am concerned, on the right track: The science of performance must meet with the real, or at least realistic, data. Peter did not fail to mention this same truth in his lecture: Spinoffs play a key part in research, and exposure to the world out there gives research both focus and credibility. As René Char put it in his poem L'Allumette (The Matchstick), "La tête seule à pouvoir de prendre feu au contact d'une réalité dure." ("The head alone has power to catch fire at the touch of hard reality.") Great deeds need great challenges, and there is nothing like reality to exceed man's imagination.

For my part, I was advertising the imminent advances in the Virtuoso RDF and graph functionality. Now that the SQL part, which is anyway the necessary foundation for all this, is really very competent, it is time to deploy these same things in slightly new ways. This will produce graph analytics and structure-aware RDF to match relational performance while keeping schema-last-ness. Anyway, the claim has been made; we will see how it is delivered during the final phase of LDBC and Geoknow.

# PermaLink Comments [0]
10/22/2014 13:21 GMT
In Hoc Signo Vinces (part 20 of n): 100G and 1000G With Cluster; When is Cluster Worthwhile; Effects of I/O [ Virtuso Data Space Bot ]

In the introduction to scale out piece, I promised to address the matter of data-to-memory ratio, and to talk about when scale-out makes sense. Here we will see that scale-out makes sense whenever data does not fit in memory on a single commodity server. The gains in processing power are immediate, even when going from one box to just two, with both systems having all in memory.

As an initial take on the issue we run 100 GB and 1000 GB on the test system. 100 GB is trivially in memory, 1000 GB is not, as the memory is 384 GB total, of which 360 GB may be used for the processes.

We run 2 workloads on the 100 GB database, having pre-loaded the data in memory:

run power throughput composite
1 349,027.7 420,503.1 383,102.1
2 387,890.3 433,066.6 409,856.5

This is directly comparable to the 100 GB single-server results. Comparing the second runs, we see a 1.53x gain in power and a 1.8x gain in throughput from 2x the platform. This is fully on the level for a workload that is not trivially parallel, as we have seen in the previous articles. The difference between the first and second runs at 100 GB comes, for both single-server and cluster, from the latency of allocating transient query memory. For an official run, where the weakest link is the first power test, this would simply have to be pre-allocated.

We run 2 workloads on the 1000 GB database, starting from cold.

The result is:

run power throughput composite
1 136,744.5 147,374.6 141,960.1
2 199,652.0 125,161.1 158,078.0

The 1000 GB result is not for competition with this platform; more memory would be needed. For actual applications, the numbers are still in the usable range, though.

The 1000 GB setup uses 4 SSDs for storage, one per server process. The server processes are each bound to their own physical CPU.

We look at the meters: 32M pages (8M per process) are in memory at each time. Over the 2 benchmark executions there are a total of 494M disk reads. The total CPU time is 165,674 seconds of CPU, of which about 10% are system, over 10,063 seconds of real-time. Cumulative disk-read wait-time is 130,177 s. This gives an average disk read throughput of 384 MB/s.

This is easily sustained by 4 SSDs; in practice, the maximum throughput we see for reading is 1 GB/s (256 MB/s per SSD). Newer SSDs would do maybe twice that. Using rotating media would not be an option.

Without the drop in CPU caused by waiting for SSD, we would have numbers very close to the 100 GB numbers.

The interconnect traffic for the two runs was 1,077 GB with no message compression. The write block time was 448 seconds of thread-time. So we see that blocking on write hurts platform utilization when running under optimal conditions, but compared to going to secondary storage, it is not a large factor.

The 1000 GB scale has a transient peak memory consumption of 42 GB. This consists of hash-join build sides and GROUP BYs. The greatest memory consumers are Q9 with 9 GB, Q13 with 11 GB, and Q16 with 7 GB. Having many of these at a time drives up the transient peak. The peak gets higher as the scale grows, also because a larger scale requires more concurrent query streams. At the 384 GB for 1000 GB ratio, we do not yet get into memory saving plans like hash joins in many passes or index use instead of hash. When the data size grows, replicated hash build sides will become less convenient, and communication will increase. Q9 and Q13 can be done by index with almost no transient memory, but these plans are easily 3x less efficient for CPU. These will probably help at 3000 GB and be necessary at least part of the time at 10,000 GB.

The I/O volume in MB per index over the 2 executions is:

index MB
LINEITEM 1,987,483
ORDERS 1,440,526
PARTSUPP 199,335
PART 161,717
CUSTOMER 43,276
O_CK 19,085
SUPPLIER 13,393

Of this, maybe 600 GB could be saved by stream compressing o_comment. Otherwise this cannot be helped without adding memory. The lineitem reads are mostly for l_extendedprice, which is not compressible. If compressing o_comment made l_extendedprice always fit in memory, then there would be a radical drop in I/O. Also, as a matter of fact, the buffer management policy of least-recently-used works the very worst for big scans, specifically those of l_extendedprice: If the head is replaced when reading the tail, and the next read starts from the head, then the whole table/column is read all over again. Caching policies that specially recognized scans of this sort could further reduce I/O. Clustering lineitems/orders on date, as Actian Vector TPC-H implementations do, also starts yielding a greater gain when not running from memory: One column (e.g., l_shipdate) may be scanned for the whole table but, if the matches are bunched together, then most of l_extendedprice will not be read at all. Still, if going for top ranks in the races, all will be from memory, or at least there will be SSDs with read throughput around 150 MB/s per core, so these tricks become relatively less important.

In the 100 GB numerical quantities summaries, we see much the same picture as in the single-server. Queries get faster, but their relative times are not radically different. The throughput test (many queries at a time) times are more or less multiples of the power (single user) times. This picture breaks at 1000 GB where I/O first drops the performance to under half and introduces huge variation in execution times within a single query. The time entirely depends on which queries are running along with or right before the execution and on whether these have the same or different working sets. All the streams have the same queries with different parameters, but the query order in each stream is different.

The numerical quantities follow for all the runs. Note that the first 1000 GB run is cold. A competition grade 1000 GB result can be made with double the memory, and the more CPU the better. We will try one at Amazon in a bit.

***

The conclusion is that scale-out pays from the get-go. At present prices, a system with twice the power of a single node of the test system is cost effective. Scales of up to 500 GB are single commodity server, under $10K. Rather than going from a mid-to-large dual-socket box to a quad-socket box, one is likely to be better off having two cheaper dual-socket boxes. These are also readily available on clouds, whereas scale-up configurations are not. Onwards of 1 TB, a cluster is expected to clearly win. At 3 TB, a commodity cluster will clearly be the better deal for both price and absolute performance.

100 GB Run 1

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 100
Total Data Storage/Database Size 0M
Query Streams for
Throughput Test
5
Virt-H Power 349,027.7
Virt-H Throughput 420,503.1
Virt-H Composite
Query-per-Hour Metric
(Qph@100GB)
383,102.1
Measurement Interval in
Throughput Test (Ts)
94.273000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 15:05:07 10/03/2014 15:05:40 0:00:33
Stream 1 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Stream 2 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Stream 3 10/03/2014 15:05:42 10/03/2014 15:07:16 0:01:34
Stream 4 10/03/2014 15:05:42 10/03/2014 15:07:14 0:01:32
Stream 5 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Refresh 0 10/03/2014 15:05:07 10/03/2014 15:05:13 0:00:06
10/03/2014 15:05:41 10/03/2014 15:05:42 0:00:01
Refresh 1 10/03/2014 15:06:48 10/03/2014 15:07:03 0:00:15
Refresh 2 10/03/2014 15:05:42 10/03/2014 15:06:06 0:00:24
Refresh 3 10/03/2014 15:06:06 10/03/2014 15:06:20 0:00:14
Refresh 4 10/03/2014 15:06:20 10/03/2014 15:06:35 0:00:15
Refresh 5 10/03/2014 15:06:35 10/03/2014 15:06:48 0:00:13

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 2.045198 0.337315 1.129548 0.327029 1.230955 0.473090 0.979096 0.852639
Stream 1 4.521951 0.596538 3.464342 1.167101 3.944699 1.744325 5.442328 4.706185
Stream 2 4.678728 0.837205 3.594060 1.911751 3.942459 0.947788 3.821267 4.686319
Stream 3 5.126384 0.932394 0.961762 1.043759 5.359990 1.035597 3.056079 5.803445
Stream 4 4.497118 0.381036 4.665412 1.224975 5.316591 1.666253 2.297872 6.425171
Stream 5 4.080968 0.493741 4.416305 0.879202 5.705877 1.615987 3.846881 3.346686
Min Qi 4.080968 0.381036 0.961762 0.879202 3.942459 0.947788 2.297872 3.346686
Max Qi 5.126384 0.932394 4.665412 1.911751 5.705877 1.744325 5.442328 6.425171
Avg Qi 4.581030 0.648183 3.420376 1.245358 4.853923 1.401990 3.692885 4.993561
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 3.575916 2.786656 1.579488 0.611454 3.132460 0.685095 0.955559 1.060110
Stream 1 9.551437 7.187181 5.816455 2.004946 9.461347 5.624020 5.517677 2.924265
Stream 2 9.637427 6.641804 6.359532 2.412576 8.819754 3.335494 4.549792 3.163920
Stream 3 11.041451 6.464479 6.982671 3.272975 8.342983 3.448635 4.405911 2.886393
Stream 4 8.860228 6.754529 7.065501 3.225236 8.789565 3.419165 4.240718 2.399092
Stream 5 7.339672 8.121027 6.261988 2.711946 8.764934 3.106366 6.544712 3.472092
Min Qi 7.339672 6.464479 5.816455 2.004946 8.342983 3.106366 4.240718 2.399092
Max Qi 11.041451 8.121027 7.065501 3.272975 9.461347 5.624020 6.544712 3.472092
Avg Qi 9.286043 7.033804 6.497229 2.725536 8.835717 3.786736 5.051762 2.969152
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 1.433789 0.972152 0.780247 1.287222 1.360084 0.254051 6.201742 1.219707
Stream 1 3.398354 2.591249 3.021207 4.663204 4.775704 1.116547 8.770115 5.643550
Stream 2 6.811520 3.411846 2.634076 4.296810 4.669635 2.282003 18.039617 6.060465
Stream 3 4.947110 2.479268 2.952951 6.431644 5.469152 1.816467 8.271266 5.498956
Stream 4 5.240237 2.062261 2.734378 6.055141 2.997684 2.519301 7.889700 6.944722
Stream 5 4.839670 3.379315 3.231582 6.255944 3.759509 1.347830 8.707303 4.376033
Min Qi 3.398354 2.062261 2.634076 4.296810 2.997684 1.116547 7.889700 4.376033
Max Qi 6.811520 3.411846 3.231582 6.431644 5.469152 2.519301 18.039617 6.944722
Avg Qi 5.047378 2.784788 2.914839 5.540549 4.334337 1.816430 10.335600 5.704745

100 GB Run 2

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 100
Total Data Storage/Database Size 0M
Query Streams for
Throughput Test
5
Virt-H Power 387,890.3
Virt-H Throughput 433,066.6
Virt-H Composite
Query-per-Hour Metric
(Qph@100GB)
409,856.5
Measurement Interval in
Throughput Test (Ts)
91.541000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 15:07:19 10/03/2014 15:07:47 0:00:28
Stream 1 10/03/2014 15:07:48 10/03/2014 15:09:19 0:01:31
Stream 2 10/03/2014 15:07:48 10/03/2014 15:09:16 0:01:28
Stream 3 10/03/2014 15:07:48 10/03/2014 15:09:17 0:01:29
Stream 4 10/03/2014 15:07:48 10/03/2014 15:09:16 0:01:28
Stream 5 10/03/2014 15:07:48 10/03/2014 15:09:20 0:01:32
Refresh 0 10/03/2014 15:07:19 10/03/2014 15:07:22 0:00:03
10/03/2014 15:07:47 10/03/2014 15:07:48 0:00:01
Refresh 1 10/03/2014 15:08:45 10/03/2014 15:08:59 0:00:14
Refresh 2 10/03/2014 15:07:49 10/03/2014 15:08:02 0:00:13
Refresh 3 10/03/2014 15:08:02 10/03/2014 15:08:17 0:00:15
Refresh 4 10/03/2014 15:08:17 10/03/2014 15:08:29 0:00:12
Refresh 5 10/03/2014 15:08:29 10/03/2014 15:08:45 0:00:16

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 2.081986 0.208487 0.902462 0.313160 1.312273 0.493157 0.926629 0.786345
Stream 1 2.755427 0.911578 3.618085 0.664407 3.740112 2.118189 4.738754 6.551446
Stream 2 4.189612 0.957921 5.267355 2.152479 6.068005 1.263380 4.251842 3.620160
Stream 3 4.708834 0.981651 2.411839 0.790955 4.384516 1.322670 2.641571 4.771831
Stream 4 3.739567 1.185884 2.863871 1.517891 5.946967 1.179960 3.840560 4.926325
Stream 5 5.258746 0.705228 3.460904 0.951328 4.530620 1.104500 3.226494 4.041142
Min Qi 2.755427 0.705228 2.411839 0.664407 3.740112 1.104500 2.641571 3.620160
Max Qi 5.258746 1.185884 5.267355 2.152479 6.068005 2.118189 4.738754 6.551446
Avg Qi 4.130437 0.948452 3.524411 1.215412 4.934044 1.397740 3.739844 4.782181
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 3.226685 1.878227 1.802562 0.676499 3.145884 0.653129 0.963449 0.990524
Stream 1 8.842030 5.630466 5.728147 2.643227 9.615551 3.197855 4.676538 4.285251
Stream 2 9.508612 5.288044 4.319998 1.492915 9.431995 3.206360 3.859749 3.201996
Stream 3 10.480224 5.880274 4.517320 2.509405 6.913159 2.892479 6.408602 2.938061
Stream 4 8.824111 5.752413 5.997959 2.581237 8.954756 3.351951 2.420598 4.148455
Stream 5 4.905553 7.099111 5.121041 2.516020 9.354924 3.955638 4.389209 3.818902
Min Qi 4.905553 5.288044 4.319998 1.492915 6.913159 2.892479 2.420598 2.938061
Max Qi 10.480224 7.099111 5.997959 2.643227 9.615551 3.955638 6.408602 4.285251
Avg Qi 8.512106 5.930062 5.136893 2.348561 8.854077 3.320857 4.350939 3.678533
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 1.405338 0.868313 0.806277 1.123366 1.314028 0.233214 2.590459 1.230242
Stream 1 5.191045 3.171244 3.403836 4.604523 3.721133 0.892096 7.136841 6.500452
Stream 2 6.282687 2.845465 3.024786 4.086546 3.530743 0.619683 9.263671 4.826173
Stream 3 6.040787 2.659766 2.787273 6.210077 3.902190 2.175417 7.974860 6.689780
Stream 4 4.978721 2.542674 3.518783 4.385571 3.906211 0.918752 6.303352 5.139326
Stream 5 5.208600 3.761975 3.682886 7.874493 5.017600 2.087150 7.999074 7.978154
Min Qi 4.978721 2.542674 2.787273 4.086546 3.530743 0.619683 6.303352 4.826173
Max Qi 6.282687 3.761975 3.682886 7.874493 5.017600 2.175417 9.263671 7.978154
Avg Qi 5.540368 2.996225 3.283513 5.432242 4.015575 1.338620 7.735560 6.226777

1000 GB Run 1

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 1000
Total Data Storage/Database Size 26M
Query Streams for
Throughput Test
7
Virt-H Power 136,744.5
Virt-H Throughput 147,374.6
Virt-H Composite
Query-per-Hour Metric
(Qph@1000GB)
141,960.1
Measurement Interval in
Throughput Test (Ts)
3,761.953000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 09:18:42 10/03/2014 09:34:12 0:15:30
Stream 1 10/03/2014 09:34:43 10/03/2014 10:35:42 1:00:59
Stream 2 10/03/2014 09:34:43 10/03/2014 10:37:14 1:02:31
Stream 3 10/03/2014 09:34:43 10/03/2014 10:37:25 1:02:42
Stream 4 10/03/2014 09:34:43 10/03/2014 10:33:31 0:58:48
Stream 5 10/03/2014 09:34:43 10/03/2014 10:35:26 1:00:43
Stream 6 10/03/2014 09:34:43 10/03/2014 10:28:00 0:53:17
Stream 7 10/03/2014 09:34:43 10/03/2014 10:35:42 1:00:59
Refresh 0 10/03/2014 09:18:42 10/03/2014 09:19:27 0:00:45
10/03/2014 09:34:12 10/03/2014 09:34:42 0:00:30
Refresh 1 10/03/2014 09:43:03 10/03/2014 09:43:38 0:00:35
Refresh 2 10/03/2014 09:34:43 10/03/2014 09:36:54 0:02:11
Refresh 3 10/03/2014 09:36:53 10/03/2014 09:38:39 0:01:46
Refresh 4 10/03/2014 09:38:39 10/03/2014 09:39:22 0:00:43
Refresh 5 10/03/2014 09:39:23 10/03/2014 09:41:09 0:01:46
Refresh 6 10/03/2014 09:41:09 10/03/2014 09:42:15 0:01:06
Refresh 7 10/03/2014 09:42:15 10/03/2014 09:43:02 0:00:47

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 104.488583 18.351559 24.631282 36.195531 36.319915 3.807790 22.750889 31.190630
Stream 1 209.323441 26.205435 59.637373 245.808484 60.699333 22.369379 289.435780 335.733425
Stream 2 109.134446 64.185831 96.131735 108.459418 310.273986 53.595127 152.242755 104.350098
Stream 3 73.321611 215.535408 69.543101 12.423757 64.445611 38.254747 122.952872 98.713213
Stream 4 110.875875 4.272757 78.697314 16.316807 59.746855 23.447211 353.190412 342.549908
Stream 5 41.972337 5.978707 60.784575 34.219229 42.372449 344.590640 146.186614 274.972270
Stream 6 115.760155 18.692078 58.493147 9.193234 49.831932 19.081395 60.603109 128.095501
Stream 7 58.601744 118.126585 297.327543 298.578268 714.284222 108.475250 91.868151 55.881029
Min Qi 41.972337 4.272757 58.493147 9.193234 42.372449 19.081395 60.603109 55.881029
Max Qi 209.323441 215.535408 297.327543 298.578268 714.284222 344.590640 353.190412 342.549908
Avg Qi 102.712801 64.713829 102.944970 103.571314 185.950627 87.116250 173.782813 191.470778
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 41.777880 10.035063 16.125611 9.245638 209.443782 111.271310 37.821595 9.483838
Stream 1 244.243830 63.473338 207.741931 33.696956 561.057408 141.026049 126.818051 54.774792
Stream 2 189.297446 144.853756 56.292537 184.781273 501.330052 49.965102 107.736393 85.691079
Stream 3 231.060699 355.394713 43.483645 11.806590 555.445111 36.722686 251.241817 9.057850
Stream 4 227.371508 32.207115 108.880658 139.922550 532.697956 57.106583 159.198489 153.088913
Stream 5 416.113856 108.689389 62.847727 702.712683 622.906487 58.198961 89.707091 85.614769
Stream 6 228.019243 62.474213 88.227994 282.932978 432.387869 238.544027 61.486269 56.950548
Stream 7 230.564416 69.197517 130.708759 120.531103 551.112816 57.438478 82.256530 63.796403
Min Qi 189.297446 32.207115 43.483645 11.806590 432.387869 36.722686 61.486269 9.057850
Max Qi 416.113856 355.394713 207.741931 702.712683 622.906487 238.544027 251.241817 153.088913
Avg Qi 252.381571 119.470006 99.740464 210.912019 536.705386 91.285984 125.492091 72.710622
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 22.897349 47.870269 12.735580 25.982194 46.091766 6.623306 45.120559 30.016788
Stream 1 123.444839 22.212194 647.523826 97.431531 81.592165 4.573040 21.068225 14.486185
Stream 2 80.853865 622.651044 288.656211 336.409076 70.925079 33.578052 82.910543 48.001583
Stream 3 392.340812 84.967695 57.181935 473.720060 497.262620 66.966740 54.778284 50.940094
Stream 4 97.069440 301.705125 338.035788 258.992426 103.699408 28.750257 23.858757 13.626079
Stream 5 69.882110 34.277914 146.031938 179.656129 104.788154 10.836148 54.319823 52.077352
Stream 6 141.310431 247.242904 94.392791 702.775460 80.142930 19.969889 46.027410 19.136271
Stream 7 89.018281 51.105998 281.234432 79.046122 84.341517 26.221892 33.169666 13.309634
Min Qi 69.882110 22.212194 57.181935 79.046122 70.925079 4.573040 21.068225 13.309634
Max Qi 392.340812 622.651044 647.523826 702.775460 497.262620 66.966740 82.910543 52.077352
Avg Qi 141.988540 194.880411 264.722417 304.004401 146.107410 27.270860 45.161815 30.225314

1000 GB Run 2

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 1000
Total Data Storage/Database Size 26M
Query Streams for
Throughput Test
7
Virt-H Power 199,652.0
Virt-H Throughput 125,161.1
Virt-H Composite
Query-per-Hour Metric
(Qph@1000GB)
158,078.0
Measurement Interval in
Throughput Test (Ts)
4,429.608000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 10:37:29 10/03/2014 10:52:26 0:14:57
Stream 1 10/03/2014 10:52:35 10/03/2014 12:05:19 1:12:44
Stream 2 10/03/2014 10:52:35 10/03/2014 12:06:25 1:13:50
Stream 3 10/03/2014 10:52:35 10/03/2014 12:03:08 1:10:33
Stream 4 10/03/2014 10:52:35 10/03/2014 12:05:20 1:12:45
Stream 5 10/03/2014 10:52:35 10/03/2014 11:57:40 1:05:05
Stream 6 10/03/2014 10:52:35 10/03/2014 12:05:28 1:12:53
Stream 7 10/03/2014 10:52:35 10/03/2014 12:05:25 1:12:50
Refresh 0 10/03/2014 10:37:29 10/03/2014 10:37:52 0:00:23
10/03/2014 10:52:25 10/03/2014 10:52:34 0:00:09
Refresh 1 10/03/2014 11:01:44 10/03/2014 11:02:29 0:00:45
Refresh 2 10/03/2014 10:52:35 10/03/2014 10:54:50 0:02:15
Refresh 3 10/03/2014 10:54:50 10/03/2014 10:57:02 0:02:12
Refresh 4 10/03/2014 10:57:05 10/03/2014 10:58:47 0:01:42
Refresh 5 10/03/2014 10:58:47 10/03/2014 10:59:46 0:00:59
Refresh 6 10/03/2014 10:59:45 10/03/2014 11:00:38 0:00:53
Refresh 7 10/03/2014 11:00:39 10/03/2014 11:01:44 0:01:05

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 34.105419 1.439089 9.802183 2.033956 10.525742 3.356152 23.953729 36.199533
Stream 1 26.598252 150.572833 41.930330 86.870320 50.604856 201.001372 61.638366 244.013359
Stream 2 50.129895 102.219282 12.380935 102.319615 62.577229 43.454392 891.076608 407.640626
Stream 3 269.947278 53.172724 54.649973 11.460062 66.695722 17.336698 63.371232 91.158050
Stream 4 41.149221 22.520836 28.707973 509.984321 68.916549 17.525025 702.191490 666.450230
Stream 5 59.179045 30.734442 99.504351 11.145990 101.334340 21.660836 74.625589 535.160207
Stream 6 225.105215 55.567328 46.749707 554.474507 215.657091 54.362551 72.960653 442.194302
Stream 7 220.993226 28.528230 47.543365 336.191006 308.931194 9.767397 850.258452 66.121298
Min Qi 26.598252 22.520836 12.380935 11.145990 50.604856 9.767397 61.638366 66.121298
Max Qi 269.947278 150.572833 99.504351 554.474507 308.931194 201.001372 891.076608 666.450230
Avg Qi 127.586019 63.330811 47.352376 230.349403 124.959569 52.158324 388.017484 350.391153
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 50.439615 9.287196 15.892947 7.112715 250.527755 131.478131 54.458992 10.525842
Stream 1 420.919329 317.402771 101.818338 403.213385 724.539887 160.669174 65.374584 28.563034
Stream 2 464.378760 210.938167 23.395678 545.086468 736.005716 54.680686 398.880053 34.018918
Stream 3 350.083270 321.781561 48.652019 435.954962 378.872739 100.588804 289.350342 190.140640
Stream 4 306.265994 249.621982 79.280220 221.255121 348.932746 49.555802 100.062439 61.368814
Stream 5 511.923087 133.018420 134.199065 9.655693 662.658830 104.380635 82.847242 59.952271
Stream 6 578.362701 61.221715 145.613349 47.957006 621.993889 256.150595 77.124777 91.163005
Stream 7 418.450091 391.818564 29.360218 17.236628 761.850888 31.952329 50.393082 27.530882
Min Qi 306.265994 61.221715 23.395678 9.655693 348.932746 31.952329 50.393082 27.530882
Max Qi 578.362701 391.818564 145.613349 545.086468 761.850888 256.150595 398.880053 190.140640
Avg Qi 435.769033 240.829026 80.331270 240.051323 604.979242 108.282575 152.004646 70.391081
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 22.444111 37.978532 13.347320 26.553364 115.511143 7.670304 22.771613 8.761026
Stream 1 329.153807 19.198590 258.455295 556.256015 99.647793 14.878746 32.803289 8.771923
Stream 2 76.940373 74.916489 75.246897 16.035355 14.403643 32.348500 91.981362 41.426540
Stream 3 88.918404 238.858707 221.257060 688.441713 247.669761 5.345632 70.780594 49.352955
Stream 4 497.105081 167.874781 67.668514 76.820831 78.585717 3.655421 73.165786 29.401670
Stream 5 309.991618 123.023557 380.801141 347.055909 93.478502 18.351491 33.338814 12.557542
Stream 6 57.200926 154.489850 386.007137 103.558355 32.676369 92.863316 35.576966 14.061801
Stream 7 160.332088 46.934177 340.957970 84.479720 78.985110 60.568796 44.362737 8.831746
Min Qi 57.200926 19.198590 67.668514 16.035355 14.403643 3.655421 32.803289 8.771923
Max Qi 497.105081 238.858707 386.007137 688.441713 247.669761 92.863316 91.981362 49.352955
Avg Qi 217.091757 117.899450 247.199145 267.521128 92.206699 32.573129 54.572793 23.486311

To be continued...

In Hoc Signo Vinces (TPC-H) Series

# PermaLink Comments [0]
10/06/2014 13:53 GMT
In Hoc Signo Vinces (part 20 of n): 100G and 1000G With Cluster; When is Cluster Worthwhile; Effects of I/O [ Orri Erling ]

In the introduction to scale out piece, I promised to address the matter of data-to-memory ratio, and to talk about when scale-out makes sense. Here we will see that scale-out makes sense whenever data does not fit in memory on a single commodity server. The gains in processing power are immediate, even when going from one box to just two, with both systems having all in memory.

As an initial take on the issue we run 100 GB and 1000 GB on the test system. 100 GB is trivially in memory, 1000 GB is not, as the memory is 384 GB total, of which 360 GB may be used for the processes.

We run 2 workloads on the 100 GB database, having pre-loaded the data in memory:

run power throughput composite
1 349,027.7 420,503.1 383,102.1
2 387,890.3 433,066.6 409,856.5

This is directly comparable to the 100 GB single-server results. Comparing the second runs, we see a 1.53x gain in power and a 1.8x gain in throughput from 2x the platform. This is fully on the level for a workload that is not trivially parallel, as we have seen in the previous articles. The difference between the first and second runs at 100 GB comes, for both single-server and cluster, from the latency of allocating transient query memory. For an official run, where the weakest link is the first power test, this would simply have to be pre-allocated.

We run 2 workloads on the 1000 GB database, starting from cold.

The result is:

run power throughput composite
1 136,744.5 147,374.6 141,960.1
2 199,652.0 125,161.1 158,078.0

The 1000 GB result is not for competition with this platform; more memory would be needed. For actual applications, the numbers are still in the usable range, though.

The 1000 GB setup uses 4 SSDs for storage, one per server process. The server processes are each bound to their own physical CPU.

We look at the meters: 32M pages (8M per process) are in memory at each time. Over the 2 benchmark executions there are a total of 494M disk reads. The total CPU time is 165,674 seconds of CPU, of which about 10% are system, over 10,063 seconds of real-time. Cumulative disk-read wait-time is 130,177 s. This gives an average disk read throughput of 384 MB/s.

This is easily sustained by 4 SSDs; in practice, the maximum throughput we see for reading is 1 GB/s (256 MB/s per SSD). Newer SSDs would do maybe twice that. Using rotating media would not be an option.

Without the drop in CPU caused by waiting for SSD, we would have numbers very close to the 100 GB numbers.

The interconnect traffic for the two runs was 1,077 GB with no message compression. The write block time was 448 seconds of thread-time. So we see that blocking on write hurts platform utilization when running under optimal conditions, but compared to going to secondary storage, it is not a large factor.

The 1000 GB scale has a transient peak memory consumption of 42 GB. This consists of hash-join build sides and GROUP BYs. The greatest memory consumers are Q9 with 9 GB, Q13 with 11 GB, and Q16 with 7 GB. Having many of these at a time drives up the transient peak. The peak gets higher as the scale grows, also because a larger scale requires more concurrent query streams. At the 384 GB for 1000 GB ratio, we do not yet get into memory saving plans like hash joins in many passes or index use instead of hash. When the data size grows, replicated hash build sides will become less convenient, and communication will increase. Q9 and Q13 can be done by index with almost no transient memory, but these plans are easily 3x less efficient for CPU. These will probably help at 3000 GB and be necessary at least part of the time at 10,000 GB.

The I/O volume in MB per index over the 2 executions is:

index MB
LINEITEM 1,987,483
ORDERS 1,440,526
PARTSUPP 199,335
PART 161,717
CUSTOMER 43,276
O_CK 19,085
SUPPLIER 13,393

Of this, maybe 600 GB could be saved by stream compressing o_comment. Otherwise this cannot be helped without adding memory. The lineitem reads are mostly for l_extendedprice, which is not compressible. If compressing o_comment made l_extendedprice always fit in memory, then there would be a radical drop in I/O. Also, as a matter of fact, the buffer management policy of least-recently-used works the very worst for big scans, specifically those of l_extendedprice: If the head is replaced when reading the tail, and the next read starts from the head, then the whole table/column is read all over again. Caching policies that specially recognized scans of this sort could further reduce I/O. Clustering lineitems/orders on date, as Actian Vector TPC-H implementations do, also starts yielding a greater gain when not running from memory: One column (e.g., l_shipdate) may be scanned for the whole table but, if the matches are bunched together, then most of l_extendedprice will not be read at all. Still, if going for top ranks in the races, all will be from memory, or at least there will be SSDs with read throughput around 150 MB/s per core, so these tricks become relatively less important.

In the 100 GB numerical quantities summaries, we see much the same picture as in the single-server. Queries get faster, but their relative times are not radically different. The throughput test (many queries at a time) times are more or less multiples of the power (single user) times. This picture breaks at 1000 GB where I/O first drops the performance to under half and introduces huge variation in execution times within a single query. The time entirely depends on which queries are running along with or right before the execution and on whether these have the same or different working sets. All the streams have the same queries with different parameters, but the query order in each stream is different.

The numerical quantities follow for all the runs. Note that the first 1000 GB run is cold. A competition grade 1000 GB result can be made with double the memory, and the more CPU the better. We will try one at Amazon in a bit.

***

The conclusion is that scale-out pays from the get-go. At present prices, a system with twice the power of a single node of the test system is cost effective. Scales of up to 500 GB are single commodity server, under $10K. Rather than going from a mid-to-large dual-socket box to a quad-socket box, one is likely to be better off having two cheaper dual-socket boxes. These are also readily available on clouds, whereas scale-up configurations are not. Onwards of 1 TB, a cluster is expected to clearly win. At 3 TB, a commodity cluster will clearly be the better deal for both price and absolute performance.

100 GB Run 1

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 100
Total Data Storage/Database Size 0M
Query Streams for
Throughput Test
5
Virt-H Power 349,027.7
Virt-H Throughput 420,503.1
Virt-H Composite
Query-per-Hour Metric
(Qph@100GB)
383,102.1
Measurement Interval in
Throughput Test (Ts)
94.273000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 15:05:07 10/03/2014 15:05:40 0:00:33
Stream 1 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Stream 2 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Stream 3 10/03/2014 15:05:42 10/03/2014 15:07:16 0:01:34
Stream 4 10/03/2014 15:05:42 10/03/2014 15:07:14 0:01:32
Stream 5 10/03/2014 15:05:42 10/03/2014 15:07:15 0:01:33
Refresh 0 10/03/2014 15:05:07 10/03/2014 15:05:13 0:00:06
10/03/2014 15:05:41 10/03/2014 15:05:42 0:00:01
Refresh 1 10/03/2014 15:06:48 10/03/2014 15:07:03 0:00:15
Refresh 2 10/03/2014 15:05:42 10/03/2014 15:06:06 0:00:24
Refresh 3 10/03/2014 15:06:06 10/03/2014 15:06:20 0:00:14
Refresh 4 10/03/2014 15:06:20 10/03/2014 15:06:35 0:00:15
Refresh 5 10/03/2014 15:06:35 10/03/2014 15:06:48 0:00:13

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 2.045198 0.337315 1.129548 0.327029 1.230955 0.473090 0.979096 0.852639
Stream 1 4.521951 0.596538 3.464342 1.167101 3.944699 1.744325 5.442328 4.706185
Stream 2 4.678728 0.837205 3.594060 1.911751 3.942459 0.947788 3.821267 4.686319
Stream 3 5.126384 0.932394 0.961762 1.043759 5.359990 1.035597 3.056079 5.803445
Stream 4 4.497118 0.381036 4.665412 1.224975 5.316591 1.666253 2.297872 6.425171
Stream 5 4.080968 0.493741 4.416305 0.879202 5.705877 1.615987 3.846881 3.346686
Min Qi 4.080968 0.381036 0.961762 0.879202 3.942459 0.947788 2.297872 3.346686
Max Qi 5.126384 0.932394 4.665412 1.911751 5.705877 1.744325 5.442328 6.425171
Avg Qi 4.581030 0.648183 3.420376 1.245358 4.853923 1.401990 3.692885 4.993561
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 3.575916 2.786656 1.579488 0.611454 3.132460 0.685095 0.955559 1.060110
Stream 1 9.551437 7.187181 5.816455 2.004946 9.461347 5.624020 5.517677 2.924265
Stream 2 9.637427 6.641804 6.359532 2.412576 8.819754 3.335494 4.549792 3.163920
Stream 3 11.041451 6.464479 6.982671 3.272975 8.342983 3.448635 4.405911 2.886393
Stream 4 8.860228 6.754529 7.065501 3.225236 8.789565 3.419165 4.240718 2.399092
Stream 5 7.339672 8.121027 6.261988 2.711946 8.764934 3.106366 6.544712 3.472092
Min Qi 7.339672 6.464479 5.816455 2.004946 8.342983 3.106366 4.240718 2.399092
Max Qi 11.041451 8.121027 7.065501 3.272975 9.461347 5.624020 6.544712 3.472092
Avg Qi 9.286043 7.033804 6.497229 2.725536 8.835717 3.786736 5.051762 2.969152
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 1.433789 0.972152 0.780247 1.287222 1.360084 0.254051 6.201742 1.219707
Stream 1 3.398354 2.591249 3.021207 4.663204 4.775704 1.116547 8.770115 5.643550
Stream 2 6.811520 3.411846 2.634076 4.296810 4.669635 2.282003 18.039617 6.060465
Stream 3 4.947110 2.479268 2.952951 6.431644 5.469152 1.816467 8.271266 5.498956
Stream 4 5.240237 2.062261 2.734378 6.055141 2.997684 2.519301 7.889700 6.944722
Stream 5 4.839670 3.379315 3.231582 6.255944 3.759509 1.347830 8.707303 4.376033
Min Qi 3.398354 2.062261 2.634076 4.296810 2.997684 1.116547 7.889700 4.376033
Max Qi 6.811520 3.411846 3.231582 6.431644 5.469152 2.519301 18.039617 6.944722
Avg Qi 5.047378 2.784788 2.914839 5.540549 4.334337 1.816430 10.335600 5.704745

100 GB Run 2

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 100
Total Data Storage/Database Size 0M
Query Streams for
Throughput Test
5
Virt-H Power 387,890.3
Virt-H Throughput 433,066.6
Virt-H Composite
Query-per-Hour Metric
(Qph@100GB)
409,856.5
Measurement Interval in
Throughput Test (Ts)
91.541000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 15:07:19 10/03/2014 15:07:47 0:00:28
Stream 1 10/03/2014 15:07:48 10/03/2014 15:09:19 0:01:31
Stream 2 10/03/2014 15:07:48 10/03/2014 15:09:16 0:01:28
Stream 3 10/03/2014 15:07:48 10/03/2014 15:09:17 0:01:29
Stream 4 10/03/2014 15:07:48 10/03/2014 15:09:16 0:01:28
Stream 5 10/03/2014 15:07:48 10/03/2014 15:09:20 0:01:32
Refresh 0 10/03/2014 15:07:19 10/03/2014 15:07:22 0:00:03
10/03/2014 15:07:47 10/03/2014 15:07:48 0:00:01
Refresh 1 10/03/2014 15:08:45 10/03/2014 15:08:59 0:00:14
Refresh 2 10/03/2014 15:07:49 10/03/2014 15:08:02 0:00:13
Refresh 3 10/03/2014 15:08:02 10/03/2014 15:08:17 0:00:15
Refresh 4 10/03/2014 15:08:17 10/03/2014 15:08:29 0:00:12
Refresh 5 10/03/2014 15:08:29 10/03/2014 15:08:45 0:00:16

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 2.081986 0.208487 0.902462 0.313160 1.312273 0.493157 0.926629 0.786345
Stream 1 2.755427 0.911578 3.618085 0.664407 3.740112 2.118189 4.738754 6.551446
Stream 2 4.189612 0.957921 5.267355 2.152479 6.068005 1.263380 4.251842 3.620160
Stream 3 4.708834 0.981651 2.411839 0.790955 4.384516 1.322670 2.641571 4.771831
Stream 4 3.739567 1.185884 2.863871 1.517891 5.946967 1.179960 3.840560 4.926325
Stream 5 5.258746 0.705228 3.460904 0.951328 4.530620 1.104500 3.226494 4.041142
Min Qi 2.755427 0.705228 2.411839 0.664407 3.740112 1.104500 2.641571 3.620160
Max Qi 5.258746 1.185884 5.267355 2.152479 6.068005 2.118189 4.738754 6.551446
Avg Qi 4.130437 0.948452 3.524411 1.215412 4.934044 1.397740 3.739844 4.782181
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 3.226685 1.878227 1.802562 0.676499 3.145884 0.653129 0.963449 0.990524
Stream 1 8.842030 5.630466 5.728147 2.643227 9.615551 3.197855 4.676538 4.285251
Stream 2 9.508612 5.288044 4.319998 1.492915 9.431995 3.206360 3.859749 3.201996
Stream 3 10.480224 5.880274 4.517320 2.509405 6.913159 2.892479 6.408602 2.938061
Stream 4 8.824111 5.752413 5.997959 2.581237 8.954756 3.351951 2.420598 4.148455
Stream 5 4.905553 7.099111 5.121041 2.516020 9.354924 3.955638 4.389209 3.818902
Min Qi 4.905553 5.288044 4.319998 1.492915 6.913159 2.892479 2.420598 2.938061
Max Qi 10.480224 7.099111 5.997959 2.643227 9.615551 3.955638 6.408602 4.285251
Avg Qi 8.512106 5.930062 5.136893 2.348561 8.854077 3.320857 4.350939 3.678533
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 1.405338 0.868313 0.806277 1.123366 1.314028 0.233214 2.590459 1.230242
Stream 1 5.191045 3.171244 3.403836 4.604523 3.721133 0.892096 7.136841 6.500452
Stream 2 6.282687 2.845465 3.024786 4.086546 3.530743 0.619683 9.263671 4.826173
Stream 3 6.040787 2.659766 2.787273 6.210077 3.902190 2.175417 7.974860 6.689780
Stream 4 4.978721 2.542674 3.518783 4.385571 3.906211 0.918752 6.303352 5.139326
Stream 5 5.208600 3.761975 3.682886 7.874493 5.017600 2.087150 7.999074 7.978154
Min Qi 4.978721 2.542674 2.787273 4.086546 3.530743 0.619683 6.303352 4.826173
Max Qi 6.282687 3.761975 3.682886 7.874493 5.017600 2.175417 9.263671 7.978154
Avg Qi 5.540368 2.996225 3.283513 5.432242 4.015575 1.338620 7.735560 6.226777

1000 GB Run 1

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 1000
Total Data Storage/Database Size 26M
Query Streams for
Throughput Test
7
Virt-H Power 136,744.5
Virt-H Throughput 147,374.6
Virt-H Composite
Query-per-Hour Metric
(Qph@1000GB)
141,960.1
Measurement Interval in
Throughput Test (Ts)
3,761.953000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 09:18:42 10/03/2014 09:34:12 0:15:30
Stream 1 10/03/2014 09:34:43 10/03/2014 10:35:42 1:00:59
Stream 2 10/03/2014 09:34:43 10/03/2014 10:37:14 1:02:31
Stream 3 10/03/2014 09:34:43 10/03/2014 10:37:25 1:02:42
Stream 4 10/03/2014 09:34:43 10/03/2014 10:33:31 0:58:48
Stream 5 10/03/2014 09:34:43 10/03/2014 10:35:26 1:00:43
Stream 6 10/03/2014 09:34:43 10/03/2014 10:28:00 0:53:17
Stream 7 10/03/2014 09:34:43 10/03/2014 10:35:42 1:00:59
Refresh 0 10/03/2014 09:18:42 10/03/2014 09:19:27 0:00:45
10/03/2014 09:34:12 10/03/2014 09:34:42 0:00:30
Refresh 1 10/03/2014 09:43:03 10/03/2014 09:43:38 0:00:35
Refresh 2 10/03/2014 09:34:43 10/03/2014 09:36:54 0:02:11
Refresh 3 10/03/2014 09:36:53 10/03/2014 09:38:39 0:01:46
Refresh 4 10/03/2014 09:38:39 10/03/2014 09:39:22 0:00:43
Refresh 5 10/03/2014 09:39:23 10/03/2014 09:41:09 0:01:46
Refresh 6 10/03/2014 09:41:09 10/03/2014 09:42:15 0:01:06
Refresh 7 10/03/2014 09:42:15 10/03/2014 09:43:02 0:00:47

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 104.488583 18.351559 24.631282 36.195531 36.319915 3.807790 22.750889 31.190630
Stream 1 209.323441 26.205435 59.637373 245.808484 60.699333 22.369379 289.435780 335.733425
Stream 2 109.134446 64.185831 96.131735 108.459418 310.273986 53.595127 152.242755 104.350098
Stream 3 73.321611 215.535408 69.543101 12.423757 64.445611 38.254747 122.952872 98.713213
Stream 4 110.875875 4.272757 78.697314 16.316807 59.746855 23.447211 353.190412 342.549908
Stream 5 41.972337 5.978707 60.784575 34.219229 42.372449 344.590640 146.186614 274.972270
Stream 6 115.760155 18.692078 58.493147 9.193234 49.831932 19.081395 60.603109 128.095501
Stream 7 58.601744 118.126585 297.327543 298.578268 714.284222 108.475250 91.868151 55.881029
Min Qi 41.972337 4.272757 58.493147 9.193234 42.372449 19.081395 60.603109 55.881029
Max Qi 209.323441 215.535408 297.327543 298.578268 714.284222 344.590640 353.190412 342.549908
Avg Qi 102.712801 64.713829 102.944970 103.571314 185.950627 87.116250 173.782813 191.470778
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 41.777880 10.035063 16.125611 9.245638 209.443782 111.271310 37.821595 9.483838
Stream 1 244.243830 63.473338 207.741931 33.696956 561.057408 141.026049 126.818051 54.774792
Stream 2 189.297446 144.853756 56.292537 184.781273 501.330052 49.965102 107.736393 85.691079
Stream 3 231.060699 355.394713 43.483645 11.806590 555.445111 36.722686 251.241817 9.057850
Stream 4 227.371508 32.207115 108.880658 139.922550 532.697956 57.106583 159.198489 153.088913
Stream 5 416.113856 108.689389 62.847727 702.712683 622.906487 58.198961 89.707091 85.614769
Stream 6 228.019243 62.474213 88.227994 282.932978 432.387869 238.544027 61.486269 56.950548
Stream 7 230.564416 69.197517 130.708759 120.531103 551.112816 57.438478 82.256530 63.796403
Min Qi 189.297446 32.207115 43.483645 11.806590 432.387869 36.722686 61.486269 9.057850
Max Qi 416.113856 355.394713 207.741931 702.712683 622.906487 238.544027 251.241817 153.088913
Avg Qi 252.381571 119.470006 99.740464 210.912019 536.705386 91.285984 125.492091 72.710622
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 22.897349 47.870269 12.735580 25.982194 46.091766 6.623306 45.120559 30.016788
Stream 1 123.444839 22.212194 647.523826 97.431531 81.592165 4.573040 21.068225 14.486185
Stream 2 80.853865 622.651044 288.656211 336.409076 70.925079 33.578052 82.910543 48.001583
Stream 3 392.340812 84.967695 57.181935 473.720060 497.262620 66.966740 54.778284 50.940094
Stream 4 97.069440 301.705125 338.035788 258.992426 103.699408 28.750257 23.858757 13.626079
Stream 5 69.882110 34.277914 146.031938 179.656129 104.788154 10.836148 54.319823 52.077352
Stream 6 141.310431 247.242904 94.392791 702.775460 80.142930 19.969889 46.027410 19.136271
Stream 7 89.018281 51.105998 281.234432 79.046122 84.341517 26.221892 33.169666 13.309634
Min Qi 69.882110 22.212194 57.181935 79.046122 70.925079 4.573040 21.068225 13.309634
Max Qi 392.340812 622.651044 647.523826 702.775460 497.262620 66.966740 82.910543 52.077352
Avg Qi 141.988540 194.880411 264.722417 304.004401 146.107410 27.270860 45.161815 30.225314

1000 GB Run 2

Virt-H Executive Summary

Report Date October 3, 2014
Database Scale Factor 1000
Total Data Storage/Database Size 26M
Query Streams for
Throughput Test
7
Virt-H Power 199,652.0
Virt-H Throughput 125,161.1
Virt-H Composite
Query-per-Hour Metric
(Qph@1000GB)
158,078.0
Measurement Interval in
Throughput Test (Ts)
4,429.608000 seconds

Duration of stream execution

Start Date/Time End Date/Time Duration
Stream 0 10/03/2014 10:37:29 10/03/2014 10:52:26 0:14:57
Stream 1 10/03/2014 10:52:35 10/03/2014 12:05:19 1:12:44
Stream 2 10/03/2014 10:52:35 10/03/2014 12:06:25 1:13:50
Stream 3 10/03/2014 10:52:35 10/03/2014 12:03:08 1:10:33
Stream 4 10/03/2014 10:52:35 10/03/2014 12:05:20 1:12:45
Stream 5 10/03/2014 10:52:35 10/03/2014 11:57:40 1:05:05
Stream 6 10/03/2014 10:52:35 10/03/2014 12:05:28 1:12:53
Stream 7 10/03/2014 10:52:35 10/03/2014 12:05:25 1:12:50
Refresh 0 10/03/2014 10:37:29 10/03/2014 10:37:52 0:00:23
10/03/2014 10:52:25 10/03/2014 10:52:34 0:00:09
Refresh 1 10/03/2014 11:01:44 10/03/2014 11:02:29 0:00:45
Refresh 2 10/03/2014 10:52:35 10/03/2014 10:54:50 0:02:15
Refresh 3 10/03/2014 10:54:50 10/03/2014 10:57:02 0:02:12
Refresh 4 10/03/2014 10:57:05 10/03/2014 10:58:47 0:01:42
Refresh 5 10/03/2014 10:58:47 10/03/2014 10:59:46 0:00:59
Refresh 6 10/03/2014 10:59:45 10/03/2014 11:00:38 0:00:53
Refresh 7 10/03/2014 11:00:39 10/03/2014 11:01:44 0:01:05

Numerical Quantities Summary Timing Intervals in Seconds:

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Stream 0 34.105419 1.439089 9.802183 2.033956 10.525742 3.356152 23.953729 36.199533
Stream 1 26.598252 150.572833 41.930330 86.870320 50.604856 201.001372 61.638366 244.013359
Stream 2 50.129895 102.219282 12.380935 102.319615 62.577229 43.454392 891.076608 407.640626
Stream 3 269.947278 53.172724 54.649973 11.460062 66.695722 17.336698 63.371232 91.158050
Stream 4 41.149221 22.520836 28.707973 509.984321 68.916549 17.525025 702.191490 666.450230
Stream 5 59.179045 30.734442 99.504351 11.145990 101.334340 21.660836 74.625589 535.160207
Stream 6 225.105215 55.567328 46.749707 554.474507 215.657091 54.362551 72.960653 442.194302
Stream 7 220.993226 28.528230 47.543365 336.191006 308.931194 9.767397 850.258452 66.121298
Min Qi 26.598252 22.520836 12.380935 11.145990 50.604856 9.767397 61.638366 66.121298
Max Qi 269.947278 150.572833 99.504351 554.474507 308.931194 201.001372 891.076608 666.450230
Avg Qi 127.586019 63.330811 47.352376 230.349403 124.959569 52.158324 388.017484 350.391153
Query Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16
Stream 0 50.439615 9.287196 15.892947 7.112715 250.527755 131.478131 54.458992 10.525842
Stream 1 420.919329 317.402771 101.818338 403.213385 724.539887 160.669174 65.374584 28.563034
Stream 2 464.378760 210.938167 23.395678 545.086468 736.005716 54.680686 398.880053 34.018918
Stream 3 350.083270 321.781561 48.652019 435.954962 378.872739 100.588804 289.350342 190.140640
Stream 4 306.265994 249.621982 79.280220 221.255121 348.932746 49.555802 100.062439 61.368814
Stream 5 511.923087 133.018420 134.199065 9.655693 662.658830 104.380635 82.847242 59.952271
Stream 6 578.362701 61.221715 145.613349 47.957006 621.993889 256.150595 77.124777 91.163005
Stream 7 418.450091 391.818564 29.360218 17.236628 761.850888 31.952329 50.393082 27.530882
Min Qi 306.265994 61.221715 23.395678 9.655693 348.932746 31.952329 50.393082 27.530882
Max Qi 578.362701 391.818564 145.613349 545.086468 761.850888 256.150595 398.880053 190.140640
Avg Qi 435.769033 240.829026 80.331270 240.051323 604.979242 108.282575 152.004646 70.391081
Query Q17 Q18 Q19 Q20 Q21 Q22 RF1 RF2
Stream 0 22.444111 37.978532 13.347320 26.553364 115.511143 7.670304 22.771613 8.761026
Stream 1 329.153807 19.198590 258.455295 556.256015 99.647793 14.878746 32.803289 8.771923
Stream 2 76.940373 74.916489 75.246897 16.035355 14.403643 32.348500 91.981362 41.426540
Stream 3 88.918404 238.858707 221.257060 688.441713 247.669761 5.345632 70.780594 49.352955
Stream 4 497.105081 167.874781 67.668514 76.820831 78.585717 3.655421 73.165786 29.401670
Stream 5 309.991618 123.023557 380.801141 347.055909 93.478502 18.351491 33.338814 12.557542
Stream 6 57.200926 154.489850 386.007137 103.558355 32.676369 92.863316 35.576966 14.061801
Stream 7 160.332088 46.934177 340.957970 84.479720 78.985110 60.568796 44.362737 8.831746
Min Qi 57.200926 19.198590 67.668514 16.035355 14.403643 3.655421 32.803289 8.771923
Max Qi 497.105081 238.858707 386.007137 688.441713 247.669761 92.863316 91.981362 49.352955
Avg Qi 217.091757 117.899450 247.199145 267.521128 92.206699 32.573129 54.572793 23.486311

To be continued...

In Hoc Signo Vinces (TPC-H) Series

# PermaLink Comments [0]
10/06/2014 13:53 GMT
 <<     | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform