Virtuoso Elastic Cluster Benchmarks AMI on Amazon EC2

Details

Virtuoso Data Space Bot

Burlington, United States

FOAF

We have another new Amazon machine image, this time for deploying your own Virtuoso Elastic Cluster on the cloud. The previous post gave a summary of running TPC-H on this image. This post is about what the AMI consists of and how to set it up.

Note: This AMI is running a pre-release build of Virtuoso 7.5, Commercial Edition. Features are subject to change, and this build is not licensed for any use other than the AMI-based benchmarking described herein.

There are two preconfigured cluster setups; one is for two (2) machines/instances and one is for four (4). Generation and loading of TPC-H data, as well as the benchmark run itself, is preconfigured, so you can do it by entering just a few commands. The whole sequence of doing a terabyte (1000G) scale TPC-H takes under two hours, with 30 minutes to generate the data, 35 minutes to load, and 35 minutes to do three benchmark runs. The 100G scale is several times faster still.

To experiment with this AMI, you will need a set of license files, one per machine/instance, which our Sales Team can provide.

Detailed instructions are on the AMI, in /home/ec2-user/cluster_instructions.txt, but the basic steps to get up and running are as follows:

Instantiate machine image ami-811becea) (AMI ID is subject to change; you should be able to find the latest by searching for "OpenLink Virtuoso Benchmarks" in "Community AMIs"; this one is short-named virtuoso-bench-cl) with two or four (2 or 4) R3.8xlarge instances within one virtual private cluster and placement group. Make sure the VPC security is set to allow all connections.
Log in to the first, and fill in the configuration file with the internal IP addresses of all machines instantiated in step 1.
Distribute the license files to the instances, and start the OpenLink License Manager on each machine.
Run 3 shell commands to set up the file systems and the Virtuoso configuration files.
If you do not plan to run one of these benchmarks, you can simply start and work with the Virtuoso cluster now. It is ready for use with an empty database.
Before running one of these benchmark, generate the appropriate dataset with the dbgen.sh command.
Bulk load the data with load.sh.
Run the benchmark with run.sh.

Right now the cluster benchmarks are limited to TPC-H but cluster versions of the LDBC Social Network and Semantic Publishing benchmarks will follow soon.

Post Comment

Name

OpenID

Comment

Remember my details

Notify me on future updates

Issue Semantic Pingback

Notify everybody mentioned in the post

Contains Markup

To verify your request please specify the result of

4 + 0 =

Subscribe to an RSS feed of this comment thread:

OpenLink Virtuoso (Product Blog)

SQL, SPARQL, RDF, XQuery, XPath, XSLT, XML, and more..

Details

Subscribe

Tag Cloud

Post Categories

Recent Articles

Comments

Post Comment

Blog Roll OPML OCS

Documentation (Atom Feed) OPML OCS

Documentation (RDF Feed) OPML OCS

Online Demos & Tutorials OPML OCS

Online Documentation OPML OCS

Support OPML OCS

OpenLink Virtuoso (Product Blog)

SQL, SPARQL, RDF, XQuery, XPath, XSLT, XML, and more..

Details

Subscribe

Tag Cloud

Post Categories

Recent Articles

Related

Comments

Post Comment

Blog Roll OPML OCS

Documentation (Atom Feed) OPML OCS

Documentation (RDF Feed) OPML OCS

Online Demos & Tutorials OPML OCS

Online Documentation OPML OCS

Support OPML OCS