Not logged in : Login
(Sponging disallowed)

About: In Hoc Signo Vinces (part 3 of 5) -- Benchmark Configuration Settings     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : schema:BlogPosting, within Data Space : www.openlinksw.com associated with source document(s)
QRcode icon
http://www.openlinksw.com/describe/?url=http%3A%2F%2Fwww.openlinksw.com%2Fdataspace%2Fvdb%2Fweblog%2Fvdb%2527s%2520BLOG%2520%255B136%255D%2F1750

AttributesValues
has container
Date Created
maker
topic
described by
seeAlso
Date Modified
link
id
  • d9a59d9725be63e3f8441b0e8eb2b862
content
  • In this section, we cover the system configuration for running an analytics workload. If you are running a Virtuoso database with even moderate data size, or are interested in reproducing the results presented here, the below will be relevant. If you are interested in the science of query, you can skip to the next installment.

    The relevant sections of the virtuoso.ini file are below, with commentary inline. The actual ini file has many more settings but these do not influence the benchmark.

    The test file system layout has two SSD file systems, mounted on /1s1 and /1s2. The database is striped across the two file systems.

    [Database]
    DatabaseFile    = virtuoso.db
    TransactionFile = /1s2/dbs/virtuoso.trx
    Striping        = 1

    This sets the log to be on the second SSD, and the database to be striped; the files are declared in the [Striping] section further below.

    [TempDatabase]
    DatabaseFile    = virtuoso.tdb
    TransactionFile = virtuoso.ttr
    
    [Parameters]
    ServerPort                 = 1209
    ServerThreads              = 100
    CheckpointInterval         = 0
    NumberOfBuffers            = 8000000
    MaxDirtyBuffers            = 1000000

    The thread count is set to 100. This is not significant, since the test will only have a few concurrent connections, but this should be at least as high as the number of concurrent user connections expected.

    The 100 GB TPC-H working set is about 38 GB for the queries. The full database is about 80 GB. Eight million buffers at 8 KB each means that up to 64 GB of database pages will be resident in memory. This should be set higher than the expected working set if possible, but the database process size should also not exceed 80% of physical memory.

    The max dirty buffers limit is set to a small fraction of the total buffers for faster bulk load. The bulk load is limited by writing to secondary storage, so we want the writing to start early, and continue through the bulk load. Otherwise the checkpoint at the end of the bulk load would be oversized, because of high numbers of un-flushed buffers.

    The checkpoint interval is set to 0, meaning no automatic checkpoints. There will be one at the end of the bulk load, as required by the rules, but the rules do not require checkpoints for the refresh functions.

    ColumnStore                = 1

    This sets all tables to be created column-wise. No special DDL directives are needed for column store operation.

    MaxCheckpointRemap         = 2500000
    DefaultIsolation           = 2

    The default isolation is set to READ COMMITTED. Running large queries with locking on reads would have a very high overhead.

    DirsAllowed                = /
    TransactionAfterImageLimit = 1500000000

    This is set to an arbitrarily high number. The measure is the count of bytes to be written to log at commit (1.5 GB, here). If the amount of data to be logged exceeds this, the transaction aborts. The RF1 transaction at 100 GB scale will log about 100 MB.

    FDsPerFile                 = 4
    MaxMemPoolSize             = 40000000

    This is the maximum number of bytes of transient memory to be used for query optimization (40 MB, here). The number is adequate for TPC-H, since the queries only have a few joins each. For RDF workloads, the number should be higher, since there are more joins.

    AdjustVectorSize           = 0

    The workload will run at the default vector size. Index operations can be accelerated by switching to a larger vector size, trading memory for locality. But since this workload is mostly by hash join, there is no benefit in changing this.

    ThreadsPerQuery            = 24

    Each query is divided into up to 24 parallel fragments. 24 is the number of threads on the test system.

    AsyncQueueMaxThreads       = 48

    Queries are run by a pool of 48 worker threads. Each session has one thread of its own. If a query parallelizes, the first fragment runs on the session's thread and the remaining fragments run on a thread from this pool. Thus the core threads are oversubscribed by a factor of slightly over 2 in the throughput run: 6 sessions plus 48 threads makes up to 53 runnable threads at any point in the throughput test.

    MaxQueryMem                = 30G

    This is a cap on query execution memory. If memory would exceed this, optimizations that would increase space consumption are not used. The memory may still transiently exceed this limit.

    HashJoinSpace              = 30G

    This is the maximum memory to be used for hash tables during hash joins. If a hash join causes this amount to be exceeded, it will be run in multiple passes, so as to have a cap on the hash table size. Not all hash joins may be partitioned, and the test must not do multi-pass hash joins, hence a high number here. We will see actual space consumption figures when looking at the queries. This parameter may be increased for analytics performance, especially in multiuser situations.

    [Client]
    SQL_QUERY_TIMEOUT  = 0
    SQL_TXN_TIMEOUT    = 0
    SQL_ROWSET_SIZE    = 10
    SQL_PREFETCH_BYTES = 120000

    120 KB of results is to be sent to clients in a single window. This is enough for the relatively short result sets in this benchmark.

    [Striping]
    Segment1 = 1024, /1s1/dbs/tpch100cp-1.db = q1, /1s2/dbs/tpch100cp-2.db = q2

    The database is set to stripe in two files, each on a different SSD. Each file has its own background I/O thread; this is the meaning of the = q1 and = q2 declaration. All files on each separately-seekable device should share the same q.

    [Flags]
    enable_mt_txn      = 1
    enable_mt_transact = 1

    The first setting enables multithreading DML statement execution. The second setting enables multithreading of COMMIT or ROLLBACK operations. This is important for the refresh function performance. A column store COMMIT of a DELETE will especially benefit from multithreading, since this may involve re-compression.

    hash_join_enable   = 2

    Will use hash joins for SQL and SPARQL (even though SPARQL is not used in this experiment).

    dbf_explain_level  = 0

    Specifies less verbose query plan formatting for logging of query execution.

    dbf_log_fsync = 1

    Specify that fsync is to be called after each write to the transaction log. The ACID qualification procedure specifies that the system is to be powered down in mid-run, hence this setting is required by the test.

    In Hoc Signo Vinces Series

  • In this section, we cover the system configuration for running an analytics workload. If you are running a Virtuoso database with even moderate data size, or are interested in reproducing the results presented here, the below will be relevant. If you are interested in the science of query, you can skip to the next installment.

    The relevant sections of the virtuoso.ini file are below, with commentary inline. The actual ini file has many more settings but these do not influence the benchmark.

    The test file system layout has two SSD file systems, mounted on /1s1 and /1s2. The database is striped across the two file systems.

    [Database]
    DatabaseFile    = virtuoso.db
    TransactionFile = /1s2/dbs/virtuoso.trx
    Striping        = 1

    This sets the log to be on the second SSD, and the database to be striped; the files are declared in the [Striping] section further below.

    [TempDatabase]
    DatabaseFile    = virtuoso.tdb
    TransactionFile = virtuoso.ttr
    
    [Parameters]
    ServerPort                 = 1209
    ServerThreads              = 100
    CheckpointInterval         = 0
    NumberOfBuffers            = 8000000
    MaxDirtyBuffers            = 1000000

    The thread count is set to 100. This is not significant, since the test will only have a few concurrent connections, but this should be at least as high as the number of concurrent user connections expected.

    The 100 GB TPC-H working set is about 38 GB for the queries. The full database is about 80 GB. Eight million buffers at 8 KB each means that up to 64 GB of database pages will be resident in memory. This should be set higher than the expected working set if possible, but the database process size should also not exceed 80% of physical memory.

    The max dirty buffers limit is set to a small fraction of the total buffers for faster bulk load. The bulk load is limited by writing to secondary storage, so we want the writing to start early, and continue through the bulk load. Otherwise the checkpoint at the end of the bulk load would be oversized, because of high numbers of un-flushed buffers.

    The checkpoint interval is set to 0, meaning no automatic checkpoints. There will be one at the end of the bulk load, as required by the rules, but the rules do not require checkpoints for the refresh functions.

    ColumnStore                = 1

    This sets all tables to be created column-wise. No special DDL directives are needed for column store operation.

    MaxCheckpointRemap         = 2500000
    DefaultIsolation           = 2

    The default isolation is set to READ COMMITTED. Running large queries with locking on reads would have a very high overhead.

    DirsAllowed                = /
    TransactionAfterImageLimit = 1500000000

    This is set to an arbitrarily high number. The measure is the count of bytes to be written to log at commit (1.5 GB, here). If the amount of data to be logged exceeds this, the transaction aborts. The RF1 transaction at 100 GB scale will log about 100 MB.

    FDsPerFile                 = 4
    MaxMemPoolSize             = 40000000

    This is the maximum number of bytes of transient memory to be used for query optimization (40 MB, here). The number is adequate for TPC-H, since the queries only have a few joins each. For RDF workloads, the number should be higher, since there are more joins.

    AdjustVectorSize           = 0

    The workload will run at the default vector size. Index operations can be accelerated by switching to a larger vector size, trading memory for locality. But since this workload is mostly by hash join, there is no benefit in changing this.

    ThreadsPerQuery            = 24

    Each query is divided into up to 24 parallel fragments. 24 is the number of threads on the test system.

    AsyncQueueMaxThreads       = 48

    Queries are run by a pool of 48 worker threads. Each session has one thread of its own. If a query parallelizes, the first fragment runs on the session's thread and the remaining fragments run on a thread from this pool. Thus the core threads are oversubscribed by a factor of slightly over 2 in the throughput run: 6 sessions plus 48 threads makes up to 53 runnable threads at any point in the throughput test.

    MaxQueryMem                = 30G

    This is a cap on query execution memory. If memory would exceed this, optimizations that would increase space consumption are not used. The memory may still transiently exceed this limit.

    HashJoinSpace              = 30G

    This is the maximum memory to be used for hash tables during hash joins. If a hash join causes this amount to be exceeded, it will be run in multiple passes, so as to have a cap on the hash table size. Not all hash joins may be partitioned, and the test must not do multi-pass hash joins, hence a high number here. We will see actual space consumption figures when looking at the queries. This parameter may be increased for analytics performance, especially in multiuser situations.

    [Client]
    SQL_QUERY_TIMEOUT  = 0
    SQL_TXN_TIMEOUT    = 0
    SQL_ROWSET_SIZE    = 10
    SQL_PREFETCH_BYTES = 120000

    120 KB of results is to be sent to clients in a single window. This is enough for the relatively short result sets in this benchmark.

    [Striping]
    Segment1 = 1024, /1s1/dbs/tpch100cp-1.db = q1, /1s2/dbs/tpch100cp-2.db = q2

    The database is set to stripe in two files, each on a different SSD. Each file has its own background I/O thread; this is the meaning of the = q1 and = q2 declaration. All files on each separately-seekable device should share the same q.

    [Flags]
    enable_mt_txn      = 1
    enable_mt_transact = 1

    The first setting enables multithreading DML statement execution. The second setting enables multithreading of COMMIT or ROLLBACK operations. This is important for the refresh function performance. A column store COMMIT of a DELETE will especially benefit from multithreading, since this may involve re-compression.

    hash_join_enable   = 2

    Will use hash joins for SQL and SPARQL (even though SPARQL is not used in this experiment).

    dbf_explain_level  = 0

    Specifies less verbose query plan formatting for logging of query execution.

    dbf_log_fsync = 1

    Specify that fsync is to be called after each write to the transaction log. The ACID qualification procedure specifies that the system is to be powered down in mid-run, hence this setting is required by the test.

    In Hoc Signo Vinces Series

  • In this section, we cover the system configuration for running an analytics workload. If you are running a Virtuoso database with even moderate data size, or are interested in reproducing the results presented here, the below will be relevant. If you are interested in the science of query, you can skip to the next installment.

    The relevant sections of the virtuoso.ini file are below, with commentary inline. The actual ini file has many more settings but these do not influence the benchmark.

    The test file system layout has two SSD file systems, mounted on /1s1 and /1s2. The database is striped across the two file systems.

    [Database]
    DatabaseFile    = virtuoso.db
    TransactionFile = /1s2/dbs/virtuoso.trx
    Striping        = 1

    This sets the log to be on the second SSD, and the database to be striped; the files are declared in the [Striping] section further below.

    [TempDatabase]
    DatabaseFile    = virtuoso.tdb
    TransactionFile = virtuoso.ttr
    
    [Parameters]
    ServerPort                 = 1209
    ServerThreads              = 100
    CheckpointInterval         = 0
    NumberOfBuffers            = 8000000
    MaxDirtyBuffers            = 1000000

    The thread count is set to 100. This is not significant, since the test will only have a few concurrent connections, but this should be at least as high as the number of concurrent user connections expected.

    The 100 GB TPC-H working set is about 38 GB for the queries. The full database is about 80 GB. Eight million buffers at 8 KB each means that up to 64 GB of database pages will be resident in memory. This should be set higher than the expected working set if possible, but the database process size should also not exceed 80% of physical memory.

    The max dirty buffers limit is set to a small fraction of the total buffers for faster bulk load. The bulk load is limited by writing to secondary storage, so we want the writing to start early, and continue through the bulk load. Otherwise the checkpoint at the end of the bulk load would be oversized, because of high numbers of un-flushed buffers.

    The checkpoint interval is set to 0, meaning no automatic checkpoints. There will be one at the end of the bulk load, as required by the rules, but the rules do not require checkpoints for the refresh functions.

    ColumnStore                = 1

    This sets all tables to be created column-wise. No special DDL directives are needed for column store operation.

    MaxCheckpointRemap         = 2500000
    DefaultIsolation           = 2

    The default isolation is set to READ COMMITTED. Running large queries with locking on reads would have a very high overhead.

    DirsAllowed                = /
    TransactionAfterImageLimit = 1500000000

    This is set to an arbitrarily high number. The measure is the count of bytes to be written to log at commit (1.5 GB, here). If the amount of data to be logged exceeds this, the transaction aborts. The RF1 transaction at 100 GB scale will log about 100 MB.

    FDsPerFile                 = 4
    MaxMemPoolSize             = 40000000

    This is the maximum number of bytes of transient memory to be used for query optimization (40 MB, here). The number is adequate for TPC-H, since the queries only have a few joins each. For RDF workloads, the number should be higher, since there are more joins.

    AdjustVectorSize           = 0

    The workload will run at the default vector size. Index operations can be accelerated by switching to a larger vector size, trading memory for locality. But since this workload is mostly by hash join, there is no benefit in changing this.

    ThreadsPerQuery            = 24

    Each query is divided into up to 24 parallel fragments. 24 is the number of threads on the test system.

    AsyncQueueMaxThreads       = 48

    Queries are run by a pool of 48 worker threads. Each session has one thread of its own. If a query parallelizes, the first fragment runs on the session's thread and the remaining fragments run on a thread from this pool. Thus the core threads are oversubscribed by a factor of slightly over 2 in the throughput run: 6 sessions plus 48 threads makes up to 53 runnable threads at any point in the throughput test.

    MaxQueryMem                = 30G

    This is a cap on query execution memory. If memory would exceed this, optimizations that would increase space consumption are not used. The memory may still transiently exceed this limit.

    HashJoinSpace              = 30G

    This is the maximum memory to be used for hash tables during hash joins. If a hash join causes this amount to be exceeded, it will be run in multiple passes, so as to have a cap on the hash table size. Not all hash joins may be partitioned, and the test must not do multi-pass hash joins, hence a high number here. We will see actual space consumption figures when looking at the queries. This parameter may be increased for analytics performance, especially in multiuser situations.

    [Client]
    SQL_QUERY_TIMEOUT  = 0
    SQL_TXN_TIMEOUT    = 0
    SQL_ROWSET_SIZE    = 10
    SQL_PREFETCH_BYTES = 120000

    120 KB of results is to be sent to clients in a single window. This is enough for the relatively short result sets in this benchmark.

    [Striping]
    Segment1 = 1024, /1s1/dbs/tpch100cp-1.db = q1, /1s2/dbs/tpch100cp-2.db = q2

    The database is set to stripe in two files, each on a different SSD. Each file has its own background I/O thread; this is the meaning of the = q1 and = q2 declaration. All files on each separately-seekable device should share the same q.

    [Flags]
    enable_mt_txn      = 1
    enable_mt_transact = 1

    The first setting enables multithreading DML statement execution. The second setting enables multithreading of COMMIT or ROLLBACK operations. This is important for the refresh function performance. A column store COMMIT of a DELETE will especially benefit from multithreading, since this may involve re-compression.

    hash_join_enable   = 2

    Will use hash joins for SQL and SPARQL (even though SPARQL is not used in this experiment).

    dbf_explain_level  = 0

    Specifies less verbose query plan formatting for logging of query execution.

    dbf_log_fsync = 1

    Specify that fsync is to be called after each write to the transaction log. The ACID qualification procedure specifies that the system is to be powered down in mid-run, hence this setting is required by the test.

    In Hoc Signo Vinces Series

Title
  • In Hoc Signo Vinces (part 3 of 5) -- Benchmark Configuration Settings
  • In Hoc Signo Vinces (part 3 of n) -- Benchmark Configuration Settings
has creator
is described using
atom:source
atom:updated
  • 2013-11-25T17:22:19Z
  • 2014-05-28T21:14:29Z
  • 2015-06-10T16:11:17Z
atom:title
  • In Hoc Signo Vinces (part 3 of 5) -- Benchmark Configuration Settings
  • In Hoc Signo Vinces (part 3 of n) -- Benchmark Configuration Settings
links to
atom:author
label
  • In Hoc Signo Vinces (part 3 of 5) -- Benchmark Configuration Settings
  • In Hoc Signo Vinces (part 3 of n) -- Benchmark Configuration Settings
atom:published
  • 2013-11-14T00:14:59Z
http://rdfs.org/si...ices#has_services
type
is topic of
Faceted Search & Find service v1.17_git122 as of Jan 03 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3330 as of Apr 5 2024, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (30 GB total memory, 26 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software