The server configuration is as in virtuoso.ini, discussed in the previous post. The schema is created by loading the file schema.sql, attached. All the tables are stored column-wise. The file contains declarations for hash partitioning in a cluster, but these have no effect on the single-server case. The file tables are declared in ldschema.sql and bound to files in ldfile.sql. The refresh functions are in rf.sql.

The source data is created with the dbgen utility. One file is generated per table.

Twelve refresh datasets are created in order to do the prescribed two runs; each consists of one power test, and one five-stream throughput test. Five streams is the minimum for the 100 GB scale.

Bulk Load

The bulk load script, ld.sql specifies the CSV files from which the data is loaded as file tables. The load command is simply --

log_enable (2); 
INSERT INTO lineitem 
  SELECT * 
  FROM lineitem_f 
;

The log_enable (2) turns off transaction logging, and enables non-transactional inserts. The lineitem table is a column-wise stored database table; the lineitem_f is a table view on the lineitem.tbl CSV file. The load script launches one statement like the above for each table, all in parallel, and then waits for their completion. It then makes an explicit CHECKPOINT to make the data durable. No foreign keys are declared; hence the load does not have to occur in any particular order. Each file is loaded in 24 parallel chunks; the file table facility splits the scan automatically inTO as many chunks as are specified by ThreadsPerQuery in the ini file.

The last of the load statements, that for lineitem, completes in 849s of real time. At this point, the data is loaded, and the database is ready for query. There are 3.4M dirty buffers yet to be flushed before the database state is durable. Thus, we must include the checkpoint time in the load result, which adds another 169s. The total load time is hence 16m58s.

By the TPC-H rules the timed portion of load must include any gathering of database statistics. We do not do any; rather the queries will derive any needed statistics by sampling at run time.

The bulk load has a sustained read rate around 120 MB/s from the source files. The average rate of writing is 60 MB/s. The writing continues long after the read has finished, so we have a truly I/O-bound situation. This can be improved by adding more SSDs. The CPU profile shows a possible gain of around 10%. Thus, with a better I/O system and some more optimization, a load time of about 11m should be possible with this CPU/memory configuration.

TPC-H specifies two data refresh operations: one inserting 1/1000th of the orders/lineitem combination; and another deleting the same. The rules leave the implementation largely open; they only specify that the order and its lineitems must be inserted or deleted within the same transaction.

Most implementations bulk-load a staging table, and then do an INSERT ... SELECT statement for the INSERT or a DELETE WHERE IN (SELECT ...) for the DELETE.

In Virtuoso, the refreshes are implemented as SQL procedures that read file tables, as follows:

CREATE PROCEDURE rf1 
  ( IN  dir    VARCHAR     , 
    IN  nth    INT         ,
    IN  no_pk  INT := 0    , 
    IN  rb     INT := 0    , 
    IN  qp     INT := NULL 
  )
  {
    INSERT INTO orders
      SELECT * 
      FROM orders_f 
      TABLE OPTION 
        ( FROM sprintf ('%s/orders.tbl.u%d', dir, nth)
        )
    ;
    INSERT INTO lineitem 
      SELECT * 
      FROM lineitem_f 
      TABLE OPTION 
        ( FROM sprintf ('%s/lineitem.tbl.u%d', dir, nth)
        )
    ;
    COMMIT WORK
    ;
  }


CREATE PROCEDURE del_batch 
  ( IN  d_orderkey  INT
  )
  {
    VECTORED;
    DELETE 
      FROM  lineitem 
      WHERE l_orderkey = d_orderkey 
    ;
    DELETE 
      FROM  orders 
      WHERE o_orderkey = d_orderkey
    ;
  }

CREATE PROCEDURE rf2 
  ( IN  dir  VARCHAR , 
    IN  nth  INT
  )
  {
    DECLARE  cnt INT 
    ;
    cnt := 
      ( SELECT COUNT (del_batch (d_orderkey) ) 
          FROM delete_f 
          TABLE OPTION 
            ( FROM sprintf ( '%s/delete.%d', dir, nth ) )
      )
    ;
    COMMIT WORK
    ;
    RETURN cnt
    ;
  }

Things can hardly be simpler. The delete uses a vectored stored procedure to delete a batch of rows in one statement. The parallelization is done automatically, dividing the file table being read into equal size chunks.

The performance is as follows. These refreshes are run on the database right after the bulk load.

-- Line 33: rf1 ('/1s1/tpch100src', 1)
Done. -- 3909 msec.
-- Line 34: rf2 ('/1s1/tpch100src', 1)
Done. -- 863 msec.
-- Line 36: rf1 ('/1s1/tpch100src', 2)
Done. -- 2269 msec.
-- Line 37: rf2 ('/1s1/tpch100src', 2)
Done. -- 871 msec.
-- Line 40: rf1 ('/1s1/tpch100src', 3)
Done. -- 2315 msec.
-- Line 41: rf2 ('/1s1/tpch100src', 3)
Done. -- 906 msec.
-- Line 43: rf1 ('/1s1/tpch100src', 4)
Done. -- 2337 msec.
-- Line 44: rf2 ('/1s1/tpch100src', 4)
Done. -- 913 msec.
-- Line 46: rf1 ('/1s1/tpch100src', 5)
Done. -- 2429 msec.
-- Line 47: rf2 ('/1s1/tpch100src', 5)
Done. -- 1970 msec.
-- Line 49: rf1 ('/1s1/tpch100src', 6)
Done. -- 2467 msec.
-- Line 50: rf2 ('/1s1/tpch100src', 6)
Done. -- 888 msec.

The performance is very competitive. Some improvement remains possible but load and refresh are already strong.

In future installments we will look at some queries and explain how to interpret query plans and profiles.

In Hoc Signo Vinces Series