Step 8: (Optional) Generate Custom Data Files

The example database provided with Vertica includes a sample data generator program that produces output files whose names correspond to the tables in the logical schema. Each data generator has a similar set of input parameters that allow you to specify the number of rows of data to generate for any subset of the tables. To see a detailed list of the parameters for any example database, examine the README file in the example database directory.

Tip: : You can repeat the tutorial using custom data files to test larger data sizes.

Syntax

 

./example_gen [ --files files ]              [ --seed seed ]
              [ --time_file path ]
              [ --fact_table_name rows ]
              [ --dimension_table_name rows ] ...

Parameters

vmart_gen

The VMart file generator.

files files

Splits the fact table data into the specified number of files. By default, the data generator produces a single, unnumbered fact table data file. If you specify a value of two (2) or more, the data generator numbers the files by appending an underscore character (_) and three digits to the file name, starting at _001. For example:

./vmart_gen --files 3

produces:

VMart_Fact_001.tblVMart_Fact_002.tbl
VMart_Fact_003.tbl

Default: 1

seed seed

The seed for the pseudo-random number generator. If you use the same seed each time you run the data generator, you get the same data files (excluding external factors); for example, seed 9999.

Default: 20177

time_file path

The path name of the pre-computed time data input file used to generate the Date_Dimension table.

Default: ./Time.txt

This file is provided for each example database and the date range may vary; for example 2012-2016.

fact_table_name rows

Is the name of the fact table in vmart_gen followed by the number of rows of data to generate for the fact table.

Default: 5,000,000 (five million)

dimension_table_name rows

Is the name of a dimension table in vmart_gen (other than the Date_Dimension table) followed by the number of rows of data to generate for that dimension table.

Notes

Examples

./vmart_gen
 
./vmart_gen --files 3
 
/home/dbadmin/Vmart_Schema/examples/vmart_gen \
--seed 9999
--time_file /home/dbadmin/Vmart_Schema/examples/Time.txt \
--inventory_fact 100000 \
--customer_dimension 500 \
--date_dimension 500 \
--employee_dimension 50 \
--product_dimension 500 \
--promotion_dimension 500 \
--shipping_dimension 500 \
--vendor_dimension 500 \
--warehouse_dimension 500 \
--promotion_dimension 100