Step 8: (Optional) Generate Custom Data Files
The example database provided with Vertica includes a sample data generator program that produces output files whose names correspond to the tables in the logical schema. Each data generator has a similar set of input parameters that allow you to specify the number of rows of data to generate for any subset of the tables. To see a detailed list of the parameters for any example database, examine the README file in the example database directory.
Tip: : You can repeat the tutorial using custom data files to test larger data sizes.
Syntax
./example_gen [ --files files ] [ --seed seed ] [ --time_file path ] [ --fact_table_name rows ] [ --dimension_table_name rows ] ...
Parameters
vmart_gen |
The VMart file generator. |
files files |
Splits the fact table data into the specified number of files. By default, the data generator produces a single, unnumbered fact table data file. If you specify a value of two (2) or more, the data generator numbers the files by appending an underscore character (_) and three digits to the file name, starting at _001. For example: ./vmart_gen --files 3 produces: VMart_Fact_001.tblVMart_Fact_002.tbl VMart_Fact_003.tbl Default: 1 |
seed seed |
The seed for the pseudo-random number generator. If you use the same seed each time you run the data generator, you get the same data files (excluding external factors); for example, Default: 20177 |
time_file path |
The path name of the pre-computed time data input file used to generate the Default: This file is provided for each example database and the date range may vary; for example 2012-2016. |
fact_table_name rows |
Is the name of the fact table in vmart_gen followed by the number of rows of data to generate for the fact table. Default: 5,000,000 (five million) |
dimension_table_name rows |
Is the name of a dimension table in vmart_gen (other than the |
Notes
- The number of rows in
Date_Dimension
tables is determined by the time data input file supplied with the example database. - If you are using multiple fact table data files, make sure that your fact table load script(s) contain the correct file names as described in Using Load Scripts.
Examples
./vmart_gen ./vmart_gen --files 3 /home/dbadmin/Vmart_Schema/examples/vmart_gen \ --seed 9999 --time_file /home/dbadmin/Vmart_Schema/examples/Time.txt \ --inventory_fact 100000 \ --customer_dimension 500 \ --date_dimension 500 \ --employee_dimension 50 \ --product_dimension 500 \ --promotion_dimension 500 \ --shipping_dimension 500 \ --vendor_dimension 500 \ --warehouse_dimension 500 \ --promotion_dimension 100