
• Use the SKIP keyword with COPY.
• Use the head or tail Linux command.
Let’s see how this works.
The Data
Here’s a small example data set, a csv file named, students.csv, that contains the following information for 20 students:Smith, John, 1977, 1981, Bowers Hall
Brown, Tim, 1978, 1982, Blue Hall
Howe, Mary, 1976, 1980, White Hall
Kelly, Sarah, 1977, 1981, Blue Hall
Jones, William, 1977, 1981, Johnson Hall
Brady, Mark, 1976, 1980, Madison Hall
Black, Howard, 1975, 1979, Bowers Hall
King, Martha, 1976, 1980, White Hall
McCoy, Keith, 1978, 1982, Madison Hall
McDonald, Susan, 1977, 1981, Johnson Hall
Williams, Joe, 1975, 1979, Bowers Hall
Johnson, Julie, 1978, 1982, White Hall
Barry, Polly, 1976, 1980, Blue Hall
Wilson, Martin, 1975, 1979, Johnson Hall
Adams, Harry, 1976, 1980, White Hall
Ball, Terry, 1977, 1981, Bowers Hall
Chan, Ty, 1978, 1982, Johnson Hall
David, Donald, 1975, 1979, Madison Hall
Emory, Fletcher, 1976, 1980, White Hall
Flaherty, Finbarr, 1977 1981, Blue Hall
To load this data into Vertica, define a table as follows:
=> CREATE TABLE students (
Last VARCHAR(64),
First VARCHAR(64),
EnrollYear INT,
GradYear INT,
Dorm VARCHAR(64)
);
Now you can load the data from the .csv file into the students table:
=> COPY students FROM '/home/dbadmin/students.csv' DELIMITER ',';
Rows loaded
-------------
20
(1 row)
=> SELECT * FROM students;
Last | First | EnrollYear | GradYear | Dorm
----------+-----------+------------+----------+---------------
Howe | Mary | 1976 | 1980 | White Hall
King | Martha | 1976 | 1980 | White Hall
McCoy | Keith | 1978 | 1982 | Madison Hall
Williams | Joe | 1975 | 1979 | Bowers Hall
Barry | Polly | 1976 | 1980 | Blue Hall
Emory | Fletcher | 1976 | 1980 | White Hall
Flaherty | Finbarr | 1977 | 1981 | Blue Hall
Jones | William | 1977 | 1981 | Johnson Hall
Brady | Mark | 1976 | 1980 | Madison Hall
Black | Howard | 1975 | 1979 | Bowers Hall
Wilson | Martin | 1975 | 1979 | Johnson Hall
David | Donald | 1975 | 1979 | Madison Hall
Smith | John | 1977 | 1981 | Bowers Hall
Brown | Tim | 1978 | 1982 | Blue Hall
Kelly | Sarah | 1977 | 1981 | Blue Hall
McDonald | Susan | 1977 | 1981 | Johnson Hall
Johnson | Julie | 1978 | 1982 | White Hall
Adams | Harry | 1976 | 1980 | White Hall
Ball | Terry | 1977 | 1981 | Bowers Hall
Chan | Ty | 1978 | 1982 | Johnson Hall
(20 rows)
Using COPY to Filter Rows
Let’s create a new table and copy the same table except for the first 10 entries. To omit the first 10 entries in the CSV file, use the SKIP keyword. We’ll name the new table students_last_ten:=> CREATE TABLE students_last_ten (
Last VARCHAR(64),
First VARCHAR(64),
EnrollYear INT,
GradYear INT,
Dorm VARCHAR(64)
);
=> COPY students_last_ten FROM '/home/dbadmin/students.csv' DELIMITER ',' SKIP 10;
Rows Loaded
-------------
10
(1 row)
=> SELECT * FROM students_last_ten;
Last | First | EnrollYear | GradYear | Dorm
----------+-----------+------------+----------+-------------
Williams | Joe | 1975 | 1979 | Bowers Hall
Barry | Polly | 1976 | 1980 | Blue Hall
Emory | Fletcher | 1976 | 1980 | White Hall
Flaherty | Finbarr | 1977 | 1981 | Blue Hall
Johnson | Julie | 1978 | 1982 | White Hall
Adams | Harry | 1976 | 1980 | White Hall
Ball | Terry | 1977 | 1981 | Bowers Hall
Chan | Ty | 1978 | 1982 | Johnson Hall
Wilson | Martin | 1975 | 1979 | Johnson Hall
David | Donald | 1975 | 1979 | Madison Hall
(10 rows)
Using Linux to Filter Rows
You can also use the Linux head and tail commands to write certain rows from the top or the bottom of a file to a Vertica table. For this next example, we’ll load the first 10 rows from the student data.=> CREATE TABLE students_last_ten (
Last VARCHAR(64),
First VARCHAR(64),
EnrollYear INT,
GradYear INT,
Dorm VARCHAR(64)
);
CREATE TABLE
=> \q
Let’s run the tail command to verify that it gives us the last 10 rows. Note that running the head command would give us the first 10 rows.
$ tail -10 /home/dbadmin/students.csv
Williams, Joe, 1975, 1979, Bowers Hall
Johnson, Julie, 1978, 1982, White Hall
Barry, Polly, 1976, 1980, Blue Hall
Wilson, Martin, 1975, 1979, Johnson Hall
Adams, Harry, 1976, 1980, White Hall
Ball, Terry, 1977, 1981, Bowers Hall
Chan, Ty, 1978, 1982, Johnson Hall
David, Donald, 1975, 1979, Madison Hall
Emory, Fletcher, 1976, 1980, White Hall
Flaherty, Finbarr, 1977, 1981, Blue Hall
Now we’ll rerun the tail command and pipe that output into the Vertica table we created.
$ head -10 /home/dbadmin/students.csv | vsql -d vmart -U dbadmin -c "COPY students_last_ten FROM LOCAL STDIN DELIMITER ',' ;"
Rows Loaded
-------------
10
(1 row)
Let’s make sure the data was actually loaded into the student_last_ten table.
$ vsql
=> SELECT * FROM students_last_ten;
Last | First | EnrollYear | GradYear | Dorm
----------+----------+------------+----------+---------------
Black | Howard | 1975 | 1979 | Bowers Hall
Brady | Mark | 1976 | 1980 | Madison Hall
Jones | William | 1977 | 1981 | Johnson Hall
Brown | Tim | 1978 | 1982 | Blue Hall
Kelly | Sarah | 1977 | 1981 | Blue Hall
McDonald | Susan | 1977 | 1981 | Johnson Hall
Smith | John | 1977 | 1981 | Bowers Hall
Howe | Mary | 1976 | 1980 | White Hall
King | Martha | 1976 | 1980 | White Hall
McCoy | Keith | 1978 | 1982 | Madison Hall
(10 rows)
For More Information
• COPY Parameters in the Vertica Documentation• head Man Page
• tail Man Page