Vertica Blog

Vertica Blog

Best Practices

Self-Descriptive Constraint Names: Quick Tip

Jim Knicely authored this tip. Constraints set rules on what data is allowed in table columns and help maintain data integrity. PRIMARY KEY, REFERENCES (foreign key), CHECK, and UNIQUE constraint must be named. If you omit assigning a name, Vertica automatically assigns one. Those Vertica-created constraint names aren’t very descriptive. It’s a good idea to...

Fast Data Loading with Vertica

Curtis Bennett authored this post. Vertica is well known for its blinding query performance at big data scale, but it can also insert data at very high rates of speed. It can even load data non-stop while being queried, thus enabling real-time analysis of data. Basic Loading Methods There are many ways of loading data...

Data Preparation Tools – Technical Brief

Curtis Bennett authored this blog Vertica supports a number of industry standard data preparation tools for use in the data science life-cycle. In addition to the functions described here, Vertica has a wide array of analytic capabilities which can be leveraged for additional data preparations including time-series analytics (with missing value imputation), analytic windowing and...

Master Blog Series: Getting Started with Vertica

This post was authored by Soniya Shah. Are you a new Vertica user? If so, you're probably wondering where to start. We're here to help you on your big data analytics journey, from understanding Vertica terminology to making the most of your resources. If you find yourself asking questions like What does the Tuple Mover...

Load Balancing Options

This blog post was authored by Soniya Shah. Connection load balancing automatically spreads the overhead of client connections across the cluster by redirecting connections. Each client connection a host in your Vertica cluster requires memory and processor time. If a lot of clients connect to a single host, this can affect database performance. The initiator...

Handling Duplicate Records in Input Data Streams

This blog post was authored by Ravi Gupta. We have often found that sources or operational systems that provide data for further analysis have duplicate records and these are sent to a downstream application or EDW for processing. This post shows a few scenarios of how to handle these duplicate records using various SQL options,...
Programmer

Resource Management

This blog post was authored by Soniya Shah. A Vertica database runs on a cluster of hardware. All loads and queries running against the database take up system resources, such as CPU, memory, disk I/O, bandwidth, file handles, and more. Query performance depends on how many resources are allocated to it. In a single-user environment,...

Dynamic Row and Column Access Policies

This blog post was authored by Serge Bonte. Vertica’s row and column access policies can be used to provide extra security on data in your tables. These policies are well covered in Best Practices for Creating Access Policies in Vertica and Dynamic Row and Column Access Policies. In this blog, we will explore how dynamic...

Improving Performance and Memory Acquisitions for Vertica Queries

This blog post was authored by Shrirang Kamat. The following design considerations will help you improve the performance and memory of your Vertica queries. When creating table definitions, you should carefully choose the size of the lookup column based on your data. Properly sizing your column based on your data will help to improve performance....