Vertica Blog

Vertica Blog

Vertica Blog

Concurrency and Workload Management

This blog post was authored by Soniya Shah. Vertica workloads range from simple primary key lookups to analytical queries that include several large tables and joins. Different types of load jobs must keep the data updated. Vertica has a mixed-workload management capability that is easy to use. Vertica can process queries both concurrently and in...

How to Set Vertica in Read-Only

This blog post was authored by Soniya Shah. You probably know that you can create READ ONLY users in Vertica. These users can view everything within a schema, but don’t have the proper permissions to change anything within the database. This is useful for sets of users that don’t need as many permissions or for...
Database Server Room

DataGals Hosts a Year Up Event

This blog post was authored by Soniya Shah. This week, the DataGals hosted an event to raise awareness for Year Up. Year Up provides young adults, aged 18-24 with the skills and experience they need to succeed in the professional workplace. Here at Vertica, we are proud supporters of the Year Up program. We have...

Understanding Users, Privileges, and Roles

This blog post was authored by Soniya Shah. Every Vertica database has one or more users. When users connect to the database, they log in with credentials that a superuser defines. Database users should only have access to the database resources they need to perform their tasks. To navigate these necessities, Vertica has designated users,...

In-Database Approximate Median and Percentile Functions

This blog post was authored by Ginger Ni. Median and percentile functions are commonly used data statistic functions. They are also used in other sophisticated data analysis algorithms, such as the robust z-Score normalization function. Vertica has exact MEDIAN and PERCENTILE_CONT functions, but these functions do not scale well for extremely large data sets, because...

Introducing the Vertica Test Drive for Clickstream Analytics

This blog post was authored by Soniya Shah. Recently, Vertica engineers introduced the new Vertica for Clickstream Analytics test drive on AWS. If you’re a Vertica user, you might be familiar with our test drives on AWS – both run on SQL on Hadoop. One uses MapR and the other uses Hortonworks. With this test...

Getting Rid of Range Joins

This blog post was authored by Soniya Shah. You can use range joins to categorize data into buckets. Vertica provides performance optimizations for =, and BETWEEN predicates. These optimizations are particularly useful when a column from one table is restricted to be in a range specified by two columns of another table. Range joins can...

New Uses for Directed Queries

Directed queries were introduced in Vertica 7.2. Directed queries were originally designed to achieve two goals: • Preserve current query plans before a scheduled upgrade. • Enable you to create query plans that improve optimizer performance. Since their introduction, users have found new and compelling ways to use directed queries—notably, using them to substitute one...

What’s New in Vertica 8.1: Flex Tables Enhancements

This blog post was authored by Soniya Shah. As of Vertica 8.1, you can execute CTAS statements to create flex tables. CREATE TABLE AS (CTAS) statement Previously, Vertica supported creating tables using the AS SELECT clause. Frequently called CTAS, this SQL statement lets you create a new table that contains the results from querying another...
Three 3D arrows, different colors pointing in different directions

Query Optimization Using Projections

In Vertica, tables are logical representations of the data. Vertica stores the actual data in projections. When data is loaded into a Vertica table, Vertica creates or updates a column-store projection. Vertica also compresses and/or encodes projection data, optimizing data access and storage. If you experience performance issues, your best first step is to run...

Machine Learning Mondays: How Vertica Implements Efficient and Scalable Machine Learning

This blog post was authored by Vincent Xu. As of Vertica 8.1, Vertica has introduced a set of popular machine learning algorithms, including Linear Regression, Logistic Regression, Kmeans, Naïve Bayes, and SVM. Based on our recent benchmarks, they run faster than MLlib on Apache Spark. The following chart shows the performance difference between Vertica 8.1.0...
Modern Database Analytics

Big Flat Fact Tables

This blog post was authored by Steve Sarsfield. For decades, it's been widely accepted that snowflake and star schemas facilitate getting optimal performance from your data warehouse. You normalize data by identifying the rows of data that you typically ingest, and creating a schema that is optimized for the types of queries you want to...