Vertica Blog
Soniya Shah smiling

Soniya Shah

Information Developer

Currently, a first year law student with a background in science and technology. Experienced technical writer, with specializations in software documentation, big data, blog development, and website development. I build user-centered content to communicate complex and technical information more easily.

I used to work for Vertica full time for about 3 years. I still work at Vertica part time while going to law school.

Update: Soniya is now doing her law internship, and no longer working at Vertica. Good luck, Soniya!

Connect With Soniya on

Concurrency and Workload Management

This blog post was authored by Soniya Shah. Vertica workloads range from simple primary key lookups to analytical queries that include several large tables and joins. Different types of load jobs must keep the data updated. Vertica has a mixed-workload management capability that is easy to use. Vertica can process queries both concurrently and in...

How to Set Vertica in Read-Only

This blog post was authored by Soniya Shah. You probably know that you can create READ ONLY users in Vertica. These users can view everything within a schema, but don’t have the proper permissions to change anything within the database. This is useful for sets of users that don’t need as many permissions or for...
Database Server Room

DataGals Hosts a Year Up Event

This blog post was authored by Soniya Shah. This week, the DataGals hosted an event to raise awareness for Year Up. Year Up provides young adults, aged 18-24 with the skills and experience they need to succeed in the professional workplace. Here at Vertica, we are proud supporters of the Year Up program. We have...

Understanding Users, Privileges, and Roles

This blog post was authored by Soniya Shah. Every Vertica database has one or more users. When users connect to the database, they log in with credentials that a superuser defines. Database users should only have access to the database resources they need to perform their tasks. To navigate these necessities, Vertica has designated users,...

In-Database Approximate Median and Percentile Functions

This blog post was authored by Ginger Ni. Median and percentile functions are commonly used data statistic functions. They are also used in other sophisticated data analysis algorithms, such as the robust z-Score normalization function. Vertica has exact MEDIAN and PERCENTILE_CONT functions, but these functions do not scale well for extremely large data sets, because...

Introducing the Vertica Test Drive for Clickstream Analytics

This blog post was authored by Soniya Shah. Recently, Vertica engineers introduced the new Vertica for Clickstream Analytics test drive on AWS. If you’re a Vertica user, you might be familiar with our test drives on AWS – both run on SQL on Hadoop. One uses MapR and the other uses Hortonworks. With this test...

Getting Rid of Range Joins

This blog post was authored by Soniya Shah. You can use range joins to categorize data into buckets. Vertica provides performance optimizations for =, and BETWEEN predicates. These optimizations are particularly useful when a column from one table is restricted to be in a range specified by two columns of another table. Range joins can...

New Uses for Directed Queries

Directed queries were introduced in Vertica 7.2. Directed queries were originally designed to achieve two goals: • Preserve current query plans before a scheduled upgrade. • Enable you to create query plans that improve optimizer performance. Since their introduction, users have found new and compelling ways to use directed queries—notably, using them to substitute one...

What’s New in Vertica 8.1: Flex Tables Enhancements

This blog post was authored by Soniya Shah. As of Vertica 8.1, you can execute CTAS statements to create flex tables. CREATE TABLE AS (CTAS) statement Previously, Vertica supported creating tables using the AS SELECT clause. Frequently called CTAS, this SQL statement lets you create a new table that contains the results from querying another...

Machine Learning Mondays: How Vertica Implements Efficient and Scalable Machine Learning

This blog post was authored by Vincent Xu. As of Vertica 8.1, Vertica has introduced a set of popular machine learning algorithms, including Linear Regression, Logistic Regression, Kmeans, Naïve Bayes, and SVM. Based on our recent benchmarks, they run faster than MLlib on Apache Spark. The following chart shows the performance difference between Vertica 8.1.0...
Modern Database Analytics

Big Flat Fact Tables

This blog post was authored by Steve Sarsfield. For decades, it's been widely accepted that snowflake and star schemas facilitate getting optimal performance from your data warehouse. You normalize data by identifying the rows of data that you typically ingest, and creating a schema that is optimized for the types of queries you want to...
Programmer

Using Vertica and HyperLogLog

This is a guest blog post co-authored by Francois Jehl and Pawel Szostek. Francois is the lead of the Analytics Data Storage team at Criteo; Pawel is a software engineer in the Analytics Data Storage team at Criteo. Criteo is the global leader in digital performance advertising with 900B ads served in 2016. The R&D...