Vertica Blog

Simulate NULLS FIRST and NULLS LAST in the ORDER BY Clause

When your query contains the ORDER BY clause to sort the result set, alphanumeric NULL data will sort to the bottom if the sort is in ascending order (ASC) and to the top if the sort is in descending order (DESC), while integer NULL data does the opposite. Example: dbadmin=> \d test List of Fields […]

In Loving Memory of Phil Molea

  Phil worked for Vertica as an Information Developer for close to five years.  He was a very important part of our Vertica team and part of our Vertica family. Phil enjoyed working in the Vertica community, the base product documentation, as well as with our Technology Partners.  He was very well liked and respected […]

Vertica Logos

Find the Version of Vertica that Created a Database: Quick Tip

Jim Knicely authored this tip. You can run the VERSION() function as one method of displaying the current version of Vertica. Example: dbadmin=> SELECT version(); version ———————————— Vertica Analytic Database v9.1.1-4 (1 row) But what if you want to know the version of Vertica running when you created the current database? For that info you […]

Rejected Data Table Row Number: Quick Tip

Jim Knicely authored this tip. When running a COPY command, using the REJECTED DATA parameter with the AS TABLE clause saves rejected data into a table. The rejected data table includes an informative column called ROW_NUMBER where its value indicates the rejected row number from the input file. Be aware that when a COPY encounters […]

Reload Data from a Rejected Data Table: Quick Tip

Jim Knicely authored this tip. When running a COPY command, using the REJECTED DATA parameter with the AS TABLE clause, will save rejected data into a table. If you realize there is a modification to the COPY command that will allow those rejected records to load successfully, you can re-run the updated COPY command against […]

Handling Cast Conversion Load Errors: Quick Tip

Jim Knicely authored this tip. The nifty cast ::! returns all cast failures as NULL instead of generating an error if a the data type cannot be coerced. This cast feature, combined with the FILLER option of the COPY command, is very useful for loading data when data types aren’t playing nice. Example: dbadmin=> CREATE […]

Return All Cast Failures as NULL: Quick Tip

Jim Knicely authored this post. When you invoke data type coercion (casting) by an explicit cast and the cast fails, the result returns either an error or NULL. Cast failures commonly occur when you attempt to cast conflicting conversions, such as trying to convert a varchar expression that contains letters to an integer. However, using […]

Microsoft Power BI: Latest Release Enhances Connection to Vertica

Kathy Taylor authored this post. We are excited to announce the new Vertica connector introduced in the October 2018 release of Microsoft Power BI: • With PowerBI Desktop, Vertica is now fully supported using DirectQuery Mode (push-down optimization). • With Power BI Service (Cloud Offering – Saas), DirectQuery mode is now supported with Vertica using […]

Display Canceled Queries: Quick Tip

Jim Knicely authored this tip. We can cancel a long running query in vsql by typing CTRL+C. The data collector table DC_CANCELS tracks queries that were stopped in this manner. Example: dbadmin=> SELECT table_name, component, description dbadmin-> FROM data_collector dbadmin-> WHERE component = ‘Cancels’; table_name | component | description ———–+———–+—————— dc_cancels | Cancels | Canceled […]

Calculate Request Queue Length: Quick Tip

Jim Knicely authored this post. The RESOURCE_ACQUISITIONS system table retains information about resources (memory, open file handles, threads) acquired by each running request. Each request is uniquely identified by its transaction and statement IDs within a given session. From this system table, you can calculate how long a request was queued in a resource pool […]

Concatenate non-NULL Values from a Group: Quick Tip

Jim Knicely authored this post. Vertica 9.1.1-4 introduces an extremely useful aggregate function named LISTAGG, which returns a string with concatenated non-NULL values from a group. Example: dbadmin=> SELECT * FROM test ORDER BY group_id; group_id | name ———-+——— 1 | ANDRIUS 1 | DAVE 1 | JIM 1 | KRISTEN 2 | BRYAN 2 […]

Simplify String Literals with Dollar-Quoted String Literals: Quick Tip

Jim Knicely authored this post. The standard syntax for specifying string literals can be difficult to understand. To allow more readable queries in such situations, Vertica SQL provides dollar quoting. Dollar quoting is not part of the SQL standard, but it is often a more convenient way to write complicated string literals than the standard-compliant […]

Master Blog Series: Vertica Database Administrators

This blog post was authored by Soniya Shah. Are you a database administrator looking for ways to get the most from your Vertica database? If so, this post is for you. You’re already familiar with the technicalities of Vertica – the Tuple Mover, deletes, projections, and more. If you’re looking to get started, check out […]

Re-Compute a Table Column’s Default Value Immediately: Quick Tip

Jim Knicely authored this tip. Vertica evaluates the DEFAULT expression and sets the column on load operations, if the operation omits a value for the column. That DEFAULT expression can be derived from another column in the same table! When you update the value in a base column, you will need to re-compute the value […]

DbVisualizer Free for Vertica Distribution Updates

Stephen Crossman authored this post Recently, there have been some changes in how DbVisualizer Free for Vertica is distributed. Previously, there were standard DbVisualizer Free and Pro Edition distributions available on the DbVisualizer web site, and there was a special DbVisualizer Free for Vertica distribution available on the Vertica Marketplace. Now, in an effort to […]

Exiting a DbVisualizer Script Following an Error: Quick Tip

Jim Knicely authored this tip. After reading yesterday’s Vertica Quick Tip “Exiting a vsql Script Following an Error”, a client asked if the ON_ERROR_STOP variable is available in the popular third party Vertica client tool DbVisualizer. The answer to that is no, as ON_ERROR_STOP is a Vertica vsql client specific setting. However, many clients, including […]

How do you use UDx’s?

We’ve posted a new Product Management feedback survey and we’re wondering what you think about our SDK and how you use the UDx’s. We appreciate all your feedback! You can find the survey here. https://in.hotjar.com/s?siteId=438341&surveyId=109476

Database and Node Uptime: Quick Tip

Jim Knicely authored this tip. You can query the DATABASES system table to find out the last time your Vertica database started and you can get the cluster node up times by querying the NODE_STATES system table. Example: dbadmin=> SELECT database_name, start_time dbadmin-> FROM databases; database_name | start_time —————+——————————- test_db | 2018-09-06 14:33:07.301363-04 (1 row) […]

Mimicking Enumerated Types: Quick Tip

Jim Knicely authored this tip. I used to work a lot with MySQL. It had a cool data type called “Enumerated Types”. Example in MySQL: (myadmin@localhost) [jimk]> CREATE TABLE e (ecol ENUM(‘Bill’, ‘Sam’, ‘Jack’)); Query OK, 0 rows affected (0.10 sec) (myadmin@localhost) [jimk]> INSERT INTO e VALUES(‘Bill’); Query OK, 1 row affected (0.00 sec) (dbadmin@localhost) […]

Changing the Field Separator in VSQL: Quick Tip

Jim Knicely authored this tip. vsql is a character-based, interactive, front-end utility that lets you type SQL statements and see the results. It’s very common to want to export data in CSV (Comma-Separated Values) format. To do that you can change the default | (vertical bar) field separator to a comma via the fieldsep option […]

Monitoring Resource Pool Cascade Events: Quick Tip

Jim Knicely authored this tip. You can define secondary resource pools to which running queries can cascade if they exceed the initial pool’s RUNTIMECAP. The RESOURCE_POOL_MOVE System Table displays the cascade event information on each node. There you can find helpful information like the source and target pools and why the cascading event occurred! Example: […]

AHM(Ancient History Mark)が進まない場合の対処方法

AHMが進んでいない場合、次のチェックリストを使用してトラブルシューティングを行います。 ステップ タスク 結果 1 Last Good Epoch(LGE)が進んでいるかどうかを確認します。 => SELECT CURRENT_EPOCH, LAST_GOOD_EPOCH, AHM_EPOCH FROM SYSTEM; LGEが進んでいる場合、Step 2 へ。 LGEが進んでいない場合、Step 5 へ。 2 すべてのノードがUPしているかどうかを確認します。 => SELECT * FROM NODES WHERE NODE_STATE = ‘UP’; すべてのノードがUPの場合、Step 3 へ。 1つ以上のノードがDOWNの場合、下記コマンドを使用してすべてのノードをUPにします。 $ admintools -t restart_node -d <database name> -s <node_name> すべてのノードがUPになった後、Step 4 へ。 3 リフレッシュが実行されていないプロジェクションがないかどうか確認します。 => SELECT PROJECTION_NAME, NODE_NAME, IS_UP_TO_DATE FROM PROJECTIONS WHERE IS_UP_TO_DATE […]

Understanding Vertica Query Budgets

This blog post was authored by Shrirang Kamat. The purpose of this document is to explain how the query budget of a resource pool used by the query can influence the initial memory acquisition for a query and how it impacts query performance. For more details about how we compute the query budget, see the […]

Protected: Japanese Checklist Test

There is no excerpt because this is a protected post.

Understanding the APPROXIMATE_COUNT_DISTINCT Functions

This blog post was authored by Curtis Bennett. The exact computation of the number of distinct values of an expression X on a multi-node architecture requires bringing all distinct values of X (within the specified group if a GROUP BY was specified) to the same node, and then counting the number of distinct values on […]

Vertica Quick Tip: What’s the last day of the month?

This blog post was authored by Jim Knicely. The Vertica built-in LAST_DAY function returns the last day of the month for a specified date. This function comes in handy for leap years. Example: dbadmin=> SELECT last_day(’02/28/2018′) NOT_A_LEAP_YEAR, last_day(’02/28/2020′) A_LEAP_YEAR; NOT_A_LEAP_YEAR | A_LEAP_YEAR —————–+————- 2018-02-28 | 2020-02-29 (1 row) In one of my previous positions, an […]

Vertica Quick Tip: Checking User Role Membership

This blog post was authored by Jim Knicely. The HAS_ROLE function returns a Boolean value that indicates whether a role has been assigned to a user. Example: To create a read only user and role, do the following: dbadmin=> CREATE ROLE read_only_role; CREATE ROLE dbadmin=> CREATE USER read_only; CREATE USER dbadmin=> GRANT read_only_role TO read_only; […]

Aggregate Projections

This blog post was authored by Curtis Bennett. Vertica stores physical data for tables in objects known as projections. Unlike traditional RDBMS’s, Vertica does not rely on indexes for performance. Instead, Vertica stores the physical data (either all or some of the columns) in whatever sort order is required for optimal query processing. This can […]

Vertica Quick Tip: Increasing the Performance of a Rebalance

This blog post was authored by Jim Knicely. Before performing a rebalance, Vertica by default will query system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation! To disable this query, set the configuration parameter RebalanceQueryStorageContainers to 0. Example: dbadmin=> SELECT […]

Vertica Quick Tip: Expiring a User’s Password

This blog post was authored by Jim Knicely. You can expire a user’s password immediately using the ALTER USER statement’s PASSWORD EXPIRE parameter. By expiring a password, you can: • Force users to comply with a change to password policy. • Set a new password when a user forgets the old password. This feature also […]

Vertica Quick Tip: VSQL Shortcuts to Move Faster on the Command Line

This blog post was authored by Jim Knicely. vsql is Vertica’s character-based, interactive, front-end utility that lets you type SQL statements and see the results. If you’ve typed a particularly long query in vsql then realize that you’d have a typo way back at the beginning of your code (i.e. you wrote SEELECT), instead of […]

Improve the Efficiency of Mergeout on Wide Tables

This blog post was co-authored by Xiao Ling and Jim Kelley. Introduction When resource pools were first introduced to Vertica, the average computer had a lot less memory than it does today. The default memory size for the Tuple Mover resource pool, 200 MB, reflects the more limited resources of that period. As hardware and […]

Vertica Quick Tip: Renaming a View

This blog post was authored by Jim Knicely. You are probably aware that you can rename a table using the ALTER TABLE … RENAME command. Example: dbadmin=> \dt test List of tables Schema | Name | Kind | Owner | Comment ——–+——+——-+———+——— public | test | table | dbadmin | (1 row) dbadmin=> ALTER TABLE […]

North East Database Day Conference

This blog post was authored by Eden Zik. On January 19th, Vertica engineers joined the North East database community for the North East Database Day conference organized annually at MIT, sponsored by Facebook and Microsoft and featuring Turing award winner Michael Stonebraker. The full conference program can be found here: http://mitdbg.github.io/nedbday/2018/ This year Styliani Pantela […]

Vertica Quick Tip: Viewing Query Error Information

This blog post was authored by Jim Knicely. The V_MONITOR.ERROR_MESSAGES system table tracks error and warning messages encountered while processing queries. Example: dbadmin=> CREATE TABLE 123 (c1 INT); ERROR 4856: Syntax error at or near “123” at character 14 LINE 1: CREATE TABLE 123 (c1 INT); ^ dbadmin=> SELECT event_timestamp, user_name, message FROM error_messages ORDER […]

Vertica Quick Tip: Setting a Client Connection Label

This blog post was authored by Jim Knicely. When you connect to a Vertica database you can set a client connection label to help you later identify the connection. Example: dbadmin=> SELECT set_client_label(‘Daily Load’); set_client_label ——————————– client_label set to Daily Load (1 row) dbadmin=> SELECT get_client_label(); get_client_label —————— Daily Load (1 row) dbadmin=> SELECT client_label […]

Vertica Customer Experience Survey

You are a valued Vertica customer and we are interested in your opinions. We want to hear about your experiences with our product, technical support, documentation, and community. Please take a few minutes to respond to our short survey and evaluate our database product here. We look forward to hearing from you!

Blog Post Series: Using Vertica to Track Commercial Aircraft in near Real-Time

This blog post was authored by Mark Whalley. The preceding blog post detailed the hardware requirements used in this project for tracking commercial aircraft in near real-time. In this blog post I will touch on installing the operating system on the Raspberry Pi (RPI) and the DUMP1090 software used for decoding the ADS-B signals being […]

Vertica Quick Tip: Analyzing Table Statistics by Column

This blog post was authored by Jim Knicely. The ANALYZE_STATISTICS function collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table. On a very large wide table it will take a significant amount of time to gather those statistics. In many situations only a few columns […]

Vertica Tip: The System Table for System Tables

This blog post was authored by Sarah Lemaire. Most of you probably know that Vertica provides system tables that allow you to monitor • System resources • Background processes • Workload • Performance • Catalog size These tables help you to profile, diagnose, and view historical data equivalent to load streams, query profiles, Tuple Mover […]

Identifying Projection Skew

This blog post was authored by Curtis Bennett. In Vertica, projections can either be replicated (unsegmented), or segmented. A segmented projection divides the data up across all the nodes in your cluster. Segmentation works by hashing a key value, and then using some simple math, figuring out which node that piece of data will live […]

What Projections are not Being Used

This blog post was authored by Eugenia Moreno. It is common to create new projections to improve performance in Vertica. However, you might forget about the old projections. Vertica still loads data in projections that you might not be using. A projection that is loaded but not picked up by the Vertica optimizer consumes storage […]

Vertica Quick Tip: Sampling Data from the Entire Table

This blog post was authored by Jim Knicely. A development or quality assurance team often times requests access to a sub-set of production data. One way to do that would be to make use of the LIMIT clause. Example: dbadmin=> SELECT COUNT(*) FROM big_number_table; COUNT ———— 1000000000 (1 row) dbadmin=> SELECT 0.05*1000000000 “5_percent”; 5_percent ————- […]

Query Execution in Eon Mode

This blog post was authored by Ben Vandiver. How Vertica distributes query processing across the cluster in Eon mode is a complex topic that is best illustrated through a concrete example. As part of this post, we’ll start with a simple data load and walk through metadata storage and query execution. To begin, we need […]

Vertica Quick Tip: Superfast Table Copy

This blog post was authored by Jim Knicely. Very often we need to make a copy of a very large table in order to do some development or quality assurance type of duties. Typically we’ll use a CREATE TABLE AS SELECT (CTAS) statement to make that copy. Example: dbadmin=> SELECT COUNT(*) FROM big_number_table; COUNT ———— […]

Vertica Quick Tip: Add a Time Zone

This blog post was authored by Jim Knicely. Vertica recognizes many time zones. However, there might come a time (zone) when you will need to reference one that is not available by default. Luckily it’s relatively easy to add a time zone to Vertica. Example: The “Hawaii Standard Time (HST)” is a default time zone […]

What’s New in Vertica 9.0.1: Security and Authentication

This blog post was authored by Soniya Shah. In this release, we introduce some security enhancements. We’ve added the ability to grant and revoke privileges on system tables, using the same syntax as you would for granting and revoking on tables. However, there are some limitations about the types of privileges you can use with […]

Vertica Quick Tip: Default Size of the NUMBER Data Type

This blog post was authored by Jim Knicely. When creating a table where you do not define a precision for a NUMBER column data type, Vertica will use a default precision of 38 digits. Often this is larger than necessary. By specifying NUMBER(37) you will potentially get better query performance and save on storage. Why? […]

Vertica Tip: Predicting the Resources a Statement Needs

This blog post was authored by Eugenia Moreno. You may find you want to set up resource pools before running queries to know how many resources a particular query needs. One way to do this is to create a small resource pool, profile the query, and note when the query is rejected. When the query […]

Beware of Segmentation Islands

This blog post was authored by Curtis Bennett. Many clients who are new to Vertica are also new to big data. While Vertica’s reliance on industry-standard SQL can make the transition very easy, often the introduction of multiple nodes used in support of a database platform can take some getting used to. It is the […]

Vertica Quick Tip: Dynamically Split Up a String

This blog post was authored by Jim Knicely. One of my favorite functions in Vertica is named SPLIT_PART. It splits up a string into parts by a given delimiter. Example: dbadmin=> SELECT split_part(my_text, ‘,’, 1) the_first_part, dbadmin-> split_part(my_text, ‘,’, 2) the_second_part, dbadmin-> split_part(my_text, ‘,’, 3) the_third_part, dbadmin-> split_part(my_text, ‘,’, 4) the_fourth_part dbadmin-> FROM (SELECT ‘ONE,TWO,THREE,FOUR’ […]

What’s New in Vertica 9.0.1: Ranger Integration

This blog post was co-authored by Mitchell Tracy and Monica Cellio. Hadoop clusters can use authorization services to determine which users can access what data in Hive and, by extension, HDFS. In Vertica 9.0 we added support for one of the most common such services, Apache Sentry, and in 9.0.1 we now support Apache Ranger […]

Vertica Quick Tip: Proper Ordering of IP Addresses

This blog post was authored by Jim Knicely. Often times we store IP addresses in a VARCHAR column in a Vertica table. When querying the data and sorting by the IP address, we see that IP addresses are sorted by its VARCHAR value instead of its numeric value. Fortunately Vertica has the INET_ATON function which […]

Vertica at the Aeronaut Brewery: Adventures in Data Architecture

This blog post was authored by Sarah Lemaire. Vertica hosted a Meetup at the Aeronaut Brewery in Somerville for customers and prospective customers, including data scientists from Nuance, J.Jill, and True Fit. After some cold beer to warm us up on a cold night, we were lucky enough to hear JB Huang, Head of Data […]

Vertica Quick Tip: The <=> operator

This blog post was authored by Jim Knicely. The operator performs an equality comparison like the = operator, but it returns true, instead of NULL, if both operands are NULL, and false, instead of NULL, if one operand is NULL. Example: dbadmin=> SELECT 1 = 2 “Returns FALSE”, dbadmin-> 1 2 “Returns FALSE”, dbadmin-> 1 […]

Vertica Quick Tip: A Truly Unique Constraint

This blog post was authored by Jim Knicely. According to the ANSI standards SQL:92, SQL:1999, and SQL:2003, a UNIQUE constraint should disallow duplicate non-NULL values, but allow multiple NULL values. A Unique Constraint in Vertica does just that! Example: dbadmin=> CREATE TABLE test (c1 INT); CREATE TABLE dbadmin=> ALTER TABLE test ADD CONSTRAINT test_uk UNIQUE […]

What’s New in Vertica 9.0.1: S3 Backup Encryption

This blog post was authored by James Kelley. Amazon S3 offers flexibility, efficiency, and scale. But does it offer security? With the release of Vertica 9.0.1, Vertica offers users the ability to encrypt their backups to S3 with server-side encryption. Vertica supports the following forms of S3 encryption: Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3) […]

What’s New in Management Console 9.0.1

This blog post was authored by Lisa Donaghue. Vertica 9.0.1 introduces Management Console (MC) improvements to cloud monitoring. Tag AWS Instances Management Console with Provisioning, available on the AWS Marketplace, includes a Cluster Creation wizard to provision databases on AWS resources. With Vertica 9.0.1, you can tag instances as you create them through the Cluster […]

Vertica Quick Tip: Lightning Fast Text Search

This blog post was authored by Jim Knicely. Searching the contents of a sizeable CHAR, VARCHAR, LONG VARCHAR, VARBINARY, or LONG VARBINARY field within a table to locate a specific keyword can be quite time consuming. Especially when dealing in Big Data. Fortunately, Vertica includes a text indexing feature which allows you to query that […]

Vertica Quick Tip: Generating a Random String

This blog post was authored by Jim Knicely. We saw in a previous Vertica Quick Tip that we can create a SQL function that generates random dates. How about one that generates random strings? Example: dbadmin=> CREATE OR REPLACE FUNCTION randomstring (x INT) RETURN VARCHAR dbadmin-> AS dbadmin-> BEGIN dbadmin-> RETURN CASE x dbadmin-> WHEN […]

Loading in Eon Mode

This blog was co-authored by Yuanzhe Bei, Ryan Roelke, Amin Saeidi, Soniya Shah, and Natalia Stavisky. This blog was updated in July 2018. Overview As of Vertica 9.1.x, you can operate your database in Eon Mode. Eon Mode separates the computational processes from the storage layer of your database. Deployment of Eon Mode is limited […]

Vertica Quick Tip: Which Rows Will Commit?

This blog post was authored by Jim Knicely. Did you ever update a bunch of rows in a table, then forget which ones you changed? Fearing you might have updated an incorrect record, you might have to roll back and start again. Or, in Vertica you can first check which records have been modified prior […]

Vertica Quick Tip: Date Arithmetic with Intervals

This blog post was authored by Jim Knicely. In the last Vertica Quick Tip we saw how easy date arithmetic can be. Well, it can be even easier with Intervals! Example: What is today’s, yesterday’s and tomorrow’s date? dbadmin=> SELECT SYSDATE Today, dbadmin-> SYSDATE – INTERVAL ‘1 Day’ Yesterday, dbadmin-> SYSDATE + INTERVAL ‘1 Day’ […]

Virtual DataGals Kick Off

This blog post was authored by Crystal Farley (North). Last week, the virtual chapter of DataGals started with their kick off event: Come here, go anywhere! Joy King, VP of Product Management, Product Marketing, and Field Engagement for Vertica talked about her experience and journey in her career. Joy has been with the company for […]

Vertica AWS Eon Mode Beta Provisioning with Management Console

This blog post was co-authored by Michael Hua and Soniya Shah. As of Vertica 9.0.x, you can operate your database in Eon Mode Beta. Doing this separates the computational processes from the storage layer of your database, enabling rapid scaling of resources to accommodate variable workloads. This post describes how you can provision a Vertica […]

Vertica Quick Tip: Date Arithmetic

This blog post was authored by Jim Knicely. Date arithmetic in Vertica is extremely easy! Example: What is today’s, yesterday’s and tomorrow’s date? dbadmin=> SELECT SYSDATE Today, dbadmin-> SYSDATE – 1 Yesterday, dbadmin-> SYSDATE + 1 Tomorrow; Today | Yesterday | Tomorrow —————————-+—————————-+—————————- 2018-01-18 11:36:43.132482 | 2018-01-17 11:36:43.132482 | 2018-01-19 11:36:43.132482 (1 row) But you’re […]

Vertica Quick Tip: Avoid Using Functions on Very Large Data Sets

This blog post was authored by Jim Knicely. You can store billions and billions and billions (i.e. a lot) of records in your Vertica tables. When querying these large data sets, try to avoid using database functions like TO_DATE, TO_CHAR, NVL, etc. when unnecessary. Example: A table named BIG_DATE_TABLE has 1 billion rows and a […]

Eon Mode Beta Overview in 9.0.1

This blog post was authored by Soniya Shah. What is Eon Mode Beta? In Vertica 9.0.1, Eon Mode Beta, the separation of compute and storage, continues on Amazon Web Services S3. Eon Mode Beta was introduced in Vertica 9.0 to capitalize on cloud economics, while still enjoying the fast query processing for which Vertica is […]

Vertica Quick Tip: Generating a Random Date

This blog post was authored by Jim Knicely. I can easily generate a random integer value using the Vertica built-in RANDOMINT function. For example: dbadmin=> SELECT randomint(10) “Random 0-9”, dbadmin-> randomint(10) “Random 0-9”, dbadmin-> randomint(10) “Random 0-9”; Random 0-9 | Random 0-9 | Random 0-9 ————+————+———— 6 | 4 | 0 (1 row) But what […]

Vertica Test Results for Operating System Patches for Meltdown and Spectre Security Flaws

For the latest Vertica update on the Meltdown and Spectre security flaws, read this blog: UPDATED: Vertica Test Results with Microcode Patches for the Meltdown and Spectre Security Flaws, published May 21, 2018. Vertica engineers have run performance tests using the operating system patches for the Meltdown and Spectre security flaws. Based on the results, […]

Vertica Quick Tip: The LIMIT Analytic Function

This blog post was authored by Jim Knicely. Vertica contains an abundance of built-in SQL analytic functions. One of the lesser known but also one of the coolest is the LIMIT analytic function. Example Say I have the following table data: dbadmin => SELECT * FROM limit_test; the_date | test_num | test_desc ————+———-+———– 2018-01-10 | […]

Vertica Quick Tip: How to Query for NaN Values

This blog post was authored by Jim Knicely. We’re introducing a new series: Vertica Quick Tips! These tips are intended to give you concise information to help you get the most out of using Vertica. NaN (Not a Number) does not equal anything, not even another NaN. You can query for them using the predicate […]

IMPORTANT: What you need to know about hardware security flaws

Vertica is aware of potential chip-level security flaws in recently discovered hardware bugs. Operating system vendors have developed patches to address these problems. Installing the patches could impact application performance. Vertica engineers are actively investigating how these patches might affect Vertica performance and will keep our customers informed. There is much speculation about the impact […]

Authentication Methods for dbadmin

This blog post was authored by Sumeet Keswani. In Vertica, when you create a new database, there are no configured authentication methods. In this case, Vertica assumes that all users, including the dbadmin, have an implicit password authentication. Users can use this authentication method both for authenticating over a network interface and for over a […]

DataGals Year in Review

The DataGals had an amazing 2017! Everyone on our steering committee – Lisa Donaghue, Styliani Pantela, Soniya Shah, Diem Tran, and Sharada Vesta would like to wish you a very happy new year! Check out our year in review below. We look forward to a successful 2018. Happy holidays from the DataGals!

External data is easier to manage than the Hadoop Stack

This blog post was authored by Steve Sarsfield. It has been a thrilling ride. In a few short years, Hadoop has seen astronomical rise, but recently, interest in Hadoop has peaked. Analysts like Gartner and Datanami are reporting that the hype in Hadoop is waning, with some hope that use cases for Hadoop will find […]

Vertica in the Clouds

This blog post was authored by Soniya Shah. The benefits of using cloud computing and storage are virtually endless. You can scale services up or down to fit your needs, customize applications, and access cloud services from everywhere. Using the cloud makes it easy to scale elastically and makes infrastructure both affordable and flexible. With […]

Blog Post Series: Using Vertica to Track Commercial Aircraft in near Real-Time Part 3

This blog post was authored by Mark Whalley. Picking apples, pears, blackberries or raspberries In a previous blog post, I provided a very high-level overview of ADS-B, and that with the appropriate pieces of hardware and some open-source software, it was possible to capture and decode the radio signals being broadcast from commercial aircraft, with […]

Flattened Tables

This blog post was authored by Soniya Shah. Before release 8.1., Vertica users could denormalize their data by combining all fact and dimension table columns in a single ‘fat’ table. These tables facilitated faster query execution. However, this approach required users to maintain redundant sets of normalized and denormalized data, which incurred its own overhead. […]

Phrase Search with Vertica Text Search

This blog post was authored by Serge Bonte. Vertica Text Search Vertica already provides Text Search. Text Search allows you to quickly search the contents of a single CHAR, VARCHAR, LONG VARCHAR, VARBINARY, or LONG VARBINARY field within a table to locate a specific token. Vertica implements that capability using a dedicated Text Index to […]

What’s New in Vertica 9.0: Hierarchical Partitioning

This blog post was authored by Michael Kronenberg. With Vertica 9.0, you can consolidate partitions into groups that minimize use of ROS storage. Reducing the number of ROS containers to store partitioned data helps facilitate DML operations such as DELETE and UPDATE, and avoid ROS pushback. For example, you can group date partitions by year. […]

What’s New in Vertica 9.0: The UUID Data Type

This blog post was authored by Gary Gray. Vertica version 9.0 adds Universal Unique Identifier (UUID) to its collection of data types. Accompanying this new data type are updates to the client libraries and a new function to help you use UUIDs in your database. As its name implies, computers use UUIDs to uniquely identify […]

Blog Post Series: Using Vertica to Track Commercial Aircraft in near Real Time

This blog post was authored by Mark Whalley. A Source of Streaming Data If you’re joining us a new reader, be sure to read part one of this series to get up to speed! When looking for a topic to use in the first of The Lab Series’ mini projects for the Big Data and […]

What’s New in Vertica 9.0: Sentry Integration

This blog post was authored by Mitchell Tracy and Monica Cellio. Hadoop clusters can use authorization services to determine which users can access what data in Hive and, by extension, HDFS. In Vertica 9.0 we now support one of the most common such services, Apache Sentry. Apache Sentry is a project in the Hadoop ecosystem […]

Data Pipelines: Vertica and Kafka

This blog post was authored by Tom Wall and Soniya Shah. At Vertica, we want to make it as easy as possible for your Vertica environment to coexist with other tools and technologies. We know that one size does not fit all. Sometimes you need a customized, end-to-end view of your system. Imagine you’re on […]

Transitioning Vertica Support

As part of our merge with Micro Focus, we are moving our support platform from my.vertica.com to Software Support Online. We want to make this alignment as smooth as possible for our customers and encourage you to read our Alignment Step by Step guide for Vertica Customers moving to SSO.

Estimate the Price of Diamonds Using Vertica Machine Learning

This blog post was authored by Vincent Xu. In this blog post, I’ll take you through the exercise I did to estimate the price of a diamond based on its characteristics, using the linear regression algorithm in Vertica. Besides Vertica 9.0, I used Tableau for charting and DbVisualizer as the SQL editor. From this exercise, […]

Try out the new Vertica Community Edition Virtual Machine!

This blog post was authored by Kathy Taylor. New to Vertica? Wondering where to start? Why not start with our new 8.1.1 Community Edition VM? It’s free! Just download the VM and start it up in a VM player on your PC. Open the User Guide, start the exercises, and off you go! You’ll be […]

What’s New in Vertica 9.0: Security and Authentication

This blog post was authored by Phil Molea. Multi-realm Support Vertica 9.0 introduces multi-realm support for Kerberos authentication. This allows you to assign a different realm so that users from another realm can authenticate to Vertica. At times, customers may store users in a protected directory server (AD or Linux KDC) for their trusted realm. […]

What’s New in Vertica 9.0: Google Cloud Platform

This blog post was authored by Chris Daly. Announcing Vertica availability in Google Cloud Platform With the release of Vertica 9.0, the team at Micro Focus has brought you a ton of new updates and enhancements that are certainly worthy of getting excited about! If you haven’t had a chance, you should check out the […]

Machine Learning Mondays: Vertica 9.0 Cheat Sheet

This blog post was authored by Vincent Xu. Vertica 9.0 is out and here is the updated Vertica machine learning cheat sheet. Vertica 9.0 introduces a slew of new machine learning features including one-hot encoding, Lasso regression, cross validation, model import/export, and many more. See the cheat sheet for examples of how to use the […]

Blog Post Series: Using Vertica to Track Commercial Aircraft in near Real Time

This blog post was authored by Mark Whalley. Project Overview This post presents an overview of an ongoing series that focuses on using Vertica, Raspberry Pi, and Apache Kafka to track commercial aircraft in near-real time. Often, users try to comprehend the many advanced capabilities of Vertica. Doing so can present some difficulty, especially if […]

What’s New in Vertica 9.0: Machine Learning Enhancements

This blog post was authored by Soniya Shah. Vertica 9.0 introduces new functionality that continues to match our goals for fast-paced development of the existing machine learning functions. In this release, we introduce two new summary functions, support for cross validation, support for one hot encoding, and the ability to import and export your models […]

Vertica Deep Dive Comes to Boston: #OwnTheNew

This blog post was authored by Sarah Lemaire. Two weeks after the successful Deep Dive in Los Angeles, Boston-area Vertica users had the unique opportunity to meet with Sumeet Keswani and Shrirang Kamat from the Vertica Customer Experience Team. These two Vertica experts offered attendees tips for managing their Vertica environment for maximum performance. These […]

What’s New in Vertica 9.0: Reading Parquet and ORC from S3

This blog post was authored by Monica Cellio. Parquet and ORC are widely-used Hadoop columnar file formats. Because these formats are columnar, they perform extremely well when queried as external tables in Vertica. Vertica queries implement column selection, predicate pushdown, and partition pruning. Vertica has supported reading Parquet and ORC data from HDFS or from […]

Vertica 9

This blog post was authored by Steve Sarsfield. The Vertica development team has just released Version 9.0. With every major release it gives me time to not only look back and see what was developed this cycle, but a look at the entire timeline. I joined the Vertica team about 4 years ago in the […]

What’s New in Vertica 9.0: Eon Mode Beta

This blog post was authored by Soniya Shah. What is Eon Mode Beta? With Vertica 9.0, you can run Vertica in Eon Mode Beta, using Amazon Web Services to capitalize on cloud economics while still enjoying the fast query processing for which Vertica is known. Running Vertica in Eon Mode Beta separates the computational processes […]

What’s New in Vertica 9.0?

This blog post was authored by Soniya Shah. In Vertica 9.0, we introduce new functionality including: • Eon Mode Beta • Supported Platform Updates • Machine Learning Enhancements • Apache Hadoop Integration Updates • Partition Grouping and Hierarchical Partitioning • Browsing S3 Data Using External Tables • Support for the UUID Data Type Eon Mode […]

CPU and Memory Starvation in SPREAD

This blog post was authored by Sumeet Keswani. What is Spread? Vertica uses an open source toolkit, Spread, to provide a high-performance control message service. Spread daemons start automatically when your database starts up for the first time. The spread daemons run on control nodes in your cluster. The control nodes manage message communication. On […]

Effective vsql in Vertica

This blog post was authored by Maurizio Felici. vsql is included in each Vertica installation and is lightweight, with a tight integration with Vertica. Vsql is installed on every Vertica server and can also be installed on non-server hosts using the client package. Executing SQL commands through vsql is often faster than navigating GUI’s menu. […]

Adding Nodes to Fault Groups

This blog post was authored by Sarah Lemaire. Suppose you are adding new cluster nodes to your Vertica database. You want to add those nodes to particular fault groups without having to restart your Vertica database. The following steps use the example of a database with five racks and fault groups, with 9 Vertica nodes […]

Analytic Queries in Vertica

This blog post was authored by Soniya Shah. Analytic functions handle complex analysis and reporting tasks. Here are some example use cases for Vertica analytic functions: • Rank the longest standing customers in a particular state • Calculate the moving average of retail volume over a specific time • Find the highest score among all […]

Integrating with Apache Spark

This blog post was authored by Soniya Shah. The Vertica Connector for Apache Spark is a fast parallel connector that allows you to use Apache Spark for pre-processing data. Apache Spark is an open-source, general purpose, cluster-computing framework. The Spark framework is based on Resilient Distributed Datasets (RDDs), which are logical collections of data partitioned […]

Working with Joins

This blog post was authored by Soniya Shah. Vertica supports a variety of join types. This post discusses the following joins: • Inner joins • Left, right, and full outer joins • Natural joins • Cross joins In Vertica, we refer to the tables participating in the join as left or right. The left table […]

Time Series Analytics

This blog post was authored by Soniya Shah. Time series analytics is a powerful Vertica tool that evaluates the values of a given set of variables over time and groups those values into a window based on a time interval for analysis and aggregation. Time series analytics is useful when you want to analyze discrete […]

Building a Secure Vertica Environment

This blog post was authored by Soniya Shah. Vertica has a client-server architecture system, where applications that reside on the client access the Vertica cluster through drivers including ODBC, JDBC, OLEDB and ADO.NET. This post discusses secure client to server communications, authenticating access to Vertica, and administrator access. Method Vertica Options Authentication: Validate user credentials […]

Vertica Presentation at the db tech showcase Tokyo 2017

On September 5th, Kanako Obayashi from the Vertica Best Practices team presented at the db tech showcase Tokyo 2017, one of the largest database events in Japan. Kanako’s presentation was about Vertica advanced analytics, including machine learning and geospatial analysis. More than 50 people attended her session. Kanako began her session by noting that more […]

What’s New in Vertica 8.1.1: Flex Parser Updates

This blog post was authored by Soniya Shah. Vertica 8.1.1 introduces an optional parameter to the FCSVPARSER function. The FCSVPARSER specifies how to load data into Vertica from a CSV data source. The new parameter allows you to define or override column names in the target file for data loaded from a CSV data source. […]

MERGE Statement with Filters

This blog post was authored by Soniya Shah. Vertica 8.1 introduced new functionality for the MERGE statement. In this post, we discuss new functionality for MERGE that allows users to filter conditions on INSERT and UPDATE clauses in a MERGE statement. The MERGE operation allows users to join the target table on another table, a […]

Do you need a database or a query engine?

This blog post was authored by Steve Sarsfield. As we travel through life, we are constantly assessing our choices. Should you eat that salad, or opt for the burger? Should you marry your partner or seek greener pastures elsewhere? All of us do these assessments in both our personal and business lives. However, it may […]

What’s New in Vertica 8.1.1: Catalog Memory Improvements

This blog post was authored by Soniya Shah. In Vertica 8.1.1, we introduce a performance improvement that reduces catalog memory usage for users with a large number of NULL values in tables. The improvement affects all string data types, including BINARY, VARBINARY, LONG VARBINARY, CHAR, VARCHAR and LONG VARCHAR. The improvement scales with the data […]

Configuring tcp Idle Settings for Long Running Idle Sessions

This blog post was authored by Soniya Shah. Important: For all recommendations to changing setting values, you must change the settings on all nodes in the cluster. It is not advisable to have different settings on different nodes. Have you ever encountered one of the following types of errors? ==> VSQL vsql => select sleep(3600); […]

How to Publish Data Collector Tables to Apache Kafka

This blog post was authored by Serge Bonte. You are probably familiar with the Vertica Data Collector (DC) and have used the granular information it collects to monitor and optimize Vertica deployments. A common challenge is that Data Collector keeps only a portion of that information—controlled by retention policies —in the internal DC tables before […]

View Privileges

This blog post was authored by Soniya Shah. This set of examples shows the privileges a user needs for various operations related to views, including creating and querying. A view is a virtual table based on the result set of a SQL statement, also called a SQL query. To select from a view, you need […]

Geospatial Analysis on Shapefile of Longitude and Latitude Data Using Vertica: Hurricane Bonnie

This blog post was authored by Ginger Ni. Like any natural disaster, hurricanes can leave behind extensive damage to life and property. The question asked by NGOs, government agencies, and insurance companies is, “How can we predict the locations where a storm will inflict the most damage?” Modern spatial analysis enables us to predict the […]

What’s New in Vertica 8.1.1: Backup to Amazon S3

As more businesses move their infrastructure to cloud-based and hybrid environments, Vertica has kept pace. In version 8.1.1, Vertica introduced support for backup and restore to and from Amazon’s Simple Storage Service, commonly known as S3. Backup to S3 works the same as any other Vertica backup. As always, backups are incremental. That is, Vertica […]

Configure JDBC clients to Work with Your Kerberos-Enabled Vertica Cluster: DbVisualizer, DBeaver, and Others

This blog post was authored by Satish Sathiyavageswaran. Many customers use JDBC-based tools like DbVisualizer and DBeaver to connect to Vertica for SQL development purposes. It is easy to use these tools on a non-Kerberos enabled Vertica cluster, but connecting to a Kerberos-enabled Vertica cluster is not straightforward because there is no native support for […]

Machine Learning Mondays: Vertica 8.1.1 Cheat Sheet

This blog post was authored by Vincent Xu. Vertica 8.1.1 provides SQL functions that support the complete machine learning workflow—from cleaning your data to training a model to evaluating model performance. Vertica machine learning is fast and scalable along the sizes of data samples, features, and computing cluster. Best of all, no data movement is […]

What’s New in Vertica 8.1.1: Introducing Export to Parquet Format

This blog post was authored by Deepak Majeti. Vertica customers often ask the following questions: 1. “We want to keep hot/warm data in Vertica and move warm/cold data to an open file format on cheap external storage. How do we do this? “ 2.”How can we store the results from Vertica in an open file […]

What’s New in Vertica 8.1.1: Cloudera Manager Support

This blog post was authored by Mitchell Tracy. In Vertica 8.1.1, we introduce support for Cloudera Manager. Cloudera Manager is a platform that Hadoop administrators can use to manage their Hadoop cluster. It allows them to see the hosts associated with their cluster, and the different Hadoop services running on the cluster. Cloudera Manager also […]

What’s New in Management Console 8.1.1

Vertica 8.1.1 introduces the ability to run SQL queries on your database through a browser. Management Console (MC) now also highlights new features and Vertica resources when you upgrade, and improves the resources and look and feel of the contextual help. Query the Database Using MC You can now use the Query Runner tool in […]

Vertica In-Database Approximate Count Distinct Functions Using LogLogBeta

This blog post was authored by Ginger Ni. Counting Distinct Values Data cardinality is a commonly used statistic in data analysis. Vertica has the exact COUNT(DISTINCT) function to count distinct values in a data set, but the function does not scale well for extremely large data sets. When exploring large data sets, speed is critical. […]

What’s New in Vertica 8.1.1: Machine Learning

This blog post was authored by Soniya Shah. Vertica 8.1.1 continues with the fast-paced development for machine learning. In this release, we introduce the highly-requested random forest algorithm. We added support for SVM to include SVM for regression, in addition to the existing SVM for classification algorithm. L2 regularization was added to both the linear […]

What’s New in Vertica 8.1.1?

This blog post was authored by Soniya Shah. In Vertica 8.1.1, we introduce new functionality including: • Supported platform updates • Machine learning updates • Management Console enhancements • Apache Hadoop, Apache Kafka, and Apache Spark integration updates • Database management improvements • Workload management • Table data management updates • SQL functions and statements […]

Understanding Backup Space Utilization

This blog post was authored by Soniya Shah. Creating regular database backups is an important part of database maintenance. The vbr utility lets you back up, restore, and copy your database to another cluster. You can create full and incremental backups, and even back up objects, such as tables. Ideally, backups should match what is […]

Concurrency and Workload Management

This blog post was authored by Soniya Shah. Vertica workloads range from simple primary key lookups to analytical queries that include several large tables and joins. Different types of load jobs must keep the data updated. Vertica has a mixed-workload management capability that is easy to use. Vertica can process queries both concurrently and in […]

How to Set Vertica in Read-Only

This blog post was authored by Soniya Shah. You probably know that you can create READ ONLY users in Vertica. These users can view everything within a schema, but don’t have the proper permissions to change anything within the database. This is useful for sets of users that don’t need as many permissions or for […]

DataGals Hosts a Year Up Event

This blog post was authored by Soniya Shah. This week, the DataGals hosted an event to raise awareness for Year Up. Year Up provides young adults, aged 18-24 with the skills and experience they need to succeed in the professional workplace. Here at Vertica, we are proud supporters of the Year Up program. We have […]

Understanding Users, Privileges, and Roles

This blog post was authored by Soniya Shah. Every Vertica database has one or more users. When users connect to the database, they log in with credentials that a superuser defines. Database users should only have access to the database resources they need to perform their tasks. To navigate these necessities, Vertica has designated users, […]

In-Database Approximate Median and Percentile Functions

This blog post was authored by Ginger Ni. Median and percentile functions are commonly used data statistic functions. They are also used in other sophisticated data analysis algorithms, such as the robust z-Score normalization function. Vertica has exact MEDIAN and PERCENTILE_CONT functions, but these functions do not scale well for extremely large data sets, because […]

Introducing the Vertica Test Drive for Clickstream Analytics

This blog post was authored by Soniya Shah. Recently, Vertica engineers introduced the new Vertica for Clickstream Analytics test drive on AWS. If you’re a Vertica user, you might be familiar with our test drives on AWS – both run on SQL on Hadoop. One uses MapR and the other uses Hortonworks. With this test […]

Getting Rid of Range Joins

This blog post was authored by Soniya Shah. You can use range joins to categorize data into buckets. Vertica provides performance optimizations for =, and BETWEEN predicates. These optimizations are particularly useful when a column from one table is restricted to be in a range specified by two columns of another table. Range joins can […]

New Uses for Directed Queries

Directed queries were introduced in Vertica 7.2. Directed queries were originally designed to achieve two goals: • Preserve current query plans before a scheduled upgrade. • Enable you to create query plans that improve optimizer performance. Since their introduction, users have found new and compelling ways to use directed queries—notably, using them to substitute one […]

What’s New in Vertica 8.1: Flex Tables Enhancements

This blog post was authored by Soniya Shah. As of Vertica 8.1, you can execute CTAS statements to create flex tables. CREATE TABLE AS (CTAS) statement Previously, Vertica supported creating tables using the AS SELECT clause. Frequently called CTAS, this SQL statement lets you create a new table that contains the results from querying another […]

Use MERGE to Update 1 Million Rows

This blog post was co-authored by Yassine Faihe, Michael Flower, and Moshe Goldberg. Updating One Million Records in Two Seconds To illustrate the true power of MERGE, this article describes how we used MERGE to demonstrate Vertica’s performance at scale. SQL MERGE statements combine INSERT and UPDATE operations. They are a great way to update […]

Query Optimization Using Projections

In Vertica, tables are logical representations of the data. Vertica stores the actual data in projections. When data is loaded into a Vertica table, Vertica creates or updates a column-store projection. Vertica also compresses and/or encodes projection data, optimizing data access and storage. If you experience performance issues, your best first step is to run […]

Machine Learning Mondays: How Vertica Implements Efficient and Scalable Machine Learning

This blog post was authored by Vincent Xu. As of Vertica 8.1, Vertica has introduced a set of popular machine learning algorithms, including Linear Regression, Logistic Regression, Kmeans, Naïve Bayes, and SVM. Based on our recent benchmarks, they run faster than MLlib on Apache Spark. The following chart shows the performance difference between Vertica 8.1.0 […]

Big Flat Fact Tables

This blog post was authored by Steve Sarsfield. For decades, it’s been widely accepted that snowflake and star schemas facilitate getting optimal performance from your data warehouse. You normalize data by identifying the rows of data that you typically ingest, and creating a schema that is optimized for the types of queries you want to […]

Using Vertica and HyperLogLog

This is a guest blog post co-authored by Francois Jehl and Pawel Szostek. Francois is the lead of the Analytics Data Storage team at Criteo; Pawel is a software engineer in the Analytics Data Storage team at Criteo. Criteo is the global leader in digital performance advertising with 900B ads served in 2016. The R&D […]

Machine Learning Mondays: Data Preparation for Machine Learning in Vertica

This blog post was authored by Vincent Xu. This post is part of our Machine Learning Mondays series. Stay tuned for more! Introduction Machine learning (ML) is an iterative process. From understanding data, preparing data, building models, testing models to deploying models, every step of the way requires careful examination and manipulation of the data. […]

Using Hadoop Rack Locality to Boost Vertica Performance

This blog post was authored by Monica Cellio. When database nodes are co-located on Hadoop data nodes, Vertica can take advantage of the Hadoop rack configuration to execute queries against ORC and Parquet data. Moving query execution closer to the data reduces network latency and can improve performance. Vertica automatically uses database nodes that are […]

What’s New in Vertica 8.1: Connecting to Vertica Updates

Vertica 8.1 includes the following product enhancements to Connecting to Vertica. Functional Updates to \timing The \timing metafunction has been enhanced so you can use the following commands to toggle \timing on or off based on its current setting: •\timing – turns timing on or off depending on its current state. For example if timing […]

What’s New in Vertica 8.1: Security Updates

Vertica 8.1 includes the following enhancements to Vertica security. Function to Verify Kerberos Configuration The function KERBEROS_CONFIG_CHECK allows you to test your Kerberos configuration of the Vertica cluster. Running this function checks: • Whether or not Kerberos services are available. • If a keytab file exists • If the Kerberos configuration parameters are set in […]

What’s New in Vertica 8.1: Machine Learning

This blog post was authored by Soniya Shah. Overall, you will notice that Machine Learning for Predictive Analytics, introduced in Vertica 7.2.2, is more accessible to use in Vertica 8.1, with the addition of several important functions. There are improvements to model management with access control ability to save and re-apply normalization parameters, missing value […]

What’s New in Management Console 8.1

Vertica 8.1 introduced new monitoring and usability enhancements to Management Console (MC). MC now provides the ability to easily monitor catalog memory and configure Workload Analyzer. You’ll also find usability improvements to cluster creation and setup for Extended Monitoring. Watch our short video about What’s New In MC in Vertica 8.1: Read on to learn […]

What’s New in Vertica 8.1: Supported Platforms

With Vertica Release 8.1, we continue to enhance and broaden our platform support. New Operating Systems for Vertica Server Vertica continues to perform extensive testing as we qualify major Linux distributions for use with the Vertica Analytic Database. Our testing ensures both stability and performance when you use Vertica with a supported operating system. For […]

What’s New in Vertica 8.1?

Watch this video to learn what’s new in Vertica version 8.1. New features include: – flattened tables – supported platforms update – Management Console features – Kafka connectivity update – machine learning functions – rack locality – Geohash conversions – security upgrades – wide column data query improvement

What’s up with rejected data?

This blog post was authored by Kanti Mann. In a perfect world, any and all data you attempt to load into your database would seamlessly and accurately move from point A to point B. Unfortunately, this doesn’t always happen. Occasionally, data fails to load into its destination table, and you’ll probably want to know what […]

Why auto-scaling analytical databases aren’t so magical

This blog post was authored by Steve Sarsfield. There is a new feature in analytical databases that seems to be all the rage, particular in cloud data warehouse – Autoscaling. Autoscaling’s promise is that if you have a particularly hard analytical workload, autoscaling will spin up new storage and compute to get the job done. […]

Understanding AT TIME ZONE

TIMESTAMPTZ AT TIME ZONE and TIMESTAMP AT TIME ZONE return date input in another time zone. How Vertica executes AT TIME ZONE varies, depending on whether the input is a TIMESTAMPTZ or TIMESTAMP. At first glance, this might be confusing. More about that later. First, let’s review AT TIME ZONE syntax: { TIMESTAMPTZ | TIMESTAMP […]

Create and Assign Roles

A role is a collection of privileges that can be granted to one or more users or roles. Assigning roles prevents you from having to manually grant sets of privileges for each individual user. For the most part, creating and assigning roles is fairly straightforward. However, the user to which roles are assigned needs to […]

Filtering Data While Loading into Vertica

Suppose you have a CSV file and you want to copy some, but not all, of its contents into a Vertica table. There are two ways you can to do this: • Use the SKIP keyword with COPY. • Use the head or tail Linux command. Let’s see how this works. The Data Here’s a […]

Vertica Machine Learning Series: Logistic Regression

This blog post is based on a white paper authored by Maurizio Felici. What is Logistic Regression? Logistic regression is a popular machine learning algorithm used for binary classification. Logistic regression labels a sample with one of two possible classes, given a set of predictors in the sample. Optionally, the output can be the probability […]

DataGals Hosts an International Women

This blog post was authored by Soniya Shah. This week, the DataGals hosted an event in celebration of International Women’s Day. This year’s campaign asked supporters around the world to #BeBoldForChange to encourage a more inclusive, gender equal world. You can read more about the campaign and influencers on the International Women’s Day site. International […]

Spark Summit East

This blog post was authored by Myles Collins. I recently went to the Spark Summit East to take the Spark training and get current on the technology that my group (Vertica Partner Engineering) is using more and more. Conveniently, it was held here in Boston. A few weeks after I registered, marketing decided to sponsor […]

Using Vertica on IoT Data: Gap Filling and Interpolation for Incomplete Sensor Data

This post was originally authored by Marco Gessner and appeared on LinkedIn. It has been reposted here with his permission. This article explains the basic gap filling and interpolation functionality in Vertica. Vertica was designed for the fast processing and analysis of huge volumes of data and is well suited to IoT applications. One of […]

Dynamic Row and Column Access Policies

The content of this blog post is based on an article authored by Maurizio Felici. The Vertica Analytic Database access policies act on columns and rows to provide extra security on data in your tables. You can create flexible access policies that limit which users can access certain data by applying the access policy to […]

Vertica Machine Learning Series: k-means

The content of this blog is based on a white paper that was authored by Maurizio Felici. What is k-means Clustering? K-means clustering is an unsupervised learning algorithm that clusters data into groups based on their similarity. Using k-means, you can find k clusters of data, represented by centroids. As the user, you select the […]

Using Vertica on Azure

A lot of customers are starting to explore the idea of reducing infrastructure related costs of their enterprise solutions by migrating them to publicly hosted cloud based environments. With that in mind I am very pleased to announce the official support of Vertica running in the Microsoft Azure cloud environment. This latest step in the […]

Using the Vertica on Azure Free Trial

In August of last year, we announced support for Vertica in the Microsoft Azure Cloud environment. This includes a fully automated cluster deployment from the Azure Marketplace (which can be found here) and also includes our free Community Edition license. Microsoft, like many other public Cloud providers, offers a free trial subscription for users that […]

Machine Learning Series: Linear Regression

The content of this blog is based on a white paper that was authored by Maurizio Felici. This blog post is just one in a series of blog posts about the machine learning algorithms in Vertica. Stay tuned for more! What is Linear Regression? Let’s start with the basics. Linear regression is one of the […]

Patented: A Look into Kahlil Oppenheimer

Kahlil Oppenheimer was a Vertica intern during the summer of 2014. This blog post was authored by him and reprinted with his permission. During the first week of my internship at Vertica, my mentor assigned a small bug for me to fix about a set of particular SQL queries. After writing a simple fix for […]

Updating UDx Projects: Syncing the Vertica Plug-in for Eclipse with New Vertica Versions

The Vertica SDK Plug-in for Eclipse version 7.1.2 creates UDxs that are compatible with Vertica version 7.1.x. By replacing two files (BuildInfo.java and VerticaSDK.jar) in projects created with this plug-in, you can update your project to work with newer versions of Vertica. You get the replacement files (/opt/vertica/sdk/BuildInfo.java and /opt/vertica/bin/VerticaSDK.jar) from your currently installed Vertica […]

Software Engineering Internships at Vertica: Make a Difference This Summer

Vertica is looking for summer interns in Cambridge, MA for 2017! Vertica is the leading Big Data analytics database, and our scale, performance, and simplicity are unparalleled in the industry. Vertica enables customers like Facebook, Twitter, Uber, and Zynga to solve Big Data problems at scale that they could not tackle otherwise. If you study […]

Crowd-sourced Reviews Compare Oracle, Vertica, and Others

This blog post was authored by Steve Sarsfield.  Crowd-sourced reviews are becoming more and more important in our lives. When you’re thinking about going to a new job, you check out Glassdoor. If you’re heading out to dinner, you check out Yelp. When buying online, the reviews on Amazon are not only informative, but sometimes hilarious. […]

LDAP and User Accounts

This blog post was authored by Soniya Shah. If you are a database administrator, you probably need to authenticate users in Vertica. There are many methods users can use to authenticate, including Ident, Kerberos, LDAP, and hash. This blog walks you through the steps to take if you want to authenticate some users using LDAP […]

Vertica Hears From UMass Computer Science Professor

On December 7, Vertica employees were lucky enough to hear a talk from UMass Boston Computer Science Professor Dr. Duc A. Tran. Dr. Tran spoke to us about a distributed storage project he is working on with his students.

The Life of a Query, According to Henry Ford

While Henry Ford did not in fact develop or even patent the modern assembly line (that credit goes to Ransom E. Olds), he relied heavily on the process for automobile production.

Batch Exporting Directed Queries

An earlier blog covered the first edition of directed queries, which appeared with the first release of Vertica 7.2. With each release since then, Vertica has offered various enhancements to directed queries functionality.

DataGals Hosts a Hot Chocolate Discussion

Last week, DataGals?an Vertica employee resource group that encourages women in STEM?hosted a hot chocolate and discussion event open to all Microfocus employees.

Grace Hopper Celebration

Every year, the Grace Hopper Celebration of Women in Computing spotlights women in STEM fields. We’re glad to say that Microfocus was a corporate sponsor at the 16th annual GHC this year in Houston, Texas. 15,000 people attended. 14,000 of those attendees were women!

Lady Problems Hackathon Series

The Lady Problems Hackathon series aims to address problems preventing female entrepreneurship.

Vertica Volunteers at East End House

Read this blog to learn more about how the Vertica team is giving back to the local community!

Disabling Numeric Overflow

Prior to version 8.0, using a column that has a numeric data type with the functions SUM, SUM_FLOAT, or AVG could result in numeric overflow. Now, you have the ability to turn-off numeric overflow and add implicit precision to your numeric data types.

Vertica Goes to UMass Amherst

Read this blog about Vertica’s visit to the UMass Amherst Engineering and Technology Career Fair!

Redesigning Projections for Query Optimization

When you submit a query to Vertica, the Vertica query optimizer automatically assembles a query plan, which consists of a set of operations to compute the requested result. Depending on the properties of the projections defined in your database, the query optimizer can choose faster and more efficient operations. Thus, it?s important to recognize what you can do to optimize your projections to improve query performance

Customize Your Security Authentication in Vertica

Learn more about authentication methods in Vertica.

Best Practices for Using LDAP Link with Vertica

There are a few best practices that you should follow to make sure that you don’t accidentally lose any users or data. This blog explains how to keep your LDAP Link service working smoothly.

Post-upgrade Tasks for Saving Catalog Space

When you upgrade to 7.2 and later, not only can you take advantage of the new features, you can perform tasks to save substantial space in your Vertica catalog.

Investigating Public Data: Road Safety in the U.K.

When learning new database applications, a good place to start is with some compelling, real-world data. It’s not necessarily so easy to find.

Meet our 2016 Summer Interns!

Our interns had an exciting summer at Vertica. Watch the video to learn more!

Introducing the Connector for Apache Spark

In Vertica version 8.0.0, we added integration for Apache Spark through our Vertica Connector for Apache Spark. This is a fast parallel connector that allows you to transfer data between Apache Spark and Vertica.

Troubleshooting Vertica Query Performance with System Tables

Do you want to learn how to troubleshoot your query performance issues?  We’ve got you covered. Just attend the Query Performance Tuning and Troubleshooting Issues session at Vertica’s Big Data Conference.

How to Load New Data and Modify Existing Data Simultaneously

Originally posted 8/27/2012 Many Vertica customers tell us “we have an OLTP workload” which is not Vertica’s architectural sweet spot. However, when we dig into what they are actually doing, it often turns out that they are simply bulk loading mostly new data with some small number of updates to existing rows. In Vertica 6, […]

The Right Tool for the Job: Using Apache Hadoop with Vertica for Big Data Analytics

I have an entrepreneur friend who used to carry a butter knife around.  He claimed this “almighty” tool was the only one he ever needed!  While the butter knife does serve a wide range of purposes (especially with a stretch of the imagination), in practice it doesn’t’ always yield optimal results.  For example, as a screwdriver, it may work for common screws, but certainly not a Phillips (unless you push down very hard and hope not to strip the screw).  As a hammer, you may be able to drive finishing nails, but your success and mileage may vary.  As a pry bar, well, I think you get my point!  Clearly one tool isn’t sufficient for all purposes – a good toolbox includes various tools each fulfilling a specific purpose.

Vertica Interns Develop Real-World Business Solutions

It is a rare opportunity for interns at Vertica to be assigned to the same project. This summer, three interns have been working closely with the Partner Engineering team to build a data generator.

Intern Lunch with Vertica GM

Last week, the interns had lunch with the General Manager, Colin Mahony. The lunch started with everyone introducing themselves, and we’ve come to know that Colin impressively remembers everyone’s names. We shared lunch over a casual round-table discussion. We even got to enjoy a slice of cake afterwards.

Do You Need to Put Your Query on a Budget?

Before we scare you away with the word “budget”, rest assured that after reading this blog, you won?t have to give up your favorite activities or sell your car. What you will be able to do is understand how Vertica resource pool parameters affect query budget.

Jump Start your BI Dashboard Development with Vertica

Do you develop BI dashboards for Vertica? Or would you like to give it a try?

To help you get started, the Vertica Partner Engineering team has created a set of QuickStart BI sample apps. You can download them for free from the Microfocus Big Data Marketplace.

Analyze Mismatched Series with Event Series Joins

Event series occur in tables with a time column, most typically a TIMESTAMP data type. In Vertica, you perform an event series join to analyze two series in different tables when their measurement intervals don’t align, such as with mismatched timestamps.

Jump Start your ETL Application Development with Vertica

Interested in exploring the Vertica Analytic Database in the context of data movement and transformation? To get a feel for it, try our new ETL QuickStart sample apps. You’ll find them on the Big Data Marketplace. Our Partner Engineering team develops QuickStart apps using tools from our technology partners. Currently we have ETL QuickStarts for […]

A Day in the Life of a Vertica Intern

When I set out to find an internship for the summer, I was afraid that I would inevitably end up in a position where my primary responsibility was to fetch coffee. Fortunately for me, my first day on the job at Vertica proved that this would not be the case.

Which One of These is Not Like the Others?

With the new guaranteed uniqueness optimization feature in Vertica 7.2.2, Vertica automatically recognizes when a query is accessing columns with unique values and optimizes the query operations that would otherwise be bogged down due to duplicate values.

Classified: FAQs on Access Policies

In Vertica 7.2.2 we?ve added more security features, including a row-level access policy option. Combined with our previously existing column access policy, Vertica verifies that your data is more secure than ever.

Watch Machine Learning for Predictive Analytics in Action

Watch this video to learn more about the Vertica Machine Learning for Predictive Analytics features new in 7.2

Vertica Champions Diversity

On Thursday, April 14, the Microfocus DataGals** hosted a free screening of the award winning documentary Code: Debugging the Gender Gap in Cambridge, MA.

Identifying Patterns in Your Data with Event Series Pattern Matching – Part 1

Vertica’s event series pattern matching functionally lets you identify events that occur is specific patterns. In this blog, we’ll introduce you to the pattern matching key features.

Learn More From Your Data with Machine Learning Algorithms

New in Vertica 7.2.2 is the Machine Learning for Predictive Analytics package. This analytics package allows you to use built-in machine learning algorithms on data in your Vertica database. Machine learning algorithms are extremely valuable in data analytics because, as their name suggests, they can learn from your data and provide information about deductive and […]

Sidestepping Catastrophes with Vertica Backup and Recovery

Accidents do happen! Data can become corrupted. It can be unintentionally deleted and, in some rare cases, you can lose all your data.

 

Do you have a plan B to recover data in a timely manner?