• Product
    • Vertica Accelerator, Vertica-as-a-Service
    • Vertica Unified Analytics Platform, Customer-Managed Software
    • Product Overview
  • Industries
    • AdTech
    • Financial Services
    • Gaming
    • Healthcare
    • Technology
    • Telecommunications
    • Utilities
  • Partners
    • Become a Partner
    • Find a Partner
    • Partner Portal
    • 3rd Party Technology Partner Integration
    • Quickstarts
    • Partners Overview
  • Resources
    • Blog
    • Case Studies
    • Demos
    • Infographics
    • Tech Topics – What is…
    • Videos
    • Webcasts
    • All Resources
  • About
    • About Vertica
    • News & Recognition
    • Events
    • Careers
    • Contact us
  • Services & Support
    • Professional Services
    • Vertica Academy
    • User Forum
    • Patches
    • Contact Support
    • Documentation
      • Product Documentation
      • Knowledge Base
      • Troubleshooting Checklists
    • Downloads
      • Client Drivers
      • Patches
      • Software/Licenses
  • Try Vertica

  • Log In
  • Contact Us
  • User Forum
  • Vertica Academy
  • English
Vertica
  • Log In
  • Contact Us
  • User Forum
  • Vertica Academy
  • English
Vertica
  • Product
    • Product Overview

      Product Overview

      Vertica delivers unified analytics and machine learning at unprecedented speed, scale, and value.

      Learn More

    • Product
      • Vertica Accelerator,
        Vertica-as-a-Service
        • Vertica SaaS offering
        • Built on and delivers all the functionality of the Vertica Unified Analytics Platform
        • Automated administration and runs in your own AWS account
    • second column
      • Vertica Unified Analytics Platform,
        Customer-Managed Software
        • Bring Your Own License (BYOL) analytics software
        • Runs on-premises, hybrid, multi clouds, and containerized
        • Advanced analytics, in database ML, and data lake query engine
    • Product Resource
      Vertica 12

      Vertica Announces Vertica 12 for Future-Proof Analytics

      Latest version of analytics database enables more deployment flexibility, advanced analytics, and enhanced machine learning

  • Industries
    • Solutions Overview

      Featured Use Case:
      Customer Behavior Analytics

      Customer centricity is a mission critical initiative across industries. Unify customer data, deliver personalized, omni-channel experiences, and grow and retain your customer base.

      Learn More

    • Industries
      • AdTech
      • Financial Services
      • Gaming
      • Healthcare
    • Industries
      • Technology
      • Telecommunications
      • Utilities
    • Solutions Resource

      Harness the Internet of Things (IoT)

      IoT data is expected to grow exponentially across industries. Learn how to leverage sensor data at massive scale for business and customer value.

      Read On

  • Partners
    • Partners Overview

      Partners

      Tight integration with and support from leading technology and solution providers.

      Learn More

    • Partners
      • Become a Partner
      • Find a Partner
      • Partner Portal
    • col 2
      • 3rd Party Technology Partner Integration
      • Quickstarts
    • Partners Resource
      Vertica 11

      Vertica Inside – Embedded Analytics at Scale

      Seize the huge growth opportunity for OEM software developers

  • Resources
    • Resource Library

      Resources

      Explore our Thought Leadership library, including the most recent articles, webcasts and reports, with expert insights.

      Browse Resources

    • Resource Library
      • Blog
      • Case Studies
      • Demos
      • eBooks
      • Infographics
      • Videos
    • Webcasts
      • Tech Topics – What is…
      • Webcasts
      • Data Analytics Thought Leader Series
      • Data Disruptors Webcast Series
      • Under the Hood Webcast Series
    • Resources Menu Resource
      Vertica’s Analytical Database Earns Leader Status in the GigaOm Radar for Data Warehouses

      GigaOm Data Warehouse Report

      Vertica receives multiple Exceptional ratings for key criteria, market and user categories and recognition for future-looking product development 

  • About
    • About Vertica

      About Vertica

      Built for Fast. Built for Freedom.

      Learn More

    • About Vertica
      • News & Recognition
      • Events
    • col 2
      • Careers
      • Contact us
    • About Resource

      Stay Informed

      Sign-up to receive our monthly newsletter.

      Subscribe

      Latest newsletter

  • Services & Support
    • Support Resource

      Services & Support

      Access subscription-based pricing: New customers eligible for a 50% discount.

      Act now

    • Support Links
      • Professional Services
      • Vertica Academy
      • User Forum
      • Contact Support
    • Documentation
      • Documentation
      • Product Documentation
      • Knowledge Base
      • Troubleshooting Checklists
    • Downloads
      • Downloads
      • Client Drivers
      • Patches
      • Software/Licenses
  • Try Vertica

Data Normalization

How to use MinMax, Z-score, and Robust Z-score data preparation functions within Vertica

Vertica’s native ingest, data preparation, and model management features cover the entire data mining lifecycle, eliminating the need to export and load data into another tool for analysis, and then export the results back into Vertica.

Users can prepare data with functions for normalization, outlier detection, sampling, imbalanced data processing, missing value imputation and more. The purpose of normalization is, primarily, to scale numeric data from different columns down to an equivalent scale. For example, suppose you execute the LINEAR_REG function on a data set with two feature columns, current_salary and years_worked. The output value you are trying to predict is a worker’s future salary. The values in the current_salary column are likely to have a far wider range, and much larger values, than the values in the years_worked column. Therefore, the values in the current_salary column can overshadow the values in the years_worked column, thus skewing your model.

Vertica In-database Machine Learning

See how Vertica’s in-database machine learning supports the entire predictive analytics process with massively parallel processing and a familiar SQL interface, allowing data scientists and analysts to embrace the power of Big Data and accelerate business outcomes with no limits and no compromises.

Learn More

Vertica offers the following data preparation methods for normalization:

MinMax

Using the MinMax normalization method, you can normalize the values in both of these columns to be within a distribution of values between 0 and 1. Doing so allows you to compare values on very different scales to one another by reducing the dominance of one column over the other.

Z-score

Using the Z-score normalization method, you can normalize the values in both of these columns to be the number of standard deviations an observation is from the mean of each column. This allows you to compare your data to a normally distributed random variable.

Robust Z-score

Using the Robust Z-score normalization method, you can lessen the influence of outliers on Z-score calculations. Robust Z-score normalization uses the median value as opposed to the mean value used in Z-score. By using the median instead of the mean, it helps remove some of the influence of outliers in the data.

 

 

Normalizing Data Using the MixMax Function in Vertica

The following chart compares raw, source data vs.  normalized data that has been transformed using the MinMax normalization function in Vertica.

 

 

Normalizing Data Using the Z-score Function in Vertica

The following chart compares raw, source data vs.  normalized data that has been transformed using the Z-score normalization function in Vertica.

 

 

Normalizing Data Using the Robust Z-score Function in Vertica

The following chart compares raw, source data vs.  normalized data that has been transformed using the Robust Z-score normalization function in Vertica.

Featured Resources

Under the Hood: Introduction to Vertica In-database Machine Learning
Webcast
Vertica Machine Learning Functions: Cheat Sheet
Infographics
Vertica In-Database Machine Learning
Data Sheets
Deploy Machine Learning for the New Speed and Scale of Business
White Papers
PDF 3MB

View All Resources

  • PRODUCT
  • INDUSTRIES
  • RESOURCES
  • PARTNERS
  • ABOUT
  • DOCUMENTATION
  • CONTACT US
  • Try Vertica

  • Returning Customer? Log In
Vertica Analytical Database Logo
  • Facebook
  • Twitter
  • LinkedIn
  • YouTube
  • Privacy Policy
  • Cookie Policy

Copyright © 2023 Open Text Corporation. All rights reserved.

Vertica uses cookies to give you the best possible online experience. You can change your consent choices at any time by updating your cookie settings.

Cookie Privacy Manager

Some essential features on Vertica.com won't work without certain cookies. Other cookies help improve your experience by giving us insights into how you use our site and providing you with relevant content. For more information, please check out our cookie policy here.

Strictly Necessary

ON

These cookies provide a secure login experience and allow you to use essential features of the site

Analytics / Performance

Analytics cookies allow us to improve our website by giving us insights into how you interact with our pages, what content you're interested in, and identifying when things aren't working properly. The information collected is anonymous.

Targeting

We use targeting cookies to test new design ideas for pages and features on the site so we can improve your experience. We also collect information about your browsing habits so we can serve up content more relevant to your interests. Disabling these cookies would mean the content you see on the site might not be as relevant to you.