Vertica in a Virtualized Environment
Vertica supports running in any virtualized environment that conforms to the performance requirements for vioperf, vnetperf, and vcpuperf.
Vertica does not support VM Snapshot.
Vertica does not support suspending or migrating virtual machines while Vertica is running. A virtual machine that is suspended or migrated will in all likelihood be marked as DOWN to the Vertica cluster, reducing the overall performance of the cluster, or in a worst-case scenario, cause the cluster to crash.
Vertica has tested VMware, and when the underlying hardware is configured correctly, VMWare performs well. Customers have also deployed other virtualization configurations successfully. If you choose to run Vertica on a different virtualization configuration and you experience an issue, the Vertica Support team may ask you to reproduce the issue using a bare-metal environment to aid in troubleshooting. Depending on the details of the case, the Support team may also ask you to enter a support ticket with your virtualization vendor.
Guidelines for Hypervisor and Virtual Machine Configuration
There are many enterprise-grade hypervisors available on the market today, most of which support Linux-based virtual machines (VMs) in support of Vertica. When selecting and configuring your virtual environment, refer to the following guidelines.
- Do not over-subscribe the physical resources (CPU, memory, and network) of the hosting hardware. Many hypervisors allow you to take advantage of scaling out solutions by over-subscribing resources, for example, deploying more virtual CPUs than are physically installed in the host hardware. However, this type of deployment has a negative performance effect on a Vertica cluster.
- Configure the hypervisor to run low-latency, high-performance applications. This means that you should disable power-saving features and CPU frequency scaling on the hypervisor hardware because these technologies contribute to latency in the applications.
- Choose an operating system for the Vertica VMs that is supported by Vertica and by the hypervisor you are using. For some hypervisors, different operating systems may perform better than others. Vertica recommends that you investigate the options with your hypervisor vendor.
- Configure attached storage for high I/O performance. A virtualized Vertica node requires the same amount of disk I/O performance as a non-virtualized one. Vertica recommends that customers use the vioperf utility to validate the actual performance throughput being achieved on each VM.
- If you are providing storage using a shared storage device, make sure to validate disk I/O performance on the cluster as a whole to ensure that the shared resource(s) do not create a bottleneck. To achieve this validation, run the vioperf utility on all the cluster nodes simultaneously to determine the maximum disk I/O performance that can be achieved on each VM during times of heavy I/O load.
- Memory recommendations for Vertica running in a virtualized environment are no different than running in a non-virtualized environment. Vertica recommends that you allocate 8 GB of memory per virtual core. Again, do not over-subscribe the memory available in the hypervisor, because this creates contention for the physical resources, causes negative performance impacts, and possibly crashes the VMs.
- Networking requirements for a virtualized Vertica cluster are the same as for a non-virtualized cluster. Each node in the cluster must be able to communicate with all the other nodes, and latency in those communications can have a negative effect on cluster performance. When you are running multiple virtual machines on a single host server, the network communication is very fast. This occurs because the network traffic is virtualized in the memory space of the hypervisor and never leaves the physical server. However, if the cluster expands beyond a single host, the physical networking of that host can become a bottleneck for the cluster. If you are deploying in a virtual environment, that environment has a robust networking infrastructure that can provide the necessary connection speeds between physical hosts. In most cases, there will be multiple 10 GBE networking connections. Use the vnetperf utility to validate actual network performance speeds between nodes in your Vertica cluster.
- When deploying multiple Vertica VMs per physical host, the fewer the better. The goal of virtualization is to consolidate workloads to reduce overall hardware footprints. However, running multiple Vertica VMs on the same host can place the Vertica cluster in a situation where a single hardware failure can take down multiple nodes in a cluster, and perhaps even the cluster itself. Vertica recommends that when you virtualize a Vertica cluster, spread the VMs across as many physical hosts as possible, with an ideal goal of having one Vertica VM per physical host.
- While virtual networking can be very robust, Vertica has found that UDP broadcast traffic that is used in the spread daemon can be unreliable in most virtual environments, especially when those environments are spread across more than one physical host. In order for Vertica to function effectively in a virtualized environment, use the
--point-to-point
flag when you execute the/opt/vertica/sbin/install_vertica
script. This flag configures the spread daemons to communicate directly with one another.