Configure and Launch an Instance
After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.
When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps. Vertica recommends that you use the Vertica AMI as is—without modification.
Configure EC2 Instances in AWS
- Select the Vertica AMI from the AWS marketplace.
- Select the desired fulfillment method.
- Configure the following:
- A supported instance type
- The number of instances you want to launch. A Vertica cluster usually uses identically configured instances of the same type.
- A VPC
- Placement Group
Add Storage to Your Instances
Consider the following issues when you add storage to your instances:
- Add a number of drives equal to the number of physical cores in your instance. For example, for a c3.8xlarge instance, 16 drives. For an r3.4xlarge, add 8 drives.
- Do not store your information on the root volume.
- Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types in the Amazon Web Services documentation.
Decide Whether to Configure EBS Volumes as a RAID Array
You can choose to configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:
[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data The minimum required I/O is 20 MB/s read and write per physical processor core on each node, in full duplex i.e. reading and writing at this rate simultaneously, concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per physical core on each node. For example, the I/O rate for a server node with 2 hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended. Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data" test | directory | counter name | counter | counter | counter | counter | thread | %CPU | %IO Wait | elapsed | remaining | | | value | value (10 | value/core | value/core | count | | | time (s)| time (s) | | | | sec avg) | | (10 sec avg) | | | | | -------------------------------------------------------------------------------------------------------------------------------------------------------- Write | /vertica/data | MB/s | 259 | 259 | 32.375 | 32.375 | 8 | 4 | 11 | 10 | 65 Write | /vertica/data | MB/s | 248 | 232 | 31 | 29 | 8 | 4 | 11 | 20 | 55 Write | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 11 | 30 | 45 Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 40 | 35 Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 50 | 25 Write | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 60 | 15 Write | /vertica/data | MB/s | 240 | 238 | 30 | 29.75 | 8 | 4 | 12 | 70 | 5 Write | /vertica/data | MB/s | 240 | 235 | 30 | 29.375 | 8 | 4 | 12 | 75 | 0 ReWrite | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237 | 29.625+29.625 | 29.625+29.625 | 8 | 4 | 22 | 10 | 65 ReWrite | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234 | 29.375+29.375 | 29.25+29.25 | 8 | 4 | 20 | 20 | 55 ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 4 | 20 | 30 | 45 ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 18 | 40 | 35 ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 20 | 50 | 25 ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 3 | 19 | 60 | 15 ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236 | 29.125+29.125 | 29.5+29.5 | 8 | 4 | 21 | 70 | 5 ReWrite | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236 | 29+29 | 29.5+29.5 | 8 | 4 | 21 | 75 | 0 Read | /vertica/data | MB/s | 248 | 248 | 31 | 31 | 8 | 4 | 12 | 10 | 65 Read | /vertica/data | MB/s | 241 | 236 | 30.125 | 29.5 | 8 | 4 | 15 | 20 | 55 Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 10 | 30 | 45 Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 40 | 35 Read | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 12 | 50 | 25 Read | /vertica/data | MB/s | 238 | 235 | 29.75 | 29.375 | 8 | 4 | 15 | 60 | 15 Read | /vertica/data | MB/s | 238 | 232 | 29.75 | 29 | 8 | 4 | 13 | 70 | 5 Read | /vertica/data | MB/s | 238 | 238 | 29.75 | 29.75 | 8 | 3 | 9 | 75 | 0 SkipRead | /vertica/data | seeks/s | 22909 | 22909 | 2863.62 | 2863.62 | 8 | 0 | 6 | 10 | 65 SkipRead | /vertica/data | seeks/s | 21989 | 21068 | 2748.62 | 2633.5 | 8 | 0 | 6 | 20 | 55 SkipRead | /vertica/data | seeks/s | 21639 | 20936 | 2704.88 | 2617 | 8 | 0 | 7 | 30 | 45 SkipRead | /vertica/data | seeks/s | 21478 | 20999 | 2684.75 | 2624.88 | 8 | 0 | 6 | 40 | 35 SkipRead | /vertica/data | seeks/s | 21381 | 20995 | 2672.62 | 2624.38 | 8 | 0 | 5 | 50 | 25 SkipRead | /vertica/data | seeks/s | 21310 | 20953 | 2663.75 | 2619.12 | 8 | 0 | 5 | 60 | 15 SkipRead | /vertica/data | seeks/s | 21280 | 21103 | 2660 | 2637.88 | 8 | 0 | 8 | 70 | 5 SkipRead | /vertica/data | seeks/s | 21272 | 21142 | 2659 | 2642.75 | 8 | 0 | 6 | 75 | 0
If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).
If your EC2 instance has hyper-threading enabled, vioperf may incorrectly count the number of cores in your system. The 20MB/s throughput per core requirement only applies to physical cores, rather than virtual cores. If your EC2 instance has hyper-threading enabled, divide the counter value (column 4 in the output) by the number of physical cores. See CPU Cores and Threads Per CPU Core Per Instance Type section in the AWS documentation topic Optimizing CPU Options for a list of physical cores in each instance type.
If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.
Security Group and Access
- Choose between your previously configured security group or the default security group.
- Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS Authentication for more information.
Launch Instances
Verify that your instances are running.