
What is Spread?
Vertica uses an open source toolkit, Spread, to provide a high-performance control message service. Spread daemons start automatically when your database starts up for the first time. The spread daemons run on control nodes in your cluster. The control nodes manage message communication.On highly loaded, very busy Vertica nodes, the Spread process might be starved of resources. If Spread is starved of memory, CPU, or network resources, it can result in:
• Nodes being dropped from the cluster
• Vertica being unable to open new sessions
• Existing sessions will not be cleaned up
• Users being unable to run queries
This blog post explains some techniques you can use to dedicate CPU and memory resources to the Spread process and prevent CPU and memory starvation. Any changes you make should be done on all nodes. The settings described in this post are only good if the database instance they were applied to is up. These settings do not persist, so these commands have to be run each time the spread process is restarted.
CPU Resources
If your machine shows a very high load average and if your monitors show a CPU usage above 80%, your machine might be CPU bound. In this case, you can increase the nice priority of the Spread process. Setting the value to a negative priority may also help. A negative nice value means a higher priority.[@:/]$ uptime
11:32:17 up 1:42, 4 users, load average: 45.50, 42.50, 40.52
[@:/]$ ps -o pid,command,nice -p $(pgrep spread$)
PID COMMAND NI
19358 /opt/vertica/bin/spread 0
# give spread the higher priority/highest priority possible -20.
[@:/]$ sudo renice -20 -p $(pgrep spread$)
19358 (process ID) old priority 0, new priority -20
[@:/]$ ps -o pid,command,nice -p $(pgrep spread$)
PID COMMAND NI
19358 /opt/vertica/bin/spread -20
You can also dedicate an exclusive CPU to spread and exclude Vertica from using that processor/CPU. If you do this, note that the Vertica process must perform the same work with one less CPU.Option 1:
# assign CPU 0 to spread
[@:/]$ sudo taskset -pc 0 $(pgrep spread$)
pid 19358's current affinity list: 0-7
pid 19358's new affinity list: 0
# assign CPU 1-7 to vertica (rest of the CPUs)
[@:/]$ sudo taskset -pc 1-7 $(pgrep vertica$)
pid 19360's current affinity list: 0-7
pid 19360's new affinity list: 1-7
[@:/]$ sudo taskset -p $(pgrep spread$)
pid 19358's current affinity mask: 1
[@:/]$ sudo taskset -p $(pgrep vertica$)
pid 19360's current affinity mask: fe
Option 2:You can create a control group to dedicate a CPU to Spread and disable swapping when you do that. The following example assigns a single CPU exclusively to spread and disables swapping:
# create a control group for spread..
sudo cgcreate -t dbadmin:verticadba -g cpu,cpuset,cpuacct,memory:spread
# assign cpu 0 to spread
sudo cgset -r cpuset.cpus=0 spread
sudo cgset -r cpuset.cpu_exclusive=1 spread
# restrict all other process by any user to cpu 1-7
sudo cgset -r cpuset.cpus=1-7 /user
# prevent process from swapping
sudo cgset -r memory.swappiness=0 spread
# allocate 100% of cpu 0 to spread
sudo cgset -r cpu.cfs_period_us=1000000 spread
sudo cgset -r cpu.cfs_quota_us=1000000 spread
# enable cpu accounting for spread
sudo cgset -r cpuacct.usage=0 spread
# attach spread process to cgroup
sudo cgclassify -g cpu,memory:spread $(pgrep spread$)
# check...
cat /proc/$(pgrep spread$)/cgroup
# or make a persistent config file and copy to/etc/cgconfig.conf
sudo touch /etc/cgsnapshot_blacklist.conf && cgsnapshot -s
# clean up, if you dont like what you did
sudo cgdelete -g cpu,cpuset,memory:spread
The following is a sample control group configuration:
group spread {
perm {
admin {
uid = root;
gid = root;
}
task {
uid = dbadmin;
gid = verticadba;
}
}
cpuset {
cpuset.cpu_exclusive="1";
cpuset.cpus="0";
}
cpu {
cpu.cfs_period_us="1000000";
cpu.cfs_quota_us="1000000";
cpu.shares="1024";
}
cpuacct {
cpuacct.usage="0";
}
memory {
memory.swappiness="0";
}
}
Option 3:
# alternatively, run all queries in a new pool set to use only CPU 1-7 (for example)
dbadmin=> CREATE resource pool constraint_pool CPUAFFINITYSET '1-7' CPUAFFINITYMODE EXCLUSIVE EXECUTIONPARALLELISM 7;
CREATE RESOURCE POOL
dbadmin => SELECT name,cpuaffinityset, cpuaffinitymode, executionparallelism FROM resource_pools WHERE name = 'constraint_pool';
name | cpuaffinityset | cpuaffinitymode | executionparallelism
-----------------+----------------+-----------------+----------------------
constraint_pool | 1-7 | EXCLUSIVE | 7
(1 row)
Memory Resources
Check if spread is swapping to disk on an already running Spread process using the following command:grep VmSwap /proc/$(pgrep spread$)/status
VmSwap: 0 kB
Then, disable anything from swapping to disk:
grep swappiness /etc/sysctl.conf
# The kernel will swap only to avoid an out of memory condition, other wise swapping is disabled
vm.swappiness = 0
If Vertica is using most of the memory, resulting in a memory issue, reduce the Vertica general pool memory usage. If you reduce the general pool memory, be aware that it can affect other pools that borrow from it or queries that directly use it:
dbadmin => SELECT name, maxmemorysize FROM resource_pools WHERE name='general';
name | maxmemorysize
---------+---------------
general | Special: 95%
(1 row)
dbadmin => ALTER resource pool general maxmemorysize '80%';
NOTICE 2585: Change takes effect upon restart. Recovering nodes will use the new value
ALTER RESOURCE POOL
dbadmin => SELECT name, maxmemorysize FROM resource_pools WHERE name='general';
name | maxmemorysize
---------+---------------
general | Special: 80%
(1 row)