Creating a Custom Resource

The custom resource definition (CRD) is a shared global object that extends the Kubernetes API beyond the standard resource types. The CRD serves as a blueprint for custom resource (CR) instances. You create CRs that specify the desired state of your environment, and the operator monitors the CR to maintain state for the objects within its namespace.

CRs use the YAML file format. For details about all available CR settings, see custom resource parameters.

Prerequisites

Creating Secrets

Use the kubectl command line tool to create Secrets that store sensitive information in your custom resource without exposing the values they represent.

  1. Create a secret named vertica-license for your Vertica license:

    $ kubectl create secret generic vertica-license --from-file=license.dat=/path/to/license.dat

    By default, the Helm chart uses the free Community Edition license. This license is limited to 3 nodes and 1 TB of data.

  2. Create a secret named su-passwd to store your superuser password. If you do not add a superuser password, there is not one associated with the database:

    $ kubectl create secret generic su-passwd --from-literal=password=secret-password
  3. The following command stores both your S3-compatible communal access and secret key credentials in a Secret named s3-creds:

    $ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
  4. This tutorial configures a certificate authority (CA) bundle that authenticates the S3-compatible connections to your custom resource. Create a Secret named aws-cert:

    $ kubectl create secret generic aws-cert --from-file=root-cert.pem
  5. You can mount multiple certificates in the Vertica server filesystem. The following command creates a Secret for your mTLS certificate in a Secret named mtls:

    $ kubectl create secret generic mtls --from-file=mtls=/path/to/mtls-cert

Required Fields

The VerticaDB definition begins with required fields that describe the version, resource type, and metadata:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: verticadb-sample

The previous example defines the following:

  • apiVersion: The API group and Kubernetes API version in api-group/version format.
  • kind: The resource type. VerticaDB is the name of the Vertica custom resource type.
  • metadata: Optional data that identifies objects in the namespace.
    • name: The name of this CR object.

spec Definition

The spec field defines the desired state of the CR. During the control loop, the operator compares the spec values to the current state and reconciles any differences.

The following sections nest values under the spec field to define the desired state of your custom resource object.

Image Management

Each custom resource instance requires access to Vertica server image and instruction on how often to download a new image:

spec:
  image: vertica/vertica-k8s:latest
  imagePullPolicy: IfNotPresent

The previous example defines the following:

  • image: The image to run in the Vertica server container pod, defined here in docker-registry-hostname/image-name:tag format. For a full list of available Vertica images, see the Vertica Dockerhub registry.
  • imagePullPolicy: Controls when the operator pulls the image from the Dockerhub registry. IfNotPresent pulls the image only if it is not present in the local image repository.

Cluster Description Values

This section logically groups fields that configure the database and how it operates:

spec:
  ...
  initPolicy: Create
  kSafety: "1"
  licenseSecret: vertica-license
  superuserPasswordSecret: su-passwd

The previous example defines the following:

  • initPolicy: Specifies how to initialize the database. Create initializes a new database for the custom resource.
  • kSafety: Determines the fault tolerance for the subcluster. For a three-pod subcluster, set kSafety to 1.
  • licenseSecret: The Secret that contains your Vertica license key. The license is mounted in the /home/dbadmin/licensing/mnt directory.
  • superuserPasswordSecret: The Secret that contains the database superuser password.

Configuring Communal Storage

The following example configures communal storage for an S3 endpoint. For a list of supported communal storage locations, see Vertica in a Containerized Environment. For implementation details for each communal storage location, see Configuring Communal Storage.

Provide the location and credentials for the storage location in the communal section:

spec:
  ...
  communal:
    credentialSecret: s3-creds
    endpoint: https://path/to/s3-endpoint
    path: s3://bucket-name/key-name
    caFile: /certs/aws-certs/root_cert.pem
    region: aws-region

The previous example defines the following:

  • credentialSecret: The Secret that contains your communal access and secret key credentials.
  • endpoint: The S3 endpoint URL.
  • path: The location of the S3 storage bucket, in S3 bucket notation. This bucket must exist before you create the custom resource. After you create the custom resource, you cannot change this value.
  • caFile: Mounts in the server container filesystem the certificate file that validates S3-compatible connections to your custom resource.
  • region: The geographic location of the communal storage resources.

Mounting Custom TLS Certificates

certSecrets is a list that contains each Secret that you created to encrypt internal and external communications for your CR. Use the name key to add each certificate:

spec:
  ...
  certSecrets:
    - name: mtls
    - name: aws-cert

certSecrets accepts an unlimited number of name values. If you update an existing certificate, the operator replaces the certificate and restarts the Vertica server container. If you add or delete a certificate, the operator reschedules the pod with the new configuration.

Adding a Sidecar Container

A sidecar is a utility container that runs in the same pod as the Vertica server container and performs a task for the Vertica server process. For example, you can add a sidecar to send logs from vertica.log to the stdout on the host node for log aggregation.

sidecars accepts a list of sidecar definitions, where each element defines the following values:

spec:
  ...
  sidecars:
    - name: vlogger
      image: vertica/vertica-logger:1.0.0

The previous example defines the following:

  • name: The name of the sidecar. name indicates the beginning of a sidecar element.
  • image: The image for the sidecar container. Vertica provides the vlogger image that sends the contents of vertica.log to stdout on the host node.

Because this sidecar performs a task that requires that it persist data between pod life cycles, the following section mounts a custom volume in the sidecar filesystem.

Mounting Custom Volumes

You might need to mount a custom volume to persist data between pod life cycles in one of the following circumstances:

Use the volumeMounts.* parameters to mount one or more custom volumes. To mount a custom volume for the Vertica server container, add the volumeMounts.* values directly under spec. To mount a custom volume for a sidecar container, nest the volumeMounts.* values in the sidecars array as part of an individual sidecar element definition.

The following example mounts a custom volume named tenants-vol in the Vertica server container, and a custom volume named logger-vol in the sidecar container:

spec:
  ...
  volumeMounts:
  - name: tenants-vol
    mountPath: /path/to/tenants-vol
  ...
  sidecars:
    - name: vlogger
      image: vertica/vertica-logger:1.0.0
      volumeMounts:
        - name: logger-vol
          mountPath: /path/to/logger-vol

The previous example defines the following:

  • volumeMounts: Accepts a list of custom volumes and mount paths to persist data for the Vertica pod.
  • volumeMounts.name: The name of the custom volume that persists data.
  • volumeMounts.mountPath: The path to the custom volume mount point in the Vertica pod filesystem.

The volumes.* parameters make the custom volume available to the CR to mount in the appropriate container filesystem. Indent volumes to the same level as its corresponding volumeMounts entry. The following example mounts custom volumes for both the Vertica server container and the sidecar utility container:

spec:
  ...
  volumeMounts:
  - name: tenants-vol
    mountPath: /path/to/tenants-vol
  volumes:
    - name: tenants-vol
      persistentVolumeClaim:
        claimName: vertica-pvc
  ...
  sidecars:
    - name: vlogger
      image: vertica/vlogger:1.0.0
      volumeMounts:
        - name: logger-vol
          mountPath: /path/to/logger-vol
      volumes:
        - name: logger-vol
          emptyDir: {}

The previous example defines the following:

  • volumes: Accepts a list of custom volumes and volume types to persist data for a container.
  • volumes.name: The name of the custom volume that persists data. This value must match the corresponding volumeMounts.name value.
  • persistentVolumeClaim and emptyDir: The volume type and name. The Vertica custom resource accepts any Kubernetes volume type.

Local Container Information

Each container persists catalog, depot, configuration, and log data in a PersistentVolume (PV). You must provide information about the data and depot locations for operations such as pod rescheduling:

spec:
  ...
  local:
    dataPath: /data
    depotPath: /depot
    requestSize: 500Gi

The previous example defines the following:

  • dataPath: Where the /data directory is mounted in the container filesystem. The /data directory stores the local catalogs and temporary files.
  • depotPath: Where the depot is mounted in the container filesystem. Eon Mode databases cache data locally in a depot to reduce the time it takes to fetch data from communal storage to perform operations.
  • requestSize: The minimum size of local data volume available when binding a PV to the pod.

    To ensure you do not run out of disk space, verify that the underlying storage is large enough to accommodate the requestSize setting.

You must configure a StorageClass to bind the pods to a PersistentVolumeClaim (PVC). For details, see Containerized Vertica on Kubernetes.

Shard Count

The shardCount setting specifies the number of shards in the database:

spec:
  ...
  shardCount: 12

You cannot change this value after you instantiate the CR. When you change the number of pods in a subcluster, or add or remove a subcluster, the operator rebalances shards automatically.

For guidance on selecting the shard count, see Configuring Your Vertica Cluster for Eon Mode.

Subcluster Definition

The subclusters section is a list of elements, where each element represents a subcluster and its properties. Each CR requires a primary subcluster or it returns an error:

spec:
  ...
  subclusters:
  - isPrimary: true
    name: primary-subcluster
    size: 3

The previous example defines the following:

  • isPrimary: Designates a subcluster as primary or secondary. Each CR requires a primary subcluster or it returns an error. For details, see Subclusters.
  • name: The name of the subcluster.

  • The default subcluster name that the Vertica server generates is default_subcluster. This name is invalid for Kubernetes resource types. You must provide a valid name that follows Kubernetes guidelines.

  • size: The number of pods in the subcluster.

Subcluster Service Object

Each subcluster communicates with external clients and internal pods through a service object:

spec:
  ...
  subclusters:
    ...
    serviceType: NodePort
    nodePort: 32001

In the previous example:

  • serviceType: Defines the subcluster service object.

    By default, a subcluster uses the ClusterIP serviceType, which sets a stable IP and port that is accessible from within Kubernetes only. In many circumstances, external client applications need to connect to a subcluster that is fine-tuned for that specific workload. For external client access, set the serviceType to NodePort or LoadBalancer.

  • nodePort: Assigns a unique port number in the 30000 - 32767 range for client access.

    Each subcluster that uses the NodePort service type requires a unique port for client access. If you do not provide a nodePort value, Kubernetes assigns a unique port in that range automatically. As a best practice, let Kubernetes assign the port number to avoid potential collisions.

    You must create a firewall rule that allows TCP connections on the external port assigned with nodePort.

    To verify the port number, use the kubectl get scv command and view the PORT(S) column:

    $ kubectl get svc
    NAME                  TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
    ...
    vertica-dashboard     NodePort    10.96.136.13   <none>        5433:32000/TCP,5444:30288/TCP   21d

For details about Vertica and service objects, see Containerized Vertica on Kubernetes.

Pod Resource Limits and Requests

Set the amount of CPU and memory resources each host node allocates for the Vertica server pod, and the amount of resources each pod can request:

spec:
  ...
  subclusters:
    ...
    resources:
      limits:
        cpu: 32
        memory: 96Gi
      requests:
        cpu: 32
        memory: 96Gi

In the previous example:

  • resources: The amount of resources each pod requests from its host node. When you change resource settings, Kubernetes restarts each pod with the updated resource configuration.
  • limits: The maximum amount of CPU and memory that each server pod can consume.
  • requests: The amount of CPU and memory resources that each pod requests from a PV.

    The default configuration sets request and limit defaults based on Recommendations for Sizing Vertica Nodes and Clusters. These limits are for testing environments—they are not suitable for production workloads.

    As a best practice, set the resource request and limit to equal values so that they are assigned to the guaranteed QoS class. Equal settings also provide the best safeguard against the Out Of Memory (OOM) Killer in constrained environments.

    Select resource settings that your host nodes can accommodate. When a pod is started or rescheduled, Kubernetes searches for host nodes with enough resources available to start the pod. If there is not a host node with enough resources, the pod STATUS stays in Pending until there are enough resources available.

Node Affinity

Kubernetes provides affinity and anti-affinity settings to control which resources the operator uses to schedule pods. As a best practice, set affinity to ensure that a single node does not serve two Vertica pods:

spec:
  ...
  subclusters:
    ...
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - vertica
          topologyKey: "kubernetes.io/hostname"

In the previous example:

  • affinity: Provides control over pod and host scheduling using labels.
  • podAntiAffinity: Uses pod labels to prevent scheduling on certain resources.
  • requiredDuringSchedulingIgnoredDuringExecution: The rules defined under this statement must be met before a pod is scheduled on a host node.
  • labelSelector: Identifies the pods affected by this affinity rule.
  • matchExpressions: A list of pod selector requirements that consists of a key, operator, and values definition. This matchExpression rule checks if the host node is running another pod that uses a vertica label.
  • topologyKey: Defines the scope of the rule. Because this uses the hostname topology label, this applies the rule in terms of pods and host nodes.

Complete File Reference

As a reference, below is the complete CR YAML file created in this tutorial:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: verticadb-sample
spec:
  image: vertica/vertica-k8s:11.0.1-0
  imagePullPolicy: IfNotPresent
  initPolicy: Create
  kSafety: "1"
  licenseSecret: vertica-license
  superuserPasswordSecret: su-passwd
  communal:
    credentialSecret: s3-creds
    endpoint: https://path/to/s3-endpoint
    path: s3://bucket-name/key-name
    caFile: /certs/aws-certs/root_cert.pem
    region: aws-region
  volumeMounts:
  - name: tenants-vol
    mountPath: /path/to/tenants-vol
  volumes:
    - name: tenants-vol
      persistentVolumeClaim:
        claimName: vertica-pvc
  sidecars:
    - name: vlogger
      image: vertica/vertica-logger:1.0.0
      volumeMounts:
        - name: logger-vol
          mountPath: /path/to/logger-vol
      volumes:
        - name: my-custom-vol
          emptyDir: {}
  certSecrets:
    - name: mtls
    - name: aws-cert
  local:
    dataPath: /data
    depotPath: /depot
    requestSize: 500Gi
  shardCount: 12
  subclusters:
  - isPrimary: true
    name: primary-subcluster
    size: 3
    serviceType: NodePort
    nodePort: 32001
    resources:
      limits:
        cpu: 32
        memory: 96Gi
      requests:
        cpu: 32
        memory: 96Gi
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - vertica
          topologyKey: "kubernetes.io/hostname"