Cluster Scaling & Other Operations

This document describes steps to scale your Kubernetes management cluster that is part of your self-hosted Private Cloud Director deployment.

Scale Up Management Cluster

Let's consider that we have 3 master nodes as part of your management cluster, with IP addresses 1.1.1.1, 2.2.2.2 `and` 3.3.3.3.

$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 2.2.2.2
- nodeName: 3.3.3.3

Lets say that you want to scale up the management cluster to 5 nodes, and that 4.4.4.4 & 5.5.5.5 are the IP addresses of the 2 new nodes to be added.

To scale up the number of cluster nodes to 5:

  • Configure the pre-requisites on the two new nodes.

  • Edit the cluster bootstrap configuration file /opt/pf9/airctl/conf/nodelet-bootstrap-config.yamland add the two new IP addresses to the section called masterNodes in the file.

  • Finally, run the airctl command shown below to scale up the cluster.

Note

airctl expects the node count to be always odd in the cluster bootstrap configuration file /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml

$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 2.2.2.2
- nodeName: 3.3.3.3
- nodeName: 4.4.4.4
- nodeName: 5.5.5.5
airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verbose

Now verify that the management cluster has scaled up by querying the cluster nodes.

$ kubectl get nodes
NAME            STATUS   ROLES    AGE      VERSION
1.1.1.1         Ready    master   44m29s   v1.29.2
2.2.2.2         Ready    master   45m41s   v1.29.2
3.3.3.3         Ready    master   46m42s   v1.29.2
4.4.4.4         Ready    master   5m42s    v1.29.2
5.5.5.5         Ready    master   5m40s    v1.29.2

Management Cluster Status

To check the status of your management cluster, run the following command:

airctl status --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>

Sample output:

airctl status --config /opt/pf9/airctl/conf/airctl-config.yaml --region foo-region1
# Sample output:
------------- deployment details ---------------
fqdn:                foo-region1.bar.io
cluster:             foo-kplane.bar.io
region:              foo-region1
task state:          ready
version:             v-5.12.0-3479469
-------- region service status ----------
desired services:  45
ready services:    45

Scale Down Management Cluster

Now let's assume that we want to remove nodes 2.2.2.2 and 3.3.3.3 from the management cluster. To scale down the cluster:

  • Edit the cluster bootstrap configuration file and remove the IP addresses of the two nodes.

  • Then run the airctl command as shown below to scale down the cluster.

$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 4.4.4.4
- nodeName: 5.5.5.5
airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verbose
$ airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verbose
2024-12-05T01:06:21.279Z        info    Removing node 2.2.2.2 from cluster airctl-mgmt
2024-12-05T01:06:21.279Z        info    Deleting nodelet
2024-12-05T01:06:21.279Z        info    Removing nodelet with cmd: apt remove -y nodelet
...
cannot remove '/run/containerd/io.containerd.grpc.v1.cri/sandboxes/f5cc808d52184fa092b1c9de2cef7a4ef9d606cdd1877be9efe5d4c91ecc4604/shm': Device or resource busy<br>"}
Failed to update nodelet cluster: ScaleCluster failed to remove old masters: failed to delete node 2.2.2.2: failed: sudo rm -rf /run/containerd: command sudo sudo rm -rf /run/containerd failed: Process exited with status 1
Error: ScaleCluster failed to remove old masters: failed to delete node 2.2.2.2: failed: sudo rm -rf /run/containerd: command sudo sudo rm -rf /run/containerd failed: Process exited with status 1
  • Because the scale command errored out on the first node to be removed, run the scale command again to remove the second node. Also, ensure to perform the manual workaround steps again in this case for node 3.3.3.3

  • Cluster state post scale down operation completion:

$ kubectl get nodes
NAME            STATUS   ROLES    AGE      VERSION
1.1.1.1         Ready    master   44m29s   v1.29.2
4.4.4.4         Ready    master   5m42s    v1.29.2
5.5.5.5         Ready    master   5m40s    v1.29.2

Stop Management Plane/Regions

To stop the specific region of your self-hosted deployment, run the following command. If you want to stop all regions, just remove --region flag.

airctl stop --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>

 SUCCESS  scaling down management plane foo-region1

Sample output:

airctl stop --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>                 
 SUCCESS  scaling down management plane foo-region1

Start Management Plane/Regions

To start the specific region of your self-hosted deployment, run the following command. If you want to start all regions, just remove --region flag.

airctl start --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>

Sample output:

airctl start --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>
 SUCCESS  scaling up management plane foo-region1

Uninstall Self-Hosted Deployment

To uninstall the specific region of your self-hosted deployment, run the following command. If you want to uninstall all regions, just remove --region flag.

airctl unconfigure-du --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME> --force

This command will uninstall and remove configured regions along with all infrastructure software like consul, vault, percona, k8sniff, etc.

If you plan to reuse the same nodes to deploy a new self-hosed Private Cloud Director environment, make sure to also run the following command on all nodes first.

rm -rf airctl* install-pcd.sh nodelet* options.json pcd-chart.tgz /opt/pf9/airctl/ .airctl/

Last updated

Was this helpful?