Cluster Scaling & Other Operations
This document describes steps to scale your Kubernetes management cluster that is part of your self-hosted Private Cloud Director deployment.
Scale Up Management Cluster
Let's consider that we have 3 master nodes as part of your management cluster, with IP addresses 1.1.1.1, 2.2.2.2 `and` 3.3.3.3.
$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 2.2.2.2
- nodeName: 3.3.3.3Lets say that you want to scale up the management cluster to 5 nodes, and that 4.4.4.4 & 5.5.5.5 are the IP addresses of the 2 new nodes to be added.
To scale up the number of cluster nodes to 5:
Configure the pre-requisites on the two new nodes.
Edit the cluster bootstrap configuration file
/opt/pf9/airctl/conf/nodelet-bootstrap-config.yamland add the two new IP addresses to the section calledmasterNodesin the file.Finally, run the
airctlcommand shown below to scale up the cluster.
$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 2.2.2.2
- nodeName: 3.3.3.3
- nodeName: 4.4.4.4
- nodeName: 5.5.5.5airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verboseNow verify that the management cluster has scaled up by querying the cluster nodes.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
1.1.1.1 Ready master 44m29s v1.29.2
2.2.2.2 Ready master 45m41s v1.29.2
3.3.3.3 Ready master 46m42s v1.29.2
4.4.4.4 Ready master 5m42s v1.29.2
5.5.5.5 Ready master 5m40s v1.29.2Management Cluster Status
To check the status of your management cluster, run the following command:
airctl status --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>Sample output:
airctl status --config /opt/pf9/airctl/conf/airctl-config.yaml --region foo-region1
# Sample output:
------------- deployment details ---------------
fqdn: foo-region1.bar.io
cluster: foo-kplane.bar.io
region: foo-region1
task state: ready
version: v-5.12.0-3479469
-------- region service status ----------
desired services: 45
ready services: 45Scale Down Management Cluster
Now let's assume that we want to remove nodes 2.2.2.2 and 3.3.3.3 from the management cluster. To scale down the cluster:
Edit the cluster bootstrap configuration file and remove the IP addresses of the two nodes.
Then run the
airctlcommand as shown below to scale down the cluster.
$ cat /opt/pf9/airctl/conf/nodelet-bootstrap-config.yaml
...
masterNodes:
- nodeName: 1.1.1.1
- nodeName: 4.4.4.4
- nodeName: 5.5.5.5airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verboseWarning
When we run above command to scale down the cluster, due to a known issue, it tries to remove the first node in this case 2.2.2.2 and fails due to a containerd mount cleanup issue. See the workaround below.
$ airctl scale-cluster --config /opt/pf9/airctl/conf/airctl-config.yaml --verbose
2024-12-05T01:06:21.279Z info Removing node 2.2.2.2 from cluster airctl-mgmt
2024-12-05T01:06:21.279Z info Deleting nodelet
2024-12-05T01:06:21.279Z info Removing nodelet with cmd: apt remove -y nodelet
...
cannot remove '/run/containerd/io.containerd.grpc.v1.cri/sandboxes/f5cc808d52184fa092b1c9de2cef7a4ef9d606cdd1877be9efe5d4c91ecc4604/shm': Device or resource busy<br>"}
Failed to update nodelet cluster: ScaleCluster failed to remove old masters: failed to delete node 2.2.2.2: failed: sudo rm -rf /run/containerd: command sudo sudo rm -rf /run/containerd failed: Process exited with status 1
Error: ScaleCluster failed to remove old masters: failed to delete node 2.2.2.2: failed: sudo rm -rf /run/containerd: command sudo sudo rm -rf /run/containerd failed: Process exited with status 1Workaround
We need to manually unmount the containerd partitions from the node to be removed.
Then, run command kubectl delete node <IPAddress> on the cluster after which all the terminating pods for that node move to other nodes part of the cluster.
It can take around 10-15 mins for all the pods from this node to move to the other nodes. Please wait and ensure this has happened before proceeding to next step.
Because the scale command errored out on the first node to be removed, run the scale command again to remove the second node. Also, ensure to perform the manual workaround steps again in this case for node
3.3.3.3Cluster state post scale down operation completion:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
1.1.1.1 Ready master 44m29s v1.29.2
4.4.4.4 Ready master 5m42s v1.29.2
5.5.5.5 Ready master 5m40s v1.29.2Stop Management Plane/Regions
To stop the specific region of your self-hosted deployment, run the following command. If you want to stop all regions, just remove --region flag.
airctl stop --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>
SUCCESS scaling down management plane foo-region1Sample output:
airctl stop --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>
SUCCESS scaling down management plane foo-region1Start Management Plane/Regions
To start the specific region of your self-hosted deployment, run the following command. If you want to start all regions, just remove --region flag.
airctl start --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>Sample output:
airctl start --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME>
SUCCESS scaling up management plane foo-region1Uninstall Self-Hosted Deployment
To uninstall the specific region of your self-hosted deployment, run the following command. If you want to uninstall all regions, just remove --region flag.
airctl unconfigure-du --config /opt/pf9/airctl/conf/airctl-config.yaml --region <REGION_NAME> --forceThis command will uninstall and remove configured regions along with all infrastructure software like consul, vault, percona, k8sniff, etc.
If you plan to reuse the same nodes to deploy a new self-hosed Private Cloud Director environment, make sure to also run the following command on all nodes first.
rm -rf airctl* install-pcd.sh nodelet* options.json pcd-chart.tgz /opt/pf9/airctl/ .airctl/Last updated
Was this helpful?
