PMK Scale Guide
Recommended Management Plane practices
Each PMK customer when on-boarded is provided a Management Plane (also known as Deployment Unit/DU/KDU) and this section outlines the recommendation and best practices for the same.
Following values are listed per Management Plane Instance:
Maximum number of nodes
2500
Maximum number of clusters (Single node clusters)
300
Maximum number of clusters (Small clusters - upto 8 nodes)
30
Maximum number of clusters (Medium clusters - upto 200 nodes)
8
Maximum number of clusters (Large clusters - upto 400 nodes)
5
Maximum number of clusters (Combination of medium and large clusters) Test configuration: - 400 Node clusters: 2 - 250 Node clusters: 2 - 200 Node clusters: 4
8
Maximum number of nodes onboarded in parallel
30
Maximum number of clusters created in in parallel (Single node clusters)
10
Note: Above values are based on latest Platform9 standard tests and are listed to provide guidance to users. Platform9 support can help you to scale to different numbers if above standard results are different from your requirements. Higher scale can be achieved with multiple Management Plane Instances, to go beyond the above listed node and cluster limits.
This guide is applicable for PMK BareOS clusters only.
Recommended Cluster configuration practices
Following values are listed per PMK cluster which runs on a Management Plane Instance:
Maximum number of nodes Test configuration: - Master & worker count: 5 masters, 395 workers - Kubernetes version: 1.26 - 1.29. (PMK 5.9 and 5.10) - Master node size: 18 vcpus, 30 GB memory - Worker node size: 2 vcpus, 6GB memory - Pod density: 23 - Cluster cpu usage max: 63% - CNI: Calico - Calico BGP: True; with Route-reflectors (3 nodes) - Metallb BGP: True
400
Maximum number of nodes Test configuration: - Master & worker count: 5 masters, 395 workers - Kubernetes version: 1.22-1.25 (PMK 5.6.8, 5.7.3 and 5.9.2 ) - Master node size: 18 vcpus, 30 GB memory - Worker node size: 2 vcpus, 6GB memory - Pod density: 23 - Cluster cpu usage max: 63% - CNI: Calico - Calico BGP: False - Metallb BGP: False
300
Maximum number of node upgrades in parallel in a cluster Test configuration: - Master & worker count: 5 masters, 395 workers - Kubernetes version: 1.26 - 1.29 - Master node size: 18 vcpus, 30 GB memory - Worker node size: 2 vcpus, 6GB memory - Pod density: 23 - Cluster cpu usage max: 65% - CNI: Calico - Calico BGP: Calico with Route-reflectors (3 nodes) - Metallb BGP: True - Upgrades versions tested: 1.26->1.27, 1.27->1.28, 1.28->1.29
40 (10 % of total 400 nodes)
Maximum number of nodes to be attached to a cluster in parallel
15
Maximum number of nodes to be detached from a cluster in parallel
30
Maximum number of pods per node
110 (Kubernetes default)
Some Test Observations:
Test configuration:
Master & worker count: 5 masters, 395 workers
Kubernetes version: 1.26 - 1.29
Master node size: 18 vcpus, 30 GB memory
Worker node size: 2 vcpus, 6GB memory
Pod density: 23
Cluster cpu usage max: 63%
CNI: Calico
Calico BGP: Calico wiih Route-reflectors (3 nodes)
Metallb BGP: True
Observations:
Number of pods: 9230
Number of pods per node: 23
Number of namespaces: 3000
Number of secrets: 15
Number of config maps: 1046
Number of services: 144
Number of pods per namespace: 7600 on single namespace
Number of services per namespace: 100
Number of deployments per namespace: 100
Component resource recommendations:
350 to 400 nodes
cpu: 200m memory:400Mi
cpu: 25m memory: 100Mi
Test configuration: Pod density as 23 and cpu usage around 60%
300 nodes
Prometheus
cpu: 2510m memory: 12266Mi
Requests and limits could be set based on this observation. It is dependent on multiple factors such as number of node, number of promethues exporter being queries, number of time series data being stored, number of calls to Prometheus etc.
Management Plane Instance resource recommendations
Default (upto 750 nodes)
Qbert
qbert
cpu: 1500m memory: 4000Mi
cpu: 40m memory: 550Mi
Resmgr
resmgr
cpu: 1000m memory: 1500Mi
cpu: 25m memory: 190Mi
Keystone
keystone
cpu: 1000m memory: 1000Mi
cpu: 250m memory: 800Mi
Prometheus
prometheus
cpu: 1000m memory: 4000Mi
cpu: 250m memory: 200Mi
Vault
pf9-vault
cpu: 500m memory: 500Mi
cpu: 25m memory: 100Mi
Scaled configurations (750 to 2500 nodes)
Component
Container
Limits(750-1500 nodes)
Requests(750-1500 nodes)
Limits(1500-2500)
Requests(1500-2500)
Additional changes
Prometheus
socat19090
cpu: 1000m memory: 1500Mi
cpu: 250m memory: 400Mi
No Change
No Change
maxchild: 2500
prometheus
cpu: 1000m memory: 4000Mi
cpu: 250m memory: 200Mi
No Change
No Change
WEB_MAX_ CONNECTIONS: 4000
Rabbitmq
socat5673
cpu: 400m memory: 1000Mi
cpu: 50m memory: 50Mi
cpu: 800m memory: 1800 Mi
cpu: 200m memory: 200Mi
rabbitmq
cpu: 1000m memory: 1500Mi
cpu: 130m memory: 750Mi
No Change
No Change
Resmgr
socat18083
cpu: 1000m memory: 1500Mi
cpu: 250m memory: 400Mi
Ingress-nginx-controller
socat444
cpu: 400m memory: 1000Mi
cpu: 50m memory: 50Mi
Sidekickserver
socat13010
cpu: 400m memory: 1000Mi
cpu: 50m memory: 50Mi
sidekickserver
cpu: 500m memory: 1000Mi
cpu: 50m memory: 100Mi
Sunpike conductor
socat19111
cpu: 400m memory: 1000Mi
cpu: 50m memory: 50Mi
Pf9-vault
vault
cpu: 1250m memory: 800Mi
cpu: 250m memory: 400Mi
Sunpike-apiserver
sunpike-apiserver
cpu: 1000m memory: 1000Mi
cpu: 500m memory: 256Mi
Sunpike-conductor
sunpike- conductor
cpu: 1000m memory: 1000Mi
cpu: 200m memory: 500Mi
Sunpike-kine
sunpike-kine
cpu: 1000m memory: 256Mi
cpu: 25m memory: 256Mi
Sunpike-kube- controllers
sunpike-kube- controllers
cpu: 500m memory: 1000Mi
cpu: 25m memory: 800Mi
Mysql/RDS config changes:
max_connections
2048
No change
mac_connect_error
1000
No change
Last updated
Was this helpful?
