Platform9 Host Components

PF9 Host Side Services

This is a list of the services that run on the master/worker nodes.

pf9-hostagent

The pf9-hostagent service is required by the hosts for management by Platform9. Its function introduces a method to install and configure the platform9 apps on the host using roles. The hostagent receives requests from the management plane to apply a node role on the host. It then updates and reports the success or failure of the role apply action back to the management plane.

When it receives the request from the management plane to push a role, it downloads the role’s packages, installs them, and finally executes the configuration scripts using the role data that comes as part of the payload. Hostagent also sends a periodic heartbeat to the management plane via the pf9-comms service. It includes CPU, memory, storage, and network information in the heartbeat.

Additionally, the pf9-hostagent monitors currently running services for the applied role and checks configuration matches with the configuration stored on the management plane in the host role mapping database. For Platform9 Managed Kubernetes (PMK), the host role is known as pf9-kube. It keeps the role configuration consistently applied on the host with the information stored in the management plane, e.g.:

  • The keepalived configuration for VIP. If the VIP network interface is modified in the configuration file on the host and keepalived is restarted, the hostagent will reset the changes back to the original network interface in keepalived during its next check.

  • It also monitors pf9-comms and restarts it if it is found to be dead.

The pf9-hostagent is also responsible for updating roles on the host. Host role apply, modify and delete (for example ‘pf9-kube’) are directed from the management plane using the pf9-hostagent. Host role updates are received on the host by the hostagent to update pf9 components on the host during an upgrade from Platform9.

pf9-comms

pf9-comms is used as a single point of communication with the management plane for Platform9-managed host services. For host-side Platform9 services like pf9-hostagent to talk to the management plane, Platform9 creates a tunnel from the host to the management plane that channels communication. The only requirement for this to work is an outbound TCP 443 connection from the host to the management plane.

pf9-comms acts as a multiplexer intercepting requests arriving at multiple localhost ports of host pf9 services and sends them over TCP 443 to the management plane. On a PMK cluster, pf9-hostagent and pf9-muster connect to a port on localhost that gets multiplexed in pf9-comms. The requests from pf9-comms are intercepted by an ingress service in the management plane, demultiplexed, and passed on to the intended receiver. pf9-comms uses TLS with trusted certificates to connect with the management plane ingress service.

Example netstat showing pf9-comms connections
COMMAND     PID USER   FD   TYPE   DEVICE SIZE_OFF NODE NAME
pf9-comms 17951  pf9   37u  IPv4   46812429 0t0    TCP mav-3-1:41768->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-comms 17951  pf9   41u  IPv4   46812664 0t0    TCP mav-3-1:41784->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-comms 17951  pf9   43u  IPv4   46812674 0t0    TCP mav-3-1:41788->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-hostd 17814  pf9   10u  IPv6   47533401 0t0    TCP localhost:38486->localhost:amqp (ESTABLISHED)
pf9-muste 28887  pf9    3u  IPv4   47534314 0t0    TCP localhost:49962->localhost:amqp (ESTABLISHED)
pf9-muste 28887  pf9    5u  IPv4   47538143 0t0    TCP localhost:50316->localhost:amqp (ESTABLISHED)
pf9-comms 96362  pf9   14u  IPv4   47532817 0t0    TCP localhost:amqp (LISTEN)
pf9-comms 96362  pf9   15u  IPv6   47532818 0t0    TCP localhost:amqp (LISTEN)
pf9-comms 96362  pf9   34u  IPv6   47533404 0t0    TCP localhost:amqp->localhost:38486 (ESTABLISHED)
pf9-comms 96362  pf9   36u  IPv4   47533407 0t0    TCP localhost:amqp->localhost:49962 (ESTABLISHED)
pf9-comms 96362  pf9   38u  IPv4   47538961 0t0    TCP localhost:amqp->localhost:50316 (ESTABLISHED)

pf9-sidekick

pf9-sidekick runs parallel to pf9-comms, but independently, and provides a backup channel for some of the hostagent's operations. When communication with pf9-hostagent is lost from the management plane, pf9-sidekick supports bundle upload operations and remote execution to allow diagnosis and recovery.

This service is typically used for debugging and talks to the host, providing a secondary channel to execute commands on managed hosts. It connects with the management plane component named sidekickserver.

pf9-muster

Muster is a monitoring and troubleshooting tool. It sends back host statistics such as memory and load usage. It also exposes a limited API allowing the Platform9 Support team to send whitelisted commands for troubleshooting. Communication is via pf9-comms.

pf9-nodelet

Nodelet comes into action after cluster creation and after the host has been discovered and authorized in the management plane. It refers to YAML files under /etc/pf9/nodelet/ for config options needed to configure the host as a Kubernetes master or worker node. Nodelet writes to /etc/pf9/kube.env, which is used by component phase scripts. It is responsible for starting, restarting, and stopping Kubernetes components in a controlled manner.

Nodelet can take corrective actions such as performing a partial restart or partial rollback of the Kubernetes components if they fail during startup. For example, if docker is not running, nodelet only attempts to restart the chain of components up to the docker configuration phase.

Nodelet continues to monitor the k8s stack once the node has been added to a cluster, invoking status checks every 1 minute. It creates a cgroup named pf9-kube-status to limit the CPU used during these checks.

Nodelet starts pf9 components when the pf9-kube role is pushed to the host and works with hostagent. It configures pf9 Kubernetes components in phases, retries only selective failed phases, and performs a full cleanup and restart of all components from the beginning on every 10th attempt if any component phase fails in the preceding 9 consecutive attempts.

pf9-kubelet

The Kubernetes kubelet runs as a systemd service on Platform9 managed Kubernetes hosts. On master nodes this service starts three master containers (api-server, scheduler, controller) as static pods. On worker nodes, kubelet communicates with the Kubernetes API on the masters to manage the pods on that node.

keepalived

This service is responsible for keeping the Master VIP highly available. Keepalived is installed and started only when VIP and VIP interface fields are chosen during cluster bootstrapping.

Standalone Docker Containers

etcd

etcd is the key-value data store backing all Kubernetes cluster data. In Platform9, etcd runs as a Docker container. pf9 etcd containers run on the master nodes and can be configured as single, three, or five node configurations.

kube-proxy

kube-proxy runs as a Docker container on every node, implementing components of the Kubernetes Service concept. kube-proxy maintains network rules on the nodes that permit network communication to Pods from inside or outside the cluster.

pf9-bouncer

pf9-bouncer runs as a Docker container and receives authentication validation requests from the Kubernetes API server. It is the Kubernetes cluster-level authentication service configured by default on Platform9 clusters, using Keystone as the identity provider. If the Kubernetes API has a valid authentication token in the request, pf9-bouncer passes the request on to the API server.

Pf9 Namespaces and Pods

Namespace
Example Pod Name
Purpose of Pod

pf9-olm

packageserver-844d4fb848-zmpxd

OLM internal pod for processing OLM package installation

pf9-olm

packageserver-844d4fb848-cpr4b

OLM internal pod for processing OLM package installation

pf9-olm

platform9-operators-df5bl

OLM repository pod; OLM operator fetches packages from this repo to install operators

pf9-olm

olm-operator-fbd9c955c-j85zb

OLM operator: watches OLM subscription objects and installs operators in response

pf9-olm

catalog-operator-d59cf9dfb-5t7s8

OLM catalog operator: works with olm operator and validates OLM packages before installing

pf9-olm

prometheus-operator-54dd4d9b-kxv9r

Prometheus operator: installed through OLM by creating an OLM subscription object

pf9-operators

monhelper-5c8558c46-p5hw2

Platform9 helper pod created as part of OLM package for Prometheus operators; installs and configures monitoring objects (prometheus, alertmanager, grafana, etc.)

pf9-monitoring

prometheus-system-0

Actual Prometheus instance installed through Prometheus operator

pf9-monitoring

grafana-986c774cf-p8w58

Grafana for UI visualization of Prometheus metrics; pre-configured to talk to Prometheus and includes built-in dashboards

pf9-monitoring

alertmanager-sysalert-0

Alertmanager instance configured with Prometheus to receive alerts; user must configure targets for delivery

pf9-monitoring

node-exporter-llgkj

Node Exporter daemonset, exports metrics for each kube node for Prometheus scraping

pf9-monitoring

node-exporter-nbw5k

Node Exporter daemonset, exports metrics for each kube node for Prometheus scraping

pf9-monitoring

node-exporter-rjtdg

Node Exporter daemonset, exports metrics for each kube node for Prometheus scraping

pf9-monitoring

kube-state-metrics-595cb5cc

kube-state-metrics exporter: exposes Kubernetes cluster metrics for Prometheus

platform9-system

pf9-sentry

pf9 UI queries the HTTP server running inside this pod to get a list of CSI drivers

pf9-addons

pf9-addon-operator-7f9784f867-ktn6z

Addon operator supporting addons such as coredns, metrics-server, dashboard, autoscaler, etc.

Etcd-backup

If etcd-backup is enabled at bootstrap, the etcd-backup directory is configured on the cluster. The default backup frequency is every 24 hours, but this is configurable. The etcd backup file is saved in /etc/pf9/etcd-backup on the master node. This backup can be used to recover cluster state after events such as etcd corruption.

Was this helpful?