Configuring Persistent Storage

Platform9 Monitoring deploys Prometheus, Alertmanager and Grafana in a single click on all clusters, this deployment leverages ephemeral storage. To configure Platform9 Monitoring to use persistent storage a storage class must be added to the cluster and the monitoring deployment updated to consume the storage class using Kubectl.

circle-exclamation

Add a Storage Class to Prometheus

The first step is to setup a storage class, if your cluster is running without storage follow the guide to setup the PortWorx CSIarrow-up-right.

Once you have a storage class configured run the Kubectl command below to edit the deployment:

kubectl -n pf9-monitoring edit prometheus system
circle-info

Info

Editing the running configuration uses the linux command line text editor Vi. For help with Vi view this guidearrow-up-right.

The default configuration is below, this configuration needs to be updated with a valid storage specification.

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  creationTimestamp: "2021-01-15T18:09:32Z"
  generation: 1
  managedFields:
  - apiVersion: monitoring.coreos.com/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences: {}
      f:spec:
        .: {}
        f:additionalScrapeConfigs:
          .: {}
          f:key: {}
          f:name: {}
        f:alerting:
          .: {}
          f:alertmanagers: {}
        f:replicas: {}
        f:resources:
          .: {}
          f:requests:
            .: {}
            f:cpu: {}
            f:memory: {}
        f:retention: {}
        f:ruleSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:prometheus: {}
            f:role: {}
        f:rules:
          .: {}
          f:alert: {}
        f:scrapeInterval: {}
        f:serviceAccountName: {}
        f:serviceMonitorSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:prometheus: {}
            f:role: {}
    manager: promplus
    operation: Update
    time: "2021-01-15T18:09:32Z"
  name: system
  namespace: pf9-monitoring
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: false
    controller: false
    kind: Deployment
    name: monhelper
    uid: cbc48a82-3c1f-4a2b-9b2a-ebbc32ae2e65
  resourceVersion: "2733"
  selfLink: /apis/monitoring.coreos.com/v1/namespaces/pf9-monitoring/prometheuses/system
  uid: c1722922-4973-4973-8e29-ba0269ad9a79
spec:
  additionalScrapeConfigs:
    key: additional-scrape-config.yaml
    name: scrapeconfig
  alerting:
    alertmanagers:
    - name: sys-alertmanager
      namespace: pf9-monitoring
      port: web
  replicas: 1
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
  retention: 7d
  ruleSelector:
    matchLabels:
      prometheus: system
      role: alert-rules
  rules:
    alert: {}
  scrapeInterval: 2m
  serviceAccountName: system-prometheus
  serviceMonitorSelector:
    matchLabels:
      prometheus: system
      role: service-monitor

The deployment needs to have the following storage section added. The storage class name must be updated to match your cluster and the amount of storage should also be specified.

The storage class in this example is running on Portworx Storage, to add Portworx see the PortWorx CSI guidearrow-up-right.

The final configuration should match the configuration below.

Troubleshooting

To see if the deployment is healthy run kubectl -n pf9-monitoring get allThe resulting output should show all services in a running state. If any pods or services are in a creating state rerun the command again.``

If there is an issue the prometheus-system-0 pods will fail to start or enter crashLoopBackoff.

{% tabs %} {% tab language="bash" title="Bash" %} {% code %}

Get Monitoring Pod Status

Run kubectl -n pf9-monitoring describe pod prometheus-system-0and review the events output. The output will show any errors impacting the Pod state. For example, prometheus is failing to start because the PVC cannot be found. To solve this issue the PVC must be manually recreated using Kubectl to apply the Solution example.

View Prometheus Container Logs

If the Pod events do not indicate that the issue is within Kubernetes itself it can be useful to look at the container logs for the Prometheus logs. To do this from the Platfrom9 SaaS Management Plane navigate to the Workloads dashboard and select the Pods tab. Filter the table to your cluster and set the namespace to pf9-monitoring. Once the table updates click the view logs link for the prometheus-system-0 container. This will open the container logs in a new tab within your browser.

Below is an example permissions error preventing the Pod from starting on each node.

Incorrect Storage Class Name

If you incorrectly specify the storage class name you will need first update the prometheus configuration and then delete the persistent volume claim by running: kubectl delete pvc <pvc-name> -n pf9-monitoring

Once the PVC is deleted the the Pods will start up and claim a new PVC.

Last updated

Was this helpful?