# Kubernetes Monitoring

The Prometheus agent is configured to report numerous K8s metrics. Below is the YAML-based rule set that Catapult uses, including alert names, expression, timeframe, labels, and annotations which contain the description and summaries.

{% tabs %}
{% tab title="YAML" %}

```yaml
bash-5.0# cat k8s.yml
groups:
  - name: kube-apiserver
    rules:
#-----------------------------------   API server -------------------------------------

      - alert: KubeAPIServerDown
        expr: up{job="kube-apiserver"} offset 5m == 0
        for: 1m
        labels:
          severity: critical
          type: k8s
        annotations:
          description: Kubernetes API server down on cluster {{ $labels.cluster }}
          summary: "Cluster {{ $labels.cluster }}: Kube api server down on host {{ $labels.host }}"

      - alert: KubernetesApiServerErrors
        expr: sum(rate(apiserver_request_total{code=~"(4|5).."}[1m] offset 5m )) by (cluster, host) / sum(rate(apiserver_request_total[1m] offset 5m )) by (cluster, host) * 100 > 3
        for: 2m
        labels:
          severity: critical
          type: k8s
        annotations:
          summary: Kubernetes API server errors on cluster {{ $labels.cluster }}
          description: "Cluster {{ $labels.cluster }}: Kubernetes API server is experiencing high error rate on host {{ $labels.host }}"

      - alert: KubernetesApiClientErrors
        expr: (sum(rate(rest_client_requests_total{code=~"(4|5).."}[1m] offset 5m )) by (cluster, host) / sum(rate(rest_client_requests_total[1m] offset 5m )) by (cluster, host)) * 100 > 1
        for: 2m
        labels:
          severity: critical
          type: k8s
        annotations:
          summary: Kubernetes API client errors on cluster {{ $labels.cluster }}
          description: "Cluster {{ $labels.cluster }}: Kubernetes API client is experiencing high error rate on host {{ $labels.host }}"

#-----------------------------------   Scheduler -------------------------------------
      - alert: KubeSchedulerDown
        expr: up{job="kube-scheduler"} offset 5m == 0
        for: 1m
        labels:
          severity: critical
          type: k8s
        annotations:
          summary: Kubernetes Scheduler down on cluster {{ $labels.cluster }}
          description: "Cluster {{ $labels.cluster }}: Kube scheduler down on host {{ $labels.host }}"


#-----------------------------------   Controller -------------------------------------
      - alert: KubeControllerManagerDown
        expr: up{job="kube-controller"} offset 5m == 0
        for: 1m
        labels:
          severity: critical
          type: k8s
        annotations:
          summary: Kubernetes Controller down on cluster {{ $labels.cluster }}
          description: "Cluster {{ $labels.cluster }}: Kube controller down on host {{ $labels.host }}"
#-----------------------------------   KubeProxy -------------------------------------
      - alert: KubeProxyDown
        expr: up{job="kube-proxy"} offset 5m == 0
        for: 1m
        labels:
          severity: critical
          type: k8s
        annotations:
          summary: KubeProxy down on cluster {{ $labels.cluster }}
          description: "Cluster {{ $labels.cluster }}: kube proxy down on host {{ $labels.host }}"

      - alert: KubeProxyRuleSyncLatency
        expr: histogram_quantile(0.99, sum by(le, cluster, host) (rate(kubeproxy_sync_proxy_rules_duration_seconds_bucket[5m] offset 5m ))) > 60
        for: 2m
        labels:
         severity: warning
         type: k8s
        annotations:
          summary: Cluster {{ $labels.cluster }} is taking too long, on average, to apply kubernetes service rules to iptables.
          description: "Cluster {{ $labels.cluster }}: network rules synchronization slowing down, VALUE = {{ $value }} on host {{ $labels.host }}"bash-5.0#
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.platform9.com/managed-kubernetes/5.7/catapult-rules-alarms/kubernetes-monitoring.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
