How to Setup the EFK Stack for Kubernetes Application Log Monitoring

Kubernetes is becoming a huge cornerstone of cloud software development. Kubernetes deployments require many logs in many locations, and Site Reliability Engineers (SREs), DevOps and IT Ops teams are finding that more and more of their time is spent setting up logs, troubleshooting logging issues, or working with log data in different places. What can be done to solve this?

Fortunately, with advances in open-source tools and ready-made integrations from commercial providers, it’s now much simpler to set up and manage a logging solution. Check out Platform9 and JFrog’s on-demand webinar to see a step-by-step of how to setup application log monitoring in Kubernetes. In the webinar, we use multiple open-source tools:

  • Elasticsearch, a distributed, open-source search and analytics engine for all types of data
  • Fluentd for log aggregation. Fluentd is an open-source data collector for building the unified logging layer
  • Kibana, an open-source data visualization dashboard for Elasticsearch
  • Kubernetes itself

Together Elasticsearch, Fluentd, and Kibana are commonly referred to as the EFK stack.

We’ll be using solutions from JFrog and Platform9 to rapidly implement a complete environment:

  • Platform9’s Managed Kubernetes which provides built-in Fluentd (early access)
  • JFrog’s ChartCenter which provides Helm charts for both these solutions.

jFrog's ChartCenter

Deploying Helm Charts with JFrog ChartCenter

Applications and logging tools associated with K8s can be found on ChartCenter. ChartCenter is a central repository built to help developers find immutable, secure, and reliable Helm charts and have a single source of truth to proxy all the charts from one location. ChartCenter stores and caches all charts, meaning every Helm chart and version of that chart remains available even if the original source goes down.

With the need for fast software development and delivery, the DevOps communit can use tools and deploy them easily using Helm charts on ChartCenter. For example, Helm charts are often created by organizations such as JFrog, Bitnami, and Elastic – and provided to the community to give them the ability to launch these organization’s software with a few command-line options. Helm charts also make it easy for developers to change the configuration options of applications. By editing the values.yaml file, an application can be set up in different ways – such as using a different database or using different configuration controls for production apps.

Here is a quick example on how you can work with ChartCenter

Go to chartcenter.io and search for the artifactory-jcr chart.

You’ll see the installation instructions under ‘Set Me Up.’ First, set ChartCenter as your repo:

$ helm repo add center https://repo.chartcenter.io
$ helm repo update

Next, to install the chart you can use:

$ helm install cert-manager center/jetstack/cert-manager \
      --namespace cert-management \
      -- version 0.15.2

We cover installing cert-manager in more detail below.

For a production install, you’ll want to review the information on the Read Me file for each chart. A good example Read Me can be found here.

If you’re still in your terminal, you can also see a list of all available charts in ChartCenter by using the command:

$ helm search repo center/

Deploying the EFK Logging Stack for Kubernetes

Platform9 deploys Prometheus and Grafana with every cluster, helping solve the monitoring piece and we are actively developing a built-in Fluentd deployment that will help simplify log aggregation and monitoring.

Part 1: Deploying Kubernetes + Fluentd using Platform9

Platform9’s free managed Kubernetes service deploys, upgrades, scales, patches and manages your Kubernetes clusters. The first step to deploying a Kubernetes cluster with log monitoring is to sign-up for the freedom plan and then build a cluster.

Platform9 can run clusters in public clouds (AWS, Azure), private clouds, and edge locations with capabilities to manage from the bare metal up; a BareOS cluster. All clusters can be built using the Platform9 SaaS platform by connecting your public clouds or by onboarding physical or virtual servers.

The example below is using a four-node Kubernetes cluster running on Platform9 Managed OpenStack but can be achieved using any virtual infrastructure, public cloud or physical servers. Once complete you will have Kubernetes cluster, managed by Platform9 with built-in monitoring, early access to our Fluentd capabilities connected to Elasticsearch and Kibana running on Rook CSI storage.

Here are the requirements:

Infrastructure:

Kubernetes Platform

  • Single Node Control Plane (2 CPU 16 GB RAM 1 NIC)
  • Three Worker Nodes (4 CPU 16 GB RAM 1 NIC)
  • OS: Ubuntu 18.04

Rook Storage

  • Three Volumes (1 per Worker node)

Software

  • GitHub Installed and an Account
  • kubectl
  • Helm v3 Client

NOTE: To install any charts and to manipulate the cluster ensure Helm 3 and KubeCtl are installed and that KubeConfig has been set up so that you can access the cluster.

Visit here for help on Kube Config Files and visit here help on Helm

Step 1: Sign up and Build a Cluster

Head to www.platform9.com/signup and create a free account.

Platform9 is able to build, upgrade, and manage clusters in AWS, Azure, and Bare Metal Operating Systems, BareOS, which can be physical or virtual servers running CentOS or Ubuntu.

This blog covers deploying a BareOS Cluster on Virtual Machines using Rook for persistent storage. Deploying onto Azure or AWS can be achieved by adding the native AWS or Azure Storage classes for the ELK data plane.

Once your account is active, create 4 virtual machines running either Ubuntu or CentOS in your platform of choice (Physical nodes can also be used), mount an empty unformatted volume to each VM (to support Rook) and then use the Platform9 CLI to connect each VM to the Platform9 SaaS Management Plane.

I built my environment on Platform9 Managed OpenStack, below you can see I have a single VM dedicated as the primary Kubernetes node and 3 for Kubernetes Worker nodes.

Platform9 Managed OpenStack Virtual Machines

Platform9 Managed OpenStack Virtual Machines

To attach a VM or physical server install the CLI by running

$ bash <(curl -sL <http://pf9.io/get_cli>)

The installation will ask for your account details, these can be found on the first step of the BareOS wizard or the Add Node page.

Platform9 CLI Commands to Connect Nodes:

Platform9 CLI Commands to Connect Nodes

Once you have installed the CLI and run the prep-node command on each node they will be attached to Platform9  and ready to host a cluster. Use the BareOS wizard to create the Kubernetes cluster, below is the required the following configuration to create a cluster with our Fluentd operator enabled:

Control Plane Setup: Single Node Control Plane with Privileged Containers Enabled

  1. Select the node that will run the Kubernetes Control Plane.
  2. Ensure the Privileged Containers option is enabled.

Workers Setup: Three Worker Nodes

  1. Select the three nodes you are using in this cluster.

Network Setup:

  1. Cluster Virtual IP: Leave All fields empty as we are creating a single node control plane.
  2. Cluster Networking Range & HTTP Proxy: Leave with Defaults
  3. CNI: Select Calio and use the default configuration
  4. MetalLB: Disabled

NOTE: If you want to deploy MetalB ensure the IP Range for is reserved within your environment and that port security will not block traffic at the Virtual Machine.

Enable Fluentd

  1. Ensure monitoring is enabled.
  2. Tags – Use the tags field to enable Fluentd.

    Enable Platform9 Fluentd (Early Access Feature)

    Platform9 has a built-in Fluentd operator that will be used to forward logs to Elasticsearch. To enable the Fluentd operator, edit the cluster from the Infrastructure dashboard and add the following tag to the clusters configuration:

    • key: pf9-system:logging
    • value: true

    Add BareOS Cluster

  3. Review and done

Your cluster will now be built and you will be redirected to the Cluster Details page where you can review the status of the cluster deployment on the Node Health Page.

Once the cluster has finished being built you can confirm Fluentd has been enabled in two places.

  1. Select the cluster and choose Edit on the Infrastructure dashboard. On the Edit screen, you should see the tag for logging added.

    Edit Basic Details

  2. Navigate to the Pods, Deployments and Services dashboard, and filter the Pods table to display the Logging Namespace. You should see Fluentd pods running.

Part 2: KubeConfig and CertManger

KubeConfig is critical for connecting and managing Kubernetes clusters. In addition, CertManager is a great application to have installed in all clusters as it can greatly simplify the management of certificates. As a way to validate your cluster and ensure you can connect to it we will download KubeConfig and install CertManager using JFrog ChartCenter and Helm 3.

Step 1: Obtain KubeConfig

Once the cluster has been built you can download a KubeConfig file directly from Platform9, choose either token or username and password and place the file in your .kube directory and name the file config. Visit here for help on Kube Config Files.

DownloadKubeConfig

Step 2: Create a namespace

For this example, I’m using a namespace called monitoring-demo, go ahead and create that in your cluster:

$ kubectl create namespace monitoring-demo

Step 3: Add Cert Manager

Using the JFrog ChartCenter we are going to add JetStack Cert-Manager to our cluster to handle self-signed certificates.

Chart Location: https://chartcenter.io/jetstack/cert-manager

cert-manager

Install Cert-Manager:

To ensure Cert-Manager installs and operates correctly you need to first create a namespace for cert-manager and add their CRDs, that’s Custom Resource Definitions.

  1. Create the cert-management namespace:

    $ kubectl create namespace cert-management in
    
  2. Install the CRDs:

    $ kubectl apply --validate=false -f \
      https://github.com/jetstack/cert-manager/releases/download/v0.16.1/cert-manager.crds.yaml
    
  3. Install the helm chart from Chart Center:

    $ helm helm install cert-manager center/jetstack/cert-manager \
         --namespace cert-management \
         -- version 0.15.2
    
  4. Once installed, add the following Certificate issuer for self-signed certificates:

    apiVersion: cert-manager.io/v1alpha2
    kind: ClusterIssuer
    metadata:
    name: selfsigned-issuer
    spec:
    selfSigned: {}
    

Now we have a cluster with multiple nodes and we don’t need to worry about certificates, the next step to running Elasticsearch is setting up storage.

Part 3: Setting up storage with Rook

For this example, we have chosen to use Rook, an open-source CSI based on Ceph. To run Rook you must have unformatted volumes attached to each node that are larger than 5 GB, I achieved this in our Managed OpenStack platform by creating a volume for each worker node that’s 10G in size and mounting it.

Storage with Rook

How to Add Rook CSI

I’m going to cheat here, Rook isn’t complicated to deploy, however, to stay keep this blog focused on ELK I’m going to refer to a great example on our KoolKubernetes GitHub repository that steps through building a 3 worker node rook cluster.

If your looking for an overview of Rook, an installation guide and tips on validating your new Rook Cluster read through this Blog on IT NEXT ROOK.

Quick Guide to Deploying Rook on Kubernetes:

Clone the Kool Kubernetes repository on any machine from where the kubectl can deploy the JSON manifests to your Kubernetes cluster.

$ git clone https://github.com/KoolKubernetes/csi.git

Deploy the first yaml:

$ kubectl apply -f rook/internal-ceph/1-common.yaml

Deploy the second yaml for rook operator:

$ kubectl apply -f rook/internal-ceph/2-operator.yaml
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created

Once your Rook cluster is running you can continue.

Now the fun part, let’s use Chart Center to get Elasticsearch and Kibana running, then direct our Fluentd output into Elasticsearch.

Part 4: Deploy Elasticsearch using ChartCenter

The catch with all helm charts is ensuring that you configure it for your environment using the values.yaml file and by specifying the version, namespace, and release or the name of the deployment.

The chart, available versions, instructions from the vendor, and security scan results can all be found at Chart Center: https://chartcenter.io/elastic/elasticsearch.

ElasticSearch

To deploy the chart you will need to create a values.yaml file (I called mine elastic-values.yml).  To ensure Helm can access the yaml file, either provide the absolute path or have your terminal session in the directory where the values.yaml file is located.

Some notes on Elastic Values.yaml –

To ensure your deployment runs, ensure that the following values are in line with the defaults.

clusterName: "elasticsearch"
protocol: http
httpPort: 9200
transportPort: 9300`

To make life a little easier, not advised for production, make the following additions to your values.yaml file

antiAffinity: "soft"
resources:
requests:
    cpu: "100m"
    memory: "1500M"
limits:
    cpu: "1000m"
    memory: "1500M"
esJavaOpts: "-Xmx1024m -Xms1024m"
replicas: 1
minimumMasterNodes: 1

To use the Rook storage, add the following to the values.yaml file.

NOTE: Ensure the storage class name matches your implementation.

volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: "rook-ceph-block"
  resources:
    requests:
      storage: 1Gi

Once your file is set up, save it and we are ready to deploy the chart.

$ helm install elasticsearch center/elastic/elasticsearch \
      --namespace monitoring-demo \
      --version 7.7.1 \
      -f elastic-values.yml

The above commands will install the 7.7.1 release of elastic from ChartCenter into the monitoring-demo namespace using the configuration parameters defined in the elastic-values.yml file.

Part 5: Deploy Kibana

Deploying Kibana is very similar to Elasticsearch, you will need a values.yaml file, I used a file named Kibana-values.yml. For this demo, I used a NodePort to expose the Kibana UI and to do this I modified the default values.yaml with the following override.

service:
  type: NodePort
  port: 5601
  nodePort: 31000
  labels: {}
  annotations: {}

Do not change elasticsearchHosts unless you modified the elastic-values.yaml file. By default, the values.yaml file contains:

elasticsearchHosts: http://elasticsearch-master:9200

The port 9200 is the default port and elasticsearch-master is the default Elasticsearch deployment.

The chart, available versions, instructions from the vendor and security scan results can also all be found at Chart Center: https://chartcenter.io/elastic/kibana.

To deploy Kibana run the following command:

$ helm install kibana-ui center/elastic/kibana \
      --namespace monitoring-demo \
      --version 7.7.1 \
      -f kibana-values.yml

Once deployed you can confirm both Kibana and Elasticsearch are running by navigating to the Kibana UI in your browser of choice. My cluster is running on 10.128.130.41 and the nodePort is 31000 as specified in the values.yaml file i.e. http://10.128.130.41:31000/app/kibana#/.

Now we are ready to connect Fluentd to Elasticsearch, then all that remains is a default Index Pattern.

Part 6: Configure Fluentd

The Platform9 Fluentd operator is running, you can find the pods in the the pf9-logging namespace. What we need to do now is connect the two platforms; this is done by setting up an Output configuration.

You will need to place the configuration below in a yaml file and apply it to your cluster. Please note, you will need to adjust the user, password, index_name and importantly the URL.

The URL is an important piece, if this isn’t correct the data cannot be forwarded into Elasticsearch, the syntax is as follows:

http://<elastic-cluster>.<namespace>.<access-definition>

If you have followed this example using the same names you will not need to change anything.

apiVersion: logging.pf9.io/v1alpha1
kind: Output
metadata:
  name: es-objstore
spec:
  type: elasticsearch
  params:
    - name: url
      value: http://elasticsearch-master.monitoring-demo.svc.cluster.local:9200
    - name: user
      value:myelasticuser
    - name: password
      value: mygreatpassword
    - name: index_name
      value: k8s-prdsjcmon01-fluentd</code>

Use kubectl to apply the yaml file.

Once the file has been applied Fluentd will start to forward data to Elasticsearch, wait a few minutes and then refresh the Kibana UI and you will be able to go through the process of setting up the first index pattern.

Setting up an Index Pattern is a two-step process. First, you need a regular expression to match the inbound data from Fluentd, this needs to match the index_name value, the next step is to identify the method Elasticsearch should use to manage log time stamps.

  1. Setting up an Index Pattern Step 1

    Step-1

  2. Setting up an Index Pattern Step 2

    Step-2

  3. Once the index pattern has been configured you can use the explore dashboard to view the log files.

    Dashboard