ES Fluent-bit

Discover the seamless way of setting up Elasticsearch and Fluent-bit for your Kubernetes cluster with our in-depth, step-by-step guide. This comprehensive tutorial is perfect for those who aim to implement centralized logging solutions for managing, analyzing and visualizing logs in real-time. Whether you are a DevOps engineer aiming to enhance your log management skills, or an IT enthusiast exploring Kubernetes’ capabilities, this guide has got you covered. Dive into the details of configuring Elasticsearch and Fluent-bit in a Kubernetes environment, thereby optimizing your applications’ performance, troubleshooting issues faster, and gaining valuable insights from your log data. Master these powerful open-source tools and revolutionize your approach to log analysis and data visualization.

Setting up Elasticsearch using Helm in Kubernetes

In our quest to provide the most robust logging solution, we move on to setting up Elasticsearch, a search engine known for its distributed, multitenant-capable full-text search and analysis capabilities with an HTTP web interface and schema-free JSON documents.

We initiate the process by creating a dedicated namespace, logging, to separate our Elasticsearch resources from other applications running in our Kubernetes cluster. This ensures a neat and organized environment that is easier to manage and troubleshoot.

Then, we utilize Helm, the package manager for Kubernetes, to install Elasticsearch. Helm simplifies deployments in Kubernetes and managing applications through charts, a collection of files that describe a related set of Kubernetes resources. We will be installing the Elasticsearch Helm chart from Elastic’s repository.

As part of the Helm command, we configure several settings to adjust the Elasticsearch deployment according to our needs. We specify the number of Elasticsearch nodes to be 1 (replicas=1) as this guide assumes a single-node setup for simplicity and learning purposes. However, Elasticsearch is designed to be scalable and distributed, and thus you may increase the number of replicas according to your actual production needs.

Furthermore, we configure the minimum number of master nodes to be 1 (minimumMasterNodes=1) to ensure cluster stability and prevent the “split-brain” situation, which could lead to data inconsistency.

Lastly, we specify the storage request of the Persistent Volume Claim (PVC) created for each Elasticsearch data node to be 10 Gigabytes (volumeClaimTemplate.resources.requests.storage=10Gi). This value ensures the Elasticsearch data node has enough disk space to function, but it can be adjusted based on the projected data influx.

With these commands, you’ll have an Elasticsearch setup tailored to your environment, providing a sturdy foundation for your Kubernetes logging needs. In the following sections, we will build upon this setup by adding more components and integrating them for a full-fledged logging solution.

kubectl create namespace logging
helm repo add elastic https://helm.elastic.co
helm repo update
helm install esarticle elastic/elasticsearch --namespace logging --set replicas=1 --set minimumMasterNodes=1 --set volumeClaimTemplate.resources.requests.storage=10Gi

Verifying the Successful Deployment of Elasticsearch

Upon successfully deploying Elasticsearch in our Kubernetes cluster, we will now undertake a few steps to ensure our installation is operating as expected. Verification is a critical step that ensures our system is ready to deliver reliable services.

Firstly, we need to ensure that all the Elasticsearch cluster members have come up and are running as expected. Kubernetes provides a handy command for us to observe the state of our pods, with the help of label selectors (-l app=elasticsearch-master). The -w or --watch flag keeps our query running, providing real-time updates of our pods’ status. We should be looking out for a status of ‘Running’ for all our Elasticsearch pods.

Next, Elasticsearch is secured by default, which means we’ll need the credentials of the built-in elastic user to interact with our cluster. Kubernetes stores these credentials as secrets, allowing us to retrieve the password securely. We access the elasticsearch-master-credentials secret and extract the encoded password field. We then decode the retrieved value using the base64 -d command to get the actual password.

Lastly, we can utilize Helm’s test functionality to execute pre-defined tests on our deployed release. By running the test command on our esarticle release, we can ensure that our Elasticsearch cluster is not only running but is also functioning as expected. A successful test will affirm our cluster’s health, confirming a successful deployment.

By following these steps, we can rest assured that our Elasticsearch setup is solid and ready for the next stages of our logging system setup.

1. Watch all cluster members come up.
  $ kubectl get pods --namespace=logging -l app=elasticsearch-master -w
2. Retrieve elastic user's password.
  $ kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
3. Test cluster health using Helm test.
  $ helm --namespace=logging test esarticle

Implementing Elasticsearch Index Alias for Efficient Log Management

As the volume of data we log grows over time, effective log management becomes an essential aspect of maintaining a performant and resource-efficient logging system. Elasticsearch offers several features to support this, one of which is the use of index aliases. An index alias in Elasticsearch is a secondary name used to refer to one or more existing indices. Using an alias can simplify many operations, including searching across several indices and switching between indices without modifying client code.

In our case, we’re creating an alias for the ’es-indeax-name’ index. This alias, ‘alias_es-indeax-name’, will provide us with a layer of abstraction, allowing us to perform operations on the index without directly referring to its primary name.

The real power of using an index alias comes to light when used in conjunction with Elasticsearch’s Index Lifecycle Management (ILM) feature. With ILM, we can define policies to automate routine tasks like index rotation, aging data out, and deleting indices that are no longer required. By applying an ILM policy to an alias, we can manage all associated indices according to the defined policy.

This Elasticsearch configuration will ensure that our logging system remains resource-efficient, performant, and scalable, even as the volume of log data grows. It allows us to maximize the value of our logs, using them for real-time troubleshooting as well as historical analysis, without overtaxing our resources or compromising system performance.

curl -X POST "http://localhost:9200/_aliases" -u elastic:password -H 'Content-Type: application/json' -d'
{
  "actions": [
    {
      "add": {
        "index": "es-indeax-name",
        "alias": "alias_es-indeax-name"
      }
    }
  ]
}'

In the Dev Tools console (kibana), execute the following request to create the alias:

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "es-indeax-name",
        "alias": "alias_es-indeax-name"
      }
    }
  ]
}

Deploying Kibana with Helm

In the next stage of setting up our logging system, we will deploy Kibana, a visualization tool that works hand-in-hand with Elasticsearch to provide actionable insights from our logged data. With Kibana, we can create dashboards, maps, and more to present our data visually and assist with quick and informed decision-making.

We will deploy Kibana using Helm, a package manager for Kubernetes that simplifies the deployment and management of applications. Deploying Kibana with Helm not only simplifies the process but also ensures a consistent configuration aligned with best practices.

Our Helm command installs Kibana into the ’logging’ namespace where our Elasticsearch instance resides, establishing a shared environment for our logging system. We use the --set flag to configure Kibana to connect to our Elasticsearch service at https://elasticsearch-master:9200, thus forming a link between our data storage (Elasticsearch) and our data visualization tool (Kibana).

Upon executing this command, Helm will begin deploying Kibana with our specified settings. Upon completion, we’ll have Kibana set up and ready to visualize the log data stored in our Elasticsearch instance. This completes a significant part of our logging infrastructure, setting the stage for effective log data analysis and monitoring.

helm install kibana elastic/kibana --namespace=logging --set elasticsearchHosts=https://elasticsearch-master:9200

Exploring Kibana with Elasticsearch: An Essential Guide

Are you ready to take a deep dive into your data? If so, the first step in this journey is to retrieve the access credentials. Elasticsearch uses basic authentication, and in our setup, we have created a user called ’elastic’.

In the Kubernetes environment, the password for this user is stored as a secret. Kubernetes secrets are secure objects which contain small amounts of sensitive data such as passwords, tokens or keys. To retrieve the password for the ’elastic’ user, we will query this secret using the kubectl command. The returned password will be encoded in base64, so we will need to decode it as well.

kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d

Once you have the ’elastic’ user’s password, you are ready to access Kibana, a user-friendly interface for visualizing Elasticsearch data. To do this, we need to set up port-forwarding from our local machine to the Kubernetes service that is hosting Kibana. This is done by using another kubectl command.

kubectl -n logging port-forward svc/kibana-kibana 5601:5601

With port-forwarding established, Kibana becomes available on our local machine at ‘http://localhost:5601’. Simply open a web browser and navigate to this address. You will be prompted for the ’elastic’ user’s credentials - remember the password you retrieved earlier?

And there you have it - a secure and efficient route from your local machine to a world of data visualizations, all thanks to the power of Kubernetes, Elasticsearch, and Kibana. Get ready to explore and uncover the patterns hidden in your data!

Installing Fluent Bit using Helm

In our quest to build a robust logging infrastructure, our next component to deploy is Fluent Bit. Fluent Bit is an open-source log processor and forwarder which allows us to collect logs from various sources, process them in different ways, and send them to multiple destinations, one of them being Elasticsearch in our setup.

Fluent Bit is lightweight and has a plugable architecture which makes it an ideal choice for dynamically changing environments like Kubernetes. Fluent Bit’s Kubernetes filter automatically enriches the logs with Kubernetes metadata, helping us with better contextualization of the logged data.

We will be deploying Fluent Bit into the same ’logging’ namespace where our Elasticsearch and Kibana instances reside, maintaining our centralized logging environment. We will use Helm for this deployment as well. Helm, being the package manager for Kubernetes, streamlines our application deployment by handling the necessary configurations and dependencies.

Upon executing this command, Helm will deploy Fluent Bit into our ’logging’ namespace. Post-deployment, Fluent Bit will start collecting log data from various sources, process it and forward it to our Elasticsearch instance, completing the logging pipeline. This setup not only collects and stores our log data but also empowers us with data visualization capabilities via Kibana.

helm install fluent-bit fluent/fluent-bit --namespace=logging

Get Fluent Bit build information by running these commands:

export POD_NAME=$(kubectl get pods --namespace logging -l "app.kubernetes.io/name=fluent-bit,app.kubernetes.io/instance=fluent-bit" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace logging port-forward $POD_NAME 2020:2020
curl http://127.0.0.1:2020

Customizing Fluent Bit Configuration for Enhanced Logging

While Fluent Bit offers default configuration options that can handle a wide variety of log data, sometimes we need to customize its settings to meet our specific logging requirements. In our setup, we are customizing the Fluent Bit ConfigMap to handle logs in a more tailored manner. ConfigMaps in Kubernetes allow us to manage our application’s configurations separately from the application code, making our applications more portable and scalable.

First, we are introducing a custom parser named ‘crio’ in our ‘custom_parsers.conf’ to handle the log data. This parser uses a regular expression to extract meaningful data from the logs. The logs are timestamped, and their format is recognized based on the defined regular expression. The ‘Decode_Field_As’ directive is used to interpret the extracted ’log’ field as JSON data.

Next, in ‘fluent-bit.conf’, we are customizing various Fluent Bit service-level configurations, including the input plugin settings for collecting logs from the Nginx application and the output plugin settings for forwarding the processed logs to Elasticsearch.

In the input section, we have specified that Fluent Bit should tail Nginx logs located at ‘/var/log/containers/nginx-*.log’ and parse them using the previously defined ‘crio’ parser.

In the output section, we have instructed Fluent Bit to forward these processed logs to the Elasticsearch instance, specifying the necessary connection parameters, index name, and authentication details.

By providing this customized configuration, Fluent Bit can efficiently parse our application’s log data and forward it to Elasticsearch for storage and further analysis. With this setup, we can gain insights from our logs using Kibana’s visualization capabilities, thereby improving our application’s observability and troubleshooting effectiveness.

  custom_parsers.conf: |

    [PARSER]
        Name crio
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
        Decode_Field_As json log
        
  fluent-bit.conf: |

    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level debug
        Parsers_File /fluent-bit/etc/parsers.conf
        Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On

    [INPUT]
        Name tail
        Tag nginx.*
        Path /var/log/containers/nginx-*.log
        Parser crio
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

    [OUTPUT]
        Name es
        Match nginx.*
        Host elasticsearch-master
        Port  9200
        Index es-indeax-name
        Logstash_Format Off
        HTTP_User elastic
        HTTP_Passwd ********
        tls On
        tls.verify Off
        Type _doc
        Retry_Limit False
        Suppress_Type_Name On

Setting up Kubernetes Ingress for Permanent Access to Kibana Web UI

As data engineers and analysts, we often need constant access to our Kibana dashboards to monitor real-time data, perform troubleshooting, or even run sporadic data exploration tasks. In a cloud-native environment where Kibana is deployed on a Kubernetes cluster, setting up an ingress resource is a practical solution to enable permanent access to Kibana’s Web UI.

Kubernetes Ingress is a powerful tool that manages external access to services within a cluster. It provides HTTP and HTTPS routes from outside the cluster to services within the cluster, thus enhancing the overall accessibility of your applications.

When setting up the ingress resource for Kibana, we need to create a dedicated YAML configuration file. This file will contain necessary details including the host name, path, and the associated service (Kibana in our case). It is crucial to remember to apply TLS for secure connections if your cluster is exposed to the internet. After the creation of this YAML file, it is applied to the cluster using a simple kubectl apply command, which will set up the ingress resource as per the configuration.

With a correctly configured ingress resource, you will be able to access your Kibana Web UI from any location via a dedicated URL. This adds a new level of convenience and enables seamless monitoring and data visualization capabilities of Elasticsearch via Kibana in your Kubernetes cluster.

Secret tls and Ingerss manifest.

apiVersion: v1
kind: Secret
metadata:
  name: kibana-ingress-tls
  namespace: logging
data:
  tls.crt: >-
    LS0tL..base64_cert
  tls.key: >-
    LS0tL..base64_key
type: kubernetes.io/tls
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: kibana-ingress
  namespace: logging
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - kibana.yourdomain.com
      secretName: kibana-ingress-tls
  rules:
    - host: kibana.yourdomain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: kibana-kibana
                port:
                  number: 5601