Building a Monitoring System using Telegraf, InfluxDB and Kubernetes
The containerization technology is applied nearly in all projects due to many reasons such as the easy management, isolation of the apps, rapid deployment, etc. To enable this technology, the docker is probably one of the most used container technology so far. Even it is well known and highly used containerization approach, its capabilities are limited in case the management complexity increases. At this point, to overcome these limits, a container management and orchestration system, Kubernetes, also known as K8s,comes in the scene, it is by far the most well-known orchestration tool for the time being. The structure of the nodes, the distribution of the pods on the nodes, different container deployment and scheduling strategies and an open API for developers enabling the interaction with the internal component of the Kubernetes, and many other valuable features converts it to a highly scalable platform.
In this article, our focus is to demonstrate how telegraf and InfluxDB software programs can be deployed in a Kubernetes cluster. In the previous articles [1], [2], it is already shown how a similar system can be deployed via docker containers using docker-compose. The use-cases in these articles are slightly complicated, and the focuses were rather on the integration of the sensors into telegraf, however, the scenario is kept in this scenario as simple as possible and the contribution is on the Kubernetes configuration.
The scenario is based on the collection of the internal memory-, CPU-usage of telegraf container, using the REST API of a python application to retrieve random data, and then the transfer of the collected data into the InfluxDB database. To extend this example, e.g. adding more sensors connected to the telegraf, you may utilize the previously mentioned articles.
The development of the system components is accomplished through the following steps:
- Telegraf Configuration
- Implement Python Application & Build its Docker Image
- Kubernetes Configuration
- Apply Kubernetes Configuration & Evaluate Results
Note that the actual development environment is Mac OSX Ventura, and Rancher is used for the container management.
1. Telegraf Configuration
The configuration of the telegraf.conf
file isn’t complicated. It can start with an agent configuration that explains how the telegraf software behave such as the interval of the data collection period, collection jitter, etc. In this case this isn't defined, therefore telegraf will use the default configuration.
The rest configuration is constructed on the input and output processes. In this example, there are no processors and aggregators. Three inputs are defined, CPU, memory and HTTP. By default, CPU and memory delivers the CPU and memory of the container in which telegraf runs. There is no additional configuration requirement. For the configuration of HTTP input differs from others, since the intention is to establish a connection between a python REST interface and telegraf. All details for this input such as HTTP URL, the interval for the data aggregation, the name of the metrics that will appear in InfluxDB, and JSON data format are provided. The final configuration belongs to the InfluxDB. The URL address of the database is default, however, the token
, organization
, and bucket
are first generated when the application is successfully booted. Most probably, there is a way to automate this process, however, this is not the focus of this example.
[[inputs.cpu]]
[[inputs.mem]]
[[inputs.http]]
urls = ["http://app:5000/random-data"]
interval = "10s"
name_override = "app_metrics"
method = "GET"
timeout = "5s"
data_format = "json"
[[outputs.influxdb_v2]]
urls = ["http://influxdb:8086"]
token = "<replace-generated-token>"
organization = "org"
bucket = "bucket"
In the end of this article, the missing configuration such as the token name, organization and bucket name will be completed.
2. Implement Python Application & Build Docker Image
Developing a python serving a REST interface doesn’t require too much effort, and below the application code below basically includes the necessary code that allows us to return a random number when /random-data
GET interface is called.
from flask import Flask, jsonify
import random
app = Flask(__name__)
@app.route('/random-data', methods=['GET'])
def get_random_data():
# Generate random data (replace with your actual data generation logic)
random_value = random.randint(1, 100)
return jsonify({"random_metric": random_value})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
The required libraries are written in the requirements.txt
file.
Flask==2.0.1
requests==2.26.0
The Dockerfile
for building a Docker image from the python code can be realized via the following code, which selects a basic image including python version 3.8, creates an app/
folder as a working directory, copy the requirements.txt
file into this folder, install these libraries via pip
tool, copy the app.py
to the app/
folder and finally executes the app.py
file. The order of these lines are pretty important to reduce the compilation time. For more details, how Docker building actually works, look at this Building a Dockerfile article.
FROM python:3.8-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install - no-cache-dir -r requirements.txt
COPY app.py .
CMD ["python", "app.py"]
All needed files are completed, and the final step in this section is to build the docker image to use it in the Kubernetes. The command below will be sufficient to create the image:
docker build -t random_data_app .
Once the image is built, you may check whether it is available in the docker images via docker images | grep random_data_app
command on the terminal.

3. Kubernetes Configuration
Assuming that you have in your environment at least one node, kubectl
is installed, and a docker image is generated for the python app. Now, our single purpose is to create a Kubernetes cluster in which all these pieces of software run in a harmony, telegraf aggregates the data, forwards into the InfluxDB, and we can display the incoming data in the InfluxDB through a web interface.
To realize all these components, the following steps should be performed:
1. Python app necessitates a deployment to be placed in a pod, and a service enabling other components can access it.
2. Telegraf requires a deployment to run in a pod, a service to make it accessible to other components , and a ConfigMap for the telegraf.conf
data.
3. InfluxDB requires a deployment to run it in a pod, and a service to make it open to the usage of other pods, as well as an additional service either NodePort or LoadBalancer to enable its access on the web browser.
3.1 Kubernetes Config for Python App
In the following part, a service is defined that allows us to access the app pod through the 5000 port number. In the second part, the deployment of the python app, which is responsible for generating a random value when its REST API is requested, is located. The important part in this config, the containerPort should be mapped with the python REST app, and since this image is locally created, imagePullPolicy: Never should be added in the containers section. Unless you add this line, Kubernetes will try to fetch the image from the docker hub, and seeing failure is unavoidable :).
---
apiVersion: v1
kind: Service
metadata:
name: app
namespace: telegraf-namespace
labels:
app: app
spec:
selector:
app: app
ports:
— protocol: TCP
port: 5000 # The port your Python app listens on
targetPort: 5000 # The port your Python app listens on within the container
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
namespace: telegraf-namespace
labels:
app: app
spec:
replicas: 1
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: app:latest
imagePullPolicy: Never # or IfNotPresent
ports:
- containerPort: 5000
---
3.2. Kubernetes Config for Telegraf
In order to run telegraf in a pod, it requires a telegraf.conf
file therefore ConfigMap
is utilized, a deployment for the telegraf pod in which the config map is connected with a volume, and finally a service to make the telegraf accessible from other services.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: telegraf-config
namespace: telegraf-namespace
data:
telegraf.conf: |
[global_tags]
# Add any global tags here
[agent]
interval = "10s" # Adjust the interval as per your requirements
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = "" # Set the hostname for identifying metrics source
[[inputs.cpu]]
[[inputs.mem]]
[[inputs.http]]
urls = ["http://app:5000/random-data"]
interval = "10s"
name_override = "app_metrics"
method = "GET"
timeout = "5s"
data_format = "json"
[[outputs.influxdb_v2]]
urls = ["http://influxdb:8086"]
token = "<token-generated-in-website>"
organization = "org"
bucket = "bucket"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: telegraf
namespace: telegraf-namespace
spec:
replicas: 1 # Adjust the number of replicas as needed
selector:
matchLabels:
app: telegraf
template:
metadata:
labels:
app: telegraf
spec:
containers:
- name: telegraf
image: telegraf:latest # Replace with the Telegraf image name and version
volumeMounts:
- name: config-volume
mountPath: /etc/telegraf
volumes:
- name: config-volume
configMap:
name: telegraf-config
---
apiVersion: v1
kind: Service
metadata:
name: telegraf-service
namespace: telegraf-namespace
spec:
selector:
app: telegraf # it points to the telegraf app
ports:
- protocol: TCP
port: 8181 # Telegraf listens on this port
targetPort: 8181 # The port Telegraf listens on within the container
3.3. Kubernetes Config for InfluxDB
Apart from the aforementioned configurations, the InfluxDB configuration requires more effort in comparison to the others. The needed elements are a persistent volume claim for the InfluxDB database, a pod that runs the InfluxDB database, a service that enables access to the InfluxDB database, and a nodePort service that allows us to access the InfluxDB on the web browser. Instead of using nodePort, it is also possible to utilize load balancer service, which is much more suitable for the production environments.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: influxdb-pvc
namespace: telegraf-namespace
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi # Storage size
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: influxdb
namespace: telegraf-namespace
spec:
replicas: 0
selector:
matchLabels:
app: influxdb
template:
metadata:
labels:
app: influxdb
spec:
containers:
- name: influxdb
image: influxdb:latest
ports:
- containerPort: 8086
volumeMounts:
- name: influxdb-storage
mountPath: /var/lib/influxdb
volumes:
- name: influxdb-storage
persistentVolumeClaim:
claimName: influxdb-pvc
- -
apiVersion: v1
kind: Service
metadata:
name: influxdb
namespace: telegraf-namespace
spec:
selector:
app: influxdb
ports:
- protocol: TCP
port: 8086
targetPort: 8086
---
apiVersion: v1
kind: Service
metadata:
name: influxdb-nodeport
namespace: telegraf-namespace
spec:
selector:
app: influxdb
type: NodePort
ports:
- protocol: TCP
port: 8086
targetPort: 8086
nodePort: 30001 # Choose any available port number in your range
4. Apply Kubernetes Configuration & Evaluate Results
Assuming that you have the following folder structure that includes app
related files and a YAML file for the Kubernetes configuration.

The final step is to execute all these Kubernetes environment, you should simply execute the commands on the terminal:
First, a new Kubernetes namespace is created, since the intention is to separate our services from other namespaces.
kubectl create namespace telegraf-namespace
Then, the YAML file can be executed:
kubectl apply -f telegraf-influxdb.yaml
To test the deployment, in case you use the config above, the InfluxDB web page should be accessible via http://localhost:30001 URL address.

Clicking the CONTINUE button will navigate you to the following web page:

Please check the telegraf logs whether any error occurs, in our case we see telegraf cannot access to the InfluxDB bucket, which makes sense. Because the token is newly created and it should be added into the telegraf.conf
.
kubectl logs telegraf-f5b76549d-s8pzc -n telegraf-namespace
2023–07–25T20:00:37Z E! [agent] Error writing to outputs.influxdb_v2: failed to send metrics to any configured server(s)
2023–07–25T20:00:47Z E! [outputs.influxdb_v2] When writing to [http://influxdb:8086]: failed to write metric to bucket (401 Unauthorized): unauthorized: unauthorized access
2023–07–25T20:00:47Z E! [agent] Error writing to outputs.influxdb_v2: failed to send metrics to any configured server(s)
2023–07–25T20:00:57Z E! [outputs.influxdb_v2] When writing to [http://influxdb:8086]: failed to write metric to bucket (401 Unauthorized): unauthorized: unauthorized access
After adding the token, you can simply decrease the replicas:0
of the telegraf, apply the deployment as above once again, and then once again set back to replicas:1
, and apply the Kubernetes deployment once again. You may ask why this manual approach is selected as a logical question, it is due to the simplicity.

Useful Commands for Kubernetes
During testing phase, many commands are tested to see the created pods, services, logs, and delete them as well. Since a separate namespace is created, it has to be added at the end of each command as seen below:
# Get pods, service
kubectl get pods -n telegraf-namespace
kubectl get services -n telegraf-namespace
# See the pod specific logs
kubectl logs influxdb-86677c6945-bxzpg -n telegraf-namespace -f
# Get, Delete service and deployment
kubectl get deploymenys -n telegraf-namespace
kubectl delete service influxdb -n telegraf-namespace
kubectl delete deployment telegraf -n telegraf-namespace
You may find more details in the official website of Kubernetes.
Summary
In this article, the essential goal is to bring diverse component together to build a small-sized monitoring platform that retrieves the data from a python application and telegraf container. The data retrieval operation is based on the telegraf software, whereas the visualization and storing is handled by InfluxDB. To start all these pieces of software together within the same ecosystem, Kubernetes is selected as a container orchestrator. I hope you can use it in your projects, or at least it can be an initiator for the creation of different ideas. The source code of the project is available under the following link:
Source Code: https://github.com/cemakpolat/telegraf-influxdb-kubernetes-monitoring/tree/main