This document explains how Kubernetes monitors the health of containerized applications using probes and container states. The reader will learn about Probes and how they help on keeping a service available.
Container's Health
After allocating resources to a container, the application execution will initiate. There's the boot stage, where the application configures itself, checks if a database is online, downloads and initializes some libraries and as soon as the the boot process is completed, the application is ready to receive requests and process them.
To determine the state of a container, one can configure Container Probes to be used by kubelet. Kubelet will execute the configured check mechanism and the result of the check will be assigned to the container a state. And the state assigned will determine the proper action to perform on the container. For example, for a container with a REST application that stops responding to requests, when a Liveness Probe is performed on it and results in failure, kubelet will terminate the container and start a new one.
Container's Probes and Check mechanisms
As K8s only concern is to monitor how the application is behaving in terms of execution, there's the need to track if is booting, if is ready to receive requests or if is not responding at all. So to keep track of those states, K8s has 3 probes that are executed depending on what the application is doing (booting, ready for processing) and periodically:
Liveness Probe
Startup Probe
Readiness Probe
These probes will help K8s to determine what is called a Container State.
A container can have one of 3 states:
Waiting is given when a container is either Running or Terminated;
Running when a container is running without issues;
Terminated when a container started execution and either ran to completion or failed for some reason.
Probe Mechanisms
A probe is a diagnostic performed periodically by the kubelet on a container. The probe is performed by executing code within a container or by making a network request.
There are four check mechanisms: exec, grpc, httpGet and tcpSocket.
exec
Executes a specific command inside a container. The diagnostic is considered successful if the command is executed with success.
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-exec
namespace: default
spec:
containers:
- name: demo
image: nginx
livenessProbe:
exec: # probe mechanism
command:
- cat # command to perform
- /usr/share/nginx/html/index.html # 1st argument for cat command
Code Block 1 - perform cat command on index.html file for Liveness Probe
grpc
Performs a remote procedure call using gRPC. The target should implement gRPC health checks. The diagnostic is considered successful if the status
of the response is SERVING
.[2]
livenessProbe:
grpc: # probe mechanism
port: 8000 # port on where to perform the grpc call
Code Block 2 - defining grpc call on container's port 8000 for Liveness Probe
httpGet
Performs an HTTP GET
request against the Pod's IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.
livenessProbe:
httpGet: # probe mechanism
path: /health # endpoint where to perform the Get
port: 8080 # port to use for the test
httpHeaders: # optional headers
- name: Custom-Header # header's name
value: ItsAlive # header's value
Code Block 3 - defining htttp get call, on container's port 8000 with optional header for Liveness Probe
tcpSocket
Performs a TCP check against the Pod's IP address on a specified port. The diagnostic is considered successful if the port is open. If the remote system (the container) closes the connection immediately after it opens, this counts as healthy.
livenessProbe:
tcpSocket: # probe mechanism
port: 8080 # port to use for the test
Code Block 4 - defining tcpSocket connection on container's port 8000 for Liveness Probe
Container's Probes
Probes are used by kubelet to determine the state of a container. Once a state is determined kubelet will execute the appropriate action, for example, when the application has passed the boot process and is ready to do its task, a Liveness Probe is executed periodically, if this probe at any time fails, kubelet will kill the container and start a new one.
All probes have five parameters that are crucial to configure.
initialDelaySeconds: Time to wait after the container starts. (default: 0)
periodSeconds: Probe execution frequency (default: 10)
timeoutSeconds: Time to wait for the reply (default: 1)
successThreshold: Number of successful probe executions to mark the container healthy (default: 1)
failureThreshold: Number of failed probe executions to mark the container unhealthy (default: 3)
livenessProbe
Indicates whether the container is running.
When an app is running for a long time, it may stop working normally and the only to recover is by restarting it. For kubelet to determine if the app is running properly, it executes periodically a Liveness Probe.
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
Code Block 5 - Liveness Probe with exec mechanism and time parameters
In Code Block 5 it can be seen a Liveness Probe
that will execute a cat
command on file /temp/healthy
, it will execute the probe for the first time 5 seconds after the container achieves the state Running
startupProbe
Indicates whether the application within the container is started.
Startup Probe is used to determine if an application has finished its startup phase. This is special useful in legacy applications where the startup is lengthy and is hard to check when this phase is over.
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
startupProbe:
initialDelaySeconds: 1
periodSeconds: 2
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 1
exec:
command:
- cat
- /etc/nginx/nginx.conf
Code Block 6 - Example of a Startup Probe
readinessProbe
Indicates whether the container is ready to respond to requests.
Sometimes an application becomes unresponsive because it has a very heavy process like a load of a big file or a long SQL transaction making it unable to receive traffic. This does not elect a container to be killed and start a new one. So, we configure a Readiness Probe to check if a Pod is able to receive traffic or not and is executed throughout the container’s lifetime;
readinessProbe:
initialDelaySeconds: 1
periodSeconds: 2
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 1
httpGet:
host:
scheme: HTTP
path: /
httpHeaders:
- name: Host
value: fake-app.com
port: 80
Code Block 7 - Example of a Readiness Probe with a httpGet mechanism
References
[1] Kubernetes.io, “Pod Lifecycle,” Kubernetes.io, 2019. Available: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/. [Accessed: Dec. 08, 2023]
[2] kubernetes.io, “Health checking gRPC servers on Kubernetes,” Kubernetes, Oct. 01, 2018. Available: https://kubernetes.io/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/. [Accessed: Dec. 09, 2023]