⚓ Kubernetes: Container Probe — Liveness
>>> Ensuring Application Stability with Automatic Health Checks and Recovery
Hello World… A probe is a tool or mechanism used to inspect, monitor, or measure the status or condition of something. It is typically used to gather information without changing the subject being examined. In the world of Kubernetes, a probe is a diagnostic mechanism that checks the health and status of a container within a pod. Probes are performed periodically by the kubelet to determine the current state of a container. Based on the result of the probe, Kubernetes can decide whether to restart a container, keep it running, or route traffic to it.
In this blog, I’ll dive into one of the key types of probes in Kubernetes: the Liveness Probe. It plays a vital role in maintaining the health and stability of your applications. So, let’s get started…
Liveness Probe:
- A Liveness Probe in Kubernetes is used to check if a container is healthy and functioning properly.
- It helps ensure that the application running inside a container is not stuck or in an unrecoverable state.
- If a liveness probe fails, Kubernetes will assume that the container is unhealthy and restart it.
- This helps ensure high availability and self-healing of applications.
Why???
- Detecting Application Failures: Applications can crash, hang, or enter states where they are no longer functioning as expected. For instance, an application might encounter a deadlock, become unresponsive due to high memory usage, or simply freeze. A liveness probe continuously checks the health of a container. If the container stops responding (e.g., the application is stuck or in an unrecoverable state), the liveness probe will fail, triggering Kubernetes to restart the container automatically. This ensures that any failed or stuck applications can be revived without human intervention.
- Self-Healing: Kubernetes uses the liveness probe to implement self-healing for containers. When the liveness probe fails for a specified number of times, Kubernetes restarts the affected container. This automatic recovery helps maintain the health of the system and minimizes downtime. By enabling this self-healing capability, you reduce the need for manual intervention and improve the reliability of your services.
- Ensuring High Availability: In distributed systems, downtime of even a single instance of an application can lead to a degraded user experience. With liveness probes, Kubernetes can quickly detect unhealthy containers and restart them, thereby maintaining the availability of services. This is particularly important in systems that require high uptime and minimal disruptions.
- Restarting Legacy Applications: Some applications are not designed to recover gracefully from certain failures and might require a restart to function properly again. Liveness probes are particularly useful for legacy applications or those that do not have robust error-handling mechanisms built in. By configuring a simple liveness probe, Kubernetes can manage these applications effectively, ensuring that they remain operational without requiring extensive code changes.
- Graceful Recovery from Resource Exhaustion: Applications may sometimes use up all available resources, like CPU or memory, causing them to become unresponsive. A liveness probe can detect such resource exhaustion by monitoring endpoints or processes within the container. Once the issue is detected, Kubernetes can restart the container to free up resources and allow the application to start afresh.
When To Use?
Liveness probes are useful when:
- An application might enter an unresponsive state, and restarting the container can recover it.
- We want to ensure that an application remains healthy and responsive throughout its lifecycle.
Example: Manifest
- A simple example of a liveness probe using an NGINX container in Kubernetes.
- Then configure the liveness probe to check an HTTP endpoint provided by NGINX, and then simulate a failure to observe how Kubernetes restarts the container.
Example: Explanation
- Explanation for the above manifest file.
- The above manifest defines a Pod named liveness-pod that runs a single container called nginx-container using the nginx image.
- The container exposes port 80 and has a liveness probe configured to perform an HTTP GET request on the root path (/) every 5 seconds.
- It waits 10 seconds after the container starts before performing the first check and considers the container unhealthy if it fails to respond 3 times in a row within 2 seconds each time.
- If the container is deemed unhealthy, Kubernetes will automatically restart it.
- Detailed line-to-line explanation of code can be found here: GitHub
Example: Testing
Step 1: Create a Pod
- Using the below command create a pod:
kubectl apply -f basic-example.yaml
# Command to create a pod:
kubectl apply -f <manifest-file-name>
Step 2: Check Pod Status
- Check the status of the pod using the below command:
kubectl get pods
Step 3: View Pod Details
- See the details of the pod using the below command:
kubectl describe pod liveness-pod
# Command to describe a specific pod:
kubectl describe pod <pod-name>
Step 4: Simulate A Failure
- To test the liveness probe’s functionality, can simulate a failure by making the / endpoint unresponsive.
- Login to the pod’s container:
kubectl exec -it liveness-pod -- sh
# Command to login to the pod's container:
kubectl exec -it <pod-name> -- <terminal>
- Rename or delete the default HTML directory:
mv /usr/share/nginx/html /usr/share/nginx/html-bak
- This will cause NGINX to return a 404 Not Found for requests to /, which will trigger liveness probe failures.
Step 5: Monitor the Liveness Probe
- Check the details of the pod, events section:
kubectl describe pod liveness-pod
# Command to describe a specific pod:
kubectl describe pod <pod-name>
Step 6: Observe Pod Restarts
- Check the pod status:
kubectl get pods
Step 7: Cleanup
- Command to delete the above-created pod:
kubectl delete pod liveness-pod
#Command to delete a specific pod:
kubectl delete pod <pod-name>
Flow:
- Container Running: The process begins when a container runs in the Kubernetes pod.
- Liveness Probe Check: Kubernetes periodically checks the container using the configured liveness probe (HTTP request, TCP socket, or command execution).
- Success: If the liveness probe check is successful (i.e., the container is healthy), the process returns to check the container’s health again after the defined interval.
- Failure: If the liveness probe fails (i.e., the container is unresponsive or unhealthy), Kubernetes moves to the next step.
- Failure Threshold Reached?: Kubernetes checks if the number of consecutive failures has reached the configured threshold.
- Wait: If the failure threshold has not been reached, the system waits.
- Yes: If the failure threshold has been reached, Kubernetes initiates the restart process for the container.
- Restart the Container: Kubernetes restarts the container to attempt to recover it.
- Container Restarted: Once the container is restarted, it goes back to its healthy state.
- Kubernetes performs another liveness probe check on the restarted container.
Repos:
The repositories:
Let’s Connect:
Feel free to get in touch, share your ideas or feedback, or ask any questions. I’m excited to engage with you and learn from each other as we navigate this exciting field!
LinkedIn: Sai Manasa
GitHub: Sai Manasa
Happy Learning 😊