Kubernetes best practices. Kubernetes Namespace Organization
Distributed systems can be difficult to manage due to the fact that they have many movable variable elements, and all of them must work properly to ensure the functionality of the system. If one of the elements fails, then the system must detect, bypass and fix it, and all this must be done automatically. In this Kubernetes Best Practices series, we will learn how to configure Readiness and Liveness tests to test the viability of a Kubernetes cluster.
Health Check Health Check is an easy way to let the system know if your application instance is running or not. If an instance of your application does not work, then other services should not access it or send requests to it. Instead, the request should be sent to another instance of the application that is already running or will start later. In addition, the system should return your application to lost functionality.
By default, Kubernetes will start sending traffic to the pod when all containers inside the hearths are running, and reloading the containers when they crash. For starters, this default behavior of the system can be quite good, but you can increase the reliability of your product deployment using custom health checks.
Fortunately, Kubernetes allows you to do this quite simply, so there are no excuses for ignoring such checks. Kubernetes provides two types of Health Check tests, and it’s important to understand the differences in each application.
The Readiness Readiness Test is designed to tell Kubernetes that your application is ready to serve traffic. Before allowing the service to send traffic to the pod, Kubernetes must verify that the availability check is successful. If the Readiness test fails, Kubernetes will stop sending traffic to the pod until the test is successful.
The Liveness Viability Test tells Kubernetes whether your application is alive or dead. In the first case, Kubernetes will leave him alone, in the second, he will remove the dead pod and replace it with a new one.
Let’s imagine a scenario in which your application needs 1 minute to “warm up” and run. Your service will not start working until the application fully loads and starts, although the workflow has already begun. And you will also have problems if you want to increase the scale of this deployment to several copies, because these copies should not receive traffic until they are completely ready. However, by default, Kubernetes will start sending traffic immediately after the start of processes inside the container.
When using the Readiness readiness test, Kubernetes will wait until the application is fully launched and only after that will allow the service to send traffic to a new copy.
Let’s imagine another scenario in which an application freezes for a long time, stopping servicing requests. Since the process continues to run, by default Kubernetes will consider that everything is in order and continue to send requests to the non-working pod. But when using Liveness, Kubernetes will detect that the application is no longer serving requests, and by default will restart a non-working pod.
Consider how to test readiness and vitality. There are three test methods – HTTP, Сommand, and TCP. You can use any of them for verification. The most common user test method is an HTTP probe.
Even if your application is not an HTTP server, you can still create a lightweight HTTP server inside your application to interact with the Liveness test. After that, Kubernetes will start pinging the pod, and if the HTTP response is in the range of 200 or 300 ms, it will mean that the pod is “healthy”. Otherwise, the module will be marked as “unhealthy”.
For tests using Command, Kubernetes executes the command inside your container. If the command returns with a zero exit code, then the container will be marked as healthy, otherwise, when the exit status number is from 1 to 255, the container will be marked as “sick”. This test method is useful if you cannot or do not want to start the HTTP server, but you are able to run a command that will check the health of your application.
The final verification mechanism is the TCP test. Kubernetes will try to establish a TCP connection on the specified port. If this can be done, the container is considered healthy, if not, it is not viable. This method may come in handy if you use a script in which testing with an HTTP request or command execution does not work very well. For example, the main services for checking with TCP will be gRPC or FTP.
Tests can be configured in several ways with various parameters. You can specify how often they should be executed, what are the thresholds for success and failure, and how long to wait for answers. See the Readiness and Liveness test documentation for more information. However, there is one very important point in setting up the Liveness test – the initial setting of the initialDelaySeconds test delay. As I mentioned, failing this test will restart the module. Therefore, you need to make sure that testing does not start until the application is ready for use, otherwise it will start to cycle through. I recommend using P99 startup time or the average application startup time from the buffer. Do not forget to adjust this value as the launch time of your application becomes faster or slower.
Most experts will confirm that Health Check is a mandatory check for any distributed system, and Kubernetes is no exception. Using the “health” check of the services ensures reliable and uptime Kubernetes and does not make any work for users.
To be continued very soon …