Kubernetes best practices. Correct Terminate Disable
Kubernetes best practices. Kubernetes organization with namespace
Kubernetes best practices. Kubernetes viability test with Readiness and Liveness tests
Kubernetes best practices. Setting Queries and Resource Limits
An important point in the work of distributed systems is the processing of failures. Kubernetes helps with this by using controllers that monitor the state of your system and restart services that have stopped working. However, Kubernetes can forcibly shut down your applications to ensure overall system viability. In this series, we will look at how you can help Kubernetes do its job more efficiently and reduce application downtime.
Prior to using containers, most applications ran on virtual or physical machines. If the application crashed or crashed, it took a long time to remove the ongoing task and re-download the program. In the worst case, someone had to solve this problem manually at night, at the most inopportune time. If only 1-2 working machines performed an important task, such a malfunction was completely unacceptable.
Therefore, instead of manually restarting, they began to use monitoring at the process level to automatically restart the application in the event of its emergency termination. If the program crashes, the monitoring process captures the exit code and reboots the server. With the advent of systems such as Kubernetes, this type of system failure response has simply been integrated into the infrastructure.
Kubernetes uses the “observe — commit differences — commit” event loop to ensure that resources are operational along the way from the containers to the nodes themselves.
This means that you no longer need to manually start process monitoring. If a resource fails the Health Check, Kubernetes will simply automatically provide a replacement. Kubernetes does more than just monitor your application crashes. It can create more copies of the application to work on multiple machines, update the application, or simultaneously run multiple versions of your application.
Therefore, there are many reasons why Kubernetes can interrupt a perfectly healthy container. For example, if you upgrade your deployment, Kubernetes will slowly stop old pods while launching new ones. If you disconnect a node, Kubernetes will terminate all hearths in that node. Finally, if the node runs out of resources, Kubernetes will disable all pods to free these resources.
Therefore, it is very important that your application stops working with minimal impact on the end user and minimum recovery time. This means that before disconnecting it must save all the data that needs to be saved, close all network connections, complete the remaining work and have time to complete other urgent tasks.
In practice, this means that your application should be able to process the SIGTERM message – the process termination signal, which is the default signal for the kill utility in Unix family OS. After receiving this message, the application should disconnect.
After Kubernetes decided to complete the pod, a whole series of events took place. Let’s look at every step that Kubernetes takes when a container or hearth completes.
Suppose we want to complete one of the hearths. At this point, it will stop receiving new traffic – containers working in the hearth will not be affected, but all new traffic will be blocked.
Let’s look at the preStop hook – this is a special command or HTTP request sent to containers in the hearth. If your application does not turn off correctly when SIGTERM is received, you can use preStop to exit correctly.
Most programs when they receive a SIGTERM signal finish correctly, but if you use third-party code or some system that you cannot fully control, the preStop hook is a great way to cause a graceful shutdown without changing the application.
After executing this hook, Kubernetes will send a SIGTERM signal to the containers in the hearth, which will let them know that they will be disconnected soon. Having received this signal, your code will proceed to the shutdown process. This process may include stopping any long-lived connections, such as connecting to a database or a WebSocket stream, saving the current state, and the like.
Even if you use the preStop hook, it is very important to check what exactly happens with your application when you send it a SIGTERM signal, how it behaves in such a way that events or changes in the system’s operation caused by the hearth shutdown are not a surprise to you.
At this point, before taking further action, Kubernetes will wait for a specified time, called terminationGracePeriodSecond, or the period for it to shut down correctly when it receives a SIGTERM signal.
By default, this period is 30 seconds. It is important to note that it lasts in parallel with the preStop hook and the SIGTERM signal. Kubernetes won’t wait for the preStop hook and SIGTERM to end — if your application exits before the TerminationGracePeriod expires, Kubernetes will proceed immediately to the next step. Therefore, check that the value of this period in seconds is not less than the time required for the hearth to turn off correctly, and if it exceeds 30 s, increase the period to the desired value in YAML. In the above example, it is 60s.
And finally, the last step – if the containers still continue to work after the terminationGracePeriod expires, they will send a SIGKILL signal and will be forcibly deleted. At this point, Kubernetes will also clean out all other pod objects.
Kubernetes shuts down hearths for many reasons, so make sure that in any case your application will be completed correctly to ensure stable operation of the service.
To be continued very soon …