Kubernetes best practices. Kubernetes organization with namespace
Kubernetes best practices. Kubernetes viability test with Readiness and Liveness tests
Kubernetes best practices. Setting Queries and Resource Limits
An important point in the work of distributed systems is the processing of failures. Kubernetes helps with this by using controllers that monitor the state of your system and restart services that have stopped working. However, Kubernetes can forcibly shut down your applications to ensure overall system viability. In this series, we will look at how you can help Kubernetes do its job more efficiently and reduce application downtime.
Prior to using containers, most applications ran on virtual or physical machines. If the application crashed or crashed, it took a long time to remove the ongoing task and re-download the program. In the worst case, someone had to solve this problem manually at night, at the most inopportune time. If only 1-2 working machines performed an important task, such a malfunction was completely unacceptable.
Therefore, instead of manually restarting, they began to use monitoring at the process level to automatically restart the application in the event of its emergency termination. If the program crashes, the monitoring process captures the exit code and reboots the server. With the advent of systems such as Kubernetes, this type of system failure response has simply been integrated into the infrastructure.
Kubernetes uses the “observe — commit differences — commit” event loop to ensure that resources are operational along the way from the containers to the nodes themselves.
This means that you no longer need to manually start process monitoring. If a resource fails the Health Check, Kubernetes will simply automatically provide a replacement. Kubernetes does more than just monitor your application crashes. It can create more copies of the application to work on multiple machines, update the application, or simultaneously run multiple versions of your application.
Therefore, there are many reasons why Kubernetes can interrupt a perfectly healthy container. For example, if you upgrade your deployment, Kubernetes will slowly stop old pods while launching new ones. If you disconnect a node, Kubernetes will terminate all hearths in that node. Finally, if the node runs out of resources, Kubernetes will disable all pods to free these resources.
Therefore, it is very important that your application stops working with minimal impact on the end user and minimum recovery time. This means that before disconnecting it must save all the data that needs to be saved, close all network connections, complete the remaining work and have time to complete other urgent tasks.
In practice, this means that your application should be able to process the SIGTERM message – the process termination signal, which is the default signal for the kill utility in Unix family OS. After receiving this message, the application should disconnect.
After Kubernetes decided to complete the pod, a whole series of events took place. Let’s look at every step that Kubernetes takes when a container or hearth completes.
Suppose we want to complete one of the hearths. At this point, it will stop receiving new traffic – containers working in the hearth will not be affected, but all new traffic will be blocked.
Let’s look at the preStop hook – this is a special command or HTTP request sent to containers in the hearth. If your application does not turn off correctly when SIGTERM is received, you can use preStop to exit correctly.
Most programs when they receive a SIGTERM signal finish correctly, but if you use third-party code or some system that you cannot fully control, the preStop hook is a great way to cause a graceful shutdown without changing the application.
After executing this hook, Kubernetes will send a SIGTERM signal to the containers in the hearth, which will let them know that they will be disconnected soon. Having received this signal, your code will proceed to the shutdown process. This process may include stopping any long-lived connections, such as connecting to a database or a WebSocket stream, saving the current state, and the like.
Even if you use the preStop hook, it is very important to check what exactly happens with your application when you send it a SIGTERM signal, how it behaves in such a way that events or changes in the system’s operation caused by the hearth shutdown are not a surprise to you.
At this point, before taking further action, Kubernetes will wait for a specified time, called terminationGracePeriodSecond, or the period for it to shut down correctly when it receives a SIGTERM signal.
By default, this period is 30 seconds. It is important to note that it lasts in parallel with the preStop hook and the SIGTERM signal. Kubernetes won’t wait for the preStop hook and SIGTERM to end — if your application exits before the TerminationGracePeriod expires, Kubernetes will proceed immediately to the next step. Therefore, check that the value of this period in seconds is not less than the time required for the hearth to turn off correctly, and if it exceeds 30 s, increase the period to the desired value in YAML. In the above example, it is 60s.
And finally, the last step – if the containers still continue to work after the terminationGracePeriod expires, they will send a SIGKILL signal and will be forcibly deleted. At this point, Kubernetes will also clean out all other pod objects.
Kubernetes shuts down hearths for many reasons, so make sure that in any case your application will be completed correctly to ensure stable operation of the service.
To be continued very soon …
A bit of advertising 🙂
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to your friends, cloud VPS for developers from $ 4.99, A unique analogue of entry-level servers that was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $ 19 or how to divide the server? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).
Dell R730xd 2 times cheaper at the Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands! Dell R420 – 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB – from $ 99! Read about How to Build Infrastructure Bldg. class c using Dell R730xd E5-2650 v4 servers costing 9,000 euros for a penny?