Kubernetes best practices. Setting Queries and Resource Limits

Kubernetes best practices. Create small containers
Kubernetes best practices. Kubernetes organization with namespace
Kubernetes best practices. Kubernetes viability test with Readiness and Liveness tests

For each Kubernetes resource, it is possible to configure two types of requirements – Requests and Limits. The first describes the minimum requirements for the availability of free node resources necessary to run a container or hearth, the second strictly limits the resources available to the container.

When Kubernetes plans a pod, it is very important that the containers have enough resources for normal operation. If you plan to deploy a large application on a node with limited resources, it is quite possible that it will not work because the node runs out of memory or lacks processor power. In this article, we will look at how you can solve the problems of lack of computer capacity using resource requests and restrictions.

Requests and Limits are mechanisms that Kubernetes uses to manage resources such as the processor and memory. Requests is the result of which the container is guaranteed to receive the requested resource. If a container requests a resource, Kubernetes will only schedule it on the host that can provide it. Limits limits control that the resources requested by the container will never exceed a certain value.

A container can only increase computing power to a certain limit, after which it will be limited. Let’s see how it works. So, there are two types of resources – processor and memory. The Kubernetes Scheduler uses data from these resources to figure out where to run your pods. A typical hearth resource specification looks like this.

Each container in pod can set its own queries and restrictions, all of which are additive. Processor resources are defined in millimeters. If your launch container needs two full cores, you set the value to 2000m. If the container needs power only 1/4 of the core, the value is 250m. Keep in mind that if you assign a processor resource value greater than the number of cores of the largest node, then the launch of your hearth will not be planned at all. A similar situation will happen if you have a sub that needs four cores, and the Kubernetes cluster consists of only two main virtual machines.

Unless your application is specifically designed to take advantage of multiple cores (with programs such as complex scientific computing and database operations coming to mind), it’s best practice to set CPU Requests to 1 or lower and then run more replicas for scalability. Such a solution will give the system greater flexibility and reliability.

When it comes to processor limitations, things get more interesting as it is considered a compressible resource. If your application starts to approach the processor capacity limit, Kubernetes will begin to slow down your container using CPU Throttling – lowering the processor frequency. This means that the processor will be artificially limited, providing the application with potentially worse performance, but the process will not be terminated or handed down.

Memory resources are defined in bytes. Usually, the value in the settings is measured in Mib mebytes, but you can specify any value, from bytes to petabytes. Here the situation is the same as with the CPU – if you place a request for an amount of memory exceeding the amount of memory on your nodes, the execution of this pod will not be scheduled. But unlike processor resources, memory is not compressed, because there is no way to limit its use. Therefore, the execution of the container will be stopped as soon as it goes beyond the limits of the memory allocated to it.

It is important to remember that you cannot configure requests that exceed the size of resources that your sites can provide. The characteristics of the shared resources for GKE virtual machines can be found on the links located under this video.

In an ideal world, the default container settings will be enough for workflows to go smoothly. But the real world is not like that, people can easily forget to configure the use of resources or hackers will set requests and restrictions that exceed the real capabilities of the infrastructure. To prevent these scenarios from developing, you can configure ResourceQuota resource quotas and LimitRange restriction ranges.

After you create a namespace, you can block them with quotas. For example, if you have prod and dev namespaces, a template is used in which there are no production quotas at all, and development quotas are very strict. This allows prod in the event of a sharp jump in traffic to take all the available resource for itself, completely blocking dev.

A resource quota may look like this. In this example, there are 4 sections — these are the 4 bottom lines of code.

Let’s look at each of them. Requests.cpu is the maximum number of combined processor power requests that can come from all namespace containers. In this example, you can have 50 containers with requests of 10m each, five containers with requests of 100m or just one container with a request of 500m. As long as the total number of requests.cpu of this namespace is less than 500m, everything will be fine.

Requested memory requests.memory is the maximum amount of combined memory requests that all containers in the namespace can have. As in the previous case, you can have 50 containers of 2 mib each, five containers of 20 Mib each or a single container with 100 Mib until the total amount of requested memory in the namespace is less than 100 mebibytes.

Limits.cpu is the maximum combined processor power value that all namespace containers can use. We can assume that this is the limit of processor power requests.

Finally, limits.memory is the maximum amount of shared memory that all containers in the namespace can use. This is a limitation of total memory requests.
So, by default, containers in a Kubernetes cluster work with unlimited computing resources. Using resource quotas, cluster administrators can limit the consumption of resources and their creation based on the namespace. In the namespace, the pod module or container can consume as much CPU and memory power as the quota of namespace resources determines. However, there is concern that one under or a container may monopolize all available resources. To prevent this situation, the limit range Range Range is used – the policy of restricting the distribution of resources (for pods or containers) in the namespace.

The limit range provides limitations that may:

– ensure the minimum and maximum use of computing resources for each module or container in the namespace;
– Force the minimum and maximum storage request for the Starage Request for each PersistentVolumeClaim in the namespace;
– force the relationship between the Request request and the Limit limit for a resource in the namespace;
– set Requests / Limits by default for computing resources in the namespace and automatically enter them into containers at runtime.

This way you can create a limit range in your namespace. Unlike the quota that applies to the entire namespace, the Limit Range is used for individual containers. This can prevent users from creating very tiny, or vice versa, giant containers inside the namespace. The Limit Range may look like this.

As in the previous case, 4 sections can be distinguished here. Let’s take a look at each.
In the default section, the default restrictions are set for the container in the hearth. If you specify these values ​​in the limit range, then any containers for which these values ​​have not been explicitly set will be guided by the default values.

In the default query section, defaultRequest, the default queries for the container in the hearth are configured. Again, if you set these values ​​in the limit range, then any containers for which these parameters are not explicitly set will use these values ​​by default.

The max section indicates the maximum restrictions that can be set for the container in the hearth. The values ​​in the default section and the restrictions for the container cannot be set above this limit. It is important to note that if max is set and the default section is absent, then the maximum value becomes the default value.

The min section indicates the minimum queries that can be set for the container in the hearth. At the same time, the values ​​in the default section and requests for the container cannot be set below this limit.

Again, it is important to note that if this value is set, the default value is not, then the minimum value becomes the default query.

As a result, these resource requests are used by the Kubernetes scheduler to execute your workloads. In order for you to properly configure your containers, it is very important to understand how this works. Suppose you want to run several modules in your cluster. Assuming that the hearth specifications are valid, the Kubernetes schedule will use cyclic balancing to select the node for the workload.

Kubernetes will check to see if the Node 1 node has enough resources to fulfill pod container requests, and if it does not, it will move on to the next node. If none of the nodes in the system is able to satisfy requests, the pods will go to the Pending state. With Google Kubernetes engine features such as auto-scaling of nodes, GKE can automatically determine the wait state and create some more additional nodes.

If subsequently there is an excess capacity of nodes, the auto-scaling function will reduce their number in order to save you money. That’s why Kubernetes plans query-based pods. However, the limit may be higher than requests, and in some cases the node may actually run out of resources. We call this state overcommitment state.

As I said, if we are talking about a processor, Kubernetes will begin to limit pods. Each pod will receive as much as he requested, but if at the same time he does not reach the limit, then throttling will begin to apply.

As for memory resources, here Kubernetes is forced to make decisions about which pods to delete and which to keep until you free up system resources, otherwise the whole system will crash.

Let’s imagine a scenario in which you have a machine that has run out of memory – how will Kubernetes do this?

Kubernetes will look for pods that use more resources than requested. So if your containers do not have Requests at all, it means that by default they use more than they requested, simply because they did not ask for anything at all! Such containers become the main candidates for shutdown. The next candidates are containers that have satisfied all their requests, but are still below the maximum limit.

So if Kubernetes finds several pods that have exceeded the parameters of their queries, it will sort them by priority, and then delete the lowest priority modules. If all modules have the same priority, Kubernetes will stop those pods that exceed their requests more than the rest of the pods.

In very rare cases, Kubernetes can interrupt the hearths that are still within its scope. This can happen when critical system components such as the Kubelet agent or Docker start consuming more resources than was reserved for them.
So, at the initial stages of small companies, the Kubernetes cluster can work fine without setting resource requests and restrictions, but as your teams and projects begin to grow in size, you run the risk of encountering problems in this area. Adding queries and restrictions to your modules and namespaces requires very little extra effort and can save you a lot of hassle.

To be continued very soon …

A bit of advertising 🙂

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to your friends, cloud VPS for developers from $ 4.99, A unique analogue of entry-level servers that was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $ 19 or how to divide the server? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper at the Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands! Dell R420 – 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB – from $ 99! Read about How to Build Infrastructure Bldg. class c using Dell R730xd E5-2650 v4 servers costing 9,000 euros for a penny?

Similar Posts

Leave a Reply