Kubernetes best practices. Create small containers

The first step in deploying to Kubernetes is to place your application in a container. In this series, we will look at how you can create an image of a small and safe container.
Thanks to Docker, creating container images has never been easier. Specify the base image, add your changes and create a container.

Although this technique is great for getting started, using basic images by default can lead to unsafe work with large images full of vulnerabilities.

In addition, most images in Docker use Debian or Ubuntu as the base image, and although this provides excellent compatibility and easy adaptation (the Docker file takes only two lines of code), basic images can add hundreds of megabytes of extra load to your container. For example, a simple node.js file of the Go hello-world application takes about 700 megabytes, while the size of your application itself is only a few megabytes.

Thus, all this additional load is a waste of digital space and an excellent cache for vulnerabilities and errors in the security system. So let’s look at two ways to reduce the size of a container image.

The first is the use of basic images of small size, the second is the use of the design pattern Builder Pattern. Using smaller base images is probably the easiest way to reduce the size of your container. Most likely, your language or the stack you are using provides an original application image that is much smaller than the default image. Let’s take a look at our container node.js.

By default, in Docker, the size of the base node: 8 image is 670 MB, and the size of node: 8-alpine is only 65 MB, that is, 10 times smaller. Using the smaller Alpine base image will significantly reduce the size of your container. Alpine is a small and lightweight Linux distribution that is very popular among Docker users because it is compatible with many applications, while maintaining the small size of the containers. Unlike the standard Docker “node” image, “node: alpine” removes a lot of utility files and programs, leaving only those that are enough to run your application.

To switch to a smaller base image, simply update the Docker file to start working with the new base image:

Now, unlike the old onbuild image, you need to copy your code into the container and install any dependencies. In the new Docker file, the container starts with the node: alpine image, then creates a directory for the code, installs dependencies using the NPM package manager, and finally starts server.js.

With this update, a container is 10 times smaller. If your programming language or stack does not have the ability to reduce the base image, use Alpine Linux. It will also provide the ability to fully manage the contents of the container. Using basic small-sized images is a great way to quickly create small containers. But you can achieve even greater reduction using the Builder Pattern.

In interpreted languages, the source code is first passed to the interpreter and then directly executed. In compiled languages, the source code is first converted to compiled code. However, compilation often uses tools that are not really needed to run the code. This means that you can completely remove these tools from the final container. You can use the Builder Pattern for this.

The code is created in the first container and compiled. Then, the compiled code is packaged in the final container without the compilers and tools needed to compile this code. Let’s skip the Go app through this process. First, we will move from the onbuild image to Alpine Linux.

In the new Docker file, the container starts with the golang: alpine image. He then creates a directory for the code, copies it to the source code, creates this source code and launches the application. This container is much smaller than the onbuild container, but it still contains the compiler and other Go tools that we don’t really need. So let’s just extract the compiled program and put it in our own container.

You may notice something strange in this Docker file: it contains two FROM lines. The first section of 4 lines looks exactly the same as the previous Docker file, except that it uses the AS keyword to name this step. In the next section, there is a new FROM line that allows you to start a new image, and instead of the golang: alpine image, we will use Raw alpine as the base image.

Raw Alpine Linux does not have any SSL certificates installed, which will cause most HTTPS API calls to fail, so let’s install some CA root certificates.

And now the most interesting: to copy the compiled code from the first container to the second, you can simply use the COPY command located on the 5th line of the second section. It will copy only one application file and will not affect Go utility tools. The new multi-stage Docker file will contain a container image of only 12 megabytes in size, while the original container image was 700 megabytes, and this is a big difference!
Thus, using small basic images and Builder Patterns are great ways to create much smaller containers without a lot of work.
It is possible that depending on the application stack, there are additional ways to reduce the size of the image and the container, but do small containers really have a measurable advantage? Let’s look at two aspects where small containers are extremely effective – performance and security.

To evaluate the performance increase, consider the duration of the process of creating a container, inserting it into the registry (push) and then retrieving from there (pull). You can see that a smaller container has an undeniable advantage over a larger container.

Docker will cache the layers, so subsequent builds will be very fast. However, in many CI systems that are used to build and test containers, layers are not cached, so there is significant time savings. As you can see, the time to build a large container, depending on the power of your machine, is from 34 to 54 seconds, and when using a container reduced with the Builder Pattern, from 23 to 28 seconds. For operations of this kind, productivity gains will be 40-50%. So just think how many times you create and test your code.

After the container is built, you need to insert its image (push container image) into the container registry in order to use Kubernetes in your cluster. I recommend using the Google container registry.

Using the Google Container Registry (GCR), you pay only for the raw storage and network, and there is no additional container management fee. It is confidential, secure and very fast. GCR uses many tricks to speed up the pull operation. As you can see, inserting a Docker Container Image container image using go: onbuild, depending on computer performance, will take from 15 to 48 seconds, and the same operation with a smaller container will take from 14 to 16 seconds, and for less efficient machines the advantage in operation speed increases by 3 times. For large machines, the time is approximately the same, since GCR uses the global cache for a common database of images, that is, you do not need to download them at all. In a low-power computer, the CPU is a bottleneck, so the advantage of using small containers here is much more tangible.

If you use GCR, I highly recommend using Google Container Builder (GCB) as part of your build system.

As you can see, using it allows you to achieve much better results in reducing the duration of the Build + Push operation than even a productive machine – in this case, the process of building and sending containers to the host is almost 2 times faster. In addition, every day you get 120 minutes of assembly for free, which in most cases meets the needs of creating containers.

Next comes the most important performance metric – the speed at which you retrieve or download Pull containers. And if you don’t really care about the time spent on the push operation, then the duration of the pull process seriously affects the overall performance of the system. Suppose you have a cluster of three nodes and one of them crashes. If you use a management system, such as Google Kubernetes Engine, it will automatically replace the idle node with a new one. However, this new node will be completely empty, and you will have to drag all your containers into it to get it working. If the pull operation is long enough, then all this time your cluster will work with lower performance.

There are many cases where this can happen: adding a new node to a cluster, updating nodes, or even switching to a new container for deployment. Thus, minimizing pull extraction time becomes a key factor. It is indisputable that a small container downloads much faster than a large one. If you use multiple containers in a Kubernetes cluster, saving time can be very significant.

Take a look at the comparison: the pull operation when working with small containers takes 4-9 times less time depending on the power of the machine than the same operation using go: onbuild. Using common basic images of small container containers greatly speeds up the time and speed with which new Kubernetes nodes can deploy and go online.

Let’s look at the issue of security. Smaller containers are thought to be much safer than large containers because they have a smaller attack surface. Is it really? One of the most useful features of the Google Container Registry is the ability to automatically scan your containers for vulnerabilities. A few months ago, I created both onbuild and multi-stage containers, so let’s see if there are any vulnerabilities there.

The result is amazing: only 3 medium vulnerabilities were found in a small container, and 16 critical and 376 other vulnerabilities in a large one. If you look at the contents of a large container, you can see that most security problems have nothing to do with our application, but are related to programs that we don’t even use. So when people talk about a large surface for attacks, they mean exactly that.

The conclusion is obvious: create small containers because they provide real benefits in the performance and security of your system.

To be continued very soon …

A bit of advertising 🙂

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to your friends, cloud VPS for developers from $ 4.99, A unique analogue of entry-level servers that was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $ 19 or how to divide the server? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper at the Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands! Dell R420 – 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB – from $ 99! Read about How to Build Infrastructure Bldg. class c using Dell R730xd E5-2650 v4 servers costing 9,000 euros for a penny?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *