Python in Docker – building the image ourselves
Hello!
Once again, collecting the Docker image of your Telegram bot and using the official image as a base python:3.12.2-alpine3.19
drew attention to the fact that docker scout
shows presence of a new vulnerability in pip
. I wouldn’t say that it somehow affects my application, but the very fact that there is a potential vulnerability “on board” the container with the application running under root and with a forwarded Docker socket (NOT best practice!) gave me the idea of how to minimize this risk?
Before we begin, a little disclaimer:
I’m not a professional programmer, I haven’t completed the Yandex Workshop courses, I’m writing for beginners in Docker and Python like me)
There are actually quite a few ways, for my specific application, I decided to study the issue of building an image based on the most minimal base Linux image – Alpine Linux. To reduce the attack surface. And here I ran into a couple of issues:
Alpine does not include Python
Well, okay, we can install it, I thought. It’s possible, it’s possible, but the size of such a container will be quite large, just look at the one built on Alpine Linux python:3.12.2-alpine3.19
. Python version 3.11.8 is available in the Main branch of the Alpine Linux repository, which does not suit me. These issues can be painlessly circumvented by using a multi-stage build to build the container, where at the first stage you compile and install all the dependencies:
# First stage
FROM python:3.12.2-alpine3.19 AS builder
COPY requirements.txt .
# Install psutil deps
RUN apk --no-cache add gcc python3-dev musl-dev linux-headers
# Install dependencies to the venv path
RUN python3 -m venv --without-pip venv
RUN pip install --no-cache --target="/venv/lib/python3.12/site-packages" -r requirements.txt
Here we take as a basis the same official image python:3.12.2-alpine3.19
copy the file with the application dependencies, install all the dependencies for building Python packages (in my case this is necessary for building psutil
) and, having initialized the virtual environment, install all application dependencies.
What exactly should you copy from the first build step in Python?
For me this was the most difficult moment – the search did not suggest anything intelligible. I had to study the Python directory structure and through trial and error I found out that for Python 3.12.2 to work correctly in the Alpine Linux image, the following structure is sufficient:
/usr/local/bin/python3
/usr/local/bin/python3.12
/usr/local/lib/python3.12
/usr/local/lib/libpython3.12.so.1.0
/usr/local/lib/libpython3.so
Now we can copy the Python structure we need from the first assembly stage to the second stage:
# Second unnamed stage
FROM alpine:3.19.1
...
# Сopy only the necessary python files and directories from first stage
COPY --from=builder /usr/local/bin/python3 /usr/local/bin/python3
COPY --from=builder /usr/local/bin/python3.12 /usr/local/bin/python3.12
COPY --from=builder /usr/local/lib/python3.12 /usr/local/lib/python3.12
COPY --from=builder /usr/local/lib/libpython3.12.so.1.0 /usr/local/lib/libpython3.12.so.1.0
COPY --from=builder /usr/local/lib/libpython3.so /usr/local/lib/libpython3.so
...
The result is, in my case, a working Python application, built on the basis of a minimal Alpine Linux image, which, according to docker scout, has no known vulnerabilities.
The full Dockerfile of the bot is available at Github.
What practices do you use?