Dockerization of project assembly at all levels
Hello everyone, Vadim Makerov, iSpring backend developer, is in touch. A successful reproducible project build is a critical factor in project support and development. With a large number of projects and technology stacks, it is more difficult to guarantee build reproducibility – “it built once, it will always build”.
I talked about how to implement assembly idempotency in the framework meetup at the iSpring office in 2023. This article is a text version of my talk.
We model
Let's assume that we have a system consisting of many projects. All projects use the following set of build tools:
These tools are enough to assemble any project. The number of projects in the system increases and due to the isolation of teams, projects are developed in parallel.
This is where the first problem arises.
Problem #1 – Reworking old projects
At the request of the information security department, several projects that have not been touched for a long time need to be finalized.
Algorithm for making changes:
Clone the project repository
Build the project locally (required to unload the project libraries)
Make changes
Compile, run tests, run linter
Commit changes
After work on the first project began, the first troubles arose:
Lack of specific tools
Some projects, in addition to the general set of tools, may require specific ones that most developers do not have installed.
New linter – old project
After making the modifications, running the linter gave the following output
The linter found some issues that it hadn't noticed before. We're sure they weren't there before because the linter runs in CI/CD before the project changes are released.
This behavior is caused by the difference in linter versions when developing the project and updating the project locally by the developer.
The comments are correct and should be fixed, but without updating the tooling – we want the build to always produce the same result, without phantom linter comments.
Eventually
We have difficulties assembling the project and linter comments that need to be corrected.
Problem #2 – Updating the tool
To be precise, it is a backwards incompatible update of a tool in the project.
Let me give you an example of a real situation with updating tools:
The code generator is being updated from version v1 to v2, the versions are incompatible with each other.
Team A and B have locally installed versions v1 of the tool.
As part of the project's refinements, Team A decides to use tool v2.
Team A installs the tool locally and modifies the project to use version v2.
Due to production needs, Team B needs to finalize the project instead of Team A.
Team B is unable to build the project without instructions from Team A to update the tool.
Eventually
Team B is unable to build the project without either contacting Team A or independently searching and learning the required version of the tool.
The situation becomes more complicated when there are a large number of teams working on different projects.
Problem #3 – Lack of local tool
I touched on this thesis in the section on refining old projects; I want to examine this problem separately.
The lack of a pre-prepared tool locally forces the developer to follow the steps:
Look in the project documentation for the tool version and how to install it. But it is unlikely to be described there
Distract other developers
Search the Internet for the required tool and select the version
Eventually
The developer takes longer to get involved in development
There is no guarantee that the developer will install the correct version of the tool.
Problems
The above examples are the result of systemic shortcomings.
The versions of the tools are not fixed.
It is unknown which version is used in the project
When updating tools, you need to somehow notify other teams/departments
Local influence
Solution
One solution could be to containerize the build.
By containerization I mean using Docker and Docker images with the necessary tools.
Containerizing the build is not the only way to solve the problems described.
There are many solutions, such as Nix shellFor example.
We wanted to go for containerization and tool isolation, so we chose containers and Docker.
The main advantages that dockerization of the assembly brings:
Portability
Docker images are easily portable between developer machines and distributed via the docker.hub registry or other registries.
In addition, images can be easily transferred to the organization's local registry if it is necessary to isolate CI/CD from external factors.
Consistent environment
The tool runs in an environment configured for it, without requiring local environment settings (environment variables, paths to executable utilities) and without conflicting with the developer's local utilities.
Insulation
Using utilities isolated in a container provides additional security for the developer's local environment and CI/CD.
The tools that are run are isolated and cannot affect the developer's host machine.
(Side note: malicious tools have many ways to escape, but it is harder for a tool running in a docker container to harm the host machine – than if it were running on the host directly)
Versioning
Docker images are perfectly versioned, allowing you to set any string as a tag.
You can use semver, the day the tool version was released, or just the commit hash from GIT.
Selecting a tool
In CI/CD we already use container assembly via a macro image with all utilities.
This solution is not suitable for local builds and makes it difficult to independently update tools.
Thus, we have the following requirement:
The project build should be done the same way – locally and in CI/CD.
Makefile + Docker
You can describe the build in Makefile, where to use containers of certain images for the build
Example:
all: build test check
.PHONY: build
build:
@docker run --rm -it \
-w ${PWD} \
-v ${PWD}:${PWD} \
-e GOCACHE=/app/cache/go-build \
-v /app/cache/go-build \
golang:1.22 \
build -v ./cmd/app -o ./bin/app
.PHONY: test
test:
@docker run --rm -it \
-w ${PWD} \
-v ${PWD}:${PWD} \
-e GOCACHE=/app/cache/go-build \
-v /app/cache/go-build \
golang:1.22 \
test ./...
.PHONY: check
check:
@docker run --rm -it \
-w ${PWD} \
-v ${PWD}:${PWD} \
-e GOCACHE=/app/cache/go-build \
-v /app/cache/go-build \
golangci/golangci-lint:v1.56 \
golangci-lint run
The advantages I would highlight are:
Simplicity – it is quite easy to write such a Makefile
Intuitive – in such a Makefile it is clear what each command does
The downsides:
There is no way to reuse layers from other build stages, as in a classic Dockerfile
It's not very convenient to operate
docker run
when you need to write more complex commands
Dev containers
Dev containers extension for VS Code (JetBrains also supported in their products) allowing you to run the IDE inside a container with a pre-prepared development environment.
The downside of this approach is the monolithic image for the IDE and the inability to operate such containers on CI/CD.
DevConainer is more like an environment description than a build utility, which makes it inconvenient to handle cache paths, export cache to CI/CD, and so on.
Earthly.dev
Earthly allows you to describe the build in the Eaethfile format similar to Makefile + Dockerfile
(example from the project's README.md)
# Earthfile
VERSION 0.8
FROM golang:1.15-alpine3.13
RUN apk --update --no-cache add git
WORKDIR /go-example
all:
BUILD +lint
BUILD +docker
build:
COPY main.go .
RUN go build -o build/go-example main.go
SAVE ARTIFACT build/go-example AS LOCAL build/go-example
lint:
RUN go get golang.org/x/lint/golint
COPY main.go .
RUN golint -set_exit_status ./...
docker:
COPY +build/go-example .
ENTRYPOINT ["/go-example/go-example"]
SAVE IMAGE go-example:latest
The advantages of Earthly that I would highlight are:
The downsides turned out to be more significant for us.
Own tool
BrewKit
Decided to make our own tool: BrewKit
(yes, yes, they wrote your bike)
The distinguishing qualities of BrewKit are:
Assembly in containers
The sources are copied to the build stage and the results of the stage are explicitly exported or used further
Incorrect use of utilities will not allow deleting or damaging local files
If dependent files for a stage have not changed, the stage will be skipped.
The build engine used is BuildKit
Description of the assembly in Jsonnet – a powerful extension over classic JSON
Architecture
BrewKit calls the docker daemon with specific build commands within specific images.
Demo of the tool
https://asciinema.org/a/q09d6OZyAiGNz1QEFyrPLxPTi
conclusions
With build containerization now:
Easier control over the tools used and their versions
Simplified tool update
Improving the Developer Experience
Supplement to the conclusions after 1 year
I spoke about this topic at a meetup a year ago and I want to share what has changed in the project during this time:
Brewkit -> OpenSource
BrewKit is open source under the MIT license. As promised at the meetup.
Now you can try BrewKit for yourself.
There is a quick start example in REDME.md, and in docs/
There are more details about its internal implementation.
Updates made easy
Recently go 1.22 was released and our update to it was simple and fast.
Previously, updating a project to a new version of Go, linter, and code generators took us about 4 hours per project.
With the introduction of assembly containerization, updating each project takes half an hour (in reality, even less).
Preparing separate tool images instead of one macro image with all the tools reduced the time for introducing new tools to a couple of minutes instead of an hour and complex macro image support as before.