Optimizing the size of the Go binary

image If you have ever written in Go, then the size of the resulting binaries could not escape your attention. Of course, in the age of gigabit links and terabyte drives, this shouldn’t be a big problem. Still, there are situations when you want the size of the binary to be as small as possible, and at the same time you do not want to part with Go. The options for how to make the Go binary “lose weight” will be discussed below.

Target aka “victim”

First, a little context. There is a daemon (a constantly running process) that does some very basic work. Close analogies in the manner of work can be DigitalOcean Agent or Amazon CloudWatch Agentthat collect metrics from machines and send them to a centralized storage. Our demon performs a slightly different task, but this is not essential.

A few more facts about the demon:

  • Written in Go (and there is no desire to rewrite into another language);
  • Fits on many cars;
  • Periodically requires updating.

At the start of the study, the size of the Go binary was eleven MB.

Let me see you stripped

The compiled binary contains debug information. In my situation, it is not needed – it is still impossible to debug on target machines due to lack of access. So you can safely remove it by compiling with the necessary flags or using the strip utility. The process is called stripping and should be quite familiar for Linux lovers (you can see the description of the flags in the output go tool link):

go build -ldflags "-s -w" ./...

After this procedure, the size of the binary was 8.5 MB. That is, the debug information gave + 30% to the size in this particular case.

Compression

The next knight’s move is to use compression. You can just try making a tar.gz archive and distributing it. Overall, this is a working solution. The ability to unpack the archive on target systems is present.

Another option is to use a packer that will unpack and run the binary on the fly. Perhaps the most famous in this area is UPX… My first acquaintance with him happened, probably more than 20 years ago, in the era of dial-up modems and crack / keygen-crafts. Despite such a solid age, UPX still finds its users and continues to develop. I missed the evolutionary point where upx worked out of the box for Go, but no extra squats are required today. Judging by history, it happened about 4 years ago, so everything works very stable.

Let’s try to package our binary using UPX:

upx agent

It took 1.5 seconds to package, and we got a binary size of 3.4 MB! Excellent result.

After doing a little research on the packer options, you may find options such as --brute and --ultra-brute… Let’s try to play with the first one:

upx --brute agent

The size of the resulting binary was 2.6 MB, which is 4 times less than our original version. True, the packaging procedure slowed down significantly and took as much as 134 seconds.

For the sake of curiosity, we try and --ultra-brute:

upx --ultra-brute agent

The size of the binary is still the same 2.6 MB (in fact, the size has decreased, but by only 8 KB). Another 11 seconds were added to the time for packing, and the total time was 145 seconds.

The thought that you want speed in terms of speed as in the first option, and in size – as in the second and third, haunted and led to the following command:

upx --best --lzma agent

As a result, we have all the same 2.6 MB in size, but it only takes 4 seconds.

image

Severe addictions

The ease of adding external dependencies can have negative consequences. For example, it is quite easy to add some “very necessary” module that will increase the size of the binary for incommensurable benefits.

It is a very good (but often overlooked) practice to monitor the size of the distribution. When there is a revision versus distribution size graph, it is very easy to figure out which of the changes led to obesity.

In my case, integration with Sentry… If you have never encountered Sentry, then in a nutshell, this is a service that collects information about errors that occur in the application. Such pieces are fastened primarily to improve the quality and search for problems arising in the industrial operation of a service or product. Returning to the problem of “obesity”, let’s see what the option without integration with Sentry gives us. We start the exercises again with eleven MB of binary. Without “stripping” after removing the integration, the size was 7.8 MB, and after “stripping” the size became completely 6.2 MB. Almost 2 times less than the original!

Of course, you might want to keep tracking errors. But, in this case, it is cheaper for me to organize an intermediate service, where to send error messages at least via HTTP, and from there redirect them to Sentry.

Once again about compression

After we figured out the dependencies, we try to use upx again:

upx --best --lzma agent

Resulting binary size: 1.9 MB! Let me remind you that the path began with eleven MB.

image

The price to pay for the compact size will be the start-up time, since unpacking is required first. A rough measurement using the time utility in my case showed an increase of 170-180 milliseconds. In a daemon context, where the runtime is disproportionately larger than the startup overhead, this is not a problem at all. But you need to keep this aspect in mind.

Other options

Where to go if you want more?

One of the options for solving the problem of delivering minimal updates is with binary patches. For example, Google Chrome update distribution uses this concept… There are bsdiff / bspatch utilities that make it easy to organize this process. In my case, the bspatch utility is not available on the target machines, so for now I considered these exercises inappropriate. Although preliminary experiments have shown very good results in terms of small patch sizes.

Another option, mentioned in passing at the very beginning, is to rewrite everything in another language. If you put size at the forefront, then everything will end in C. Time for development is precious to me, and I want to feel pleasure from the process, so not either.

Another interesting option is gccgo… If the target machines are more or less monotonous, then you can use this method and get a Go-binary with dynamic linking. The size of the binary will greatly please.

This is not my case (the OSes are very different), but I also tried to conduct an experiment:

go build -compiler gccgo -gccgoflags "-s -w" ./...

The conditions are not quite equal (this is a different virtual machine and a different OS), but already at the start we get a binary of size 1.8 MB. True, with dynamic linking. Apply upx and get … everything 284 KB! The main thing is not to be surprised by the following errors when transferring to another environment:

./agent: error while loading shared libraries: libgo.so.16: cannot open shared object file: No such file or directory

You can add to the collection of exotic compilers TinyGo… They did not manage to assemble this project – it crumbles with a number of errors at the compilation stage. But, in general, successful attempts have already been in the context of another project (small, but still not “Hello, World!”). The binary will be dynamically linked, but the number of dependencies is less than in the version with gccgo (which means that there are slightly fewer problems with portability).

If you have enough platform-specific code, it might come in handy build tags… The rules can be trickier than just naming files with the _windows.go or _linux.go suffix. The amount of savings is highly dependent on the specific situation. In my case, there is practically no savings, since the main target platform is Linux x86_64, and support for Mac and ARM is just experimentation.

Docker

It is not uncommon to find that the Go binary is distributed in the form of a Docker container. For example, to completely isolate the daemon from the host system and forward only the necessary files or directories. In my situation, there is a neighboring daemon that is distributed this way and is used on two machines. In this case, efforts to optimize the size of not the binary itself, but the size of the Docker image are more important. An optimally sized Docker image is built using a fairly standard two-phase assembly trick:

FROM golang:1.15 as builder

ARG CGO_ENABLED=0

WORKDIR /app

RUN apt-get update && apt-get install -y upx

COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN make release

FROM scratch

COPY --from=builder /app/server /server

ENTRYPOINT ["/server"]

Each phase begins with a directive FROM… In the first phase, all the dependencies required for the build are present, and the binary is formed. And then, from the directive FROM scratch, in fact, a new image is formed from scratch, where the previously obtained binary is copied and the launch command is determined.

Under make release lurking challenges go build and upx… The resulting image size was only 1.5 MB (the size is slightly smaller, because we are talking about a similar, but slightly different daemon). If you collect everything in one phase using the golang image as a base, the result will be 902 MB.

conclusions

So we went the way with eleven Mb to 1.9 MB, that is, we have reduced the size of the Go-binary by almost 6 times! Stripping a binary and then packing it with upx is a very effective measure to reduce size. You should not neglect and remove unnecessary dependencies. In my case, this led to a very noticeable reduction. If there is no particular variability in the environments for running the binary, then you can take a closer look at the option with gccgo.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *