Size matters! Part 1

In this article, we will analyze how you can automate the process of detecting size regression in an android application.

We, as developers, want to make our application available to many users so that they can download and install our applications without any problems. And one of the metrics related to this goal is the size of our application – a large application size can cause problems when downloading and installing on devices with limited memory or slow internet. And the more we reduce the size of our application, the higher the install conversion becomes, as detailed in separate article. Therefore, we would like to not only optimize the size, but also be able to monitor the size change and be able to detect the increase in size before it hit production.

When it comes to a large team, it is very difficult to control the changes taking place in the project, as there are many feature commands that add cool features that can later affect the size. Therefore, it is best to make this control automated, and our special flow on CI will help us with this.

Regression detection in development

Let’s start with the simplest, check app size in development branch, since it is the only source of truth for us, since we create our release branches from development.

Our first idea was – what if we collect a release build for each commit (pull request merge) in development and collect size metrics. But we immediately realized that this is not the best solution for a quick solution to this problem, since if 2 commits are simultaneously pushed (push) to development, then it will be difficult to compare these 2 apk with each other, since the release build takes a decent amount of time. Therefore, we found an easier way.

1. Suppose we select a specific commit at a specific time and run a release build from it using the familiar gradle command:

./gradlew assembleRelease

And as a result, from the received apk, we collect and upload metrics (apk size, currently used commit, application version, date, etc., for example, in json format) to any storage (gcloud), the main thing is that we can upload our artifacts there and upload back to you when needed. We also download the assembled release apk, we will need it in the subsequent check of the application size. Using the gsutil command line tool:

gsutil cp "$build_size_info.json" "gs://custom_folder/build_size_info.json"
gsutil cp "$current-release.apk" "gs://custom_folder/current-release.apk"

2. Select the next specific commit in development, after some time, for example, after 3 hours (your choice). We also collect a release build on this commit, you also upload apk, metrics. But now we need to download the previous build from the repository, which we collected 3 hours earlier, and compare the previous and current apk. But just comparing apk sizes is not enough for us, we would like a detailed comparison in which we can find out which files have been added or removed. This will help us diffuse tool by Jake Wharton which can compare apk, aab, jar, aar files and show comparison details.

java -jar diffuse-binary.jar diff $prev.apk $current.apk > $report.txt
gsutil cp "$report.txt" "gs://custom_folder/report.txt"

The result of the comparison is written to a file report.txt and upload it to our storage so that you can see it further by clicking on the link. If show content report.txtit will look like this:

The file will also contain more details about the changed resources (added, removed), but this information may be enough for us to understand what served as a regression. We are interested in the column diff, which shows how much the files inside the apk have been changed. In this case, we see that the file has been significantly changed arsc(resources.arsc), which stores all compiled resources (drawable/.xml, layout/, raw/, values/ etc.). And we understand that someone froze in the development Pull request, in which compiled resources were added, and we can find this commit in git history.

3. To make it easier to find the commit that served as the regression, we can collect all the involved commits between commits when the application size has been checked. To do this, we call the git command:

git rev-list --ancestry-path "$prev_commit_hash".."$current_commit_hash"


As a result, our CI workflow will look like this:

Scheme of a complete workflow on CI
Scheme of a complete workflow on CI

To run a check every 3 hours, you can assign a task in your CI, or run it depending on the time (depends on the capabilities of your CI).

If a size regression is found, you can notify the team with a link(report.txt in gcloud) to detailed comparison information, in the consequences, the responsible developer can identify the cause and find a specific commit in development that served as a regression and fix or reverse the changes.


1. Late regression detection – we learn about the regression only after the changes have already been merged into development

2. Requires manual work – after all, this is not a fully automated process, since as soon as a regression occurs, the developer has to check the result of the comparison and look for a specific commit, inside all the involved commits.

3. We collect a universal apk – which contains resources for all device configurations, which affects the size of the application. This problem is solved inside Google Play, where we need to upload the App Bundle, and apk is built inside depending on the configuration of a particular device.

We will address all these shortcomings in the next part of the article.

Similar Posts

Leave a Reply