Simple stories about speeding up the assembly of a large codebase
We often read instructions about setting up magic compiler flags to speed up project builds. Yes, these methods work, but they may not always provide the time savings that you want. The real problem may be the large amount of code. Sometimes primitive removal of old unused code, parallelization of the build, and division into components give a more tangible result.
Removing unused code
Do you have a large codebase? If so, there is likely unused code there. Until some point, I did not even think that simply removing code can increase the build speed almost three times. In early 2020, we were given a fairly old component for support, providing services, libraries, and applications for working with devices and boards. After setting up the build of this component in our infrastructure, we noticed that the build time was quite long – 35 minutes. We had already developed Core functionality components (providing features for other components), and there the total build time of five components was 30 minutes in total. And the build of one new component took 35 minutes. We understood that building the provided functionality could not take so much time, so we decided to take an inventory of the code.
The component contained application names with the postfixes v1/v2, NG (or New Generation), which clearly indicated duplicates. It seems that when creating a new application, the old one was not deleted.
First of all, everything that was not in the installer was included in the list of candidates for removal. Among them were graphical and console applications. Now it remained to find out whether anyone inside the company was using these applications. What was not in the installer was certainly not delivered to the end user, but could still be used by testers, programmers and systems engineers in the company's factory, since the applications were delivered in the assembly.
We also compiled a list of all the libraries that are in the component and checked their use in the entire codebase (this was not difficult to do, since there is a search through the entire OpenGrok code).
It took a month to get an up-to-date list of the list of used applications and libraries. After that, we cleaned our repository with a clear conscience. And hooray, the build time was only 12 minutes.
Parallelization of assemblies
We sometimes hear about these tools, but, unfortunately, in practice they are not used so often for some reason. Personally, I have seen the use of a similar tool, namely Incredibuild, only once. This instrument made a lasting impression. I just moved to a new project, downloaded the repository, build… The local build took 30 minutes. I'm shocked because… I’ve heard a lot about this project: they use cool tools, code generators are used for entire layers of code, the presence of a mini-server for the main product, which allows you to run full-fledged smokes on a component, and not on the final assembly of the product. I'm off to find out how everyone works with this. They told me they use Incredibuild. Until this moment, I had never used parallelization of assemblies, so I was pleasantly surprised by the result: the time for a complete assembly was 5 minutes (it was 30 minutes). The only difficulty with this approach is correctly setting up the dependencies of all projects. Here Incredibuild was used only for local builds. Integration into the build system was unsuccessful, because… Project dependencies broke from time to time, and I didn’t want to have a red build just for this reason.
Component architecture
Components are something like a constructor. The component assembly process is performed as follows. After assembling each component, the spare parts necessary to use this component are uploaded to the server: these are binaries and, for example, in the case of C++, interfaces in h-files. Now we don’t need to build this code every time, but simply download the required version of the component. Next, builds of the following components are launched, depending on the previous one. I would also like to note that tests for a component are launched only if there are changes in it (the code of the component itself has changed, the version of the lower-level components used has been updated), which is logical. This architecture can save a lot of time, especially on components that change little or little.
These are the kinds of techniques that can speed up build times and make programmers happy.