flux v2 monorepo experience

Hi all. My name is Alexey, I am a DevOps engineer, and today I want to tell you a little about one infrastructure solution of my key customer.

A little about my work and division of responsibility. I provide services for setting up and maintaining a cloud infrastructure based on GCP, as well as monitoring, alerting, logging, etc. At Altenar, I’m part of the automation team, which also deals with release pipelines, improving various processes in the company, and other important things. CI, builds, networking, accesses – other teams are responsible for this, and these parts I touch very little or not at all.

Since ancient times, my team has used flux v1 for clusters on VM and for GKE in the “one cluster – one repository” format, with connected “templates” in the form of Git submodules. This approach has its drawbacks, one of the most significant is that the first version of flux is no longer supported since the end of 2020.

Thus, we had some technical debt in the form of the first flux, and the need to update to version 2 sooner or later or switch to other tools. The first option was chosen, at the same time I decided to improve something in our current structure by removing a bunch of different repositories and abandoning “submodule templates”.

The peculiarity of the move is that I could recreate some of the clusters without problems immediately with a new installation of flux v2, for example, my sandbox and dev cluster, and some in production, for which I had to think over DR questions if something goes wrong and create a manual for the move with minimum downtime.

How I built a new monorepa and why.

Old scheme.

repo-1
└── flux-v1-cluster
    ├── submodule-base
    │   └── charts
    └── submodule-gke
        └── charts

repo-2
└── submodule-base
    └── charts

repo-3
└── submodule-gke
    └── charts

Let’s say I need to see what my chart is rolling into the cluster. I see in the cluster that this is part of flux, I go to the flux-v1-cluster repo, I try to find what I am looking for there and I do not find it. I look further, I remember that there are submodules here, I climb into the Submodule-base, I look there, but there is none … I go to the neighboring Submodule-gke repo and aha! It’s here! 🙂

And the next time you remember that you found some chart in Submodule-gke, you immediately go there, but it is not there … because it is in another Submodule-base turnip … In general, if you ever operated on linked repositories once a month or two, you already understand what the problem is.

This also adds a problem with versioning submodules. At some point, for a long time I could not understand what was wrong, because. a chart was poured into one cluster, which was not in any submodule. As it turned out, this chart was added to some custom branch of the submodule, and it would be quite difficult to guess about it if you didn’t read the version of the submodule used.

As the neuron said about this structure:

This code organization has several disadvantages:

1. Complexity of dependency management: In this framework, managing dependencies between different repositories and their submodules can be quite complex and can lead to dependency conflicts.

2. Increasing the complexity of the deployment: This organization of the code can lead to the complexity of the deployment process, since it is necessary to coordinate updates in various repositories and submodules.

3. Limitation of flexibility: with this organization of the code, changing one component may require updating dependencies in several repositories and submodules, which can lead to limited flexibility and increased time to change the system.

4. Increased development time: if several teams work on different repositories and submodules, this can lead to an increase in development time and a decrease in team productivity as a whole.

5. Difficulty in scaling: if you need to add a new component or change an existing one, this can lead to difficulties in scaling the system, since it will be necessary to coordinate changes in several repositories and submodules.

New scheme.

Even though flux v2 also supports “submodules” as separate repositories, I have implemented a monorepository following one of the examples in the documentation:

https://fluxcd.io/flux/guides/repository-structure

.
├── apps
│   ├── common
│   │   └── monitoring
│   │       ├── prometheus
│   │       └── prometheus-rules
│   ├── kafka
│   ├── rabbitmq
│   └── redis
├── clusters
│   ├── cluster-1
│   │   ├── custom-resources
│   │   └── sync-code
│   ├── cluster-2
│   │   ├── custom-resources
│   │   └── sync-code
│   ├── gke-template
│   └── self-hosted-template
└── infrastructure
    ├── controllers
    │   └── nginx-ingress
    └── lb
        └── metallb

Advantages:

  1. All the necessary “modules” and applications are in my mind, in one repository.

  2. It is easy to add new applications both to the cluster and to generalized “templates” of clusters.

  3. When using a custom branch for a cluster, all applications are also at hand and their versions will not accidentally “leave” due to the work of colleagues with the master branch.

  4. When creating PRs in a master monorep, unlike PRs in a submodule, you are sure that this affects all resources and therefore you will be more careful 🙂

Flaws:

  1. If you have poorly thought out the structure of the cluster and decide to significantly change it on the go, when several clusters are already using a monorepa, it will be quite painful.

  2. About PRs and their impact on the entire infrastructure, it is immediately fair and attributed to the minuses.

The neuron had her own opinion on this matter, of course, asked her to compare which is better:

Evaluate whether it is better or worse depends on the requirements of the project and the organization of the development team.

The single-repository structure has the following advantages:

1. Simplify dependency management: This structure makes it easier to manage dependencies between components because they are all in the same repository.

2. Simplify the deployment process: Simplifying dependency management also results in a simplified deployment process, as updates can be coordinated within the same repository.

3. Improved flexibility: if a component needs to be changed, changes can be made within the same repository, which improves flexibility and reduces the time to change the system.

4. Improving Team Productivity: A single-repository structure can improve team productivity because it makes it easier to coordinate work on different components.

However, the single-repository structure also has some drawbacks:

1. Increasing the size of the repository: all components are in the same repository, which can lead to an increase in its size and complicate versioning.

2. Limitation of scaling: if you need to add a new component or change an existing one, this can lead to difficulties in scaling the system, since it will be necessary to change the code within the same repository.

3. Increased development time: Increasing the size of the repository can lead to increased development time and reduced team productivity as a whole.

So, both structures have their advantages and disadvantages, and the choice between them depends on the specific requirements of the project and the development team.

A little about app packaging

Even after reading the documentation several times, for a long time I could not understand what exactly a “packaged application for flux v2” is.

As a result, the simplest and most understandable scheme for describing this, I got this:

In essence, to package an application for a monorep, you collect a set of resources that the application needs, such as namespace, policies, and other row manifests. If the application has a helm chart, add a helm release. And you just put all this in one directory and add kustomization, in which you list the files used.

Best practice from my experience:

If the application has any configurable parameters (for example, classic values ​​for the helm chart) – include them immediately as a configmap and make its use mandatory. If you need to install with default options – just add an empty configmap.

If you want to add some components to the cluster that require CRD – think about how you will package them into a separate “application” in flux v2.

Example – kafka-alerts in my repository.

Homework – try adding any prometheus rules to the cluster at once and you will immediately understand why I divided the cluster code into 2 directories: custom-resources And sync-code.

An example of Prometheus packaging can be viewed in the repository with an example and even rolled out to yourself, more details at the end of the article.

Flux v1 -> flux v2 migration order

Everything is more or less simple, but there are a number of nuances.

We use the practice of preparing a change plan “like for a five-year-old child”, so that any of the available engineers can go through the checklist items and not forget anything critical (although situations are different).

If you have a very outdated Prometheus, as I had, it will not reconsider, because. it will not match the CRD with the new version. It is necessary to delete all old resources and definitions of prometheus with handles before rolling out a new one.

In outline:

  1. Scaled deployment flux v1 to 0

  2. Bootstrap flux v2 into a cluster (there are several options for how you will do this, the simplest one is described in my repository)

  3. Applying the new cluster code

  4. Change\remove old annotations by hand on flux v1 controlled components (but I prefer to delete the old installation and reinstall the application, because I can afford it in the maintenance window)

In general, everything should go without problems. Well, in case you are doing this for the first time, I recommend experimenting on a demo cluster.

An example code for creating a sandbox in gke via terraform is here: https://github.com/ksemele/tf-gke-test
An example of the flux v2 monorepo code for this cluster is here: https://github.com/ksemele/fluxv2-test

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *