our path to GitOps

Vadim Gedz, Lead DevOps Engineer, Dynatech

Hi all! Today I would like to talk about how the company where I work is moving to GitOps. But first, some context: I’m a Lead DevOps Engineer at Dynatech, the IT hub of the Dyninno group of companies. We develop software and are responsible for the IT infrastructure of all companies in the group. Dyninno is an international holding that operates in three business segments: air travel, the financial sector and the entertainment industry (casting). We have created and maintain our own tool for finding the best flight options for clients, a social network for casting actors and talented people, and recently launched a new payment solution for neobank. The work of these projects is tied to technologies, the development of which is the responsibility of the Dynatech team.

The development team has more than 300 people, so the issue of configuration management and continuous code integration is acute. In this case, I will talk about how, when switching to GitOps, we eliminated configuration drift between git repositories and infrastructure, I will note the advantages and disadvantages of ArgoCD, how we solved the secret management problem when using ArgoCD. I will also touch on the issue of feedback after the Docker Image is uploaded to the Docker Register and before ArgoCD begins deployment, telling how my colleagues and I wrote the Argo Watcher service in our free time. This article may be of interest to those who are thinking about implementing GitOps into their infrastructure.

Description of the problem

When I first started at Dynatech, the company had an unexpected way of deploying applications to Kubernetes clusters. A certain version of the Helm chart was taken, manifests were generated from it, which were later controlled through Kustomize, which was launched from a huge Makefile. All this design somehow coped with its tasks, but this work could hardly be called optimal or even minimally acceptable. With this approach, many things were done locally, without committing to the git repository. This means that there were situations when new changes canceled unfixed previous ones.

I came to Dynatech from a company where Weave Flux was used in all Kubernetes clusters. This operator was also used here, but a minimal percentage of the configuration was controlled through it. It became obvious that in order to improve processes, you need to move to GitOps.

Using the GitOps approach, we would be able to capture the desired state of the infrastructure and manage it further through the git repository. In this case, the cluster controller uses the repository as a source of truth, and applies the changes to the entire cluster. But before you start implementing GitOps, you need to answer two questions:

  • Which GitOps solution best suits our needs?

  • How can developers control what happens to their code when the CI/CD pipeline ends?

Tool selection

One of the most important tasks when choosing a GitOps solution is to understand what happens during the synchronization process. While the logs provided by other products are usually sufficient to understand what is going on, it is sometimes difficult to find messages related to a problem with a particular resource. Especially if charts with dozens, if not hundreds, of resources are used. And ArgoCD has a very user-friendly interface that displays the status of each resource, which helps to reduce the time to identify a potential problem. This alone is a big plus compared to Weave Flux. Additional arguments were the ability to connect third-party plug-ins to expand functions and integration with SSO (single sign-on) when working through the web interface.

This is what the ArgoCD interface looks like

This is what the ArgoCD interface looks like

What is missing

We are slowly moving towards standardization across all Kubernetes environments. But since different teams have their own vision of what technologies to use, the system must be able to adapt to differences from the standard approach. For us, one of the critical parameters was secrets management. Since we decided we wanted to move to GitOps eventually, manually creating secrets became unacceptable. Some clusters have already used the SealedSecrets controller. This still required direct access to the cluster and at least a basic understanding of how to work with it. But hardly all developers need to know this. And, alas, there was no solution out of the box in ArgoCD secrets management.

But ArgoCD allows you to connect third-party plugins to extend the functionality. One of them is a great plugin that allows you to use secrets from HashiCorp Vault (https://github.com/argoproj-labs/argocd-vault-plugin). Thanks to this, developers or non-IT professionals can edit passwords and other sensitive data through a convenient and simple web interface.

While we try to follow the Principle of Least Privilege, sometimes developers need to know what happens to their code in production. With the interface mentioned above, and with properly configured RBAC policies, limited access to each application is given, which will be enough to understand the big picture of what is happening.

How does this fit in with CI/CD? How to deploy images in a real environment? The following sequence works:

  • ArgoCD Image Updater detects a new image that matches the update strategy and commits to the appropriate git repository.

  • ArgoCD synchronizes changes across the cluster.

But the issue of process transparency remains, as once the image is pushed to the Docker registry, there is no feedback. We did not find anything suitable for our tasks. And after several discussions within the team, they decided to spend some of their free time writing their own tool, Argo Watcher: https://github.com/shini4i/argo-watcher.

Argo Watcher

Argo Watcher

Argo Watcher is the link between pipelines and ArgoCD.

  • The pipeline sends a task to Argo-Watcher and requests an update.

  • Argo-Watcher makes requests to the ArgoCD API and checks if a particular application is running on the expected version.

  • Deployment is considered successful if the application runs on the expected version, if it is healthy and synced.

  • If, after the timeout, the application is not running at the expected version, or is out of sync, it is concluded that the download failed.

In addition, Argo-Watcher acts as a centralized dashboard showing deployments across all projects. This is also useful since not everyone has access to git repositories.

Simplified diagram of the deployment process

Simplified diagram of the deployment process

How to control the tool that controls everything else

Deployment and preparation of the infrastructure for the launch of ArgoCD is implemented as follows:

  1. We use Terraform to deploy everything that is somehow related to the infrastructure.

  1. Namespace/secrets for ArgoCD are created using Terraform.

  1. We also deploy ArgoCD via Terraform using the Helm provider.

And everything else is done through commits to the git repository.

conclusions

First, it has become much easier to track changes. Now everything is monitored centrally, and there is no need to shovel through dozens of repositories, which saves a lot of time.

Secondly, and most importantly, this approach virtually eliminates the possibility of configuration drift (when the configuration in git / scripts differs from what it actually is). Because now every manual (not through a git commit) change is automatically retracted by ArgoCD itself. This disciplines and prevents unpleasant surprises.

In the next stages, we continued to expand GitOps to other areas: for example, we began to manage Terraform through Atlantis.

In general, the transition to GitOps has had a positive effect on the interaction of developers with the operations team. We got a more transparent and controllable system, reduced the number of errors. We will continue to develop and improve our infrastructure based on the principles of GitOps.

Vadim Gedz, Lead DevOps Engineer, Dynatech
https://github.com/shini4i

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *