K8S Multicluster Journey

Hello, Habr!

We represent the Exness platform team. Earlier, our colleagues have already written an article about Production-ready images for k8s. Today we want to share the experience of service migration in Kubernetes.


To begin with, we offer you some numbers for a better understanding of what will be discussed:

  • Our development department is 100+ people, including more than 10 different teams with self-sufficient QA, DevOps and Scrum processes. Development Stack – Python, PHP, C ++, Java and Golang.
  • The size of the test and product environments is about 2,000 containers each. They are running Rancher v1.6 on their virtualization and under VMware.

Motivation

As they say, nothing lasts forever, and Rancher has long enough announced the termination of support for version 1.6. Yes, for more than three years we have learned how to cook it and solve problems that arise, but more and more often we are faced with problems that will never be fixed. Rancher 1.6 also has a ossified system for issuing rights, where you can either do almost everything or nothing.

Own virtualization, although it provided greater control over data storage and security, but imposed operational costs, which were difficult to put up with the constant growth of the company, the number of projects and requirements for them.

We wanted to follow the IaC standards and, if necessary, get the power quickly, in any geographical location and without a vendor lock, and also be able to quickly abandon them.

First steps

First of all, we wanted to rely on modern technologies and solutions that would allow teams to have a faster development cycle and minimize operating costs for interaction with a platform that provides power.

Of course, the first thing that came to our mind was Kubernetes, but we did not get excited and did a little research on the subject of the right choice. We evaluated only opensource solutions, and Kubernetes unconditionally defeated in an unfair battle.

Next came the question of choosing a tool for creating clusters. We compared the most popular solutions: kops, kubespray, kubeadm.

To start, kubeadm seemed to us too complicated way, rather, a kind of inventor of the “bicycle”, and kops lacked flexibility.

And the winner came out:

We started to experiment on our own virtualization and AWS, trying to recreate an approximate likeness of our previous resource management pattern, where everyone uses the same “cluster”. And now we have the first cluster of 10 small virtual machines, a couple of which are in AWS. We started trying to migrate teams there, everything seemed to be “good”, and the story could be finished, but …

First problems

Ansible is what kubespray is built on, it’s not the tool that allows IaC to be followed: something went wrong during the input / output of the nodes, and some intervention was required, and when using different OSs, the playbook behaved in different ways. With the growing number of commands and nodes in the cluster, we began to notice that the playbook took longer and longer, in the end, our record was 3.5 hours, and yours? 🙂

And it seems like kubespray is just Ansible, and everything is clear at first glance, but:

At the beginning of the journey, the task was to launch capacities only in AWS and virtualization, but then, as often happens, the requirements changed.

In light of this, it became clear that our old pattern of combining resources into one orchestration system was not suitable – in the case when the clusters are far removed and are managed by different providers.

Further more. When all the teams work within the same cluster, various services with NodeSelector installed incorrectly could fly to the “alien” host of another team and utilize resources there, and in the case of setting taint, there were constant calls that this or that service was not working, not distributed correctly due to human factor. Another problem was the calculation of cost, especially given the problems in the distribution of services by nodes.

A separate story was the issue of rights to employees: each team wanted to be “at the head” of the cluster and fully manage it, which could cause a complete collapse, since the teams are mostly independent of each other.

How to be

Given the above and the wishes of the teams to be more independent, we made a simple conclusion: one team – one cluster.

So we got a second:

And then the third cluster:

Then we started to think: let’s say, in a year our teams will have more than one cluster? In different geographical areas, for example, or under the control of different providers? And some of them will want to be able to quickly deploy a temporary cluster for any tests.

Would come full Kubernetes! This is some kind of MultiKubernetes, it turns out.

At the same time, we all will need to somehow support all of these clusters, be able to easily control access to them, as well as create new ones and decommission old ones without manual intervention.

Since the beginning of our journey in the world of Kubernetes, some time has passed, and we decided to re-examine the available solutions. It turned out that it already exists on the market – Rancher 2.2.

At the first stage of our research, Rancher Labs already made the first release of version 2, but although it could be raised very quickly by running the container without external dependencies with a couple of parameters or using the official HELM Chart, it seemed crude to us, and we did not know if rely on this decision, whether it will be developed or quickly abandoned. The cluster = clicks paradigm itself in the UI also did not suit us, and we did not want to get attached to RKE, since this is a fairly narrow-minded tool.

The Rancher 2.2 version already had a more efficient look and, along with the previous ones, had a bunch of interesting features out of the box, such as integration with many external providers, a single point of distribution of rights and kubeconfig files, launching a kubectl image with your rights in UI, nested namespaces aka projects.

A community has already formed around Rancher 2, and the HashiCorp Terraform provider was created to manage it, which helped us put everything together.

What happened

As a result, we got one small cluster in which Rancher is launched, accessible to all other clusters, as well as many clusters associated with it, access to any of which can be issued as simply as adding a user to the ldap directory, regardless of where he is is located and the resources of which provider uses.

Using gitlab-ci and Terraform, a system was created that allows you to create a cluster of any configuration in cloud providers or our own infrastructure and connect them to Rancher. All this is done in IaC style, where each cluster is described by a repository, and its state is versioned. In this case, most modules are connected from external repositories so that it remains only to transfer variables or describe their custom configuration for instances, which helps to reduce the percentage of code repeatability.

Of course, our journey is far from over and there are still many interesting tasks ahead, such as a single point of work with the logs and metrics of any clusters, service mesh, gitops for managing loads in a multicluster, and much more. We hope you will be interested in our experience!

The article was written by A. Antipov, A. Ganush, Platform Engineers.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *