Cloud infrastructure to help product teams – how we do it at MKB

Hi, I am Alexander Podmoskovny, head of the Competence Center (BPM, SRS and SAS systems) at Moscow Credit Bank. Since 2020, I have been actively developing the DevOps topic, which is what I am currently doing.

In my article, I will discuss how cloud infrastructure and a dev platform can provide software developers with tools to automate deployment and testing, which will speed up the release of new product versions, and how companies can improve the reliability and security of IT systems through centralized management, monitoring, and automation of processes.

The main focus

Let's imagine a product team that knows no troubles, writes the code of its application, pleases the customer, develops the product and is not distracted by anything else. Somewhere in an ideal world… In the harsh reality, the product team does almost everything and it is very painful. Let's figure out why. In an attempt to understand what else the team is distracted by, besides developing its product, we identified two main categories.

The first category is everything that is directly related to the servers and hardware on which this application is launched. And the second category is the ecosystem with which we assemble this application, deliver it to production and help it operate in an enterprise environment.

Let's look at these two problems one by one. Let's start with the infrastructure in the company. And let's take an example of this traditional approach, when there is a product team, and this product team needed a new environment.

What does the team usually do? Create an incredible number of requests, first to order a server, then to order access to this server, to set up the environment. All this drags on for an infinite amount of time. And what, am I telling you? I think most of you have requests in SD. Some have already overcome this stage, some have not yet completely.

The traditional approach is to order a new environment

The traditional approach is to order a new environment

As a result, the product team, overcoming all the difficulties and spending a lot of resources, configures access and assembles the environment, launches applications, then realizes that they forgot something, starts, for example, to launch backups and monitoring. At this time, the product is already becoming irrelevant. The market has already moved far ahead. The customer is very unhappy. And in general, we are in a not very pleasant situation.

We would like to spend the minimum amount of time on creating the infrastructure that we need.

Traditional vs. Platform Approaches

Traditional vs. Platform Approaches

That is, ideally, all automation and resources would be prepared in advance, so that the server we receive would already be configured, and all we had to do was roll out our application there. And, in fact, it would launch there. At first glance, it seems like fantasy. But in fact, all this is real.

What is the application itself?

Ecosystem of application operation in the Enterprise environment

Ecosystem of application operation in the Enterprise environment

An application in an enterprise environment is not only the code and architecture of the application itself, but also services related to CI/CD, with a pipeline, logging and monitoring. These are publicly available services, for example, billing, security, authentication, authorization. And in the traditional approach, the team also tries to make most of these services and processes themselves, perhaps reinventing the wheel once again, instead of doing the main task – product development.

And if there are, for example, 50 such teams, then each of them tries to create their own services, their own process. And it turns out to be a kind of zoo, which is absolutely impossible to manage, scale and develop, and to create any systemic stories related to the IT landscape. I have already partially said why this is bad.

The most important thing is that the team is distracted from their product, and the rest is business as usual.

This creates additional cognitive loads and overlapping areas of responsibility. Difficulty in implementing and developing any standards. If you look at this picture through the eyes of infrastructure engineers who manage the entire fleet of virtual machines, then for them the picture is no less sad. Firstly, we got a lot of snowflake servers in their worst form. If anyone knows what snowflake servers are, leave a comment about your experience or just sympathize.

In other words, servers, snowflakes are a serious problem and a huge pain, because these are the servers whose state cannot be simply reproduced. Each of them is unique in its own way. They cannot be scaled. And here some solutions for fault tolerance are very difficult to design. Painful scaling, maintenance and control of all this. Here is the first problem, it is quite serious and obvious.

Something on the cloud

Let's move to the clouds for a short time, but very quickly, and talk about something in “cloud language”. What is so unique about the clouds that is not in the traditional approach? Clouds essentially provide services such as IaaS, PaaS, SaaS.

Cloud services - IaaS, PaaS, SaaS

Cloud services – IaaS, PaaS, SaaS

Quite a few services are provided now, because this topic has been popular for a long time. It was born more than one year ago. And for these services, a fundamentally different approach to interaction with the infrastructure is applied – Infrastructure as Code (IaC). Any requests for creating, changing, deleting virtual machines, networks, file and object storage, managed k8s clusters or databases, we describe exactly the same way as we describe the code of our application. We store this code in the version control system, be it git or some other similar ones, we carry out all the same processes related to code review, merge request.

IaC approach to interaction with infrastructure

IaC approach to interaction with infrastructure

We build a pipeline around this code and deliver it directly to the cloud infrastructure via providers. In this way, we reproduce a whole new process, fundamentally different from the requests and servers I mentioned earlier. The advantages of this approach are also obvious. All those pains, starting with snowflake servers and ending with the lack of systemicity, are solved using the IaC approach.

That is, we increase speed, reduce costs. We have scalability, fault tolerance, repeatability, the ability to recover from accidents several times faster, due to the unification of infrastructure. But everything has its price and the main limitation is that the threshold for entering this model is quite high. The workload of engineers and their level of responsibility, the level of knowledge also assumes a much higher bar. And this is probably one of the main stop factors, how to start, how to rethink this approach to infrastructure.

And if we are talking about some cloud providers, then in order to ensure quality, they provide some services of not the latest versions. That is, if a more recent version of k8s has already been released, then, for example, this version will appear on the platform only after some time.

What is the cloud? Yes, on the one hand it seems like some kind of magical entity, you write code there, give it to it, and at the output you get elements of infrastructure.

For example, take openstack – it is the most common open source software that can make a cloud infrastructure out of anything.

Openstack - conceptual architecture

Openstack – conceptual architecture

If we look at the architecture in a little more detail, we will see that there are also groups of services. Some of them are responsible for providing computing power, some – for providing data storage and network resources. Above are the services that make PaaS-type services from elements, these are ready-made managed Kubernetes clusters or databases. And when we look at this picture for a long enough time, we understand that there is no magic, everything is explainable and everything is clear. And understanding this, we already begin to think differently and think about how to apply these new tools that the cloud infrastructure gives us. So, we seem to have more or less sorted out the first problem.

IaC and cloud tools are interesting, modern, yes, complicated, but the advantages are obvious, and they are very cool. Who should we give these tools to, who will use them? If we give them to the product team, they will be distracted, and we will not achieve our goal.

Hand over the tools

Well, someone has to hand them over. Here we remember about the ecosystem and smoothly approach the definition of the infrastructure platform. Quite a complex definition, but its meaning is to take all the tools and all the processes associated with the ecosystem, arm yourself with cloud infrastructure tools, create some kind of infrastructure platform. Let's call it a devops platform, so as not to get confused with the word “infrastructure” further. Take a devops platform, take a team of devopsers. And thus create a bridge between product teams and between the infrastructure that provides computing power, and in this bridge build all the necessary processes for delivering the code to the final execution environment. Using cloud infrastructure tools, among other things, may sound complicated.

An infrastructure platform is a set of services, systems, people, practices and agreements aimed at supporting the stages of development and operation of digital products.

Infrastructure platform – a set of services, systems, people, practices and agreements aimed at ensuring the stages of development and operation of digital products.

Approximate composition of tools:

Instrumental composition of the platform

Instrumental composition of the platform

In the ICD, everything is about the same, and I do not call to talk about any things related to rocket science. And all this is very familiar to many people. But understanding is not enough. It is much more difficult to implement this story.

I still want to emphasize the concept of IaC and storing infrastructure as code. I would like to say that the first step is the most difficult and many different issues need to be resolved, starting with tooling. You can dive into all aspects in detail by reading this excellent article.

We are currently piloting all these things related to deploying infrastructure from code. But not only these modern tools will allow us to protect the product team from related matters and allow it to focus on the product code. It is important to properly organize the work of all departments, divide responsibility, and rethink approaches.

And this is probably the most difficult thing in terms of organizing the infrastructure, code, organizing the work of applications and the work and interaction of teams.

To sum it up

Cloud infrastructure:

Advantages – this is automation, scalability, systematicity

Restrictions – this is a high entry threshold, difficulty in mastering the settings and maintenance. But, having mastered this, you can fully use those advantages.

Infrastructure platform:

It is the basis, bridge and foundation for building the further IT landscape, and also becomes our competitive advantage in the future and our internal product, which should definitely be developed.

Yes, we will have to rethink approaches to traditional methods of development and landscape construction. But movement is life, I believe, and we must constantly strive to develop.

I hope you found my article useful. I'll be happy to answer your questions in the comments!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *