Secure parallel development. Istio

Somehow in the office the idea came up that we needed to think about how we could parallelize the work on one microphone, so that the teams would not overlap with each other. There are some apis that multiple teams are working on. Each one works on their own feature locally and writes tests, but when deploying to a stand, it turns out to be a pandemonium because the changes need to be merged into one branch ala develop and put into testing. In this case, there may be conflicts when merging code or properties that are not compatible between different branches may change.

The mobile bank now serves 450+ mikriks. More than 90 teams are working on them. Since we don’t have code ownership in the project, each team makes changes to the microphones they need. To avoid various kinds of difficulties that lead to an increase in time to market, it was necessary to separate the development of individual teams so that they did not influence each other and could work in parallel.

Untitled

Problem

Our test stand is quite fat and we didn’t want to create a virgin stand next to it. Since this is another item in the budget, plus it needs to be supported, which also requires a lot of effort. Due to constant changes, the test stand periodically felt bad, which directly affects business processes.

At the same time, we live in K8S with Istio. In the article “The practical magic of Istio when building the architecture of large microservice systems,” colleagues already wrote about Istio and why it was chosen. You can read it to understand what kind of animal this is.

The main thing is that Istio has a powerful request routing mechanism that can help us make parallel development and testing more secure.

How it was done

We considered several options, such as organizing a queue, a separate release cluster or a cluster (namespace) for each team. But with so many teams, a huge amount of resources will be required, and the complexity of deployment and support will also increase.

We settled on one solution, which we called Feature Branches. We have already partially used this solution for the fronts that developed web. It had to be refined and scaled for development for Andriod, iOS, as well as for the backend.

Before we get into the technical details, a little terminology:

  • Feature instance: A service instance that has routing rules configured for it.

  • Feature name: feature name, attribute for routing to feature instance.

  • Feature branch: a set of feature instances united by one feature name.

  • Master branch: default branch, set of release versions.

The implementation of the solution may vary, but we use Kuber and Istio. Istio has separate resources that are responsible for routing – VirtualService (VS) and DestinationRule (DR). Each feature has its own DR, and VS routes traffic between them. For example, the DR for demo-api for the master branch looks like this:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
...
spec:
host: demo-api.default.svc.cluster.local
  subsets:
    - labels:
        app.kubernetes.io/instance: demo-api
      name: default

Here is the default subset, which leads to the deployed master branch.

Moreover, for the DR feature:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
...
spec:
  host: demo-api.default.svc.cluster.local
  subsets:
    - labels:
        app.kubernetes.io/instance: demo-api.feature-31337
      name: feature-31337

The name of the feature is specified in the subset, and also ends up in the label through a dot. VS and DR work in tandem. For our VS service it will look like this:

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
...
spec:
  hosts:
    - demo-api.default.svc.cluster.local
  http:
    - match:
        - headers:
            X-FEATURE-NAME:
              exact: feature-31337
      route:
        - destination:
            host: demo-api.default.svc.cluster.local
            subset: feature-31337
    - route:
        - destination:
            host: demo-api.default.svc.cluster.local
            subset: default

The following picture emerges. To understand where to redirect the request, Istio first looks in VS, and then, based on the matching rules, selects the desired subset, which is specified in DR and leads to the desired service.

And now we come to the basis on which HTTP requests are routed, namely, the presence of an HTTP header is taken into account X-FEATURE-NAME. Since in VS, in the match section, the presence of this header in the request is checked for an exact (exact) match of its value with feature-31337then the request will fly under with the label app.kubernetes.io/instance: demo-api.feature-31337.

Services can be combined into a chain if they are deployed with one feature name. Moreover, if there is no specified feature in the service call chain, then the request will go to the default subset because not a single matching section in VS will work.

Untitled

Untitled

This Istio configuration is present not only for each back, but also for the micro-front. Fronts do not use queries with X-FEATURE-NAME header directly. They have the opportunity to deploy features under a dedicated URL. For example, feature-31337.demo.net.

This URL is convenient to share with testers or designers to show your current work. In this case, the request arrives at the service with the Spring Cloud Gateway configured. GW parses the URL and finds the feature name there, which it then puts in the header X-FEATURE-NAME and redirects the request to the desired microfront:

spring:
  cloud:
    gateway:
      routes:
        - id: ignored
          uri: http://demo-ui
          order: 1
          predicates:
            - Host={branch}.demo.net
            - Path=/demo-ui/**
          filters:
            - SetRequestHeader=X-FEATURE-NAME, {branch}
            - RewritePath=/demo-ui/(?<segment>.*), /$\\{segment}

This is how the branch feature is structured and works.

The branch deployment process is also automated. The bank uses a self-written CI/CD platform, which, among other things, deploys artifacts to the test environment. The platform is integrated with Bitbucket via hooks. That is, she knows when a push to the repository occurred.

After receiving a push event, the Platform scans the commit message in search of keywords, and if it finds a message (deploy_feature), it starts the flow of assembling the artifact and deploying it in K8S to the test environment. At the same time, the developer does not need to monitor the status of the build and deployment of the feature, since a specially trained bot notifies him in the messenger about a successful or failed build.

Let's briefly summarize what we got:

No. 1. The implementation of the branches feature is based on the routing rules provided by Istio.

No. 2. For feature instance to work you need:

  • Create a DestinationRule that contains a subset for selecting a service by label.

  • Create a VirtualService that, based on the HTTP header, will redirect the request to the desired subset.

No. 3. To access a feature instance, you need:

  • Use a header on the back X-FEATURE-NAME in requests.

  • On the front you can use a dedicated url by type feature-31337.demo.netif you configure Spring Cloud Gateway in advance as shown above.

Conclusion

After introducing the branches feature, there is no longer a need to create a common branch in Git and deploy it to the master branch, which significantly increased the ease of development and stability of the test environment.

Direct deployment to master branch is prohibitedand its update occurs only if the artifact is rolled out for production, since deployment and testing in a test environment is a mandatory step.

The teams quickly adapted to the new work flow and began producing hundreds of feature instances. In order not to overwhelm the cluster, separate nodes were allocated for them, and a job was also written that cleans up the feature branches. It is also deployed to the Cuba cluster and runs via crown.

A few numbers. As I said above, there are now 90+ teams working on the project. At the same time, the total number of feature branches exceeds 300 units. Also, a survey was conducted among developers about the ease of operation/use of the branches feature. One of the questions asked for a subjective rating on a five-point scale. The average rating was 4.3, which is very good. More than half of the engineers took part in the survey.

After the implementation of the work branches feature, the test environment began to work much more stable. We always have release versions of artifacts that are tested and work stably. That is, now the team can safely deploy a snapshot version of the artifact without fear of breaking related services. If any error occurs, it happens only in this snapshot version, which allows it to be localized.

Snapshot branches also allow you to debug the service remotely, again, because adjacent apis will not be affected.

There is a fly in the ointment in our decision. With the introduction of the branches feature, the load on Jenkins increased, which began to collect more snapshot artifacts. Also, this approach does not completely solve the problem with external systems, plus the complexity of development has increased slightly.

At the moment, the branch feature mechanism is implemented only for APIs, but at the same time we still have databases, queues and caches that have not yet been deployed. There are problems. We will think about how to do this beautifully and at the same time make it convenient to use.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *