Deploying a Kubernetes cluster using Kubernetes

Within the course DevOps Practices and Tools prepared a translation of a useful article for you.

We also invite you to an open webinar on the topic “Prometheus: quick start”… At the webinar, participants, together with an expert, will examine the Prometheus architecture and how it works with metrics; will figure out how to generate alerts and events in the system.


Wait … what, what? Yes, I’ve heard similar reactions to my suggestion to use Kubernetes to build Kubernetes clusters.

But for cloud infrastructure automation nothing better comes to my mind than Kubernetes itself… Using one central K8s cluster, we create and manage hundreds of other K8s clusters. In this article, I’ll show you how to do it.

Note: SAP Concur uses AWS EKS, but the concepts discussed here apply to Google GKE, Azure AKS, and any other cloud provider that offers Kubernetes.

Ready for commercial operation

Deploying a Kubernetes cluster to any of the major cloud providers is very easy. You can create and run a cluster on AWS EKS with one command:

$ eksctl create cluster

However, creating a production-ready Kubernetes cluster requires more. Although everyone understands “production readiness” differently, SAP Concur uses the following four steps to create and deliver Kubernetes clusters.

Four stages of assembly

  • Preliminary tests A set of baseline tests for the target AWS environment to verify that all requirements are met before actually creating the cluster. For example: available IP addresses for each subnet, AWS exports, SSM parameters and other variables.

  • EKS control plane and nodegroup… Actual assembly of the AWS EKS cluster with worker nodes connected.

  • Installing add-ons… This is what makes your cluster nicer 🙂 Install addons like Istio, logging integration, autoscaler, etc. This list of additions is not exhaustive and is completely optional.

  • Cluster validation… At this stage, we validate the cluster (EKS core components and add-ons) from a functional point of view before putting it into production. The more tests you write, the better you sleep. (Especially if you are the tech support person on duty!)

Glue the stages together

Each of these four steps uses different tools and techniques, which I will discuss later. We were looking for a single tool for all stages that would glue everything together, support sequential and parallel execution, be event-driven and, preferably, with assembly visualization.

And we found Argo… In particular, Argo Events and Argo workflows… They both run on Kubernetes as CRDs and use declarative YAML, which is also used in other Kubernetes deployments.

We found ideal combination: Imperative Orchestration, Declarative Automation

Production-ready K8s cluster built with Argo Workflows
Production-ready K8s cluster built with Argo Workflows

Argo workflows

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs in Kubernetes. Argo Workflows is implemented as Kubernetes CRD.

Note: If you are familiar with K8s YAML, then I promise you it will be easy for you.

Let’s see how the above build steps might look like in Argo Workflows.

1. Preliminary tests

Preliminary tests are run in parallel, with retry attempts in case of failure
Pre-tests are run in parallel, with retry in case of failure

To write tests, we use BATS framework… Writing a test in BATS is very simple:

#!/usr/bin/env bats
@test “More than 100 available IP addresses in subnet MySubnet” {
AvailableIpAddressCount=$(aws ec2 describe-subnets --subnet-ids MySubnet | jq -r ‘.Subnets[0].AvailableIpAddressCount’)

 [ “${AvailableIpAddressCount}” -gt 100 ]
}

Running BATS tests in parallel (the above test avail-ip-addresses.bats and three more) using Argo Workflow might look like this:

— name: preflight-tests
  templateRef: 
    name: argo-templates
    template: generic-template
  arguments:
    parameters:
    — name: command
      value: “{{item}}”
  withItems:
  — bats /tests/preflight/accnt-name-export.bats”
  — bats /tests/preflight/avail-ip-addresses.bats”
  — bats /tests/preflight/dhcp.bats”
  — bats /tests/preflight/subnet-export.bats”

2.EKS control plane and nodegroup

EKS control plane and nodegroup with dependencies
EKS control plane and nodegroup with dependencies

You can choose from various tools to create an EKS cluster. Available eksctl, CloudFormation or Terraform. Two-step EKS build with dependencies using CloudFormation templates (eks-controlplane.yaml and eks-nodegroup.yaml) in Argo Workflow might look like this.

— name: eks-controlplane
  dependencies: [“preflight-tests”]
  templateRef: 
    name: argo-templates
    template: generic-template
 arguments:
   parameters:
   — name: command
     value: |
       aws cloudformation deploy 
       --stack-name {{workflow.parameters.CLUSTER_NAME}} 
       --template-file /eks-core/eks-controlplane.yaml 
       --capabilities CAPABILITY_IAM
- name: eks-nodegroup
  dependencies: [“eks-controlplane”]
  templateRef: 
    name: argo-templates
    template: generic-template
  arguments:
    parameters:
    — name: command
      value: |
        aws cloudformation deploy 
        --stack-name {{workflow.parameters.CLUSTER_NAME}}-nodegroup 
        --template-file /eks-core/eks-nodegroup.yaml 
        --capabilities CAPABILITY_IAM

3. Installing add-ons

Installing add-ons with dependencies in parallel
Installing add-ons with dependencies in parallel

You can install add-ons using kubectl, helm, kustomize, or a combination of both. For example setting metrics-server from helm template and kubectl, provided that installation is required metrics-server, in Argo Workflows might look like this.

— name: metrics-server
  dependencies: [“eks-nodegroup”]
  templateRef: 
    name: argo-templates
    template: generic-template
  when: “‘{{workflow.parameters.METRICS-SERVER}}’ != none”
  arguments:
    parameters:
    — name: command
      value: |
        helm template /addons/{{workflow.parameters.METRICS-SERVER}}/ 
        --name “metrics-server” 
        --namespace “kube-system” 
        --set global.registry={{workflow.parameters.CONTAINER_HUB}} | 
        kubectl apply -f -

4. Cluster validation

Concurrent cluster validation with error retries.
Concurrent cluster validation with error retries.

We use the excellent BATS library to test the functionality of add-ons. DETIKwhich makes writing K8s tests easier.

#!/usr/bin/env bats
load “lib/utils”
load “lib/detik”
DETIK_CLIENT_NAME=”kubectl”
DETIK_CLIENT_NAMESPACE="kube-system"
@test “verify the deployment metrics-server” {
 
 run verify “there are 2 pods named ‘metrics-server’”
 [ “$status” -eq 0 ]
 
 run verify “there is 1 service named ‘metrics-server’”
 [ “$status” -eq 0 ]
 
 run try “at most 5 times every 30s to find 2 pods named ‘metrics-server’ with ‘status’ being ‘running’”
 [ “$status” -eq 0 ]
 
 run try “at most 5 times every 30s to get pods named ‘metrics-server’ and verify that ‘status’ is ‘running’”
 [ “$status” -eq 0 ]
}

Executing the above BATS DETIK test file (metrics-server.bats), provided that metrics-server installed, in Argo Workflows it might look like this:

— name: test-metrics-server
  dependencies: [“metrics-server”]
  templateRef:
    name: worker-containers
    template: addons-tests-template
  when: “‘{{workflow.parameters.METRICS-SERVER}}’ != none”
  arguments:
    parameters:
    — name: command
      value: |
        bats /addons/test/metrics-server.bats

Imagine how many tests you can plug in here. You can run Sonobuoy conformance tests, Popeye – A Kubernetes Cluster Sanitizer and Fairwinds’ Polaris… Connect them with Argo Workflows!

If you get to this point, then you have a fully working AWS EKS cluster ready for production, with installed, tested and ready metrics-server… You are great!

But we are not saying goodbye yet, I left the most interesting for the end.

WorkflowTemplate

Argo Workflows supports templates (WorkflowTemplate), which allows you to create reusable workflows. Each of the four assembly steps is a template. Basically, we’ve created building blocks that can be combined as needed. Using one “main” workflow, you can perform all the stages of the assembly in order (as in the example above), or independently of each other. This flexibility is achieved with Argo Events.

Argo Events

Argo Events is an event-driven workflow automation framework for Kubernetes that helps you launch K8s objects, Argo Workflow workflows, serverless workloads, and more on events from various sources like webhook, s3, schedules, queues messages, gcp pubsub, sns, sqs, etc.

The cluster assembly is triggered by an API call (Argo Events) using JSON. In addition, each of the four assembly steps (WorkflowTemplate) has its own API endpoint. Kubernetes maintainers can greatly benefit from this:

  • Not sure about the state of the cloud? Use the Pre-Test API

  • Looking to build a bare EKS cluster? Use the eks-core (control-plane and nodegroup) API

  • Want to install or reinstall add-ons on an existing EKS cluster? There are addons API

  • Is something strange happening to the cluster and you need to run tests quickly? Call the test API

Argo features

how Argo Eventsand Argo workflows includes a large set of features that you do not need to implement yourself.

Here are seven of them that are most important to us:

  • Parallelism

  • Dependencies

  • Retries – Note in the screenshots above for the red pre-test and validation tests. Argo repeated them automatically, followed by successful completion.

  • Conditions

  • S3 support

  • Workflow Templates (WorkflowTemplate)

  • Events Sensor parameters

Conclusion

We were able to use various tools that work together to determine the desired state of the infrastructure, providing flexibility and high speed of the project. We plan to use Argo Events, Argo Workflows and other automation tasks. The possibilities are endless.


Learn more about the course DevOps Practices and Tools

Watch an open webinar on the topic “Prometheus: quick start”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *