A Guide to Kubernetes for Kubernetes Haters

There is a faction of programmers where Kubernetes has a bad reputation as an overly complex time waster and a technology that startups should avoid. Using Kubernetes within a small team is a clear sign of engineering overcomplication.

I myself am involved in the confusion on this topic.

Of course, I can sometimes grumble about Kubernetes, but in fairness, it is a technological masterpiece. I highly recommend that all my competitors use it.
— Paul Butler (@paulgb) September 9, 2022

Despite this mockery, I sincerely consider Kubernetes a “technological masterpiece.” In September 2022

I also wrote about that

to what extent the complexity of Kubernetes is really justified, taking into account the problems that it solves.

We are in the company Jamsocket We’ve been using Kubernetes in production for several years now, and I’ve become quite involved in working with it. Within the company, we managed to instill a calm attitude towards Kubernetes. The most important thing we did for this is isolated a small fragment of features Kubernetes, and the rest have learned to ignore.

This post is based on our in-house guide to using Kubernetes, so don't consider it mandatory for any startup. However, I believe it will serve as a good starting point for navigators who want to run aground less often in the vast seas of Kubernetes.

❯ Why do we need Kubernetes at all?

In my opinion, Kubernetes is the way to go if you want to achieve three of the following things at once:

Run multiple processes/servers/scheduled tasks simultaneously.

Perform them redundantly and at the same time balance the load between them.

Configure them in code and express the relationships between them in code.

At its core, Kubernetes is simply an abstraction layer that makes it convenient to think of an entire pool of computers as a single computer (without a monitor). If in practice this is what you need and you don't need any other components, then Kubernetes will take you far.

I've heard that #2 above is too much and startups shouldn't be aiming for zero-latency or high-availability deployments. But we often need to deploy code several times a day, and when the product breaks, it’s the people who suffer the most. our users. Even a minute of inaccessibility, someone will notice. When operating in a rolling deployment mode, we are confident that we can perform such operations without ceremony and with the required frequency.

❯ How we use Kubernetes

For context:

Jamsocket

is a service for dynamically raising processes with which a web application can “communicate”. Somewhat similar to AWS Lambda, but the process lifetime is tied to the WebSocket connection, and not to the individual request/response.

With Kubernetes, we run the long-running processes needed to support these operations. API server, container registry, controller, log collector, some DNS services, metrics collection, etc.

Here are some types of operations we handle without Kubernetes:

Ephemeral processes as such. At the very beginning of our work, we managed to actively use them, but soon discovered that they rather constrained us (more on this below)

Static/marketing sites. For this purpose we use Vercel. It is more expensive, but, well, an hour of a developer’s time at a small startup is also expensive, and in our case Vercel quite pays for itself.

In any context where such data is directly stored, the loss of which we would greatly regret. We use persistent volumes for caching or derived data, but generally we prefer to work with a managed Postgres DB database outside of a cluster and blob storage.

It is important that we ourselves do not administer Kubernetes. Its main advantage is the ability to outsource its operation at the infrastructure level! We are quite happy with Google Kubernetes Engine, and even if it’s a fiasco

Google Domains

has shaken my faith in Google Cloud, at least I can sleep soundly, fully aware that if necessary, it will not be difficult for us to migrate to Amazon EKS.

❯ Things we actively use

There are several types of resources at k8s that we use without hesitation. Here I will only list those resources that we explicitly create. Most of them themselves implicitly create other resources (like pods), which I won't mention, but which we of course (indirectly) use.

Deployed instances: We specifically deploy most of our pods. Each deployed instance, critical for the operation of our service, exists in several replicas, and rolling updates are applied to it.

Servicesnamely: ClusterIP for internal services and LoadBalancer for external. We avoid using services NodePort And ExternalNamewe prefer to keep our DNS configuration outside of Kubernetes.

CronJobs for cleanup scripts and the like.

ConfigMaps and Secrets: To transfer data to the above resources.

❯ Things we use with caution

StatefulSet and PersistentVolumeClaim: Yes, we have used StatefulSet from time to time. This configuration is more complicated than when deploying regular instances, but it allows you to maintain a persistent volume from restart to restart. We prefer to store important data long-term in managed services outside of k8s. We have no taboo on using volumes, since sometimes it is convenient to save, say, a cache between restarts. But I prefer to do without volumes, since with a sliding approach to deployment, pathological interactions (mutual blocking) are possible between them.

RBAC (Role Based Access Control): we used this approach a couple of times, for example, to give the service the right to update the secret. But this approach adds too much complexity to our small cluster, so I mostly avoid it.

❯ Things we actively avoid

Write YAML manually. In YAML enough options to shoot yourself in the foot, so I prefer not to deal with him. In contrast, we create our Kubernetes resource definitions from TypeScript using Pulumi.

Non-embedded resources and operators. Previously I wrote about why the pattern control cycle – this is a double-edged sword. Yes, this is a key factor in ensuring the reliability of the K8s, but it is also a source of unnecessary indirectness and complexity. Using a pattern operator And own resources You can allow third-party programs to use the robust Kubernetes infrastructure in their own control loops, which is a great idea in theory but turns out to be quite inconvenient in practice. We don't work with cert-managerand we automate work with certificates using Caddy.

Helm. Helm is not an option for us, since it requires working with operators and dealing with YAML without rules, but I also believe that by using templating of unstructured strings to generate data that would then be subject to machine parsing, we ourselves increase fragility our system without gaining anything. I think that nindent — it’s like “iron on glass,” sorry.

Anything with “mesh” in the name. I suppose they are useful to someone, but definitely not to me and, by the way, him too do not like.

Ingress resources. Personally, I have no scars from them, and I even know people who know how to use such resources productively, but our successful experience with Kubernetes suggests that we need to avoid unnecessary levels of indirection. It’s enough for us to configure Caddy, so that’s how we proceed.

Trying to replicate an entire k8s stack on a local machine. We don't use things like k3s or the like to accurately replicate the production environment, but rather make do with Docker Compose or our own scripts to run the subset of the system that interests us at the moment.

❯ A person should not wait

I mentioned above that for some time we have been using ephemeral interactive Kubernetes processes, the lifetime of which is tied to the session. We quickly came to the conclusion that Kubernetes was specifically designed for reliability and modularity at the container startup time level. Typically, Kubernetes is good in scenarios where you need to redundantly execute some long-running processes. But, if a person ever has to wait for the pod to start, then

Kubernetes is not right for you

I admit that I’m promoting my book here, but at least it’s open source: it talks about an MIT-licensed Rust orchestrator called Plane. This orchestrator is designed to quickly assign and execute processes that serve interactive workloads (particularly those that require a human to wait).

❯ Higher level abstractions

To complete the picture, I should also mention that the alternatives that have already appeared are quite good. In particular, if requirement #3 from the above list (the ability to write infrastructure in code) is not relevant to you. IN

one of our products

we decided to take advantage

Railway

instead of a k8s cluster, mainly to serve preview environments. Some colleagues, whom I respect very much, drown for

Render

(I also dabbled with it, but it seems to me that the environment model proposed in Railway is still cleaner). In addition, I advocate adopted in

Flight Control

“come with your cloud” approach.

With many SaaS applications, this approach can achieve significant results. But, if you need to solve three of the above tasks at once, then approach them in a disciplined manner, and don’t give anyone a reason to think that you have not yet matured to Kubernetes.