why switch to Structured Authentication Config

We already wrote blog about Structured Authentication Config, the biggest change to the K8s authentication system in years, which arrived in version 1.29. In this same article I will talk in more detail about the prerequisites for the appearance KEP 3331 and scenarios in which the new authenticator is useful.

My name is Maxim Nabokikh. I'm leading the development Deckhouse Kubernetes Platformand I’m also part of the Kubernetes sig-auth, I contribute to the development of the orchestrator as a whole and am the main maintainer of Dex. I'm one of the people who contributed to the development of Structured Authentication Config and want the community to migrate to it from older authenticators.

How authentication works in Kubernetes

If you are very familiar with how authentication works, you can jump right in to the section about her problems using OpenID Connect as an example. For the rest of the readers, I suggest that you quickly remember how authentication works in Kubernetes.

So, authentication allows you to understand who wants to perform a task and make sure that this “someone” is really who they say they are. There are no users in the Kubernetes world. Instead, the orchestrator relies on the concept of subjects. An entity is a minimalistic standard interface for displaying those attributes needed to grant access rights. An entity that will perform actions in the cluster.

Entities do not physically exist in Kubernetes. Their attributes are calculated on the fly. Therefore, it does not matter who you are – a user or a machine. When authenticating with Kubernetes, you are simply a subject to the orchestrator.

There are various options (flows) with which you can authenticate to Kubernetes. But the structure of attributes for a subject is always the same:

uid : "..."
name : "..."
groups : [...]
extra : {...}

What stands between you as a user and a subject in the Kubernetes world is authenticators. They are built into the kube-apiserver code and are responsible for extracting subject attributes from the authentication data.

There are two groups of authenticators – internal and external. Internal ones rely only on Kubernetes data—keys and certificates—for authentication. External ones rely on third-party applications such as authentication proxies or the OpenID Connect provider. Examples of internal authenticators are Service Account, Bootstrap Token, and x509 certificates. External – Token Webhook, OpenID Connect and AuthProxy.

Authenticators extract subject attributes from authentication data in different ways. I will give three examples.

Internal authenticator x509 checks whether the Kubernetes Certificate Authority (CA) client certificate has been issued. If so, the certificate's Common Name (CN) will be username and the ORG attribute will be that subject's groups.

CN = username
ORG = group-1
ORG = group-2

Second example – Service Account. When a request arrives with a service account token, this authenticator verifies that the token is signed by Kubernetes and that the specified service account exists in the cluster. If everything is in order, the subject's attributes will be as follows:

system:serviceaccount:: = username
[system:serviceaccount, system:serviceaccount:<ns>} = groups

И третий пример — аутентификатор OpenID Сonnect. Вы можете подключить к своему кластеру внешнего провайдера OpenID. В таком случае аутентификатор проверит, что токен подписан этим провайдером. Затем он убедится, что токен действителен, проверит его срок действия, проведёт проверку клиента и слушателей. Если всё в порядке, Kubernetes извлечёт атрибуты субъекта из объектов claim. А claim можно настраивать, и они могут быть разными.

name claim = username
group claim = groups

Для аутентификации в Kubernetes используются конфигурационные файлы kubeconfig. В этих файлах три секции: секция пользователя, кластера и контекста.

users : [...]
clusters: [...]
context: [{user,cluster}]

In section user authentication data is stored. This is your passport to the world of Kubernetes. As such a passport you can use:

  • x509 client certificates;

  • basic authentication credentials;

  • Bearer token;

  • a third-party exec plugin that will return either a client certificate or a Bearer token.

Section clusters stores the address of the cluster to which requests will be sent.

Section context shows which user is accessing which cluster.

So we have a file kubeconfig. How to find out which subject is represented in it? Previously, this was not possible; debugging Kubernetes authentication was difficult. But in 2023 we added this capability to kubectl. So now you can easily find out who you are. Just enter the command in the terminal $ kubectl auth whoami and get the output of your subject's attributes with values:

ATTRIBUTE    VALUE

Username     john.doe@example.com
Groups       [system:masters developers:mainteiners system:aithenticated]

Authentication problems using the example of OpenID Connect

Let's dive a little deeper and talk about authentication issues using OpenID Connect as an example. This is the secure and most popular external authenticator. Wikipedia speaksthat in 2016 more than 1.1 million sites supported the OpenID standard, I believe there are many more now. Many companies use the OpenID Connect provider. For example, if you store your code on GitHub or GitLab, then that's what you're dealing with.

But popularity doesn't mean there are no problems with the OpenID Connect authenticator.

The first is that you can only configure the authenticator using CLI arguments. Here is an example of the arguments that are typically used for configuration:

--oidc-issuer= ...
--oidc-groups-claim= ...
--oidc-ca= ...

This is an inflexible approach: setting complex settings will not work. Changing the values ​​is also inconvenient, because if you want to reconfigure authentication, you will have to restart the Kube API. And this is bad, because, among other things, you will need to disconnect all clients, which can lead to a malfunction.

Another complication is that you can only connect one provider to Kubernetes. I don't know why this unnecessary limitation appeared, but it is because of it that solutions like Dex. It can be used as an OpenID Connect splitter to connect many providers to Kubernetes.

The third problem is that the OpenID Connect authenticator was developed back in 2015 and has poor test coverage. Adding new features to this authenticator is simply scary.

The final complication is that the token verification for this authenticator is quite strange. It relies heavily on the OpenID token format, and you cannot verify the authenticity of tokens in other formats, e.g. SPIFFE. You can develop your own webhook for such tokens, but this will lead to more complex infrastructure, because it will be necessary to deploy a webhook and ensure its high availability.

In total, these problems lead to:

  • dissatisfied users;

  • low level of acceptance of features into authentication in Kubernetes;

  • workarounds to implement features that users wanted.

SIG Auth's motivation for authentication changes

SIG Auth is a group of people who work on authentication features in Kubernetes. I was one of the participants in the discussion of existing problems and I want to tell you about two main reasons for the changes that guided us.

Firstly, I wanted to do something extensible – add more authentication options. Secondly, I wanted to cover open issues – there were quite a lot of them on GitHub. Different users asked for their own use cases for authentication and we wanted to cover them all.

The result of the discussion was the development of a new authenticator – Structured Authentication Configuration.

New authenticator: Structured Authentication Config

The main difference of the new authenticator is its structured configuration. It needs a single CLI argument that specifies the path to a configuration file with a specific structure:

--authentication-config = <path-to-config-file>

Now, when the Kubernetes settings change, the API server will reload the configuration itself. You will no longer need to restart your API server. That is, one of the problems described above has already been solved.

The structure of the configuration file itself looks like this:

apiVersion: apiserver.config.k8s.io/v1alpha1
kind: AuthenticationConfiguration
jwt:
- issuer:
    url: https://example.com
    clientIDs:
    - my-app
  claimValidationRules: [...]
  claimMappings: {...}
  userInfoValidationRules: [...]

The only top parameter jwt (JSON Web Token) is an array of authenticators. Thanks to it, you can specify multiple authenticators in the configuration. Inside the parameter there are keys – validation rules, comparison of claims from a token with subject attributes and validation of user information. I will talk about them later in the article.

Structured Authentication Config replaces OpenID Connect. Therefore, you can either use only the new authenticator or only the old one. You won't be able to use them at the same time.

superseds OIDC:	  either --oidc-*
			      or     --authentication-config

Authentication Steps

The authentication steps with Structured Authentication Config are as follows:

First, the authenticator receives a request with a token. Then several basic checks are performed on the token: verification of signature, client, expiration date. If the basic checks are passed, we move on to claim validation. It is at this point that you can provide your own checks. For this purpose, Common Expression Language is used (CEL).

If everything is fine with the token, we move on to the claim mapping stage. Here, the token's assertions map to the subject's attributes. This also happens with CEL. And at the last stage of authentication, when we have already extracted the subject's attributes, we can apply additional checks to them. For example, validate authentication policies.

Common Expression Language

A few words about why we chose CEL and its advantages. This is an Apache Licensed expression language developed by Google. It is very fast, in many cases faster than compilation. It's easy to extend: you can write your own library and plug it into the CEL runtime.

You can retrieve values ​​in CEL because it is a query language. It also allows you to check the return type: we can check whether the return type is boolean, array or string. And this check is great for validation because it is already used for this purpose in many places in Kubernetes: for example, in Sergis, or as a separate thing for validating admission policy.

What problems does Structured Authentication Config solve?

So, Kubernetes has a flexible and extensible authenticator. But what exactly problems does it allow to solve? Let's look at five examples.

Using different URIs to detect and verify the issuer

All OpenID Connect providers have an address, e.g. https:oidc.example.com. It is used for two purposes:

  1. To access the endpoint /.well-known/openid-configurationto get provider keys for validation.

  2. To confirm that the token belongs to the issuer OpenID Connect.

When the OpenID Connect provider is outside of the Kubernetes cluster, everything is fine. We use an external address to access it, and everything works fine even with the previous version of the authenticator.

But what happens if the OpenID Connect provider is in the same cluster as kube-apiserver?

I don’t want the external address to go to the external network and then return to the cluster through ingress. This type of network context switching is not good from a performance or security perspective because the traffic goes outside of your cluster.

Structured Authentication Config allows you to have separate URI to detect and verify the issuer. In configuration it looks pretty simple: the first URL is the base issuer. The second one, if present, is used for discovery calls. No more requests outside the cluster:

jwt:
 - issuer:
    url: https://oidc.example.com./
    discoveryURL: https//oidc.ns.svc.cluster.local

Validation based on custom queries

Structured Authentication Config allows token validation based on user requests. Previously this was impossible.

Let's imagine that we have the following user data:

{
  "phone_number": +7 929 _",
  "phone _number_verified": true,
  "picture": "https://gravatar.com/path/to/pic"
}

Now, based on this data, we can perform additional checks during authentication. For example, add phone number validation using a simple CEL expression:

- expression: !claims.phone_number.startsWith("+7")
  message: Only Russian phone numbers are allowed

Here claims is an object in which all statements are stored. We get a phone number from it and then check that it starts with the dialing code “+7”. And if this is not the case, we display a message that only owners of a Russian number can authenticate in this cluster.

However, it is not necessary to use CEL. Yes, it is fast, but the cost of calculating expressions is still not zero. Therefore, the new authentication retains simple validation for cases where you need to check that a certain statement has a specific meaning:

- claim: phone_number_verifie
  requiredValue: true

I will also give a complex example of using CEL. It's a little strange, but still:

- expression: isURL(claims.picture)
              && URL(claims.picture).gerHostname() == "gravatar.com"
  message: Only gravatar images can be used as profile pictures

Kubernetes has a library for URLs and you can manipulate them. In the example above, we check that the profile picture matches the URL. And if yes, then we extract the hostname and check that it is equal to gravatar.com. And if yes again, then we allow the user to join the cluster, and if not, then we say that the profile picture needs to be changed.

The good thing about Structured Authentication Config here is that it is truly flexible. You can also write custom errors, which improves the user experience.

Converting assertions to subject attributes using transformations

Let's imagine that we have the following data:

{
  "name": "jane.doe",
  "sub": "550e8400-e29b-41d4",
  "roles": "admin,user"
}

With CEL we can pick up claim "name" and check that the user is from GitLab:

username:
  expression: "gitlab:" + claims.name

The result will be like this:

Username: "gitlab:jane.doe"

Again, CEL is not required. You can retrieve the value of the top claim as a user attribute. For example, use the token key sub for ID:

uid:
  claim: sub

Execution result:

UID: "550e8400-e29b-41d4"

And a more complex example with CEL. In the above data, the role claim is not an array, but a comma separated string "roles": "admin,user". Previously, it was not possible to use such a statement in Kubernetes; you had to somehow convert it into an array of strings. But now, thanks to CEL, you can do without unnecessary transformations.

We take the role statement, separate it with a comma using the function claims.roles.split and add the gitlab prefix to the resulting groups, as in the first example:

groups:
  expression: claims.roles.split(",")
              .map(g, "gitlab:"+g)

As a result, we get the following user attributes:

Groups: ["gitlab:admin", "gitlab:user"]

Structured Authentication Config allows us to go deeper into the token, and we no longer need to rely only on first-level claims. We can change claim values ​​at our discretion, and also convert data types.

Connecting multiple providers to a cluster

I already talked about what now jwt: [...] in the configuration file it is an array. Therefore we can define more than one authenticator. This can be used:

  • for migrations;

  • for outsourcing work;

  • upon acquisition or merger with other companies;

  • to use one provider for different checks.

Migration. You developed your code on GitLab and used it as a provider. But your company has grown, so you want to move to a self-hosted installation of Keycloak. Now during migration you can use both providers to access the Kubernetes cluster.

Outsourcing. You want to outsource some of your work to another company. You have Keycloak and they have Keycloak. You can connect both providers to your cluster, but do not forget to protect them with rules UserInfoValidatonRules.

Purchase of a company. You work for a large corporation that uses Keycloak. And your company decides to acquire a small startup that uses GitLab as its authentication provider. In this case, you can connect both Keycloak and GitHub to the cluster.

One provider for different checks. You have Keycloak and want to connect two different applications with different authentication policies or different assertion checks to the cluster. This is also possible, one of the users asked for such an opportunity.

User Authentication Policy

The new Structured Authentication Config features in this example will be most useful to managed service providers.

Let's imagine that we already have user attributes and need to check that they do not contain a system prefix systembecause when using it you can accidentally give the user extra rights:

UserInfoValidationRules:

  - expression: !username.startsWith("system:")
    message: '"system:" prefix in command'

  - expression: groups.all(g, !g.startsWith("system:"))
    message: '"system:" prefix in command'

With this check, a user with the following attributes will not be included in the cluster, because his groups contain system:masters:

{
  "name": "jane.doe",
  "groups": ["system:masters". "users"]
}

These innovations help protect the cluster from leaks.

Resume

  • Structured Authentication Config is the biggest authentication update in a long time. It is designed keeping in mind the needs of the engineering community.

  • Common Expression Language is a fast, reliable and flexible authentication engine. It allows you to do a lot of great things.

  • Structured Authentication Config may be expanded in the future. Perhaps we will add settings for other authenticators to this configuration, not just OpenID Connect.

You can already try Structured Authentication Config. It is a beta feature of Kubernetes 1.30 and will no doubt be globally available (GA) in the near future. You can try creating your own CEL policies in CEL Playground.

PS

Read also in our blog:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *