We understand the nuances of creating an operator in golang

Operators are software extensions to Kubernetes that make use of custom resources to manage applications and their components. Operators follow Kubernetes principles, notably the control loop. — from kubernetes.io

In this article, I tried to outline what to look for when writing an operator in golang and the nuances that are described in passing or not described at all in the official tutorial or other similar articles.

In this article, I will briefly show:

  • How to prepare the environment for creating an operator

  • How to write a program and what we can do inside the main event handling function (reconcealer)

  • When is the reconcealer called and how to manage it

  • How to exit the reconcealer

  • How to Consistently Create and Delete Cluster Objects

For example, we will create a secret-operator which will be:

  • Create the necessary secrets in all cluster namespaces

  • Generate secrets when creating a new namespace

  • Restore a secret if someone deletes it

  • Delete all children if our root object is deleted

What this operator does NOT do, to simplify the code:

A bit of theory

pattern operator realizable controller-runtime (kubebuilder, operator-sdk) is very similar to the pattern Observer (2). We “subscribe” to k8s events for the creation/modification/deletion of objects to which we must respond. When these resources change, the reconcile function is called, which is passed the name of the “parent” object to which the event data refers. The reconcile function describes checking the states of parent/child/other objects and reacting to these events. More details about how the subscription to events occurs and how the reconcile-loop works are described below.

Preparing the Development Environment

Installing golang

Download the required archive for the required OS by link.

Unzip the archive, for example, to the /opt/go-1.19.4 directory

Create a working directory for go and set environment variables

mkdir ~/go-1.19
export GOROOT=/opt/go-1.19.4
export GOPATH=~/go-1.19
export PATH=$GOROOT/bin:$GOPATH/bin:$PATH

Installing the operator SDK

Download and check the necessary binary executable file (link)

export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64) 
echo -n arm64 ;; *) echo -n $(uname -m) ;; esac)
export OS=$(uname | awk '{print tolower($0)}')
export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/download/v1.26.0
curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH}

gpg --keyserver keyserver.ubuntu.com --recv-keys 052996E2A20B5C7E
curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt
curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt.asc
gpg -u "Operator SDK (release) <cncf-operator-sdk@cncf.io>" --verify checksums.txt.asc
grep operator-sdk_${OS}_${ARCH} checksums.txt | sha256sum -c -

chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk

Installing the IDE

If you don’t have a preferred IDE use Goland, you can download here. Trial 30 days when registering by e-mail.

After opening the first project, all that remains is to register GOROOT | GOPATH in the settings (File -> settings -> Go)

Preparing the operator SDK project

Description on the official website here

The project source code is stored on github

Let’s create a new project:

mkdir -p ~/go-1.19/src/github.com/ddnw/secret-operator
cd ~/go-1.19/src/github.com/ddnw/secret-operator
operator-sdk init --domain ddnw.ml --repo github.com/ddnw/secret-operator

Create a new API and controller:

operator-sdk create api --group multi --version v1alpha1 --kind MultiSecret --resource --controller

Change the name of the docker image to the one needed in the Makefile

IMAGE_TAG_BASE ?= ddnw/secret-operator
IMG ?= $(IMAGE_TAG_BASE):$(VERSION)

SDK command hint

#Запуск кодогенерации
make generate
#Создание манифестов
make manifests
# Сборка и пуш контейнера
make docker-build docker-push
# Установка CRD
make install
# Запуск контроллера в кластере
make deploy
# Удаление CRD и контроллера из кластера
make undeploy
# Удаление CRD
make uninstall 
# Создание тестового объекта
kubectl apply -f config/samples/multi_v1alpha1_multisecret.yaml

Operator code structure

The operator code is divided into 2 main parts:

  • api – declaring our new “parent” entity for k8s

  • controller – code that reads the desired state of k8s objects and tries to apply it to k8s

API

After executing the command operator-sdk create api files were generated along the path api/v1alpha1 in the multisecret_types.go file, our new “parent” entity is described for which we will write most of the subsequent code.

Let’s add the necessary fields to the spec section, which we will later put in our secrets, do not forget the json annotations – so that k8s can serialize this data later.

After each change in this part of the code, run make generate to auto-generate the necessary part of the code in the zz_generated.deepcopy.go file

// MultiSecretSpec defines the desired state of MultiSecret
type MultiSecretSpec struct {
	Data       map[string][]byte `json:"data,omitempty"`
	StringData map[string]string `json:"stringData,omitempty"`
	Type       SecretType        `json:"type,omitempty"`
}

We also add a description of the status structure of our “parent” object

// MultiSecretStatus defines the observed state of MultiSecret
type MultiSecretStatus struct {
	Wanted     int    `json:"wanted"`
	Created    int    `json:"created"`
	ChangeTime string `json:"change_time,omitempty"`
}

controller

The main logic of the multiSecret controller

Our controller executes the following logic:

  1. Upon calling the reconciler, the multiv1alpha1.MultiSecret{} object is requested

  2. Exit If it doesn’t exist

  3. If the “parent” object is being deleted, delete all “child” secrets and exit (Finalizer)

  4. We check what “child” secrets exist for all spaces (namespace), in accordance with the specification of the “parent” object, we create or delete them.

Reconcile-loop

Reconcile is the main function inside which we check the state of objects and bring them to the desired state. Function necessarily must be idempotent, you don’t know at what point in the lifetime of objects it will be called.

Exiting a function

From the reconcile function, there may be several options for exiting, depending on whether the loop still needs to be restarted or not.

ctrl.Result{}, err – an error occurred during execution due to which we cannot continue execution, we return it to restart the loop later.

ctrl.Result{Requeue: true}, nil – there are no runtime errors, but we return control to the controller so that it can process other objects, and later return to the current one again.

ctrl.result{}, nil – cycle restart is not required, desired state = existing.

ctrl.Result{RequeueAfter: 60 * time.Second}, nil – restart the cycle after a certain time. Can be used to make sure that the state of objects will be checked and applied in a given time interval.

Object lifecycle and cache

An object in k8s can be created, updated, or in the process of being deleted.

In addition to this, the controller has its own object state cache, on the one hand, this makes it possible not to care how many times we request the state of the object, but we also need to understand that we can “manage” to delete the same object 2 times. Or the reconciler can be called already on a remote object. To handle such situations, you need to compare the returned error to the absence of an object using the IsNotFound (err) function

func (r *MultiSecretReconciler) deleteSecret(ctx context.Context, secret *corev1.Secret) error {
	log := ctrllog.FromContext(ctx)
	err := r.Delete(ctx, secret)
	if errors.IsNotFound(err) {
		log.Info("corev1.Secret resource not found. Ignoring since object must be deleted",
			"NameSpace", secret.Namespace, "Name", secret.Name)
		return nil
	}
	if err != nil {
		log.Error(err, "Failed to delete corev1.Secret",
			"NameSpace", secret.Namespace, "Name", secret.Name)
		return err
	}
	return nil
}

After each request for an object and deletion, you need to understand what type of error returned, is it a problem with access to the API or such an object does not exist, and based on this, decide what to do next. In the above example, if our multiSecret object does not exist, then we do nothing, but if it is an API access error, we return an error so that reconcileLoop is queued for execution again.

	// Get MultiSecret object
	mSecret := &multiv1alpha1.MultiSecret{}
	err := r.Get(ctx, req.NamespacedName, mSecret)
	if err != nil {
		if errors.IsNotFound(err) {
			log.Info("MultiSecret resource not found. Ignoring since object must be deleted")
			return reconcile.Result{}, nil
		}
		log.Error(err, "Failed to get MultiSecret")
		return reconcile.Result{}, err
	}

Watching Resources

The SetupWithManager function – sets the events on which the reconciler should work, and also makes a comparison, which object’s reconciler should be called.

The first method is For – for the “parent” object. Here we do not need to invent anything, reconciler will be called on its creation/change/deletion events

If you have conceived a controller that will only handle events on objects created by itself, and the objects will be in the same namespace, then you can use handlers through the Owns method

https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/

func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&cachev1alpha1.Memcached{}).
		Owns(&appsv1.Deployment{}).
		Complete(r)
}

The main thing is not to forget to make a reference to the “parent” object from the “children”

// deploymentForMemcached returns a memcached Deployment object
func (r *MemcachedReconciler) deploymentForMemcached(m *cachev1alpha1.Memcached) *appsv1.Deployment {
	ls := labelsForMemcached(m.Name)
	replicas := m.Spec.Size

	dep := &appsv1.Deployment{
    ...
	}
	// Set Memcached instance as the owner and controller
	ctrl.SetControllerReference(m, dep, r.Scheme)
	return dep
}

From the previously given TK, we need to work out:

  • Change events of our “parent” object

  • Objects that we will create (secrets) in different spaces (namespace)

  • Objects that do not belong to us, creating new spaces (namespace)

// SetupWithManager sets up the controller with the Manager.
func (r *MultiSecretReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&multiv1alpha1.MultiSecret{}).
		Watches(
			&source.Kind{Type: &corev1.Secret{}},
			handler.EnqueueRequestsFromMapFunc(r.secretHandlerFunc),
			builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
		).
		Watches(
			&source.Kind{Type: &corev1.Namespace{}},
			handler.Funcs{CreateFunc: r.nsHandlerFunc},
		).
		Complete(r)
}

What we see here is that we keep track of the corev1.Secret objects, call the secretHandlerFunc function – which makes a comparison to which “parent” object it belongs to.

func (r *MultiSecretReconciler) secretHandlerFunc(a client.Object) []reconcile.Request {
	anno := a.GetAnnotations()
	name, ok := anno[annotationOwnerName]
	namespace, ok2 := anno[annotationOwnerNamespace]
	if ok && ok2 {
		return []reconcile.Request{
			{
				NamespacedName: types.NamespacedName{
					Name:      name,
					Namespace: namespace,
				},
			},
		}
	}
	return []reconcile.Request{}
}

The function itself is simple, we look for the necessary annotations in the object, and return the call of the necessary reconciler of the “parent” object. The predicate specifies when to fire. This predicate is triggered by any change in the version of the object, the creation and deletion is also included here.

For corev1.Namespace, the behavior is similar, but we work out only the creation of the space and in nsHandlerFunc we issue calls to all the reconcilers of our “parent” objects.

func (r *MultiSecretReconciler) nsHandlerFunc(e event.CreateEvent, q workqueue.RateLimitingInterface) {
	multiSecretList := &multiv1alpha1.MultiSecretList{}
	err := r.List(context.TODO(), multiSecretList)
	if err != nil {
		return
	}
	for _, ms := range multiSecretList.Items {
		q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
			Name:      ms.Name,
			Namespace: ms.Namespace,
		}})
	}
}

You can read more about tracking resources here. here.

Finalizers

Finalizers are placed on an object so that it is possible to perform the necessary actions when deleting an object, for example, in our case, how to delete all “child” secrets.

If we create “child” objects in the same namespace as the “parent” object, then we don’t need finalizers to remove the “children”.

Enough pointing to the “parent” object, and k8s will delete them itself. more

В нашем случае при удалении “родительского” объекта мы хотим удалять и все “дочерние” во всех пространствах (namespaces) для этого будем использовать финализатор.

Идея простая: k8s при удалении объекта проставляет время удаления и смотрит есть ли у него финализаторы. Пока они есть – объект не удаляется и дается время чтобы контролеры могли закончить необходимые действия.

В нашем коде пишем следующую логику:

  • Если родительский объект не в стадии удаления и у него нет нашего финализатора, то мы его добавляем

  • Если родительский объект в стадии удаления, удаляем все “дочерние” и в случае успеха удаляем запись финализатора

	inFinalizeStage := false
	// Check Finalizer
	if mSecret.ObjectMeta.DeletionTimestamp.IsZero() {
		if !ctrlutil.ContainsFinalizer(mSecret, FinalizerName) {
			ctrlutil.AddFinalizer(mSecret, FinalizerName)
			if err := r.Update(ctx, mSecret); err != nil {
				return ctrl.Result{}, err
			}
			changed = true
		}
	} else {
		// The object is being deleted
		inFinalizeStage = true
		if ctrlutil.ContainsFinalizer(mSecret, FinalizerName) {
			// our finalizer is present, so lets handle any external dependency
			if err := r.deleteAllSecrets(ctx, genGlobalName(mSecret.Name, mSecret.Namespace, multiSecName), nameSpaces); err != nil {
				// if fail to delete the external dependency here, return with error
				// so that it can be retried
				return ctrl.Result{}, err
			}
			changed = true

			// remove our finalizer from the list and update it.
			ctrlutil.RemoveFinalizer(mSecret, FinalizerName)
			if err := r.Update(ctx, mSecret); err != nil {
				return ctrl.Result{}, err
			}
		}
	}

Status

Let’s add statuses to our object so that they are beautifully displayed on a get request:

$ k get multisecrets.multi.ddnw.ml
NAME             	WANTED   CREATED   CHANGETIME
multisecret-sample   9    	9     	2022-05-31T12:14:35+03:00

Add counters by code:

	// Calculate Wanted Status
	sWantedStatus := 0
	existedSecrets := 0
	changed := false
	for _, ns := range nameSpaces {
		if nsInList(mSecret, ns) {
			sWantedStatus++
		}
	}

Add a status update when the function exits. The update will only work when exiting without errors, for this we have given names to the output parameters:

func (r *MultiSecretReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrlRes ctrl.Result, ctrlErr error) 

Here you can also see how the patch of the object status is performed:

	// Update Status on reconcile exit
	defer func() {
		if ctrlErr == nil {
			if changed || sWantedStatus != mSecret.Status.Wanted || existedSecrets != mSecret.Status.Created {
				patch := client.MergeFrom(mSecret.DeepCopy())
				mSecret.Status.Wanted = sWantedStatus
				mSecret.Status.Created = existedSecrets
				mSecret.Status.ChangeTime = time.Now().Format(time.RFC3339)
				ctrlErr = r.Status().Patch(ctx, mSecret, patch)
			}
			if ctrlErr != nil {
				log.Error(ctrlErr, "Failed to update multiSecret Status",
					"Namespace", mSecret.Namespace, "Name", mSecret.Name)
			}
		}
	}()

Additionally, in order for the status output to work with get, you must additionally set markers for generating CRD, link for more information.

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:printcolumn:name="Wanted",type=integer,JSONPath=`.status.wanted`
//+kubebuilder:printcolumn:name="Created",type=integer,JSONPath=`.status.created`
//+kubebuilder:printcolumn:name="ChangeTime",type=string,JSONPath=`.status.change_time`

Events

In order to later understand what our controller does, let’s add the generation of events (Events), which can be seen, including in the output of the describe object of the multisecret.

To do this, add a recorder and rights in the marker to the structure of the reconcealer:

// MultiSecretReconciler reconciles a MultiSecret object
// +kubebuilder:rbac:groups="",resources=events,verbs=create;patch
type MultiSecretReconciler struct {
	client.Client
	Scheme   *runtime.Scheme
	Recorder record.EventRecorder
}

In main.go add recorder initialization:

	if err = (&controllers.MultiSecretReconciler{
		Client:   mgr.GetClient(),
		Scheme:   mgr.GetScheme(),
		Recorder: mgr.GetEventRecorderFor("multisecret-controller"),
	}).SetupWithManager(mgr); err != nil {
		setupLog.Error(err, "unable to create controller", "controller", "MultiSecret")
		os.Exit(1)
	}

Subsequently, when creating and deleting secrets, we write events:

msg := fmt.Sprintf("Created corev1.Secret, NameSpace: %s, Name: %s", newSecret.Namespace, newSecret.Name)
r.Recorder.Event(mSecret, "Normal", "Created", msg)
k describe multisecrets.multi.itsumma.ru multisecret-sample
Name:     	multisecret-sample
....
Status:
  change_time:  2022-05-31T14:02:18+03:00
  Created:  	9
  Wanted:   	9
Events:
  Type	Reason   Age            	From                	Message
  ----	------   ----           	----                	-------
  Normal  Created  2s (x3 over 160m)  multisecret-controller  Created corev1.Secret, NameSpace: secret-operator-system, Name: multisecret-sample.secret-operator-system.multisec

In this article, I tried to show how to extend the standard operator from the example to a working state. And we also considered how to monitor resources, how to change the status of a resource, how to manage a reconciler, how to write events, how to make your own finalizer.

PS: Many thanks to colleagues from IT-Summa for their contribution to the creation of this article.

Similar Posts

Leave a Reply