We understand the nuances of creating an operator in golang
Operators are software extensions to Kubernetes that make use of custom resources to manage applications and their components. Operators follow Kubernetes principles, notably the control loop. — from kubernetes.io
In this article, I tried to outline what to look for when writing an operator in golang and the nuances that are described in passing or not described at all in the official tutorial or other similar articles.
In this article, I will briefly show:
How to prepare the environment for creating an operator
How to write a program and what we can do inside the main event handling function (reconcealer)
When is the reconcealer called and how to manage it
How to exit the reconcealer
How to Consistently Create and Delete Cluster Objects
For example, we will create a secret-operator which will be:
Create the necessary secrets in all cluster namespaces
Generate secrets when creating a new namespace
Restore a secret if someone deletes it
Delete all children if our root object is deleted
What this operator does NOT do, to simplify the code:
A bit of theory
pattern operator realizable controller-runtime (kubebuilder, operator-sdk) is very similar to the pattern Observer (2). We “subscribe” to k8s events for the creation/modification/deletion of objects to which we must respond. When these resources change, the reconcile function is called, which is passed the name of the “parent” object to which the event data refers. The reconcile function describes checking the states of parent/child/other objects and reacting to these events. More details about how the subscription to events occurs and how the reconcile-loop works are described below.
Preparing the Development Environment
Installing golang
Download the required archive for the required OS by link.
Unzip the archive, for example, to the /opt/go-1.19.4 directory
Create a working directory for go and set environment variables
mkdir ~/go-1.19
export GOROOT=/opt/go-1.19.4
export GOPATH=~/go-1.19
export PATH=$GOROOT/bin:$GOPATH/bin:$PATH
Installing the operator SDK
Download and check the necessary binary executable file (link)
export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64)
echo -n arm64 ;; *) echo -n $(uname -m) ;; esac)
export OS=$(uname | awk '{print tolower($0)}')
export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/download/v1.26.0
curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH}
gpg --keyserver keyserver.ubuntu.com --recv-keys 052996E2A20B5C7E
curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt
curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt.asc
gpg -u "Operator SDK (release) <cncf-operator-sdk@cncf.io>" --verify checksums.txt.asc
grep operator-sdk_${OS}_${ARCH} checksums.txt | sha256sum -c -
chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk
Installing the IDE
If you don’t have a preferred IDE use Goland, you can download here. Trial 30 days when registering by e-mail.
After opening the first project, all that remains is to register GOROOT | GOPATH in the settings (File -> settings -> Go)
Preparing the operator SDK project
Description on the official website here
The project source code is stored on github
Let’s create a new project:
mkdir -p ~/go-1.19/src/github.com/ddnw/secret-operator
cd ~/go-1.19/src/github.com/ddnw/secret-operator
operator-sdk init --domain ddnw.ml --repo github.com/ddnw/secret-operator
Create a new API and controller:
operator-sdk create api --group multi --version v1alpha1 --kind MultiSecret --resource --controller
Change the name of the docker image to the one needed in the Makefile
IMAGE_TAG_BASE ?= ddnw/secret-operator
IMG ?= $(IMAGE_TAG_BASE):$(VERSION)
SDK command hint
#Запуск кодогенерации
make generate
#Создание манифестов
make manifests
# Сборка и пуш контейнера
make docker-build docker-push
# Установка CRD
make install
# Запуск контроллера в кластере
make deploy
# Удаление CRD и контроллера из кластера
make undeploy
# Удаление CRD
make uninstall
# Создание тестового объекта
kubectl apply -f config/samples/multi_v1alpha1_multisecret.yaml
Operator code structure
The operator code is divided into 2 main parts:
api – declaring our new “parent” entity for k8s
controller – code that reads the desired state of k8s objects and tries to apply it to k8s
API
After executing the command operator-sdk create api
files were generated along the path api/v1alpha1 in the multisecret_types.go file, our new “parent” entity is described for which we will write most of the subsequent code.
Let’s add the necessary fields to the spec section, which we will later put in our secrets, do not forget the json annotations – so that k8s can serialize this data later.
After each change in this part of the code, run make generate
to auto-generate the necessary part of the code in the zz_generated.deepcopy.go file
// MultiSecretSpec defines the desired state of MultiSecret
type MultiSecretSpec struct {
Data map[string][]byte `json:"data,omitempty"`
StringData map[string]string `json:"stringData,omitempty"`
Type SecretType `json:"type,omitempty"`
}
We also add a description of the status structure of our “parent” object
// MultiSecretStatus defines the observed state of MultiSecret
type MultiSecretStatus struct {
Wanted int `json:"wanted"`
Created int `json:"created"`
ChangeTime string `json:"change_time,omitempty"`
}
controller
The main logic of the multiSecret controller
Our controller executes the following logic:
Upon calling the reconciler, the multiv1alpha1.MultiSecret{} object is requested
Exit If it doesn’t exist
If the “parent” object is being deleted, delete all “child” secrets and exit (Finalizer)
We check what “child” secrets exist for all spaces (namespace), in accordance with the specification of the “parent” object, we create or delete them.
Reconcile-loop
Reconcile is the main function inside which we check the state of objects and bring them to the desired state. Function necessarily must be idempotent, you don’t know at what point in the lifetime of objects it will be called.
Exiting a function
From the reconcile function, there may be several options for exiting, depending on whether the loop still needs to be restarted or not.
ctrl.Result{}, err – an error occurred during execution due to which we cannot continue execution, we return it to restart the loop later.
ctrl.Result{Requeue: true}, nil – there are no runtime errors, but we return control to the controller so that it can process other objects, and later return to the current one again.
ctrl.result{}, nil – cycle restart is not required, desired state = existing.
ctrl.Result{RequeueAfter: 60 * time.Second}, nil – restart the cycle after a certain time. Can be used to make sure that the state of objects will be checked and applied in a given time interval.
Object lifecycle and cache
An object in k8s can be created, updated, or in the process of being deleted.
In addition to this, the controller has its own object state cache, on the one hand, this makes it possible not to care how many times we request the state of the object, but we also need to understand that we can “manage” to delete the same object 2 times. Or the reconciler can be called already on a remote object. To handle such situations, you need to compare the returned error to the absence of an object using the IsNotFound (err) function
func (r *MultiSecretReconciler) deleteSecret(ctx context.Context, secret *corev1.Secret) error {
log := ctrllog.FromContext(ctx)
err := r.Delete(ctx, secret)
if errors.IsNotFound(err) {
log.Info("corev1.Secret resource not found. Ignoring since object must be deleted",
"NameSpace", secret.Namespace, "Name", secret.Name)
return nil
}
if err != nil {
log.Error(err, "Failed to delete corev1.Secret",
"NameSpace", secret.Namespace, "Name", secret.Name)
return err
}
return nil
}
After each request for an object and deletion, you need to understand what type of error returned, is it a problem with access to the API or such an object does not exist, and based on this, decide what to do next. In the above example, if our multiSecret object does not exist, then we do nothing, but if it is an API access error, we return an error so that reconcileLoop is queued for execution again.
// Get MultiSecret object
mSecret := &multiv1alpha1.MultiSecret{}
err := r.Get(ctx, req.NamespacedName, mSecret)
if err != nil {
if errors.IsNotFound(err) {
log.Info("MultiSecret resource not found. Ignoring since object must be deleted")
return reconcile.Result{}, nil
}
log.Error(err, "Failed to get MultiSecret")
return reconcile.Result{}, err
}
Watching Resources
The SetupWithManager function – sets the events on which the reconciler should work, and also makes a comparison, which object’s reconciler should be called.
The first method is For – for the “parent” object. Here we do not need to invent anything, reconciler will be called on its creation/change/deletion events
If you have conceived a controller that will only handle events on objects created by itself, and the objects will be in the same namespace, then you can use handlers through the Owns method
https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/
func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&cachev1alpha1.Memcached{}).
Owns(&appsv1.Deployment{}).
Complete(r)
}
The main thing is not to forget to make a reference to the “parent” object from the “children”
// deploymentForMemcached returns a memcached Deployment object
func (r *MemcachedReconciler) deploymentForMemcached(m *cachev1alpha1.Memcached) *appsv1.Deployment {
ls := labelsForMemcached(m.Name)
replicas := m.Spec.Size
dep := &appsv1.Deployment{
...
}
// Set Memcached instance as the owner and controller
ctrl.SetControllerReference(m, dep, r.Scheme)
return dep
}
From the previously given TK, we need to work out:
Change events of our “parent” object
Objects that we will create (secrets) in different spaces (namespace)
Objects that do not belong to us, creating new spaces (namespace)
// SetupWithManager sets up the controller with the Manager.
func (r *MultiSecretReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&multiv1alpha1.MultiSecret{}).
Watches(
&source.Kind{Type: &corev1.Secret{}},
handler.EnqueueRequestsFromMapFunc(r.secretHandlerFunc),
builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
).
Watches(
&source.Kind{Type: &corev1.Namespace{}},
handler.Funcs{CreateFunc: r.nsHandlerFunc},
).
Complete(r)
}
What we see here is that we keep track of the corev1.Secret objects, call the secretHandlerFunc function – which makes a comparison to which “parent” object it belongs to.
func (r *MultiSecretReconciler) secretHandlerFunc(a client.Object) []reconcile.Request {
anno := a.GetAnnotations()
name, ok := anno[annotationOwnerName]
namespace, ok2 := anno[annotationOwnerNamespace]
if ok && ok2 {
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Name: name,
Namespace: namespace,
},
},
}
}
return []reconcile.Request{}
}
The function itself is simple, we look for the necessary annotations in the object, and return the call of the necessary reconciler of the “parent” object. The predicate specifies when to fire. This predicate is triggered by any change in the version of the object, the creation and deletion is also included here.
For corev1.Namespace, the behavior is similar, but we work out only the creation of the space and in nsHandlerFunc we issue calls to all the reconcilers of our “parent” objects.
func (r *MultiSecretReconciler) nsHandlerFunc(e event.CreateEvent, q workqueue.RateLimitingInterface) {
multiSecretList := &multiv1alpha1.MultiSecretList{}
err := r.List(context.TODO(), multiSecretList)
if err != nil {
return
}
for _, ms := range multiSecretList.Items {
q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
Name: ms.Name,
Namespace: ms.Namespace,
}})
}
}
You can read more about tracking resources here. here.
Finalizers
Finalizers are placed on an object so that it is possible to perform the necessary actions when deleting an object, for example, in our case, how to delete all “child” secrets.
If we create “child” objects in the same namespace as the “parent” object, then we don’t need finalizers to remove the “children”.
Enough pointing to the “parent” object, and k8s will delete them itself. more
В нашем случае при удалении “родительского” объекта мы хотим удалять и все “дочерние” во всех пространствах (namespaces) для этого будем использовать финализатор.
Идея простая: k8s при удалении объекта проставляет время удаления и смотрит есть ли у него финализаторы. Пока они есть – объект не удаляется и дается время чтобы контролеры могли закончить необходимые действия.
В нашем коде пишем следующую логику:
Если родительский объект не в стадии удаления и у него нет нашего финализатора, то мы его добавляем
Если родительский объект в стадии удаления, удаляем все “дочерние” и в случае успеха удаляем запись финализатора
inFinalizeStage := false
// Check Finalizer
if mSecret.ObjectMeta.DeletionTimestamp.IsZero() {
if !ctrlutil.ContainsFinalizer(mSecret, FinalizerName) {
ctrlutil.AddFinalizer(mSecret, FinalizerName)
if err := r.Update(ctx, mSecret); err != nil {
return ctrl.Result{}, err
}
changed = true
}
} else {
// The object is being deleted
inFinalizeStage = true
if ctrlutil.ContainsFinalizer(mSecret, FinalizerName) {
// our finalizer is present, so lets handle any external dependency
if err := r.deleteAllSecrets(ctx, genGlobalName(mSecret.Name, mSecret.Namespace, multiSecName), nameSpaces); err != nil {
// if fail to delete the external dependency here, return with error
// so that it can be retried
return ctrl.Result{}, err
}
changed = true
// remove our finalizer from the list and update it.
ctrlutil.RemoveFinalizer(mSecret, FinalizerName)
if err := r.Update(ctx, mSecret); err != nil {
return ctrl.Result{}, err
}
}
}
Status
Let’s add statuses to our object so that they are beautifully displayed on a get request:
$ k get multisecrets.multi.ddnw.ml
NAME WANTED CREATED CHANGETIME
multisecret-sample 9 9 2022-05-31T12:14:35+03:00
Add counters by code:
// Calculate Wanted Status
sWantedStatus := 0
existedSecrets := 0
changed := false
for _, ns := range nameSpaces {
if nsInList(mSecret, ns) {
sWantedStatus++
}
}
Add a status update when the function exits. The update will only work when exiting without errors, for this we have given names to the output parameters:
func (r *MultiSecretReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrlRes ctrl.Result, ctrlErr error)
Here you can also see how the patch of the object status is performed:
// Update Status on reconcile exit
defer func() {
if ctrlErr == nil {
if changed || sWantedStatus != mSecret.Status.Wanted || existedSecrets != mSecret.Status.Created {
patch := client.MergeFrom(mSecret.DeepCopy())
mSecret.Status.Wanted = sWantedStatus
mSecret.Status.Created = existedSecrets
mSecret.Status.ChangeTime = time.Now().Format(time.RFC3339)
ctrlErr = r.Status().Patch(ctx, mSecret, patch)
}
if ctrlErr != nil {
log.Error(ctrlErr, "Failed to update multiSecret Status",
"Namespace", mSecret.Namespace, "Name", mSecret.Name)
}
}
}()
Additionally, in order for the status output to work with get, you must additionally set markers for generating CRD, link for more information.
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:printcolumn:name="Wanted",type=integer,JSONPath=`.status.wanted`
//+kubebuilder:printcolumn:name="Created",type=integer,JSONPath=`.status.created`
//+kubebuilder:printcolumn:name="ChangeTime",type=string,JSONPath=`.status.change_time`
Events
In order to later understand what our controller does, let’s add the generation of events (Events), which can be seen, including in the output of the describe object of the multisecret.
To do this, add a recorder and rights in the marker to the structure of the reconcealer:
// MultiSecretReconciler reconciles a MultiSecret object
// +kubebuilder:rbac:groups="",resources=events,verbs=create;patch
type MultiSecretReconciler struct {
client.Client
Scheme *runtime.Scheme
Recorder record.EventRecorder
}
In main.go add recorder initialization:
if err = (&controllers.MultiSecretReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Recorder: mgr.GetEventRecorderFor("multisecret-controller"),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "MultiSecret")
os.Exit(1)
}
Subsequently, when creating and deleting secrets, we write events:
msg := fmt.Sprintf("Created corev1.Secret, NameSpace: %s, Name: %s", newSecret.Namespace, newSecret.Name)
r.Recorder.Event(mSecret, "Normal", "Created", msg)
k describe multisecrets.multi.itsumma.ru multisecret-sample
Name: multisecret-sample
....
Status:
change_time: 2022-05-31T14:02:18+03:00
Created: 9
Wanted: 9
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 2s (x3 over 160m) multisecret-controller Created corev1.Secret, NameSpace: secret-operator-system, Name: multisecret-sample.secret-operator-system.multisec
In this article, I tried to show how to extend the standard operator from the example to a working state. And we also considered how to monitor resources, how to change the status of a resource, how to manage a reconciler, how to write events, how to make your own finalizer.
PS: Many thanks to colleagues from IT-Summa for their contribution to the creation of this article.