Restoring MySQL. Problem solution

Hello! I'm Sasha Khrennikov, head of the DevOps unit at KTS.

We recently ran a DevOps challenge where we needed to recover a broken MySQL instance. It was not easy – the seven strongest DevOps masters, to whom we are already sending prize merch, completed it the fastest. Here they are, from left to right:

@ovsjke (Время прохождения: 31m 27s)
@mr_hightlook (Время прохождения: 34m 6s)
@angapov (Время прохождения: 51m 49s)
@iTem86 (Время прохождения: 55m 14s)
@Yepcock (Время прохождения: 56m 0s)
@Benosa19 (Время прохождения: 58m 51s)
@ovss_s (Время прохождения: 59m 38s)

In this article I will analyze the problem and show how it can be solved in two ways.

Table of contents

How to understand that the cluster is broken

First, let's figure out what happened to our cluster.

The first thing we see is pod mysql-0which is stuck in the Terminating state. To understand what happened, let's look at describe. Among other errors, we get the following line:

Handler 'on_pod_delete' failed temporarily: Cluster cannot be restored because there are unreachable pods

It looks like the pod has a stuck finalizer.

To get rid of this misunderstanding, you can safely run kubectl edit or patch, if that’s more common. Just remove these lines:

finalizers: 
- mysql.oracle.com/membership
- kopf.zalando.org/KopfFinalizerMarker

After this we see that the pod starts. But the service still doesn’t want to respond. Let's continue the investigation and look at the MySQL log:

[Note] [MY-010926] [Server] Access denied for user 'mysqladmin'@'10-1-128-45.mysql-operator.mysql-operator.svc.cluster.local' (using password: YES)

Googling the username “mysqladmin”, in the first link we find that this is a service user created by the operator. The password for it is in the mysql-privsecrets secret. More details follow the link.

We extract the password from the secret and, having fallen into the mysql container, we check that the password does not match. It looks like some villain changed it…

kubectl exec -it mysql-0 -c mysql -- /bin/bash
bash-5.1$ mysql -u mysqladmin -p
Enter password:
ERROR 1045 (28000): Access denied for user 'mysqladmin'@'localhost' (using password: YES)

How to restore the cluster? There are two ways.

Option one

We have the root user credentials and they are quite functional. Therefore, a simple and clear way would be to write his account from the mycluster-cluster-secret secret to the mysql-privsecrets secret: both the password and the username.

After which we see the treasured in the mysql-router logs:

metadata_cache WARNING [7f20d83d6640] Failed fetching metadata from metadata server on mysql-0.mysql-instances.default.svc.cluster.local:3306 - No result returned for metadata query
metadata_cache WARNING [7f20d83d6640] Metadata server mysql-0.mysql-instances.default.svc.cluster.local:3306 is not an online GR member - skipping.
metadata_cache INFO [7f20d83d6640] Potential changes detected in cluster after metadata refresh (view_id=0)
metadata_cache INFO [7f20d83d6640] Metadata for cluster 'mysql' has 1 member(s), single-primary:  
metadata_cache INFO [7f20d83d6640]     mysql-0.mysql-instances.default.svc.cluster.local:3306 / 33060 - mode=RW

Option two

You can see that we have only one database with data and take a dump from it.

kubectl exec -i mysql-0 -c mysql -- mysqldump -u root -p**** KTS > kts.dump

Save the cluster configs and clear them of unnecessary entries.

kubectl get InnoDBCluster mysql -o yaml > cluster.yaml

Delete the cluster and its pvc. True, here you may also need to work on cleaning finalizers.

kubectl delete InnoDBCluster mysql
kubectl delete pvc datadir-mysql-0

Recreate the cluster, since the mycluster-cluster-secret is created separately from the cluster, and the root user password will not change.

kubectl apply -f cluster.yaml

Well, the last step is to create a database and upload the dump back.

kubectl exec –i mysql-0 -c mysql -- mysql -u root -p**** -e "create database KTS;"kubectl exec -i mysql-0 -c mysql -- mysql -u root -p**** -D KTS <
kts.dump

About the operation of MySQL operator secrets

The peculiarity of the operator’s work is that when creating a cluster, he creates service secrets (including privsecrets) and users based on these secrets when initializing the database.

Further, these secrets are used only to connect to a running database. For us, this means that if you lose such a password, you will not be able to connect to the database or reset it without incident. So, if you don’t need such problems with the operation of the cluster, then perhaps you should think about backing up these secrets.

We have already conducted three DevOps challenges. We'll announce another one soon! Don't get lost, fly in to our bot and write /start — we’ll be one of the first to send you an invitation to the challenge

We've sorted out MySQL. And if you want more useful content on DevOps, I advise you to read our articles:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *