MySQL encryption: keystore

In anticipation of the start of a new recruitment for the course “Database” prepared a translation of a useful article for you.


Transparent Data Encryption (TDE) was introduced in Percona Server for MySQL and MySQL for quite some time. But have you ever wondered how it works under the hood and what impact TDE can have on your server? In this article series, we’ll take a look at how TDE works internally. Let’s start with the storage of keys, as this is required for any encryption to work. Then we’ll take a closer look at how encryption works in Percona Server for MySQL / MySQL and what additional features are available in Percona Server for MySQL.

MySQL Keyring

Keyring are plugins that allow the server to query, create and delete keys in a local file (keyring_file) or on a remote server (such as in HashiCorp Vault). Keys are always cached locally to speed up retrieval.

Plugins can be divided into two categories:

  • Local storage. For example, a local file (we call this a file-based keyring).
  • Remote storage. For example Vault Server (we call this server-based keyring).

This separation is important because different types of storage behave slightly differently, not only when storing and retrieving keys, but also when starting up.

When using a file storage, at startup, the entire contents of the storage are loaded into the cache: key id, key user, key type and the key itself.

In the case of a back-end vault (such as a Vault server), only the key id and key user are loaded at startup, so getting all the keys doesn’t slow down startup. Keys are loaded lazily. That is, the key itself is loaded from Vault only when it is actually needed. Once loaded, the key is cached in memory so that in the future there is no need to access the Vault Server over TLS connections. Next, let’s look at what information is present in the key store.

The key information contains the following:

  • key id – key identifier, for example:
    INNODBKey-764d382a-7324-11e9-ad8f-9cb6d0d5dc99-1
  • key type – key type based on the used encryption algorithm, possible values: “AES”, “RSA” or “DSA”.
  • key length – key length in bytes, AES: 16, 24 or 32, RSA 128, 256, 512 and DSA 128, 256 or 384.
  • user – the owner of the key. If the key is a system key, for example, Master Key, then this field is empty. If a key is generated using keyring_udf, this field identifies the owner of the key.
  • the key itself

The key is uniquely identified by the pair: key_id, user.

There are also differences in the storage and disposal of keys.

File storage is faster. One might assume that a key store is a simple one-time write of a key to a file, but no – there are more operations going on here. Any modification to the file storage will first create a backup of all content. Let’s say the file is called my_biggest_secrets, then the backup will be my_biggest_secrets.backup. Next, the cache is changed (keys are added or removed) and if everything is successful, the cache is flushed to a file. In rare cases, such as a server crash, you may see this backup file. The backup file is deleted the next time the keys are loaded (usually after a server restart).

When saving or deleting a key in the server repository, the repository must connect to the MySQL server with the “send the key” / “request key deletion” commands.

Let’s get back to the server startup speed. In addition to the fact that the storage itself affects the launch speed, there is also the question of how many keys from the storage you need to get at startup. Of course, this is especially important for back-end storage. At startup, the server checks which key is required for encrypted tables / tablespaces and requests the key from storage. On a “clean” server with Master Key – encryption, there must be one Master Key, which must be retrieved from the storage. However, more keys may be required, for example, when restoring a backup from the primary server to a backup server. In such cases, a Master Key rotation should be provided. This will be discussed in more detail in future articles, although here I would like to point out that a server using multiple Master Keys may take a little longer to start, especially when using a server-side key store.

Now let’s talk a little more about keyring_file. When I was developing keyring_file, I was also concerned about how to check for keyring_file changes while the server was running. In 5.7, the check was performed based on file statistics, which was not an ideal solution, and in 8.0 it was replaced with a SHA256 checksum.

When keyring_file is run for the first time, the file statistics and checksum are calculated, which are remembered by the server, and the changes are applied only if they match. The checksum is updated when the file is changed.

We have already covered many questions about keystores. However, there is another important topic that is often forgotten or misunderstood – the sharing of keys across servers.

What I mean? Each server (for example, Percona Server) in the cluster must have a separate location on the Vault server where the Percona Server must store its keys. Each Master Key stored in the repository contains the GUID of the Percona Server within its identifier. Why is it important? Imagine that you have only one Vault Server and all Percona Servers in the cluster are using that single Vault Server. The problem seems obvious. If all Percona Servers were using the Master Key without unique identifiers, such as id = 1, id = 2, etc., then all servers in the cluster would use the same Master Key. This is what the GUID provides – the distinction between servers. Why then talk about sharing keys between servers when a unique GUID already exists? There is one more plugin – keyring_udf. With this plugin, your server user can store their keys on the Vault server. The problem occurs when a user creates a key, for example, on server1, and then tries to create a key with the same ID on server2, for example:

--server1:
select keyring_key_store('ROB_1','AES',"123456789012345");
1
--1 значит успешное завершение
--server2:
select keyring_key_store('ROB_1','AES',"543210987654321");
1

Wait. Both servers are using the same Vault Server, shouldn’t the keyring_key_store function fail on server2? Interestingly, if you try to do the same on the same server, you will get an error:

--server1:
select keyring_key_store('ROB_1','AES',"123456789012345");
1
select keyring_key_store('ROB_1','AES',"543210987654321");
0

That’s right, ROB_1 already exists.

Let’s discuss the second example first. As we said earlier, keyring_vault or any other keyring plugin will cache all key ids in memory. Thus, after creating a new key, ROB_1 is added to server1, and besides sending this key to Vault, the key is also added to the cache. Now, when we try to add the same key a second time, keyring_vault checks if this key exists in the cache and throws an error.

In the first case, the situation is different. Server1 and server2 have separate caches. After adding ROB_1 to the key caches on server1 and Vault, the key caches on server2 are out of sync. There is no ROB_1 key in the cache on server2. Thus, the ROB_1 key is written to the keyring_key_store and to the Vault server, which actually overwrites (!) The previous value. Now the key ROB_1 on the Vault server is 543210987654321. Interestingly, the Vault server does not block such actions and easily overwrites the old value.

We can now see why splitting by server per Vault can be important – when you are using keyring_udf and want to store keys in a Vault. How do you provide this separation on the Vault server?

There are two ways to split into Vault. You can create different mount points for each server, or use different paths within the same mount point. This is best illustrated with examples. So let’s take a look at the individual mount points first:

--server1:
vault_url = http://127.0.0.1:8200
secret_mount_point = server1_mount
token = (...)
vault_ca = (...)

--server2:
vault_url = http://127.0.0.1:8200
secret_mount_point = sever2_mount
token = (...)
vault_ca = (...)

Here you can see that server1 and server2 are using different mount points. When splitting paths, the configuration will look like this:

--server1:
vault_url = http://127.0.0.1:8200
secret_mount_point = mount_point/server1
token = (...)
vault_ca = (...)
--server2:
vault_url = http://127.0.0.1:8200
secret_mount_point = mount_point/sever2
token = (...)
vault_ca = (...)

In this case, both servers use the same mount_point, but different paths. When the first secret is created on server1 along this path, the Vault automatically creates the “server1” directory. For server2, everything is the same. When you remove the last secret in mount_point / server1 or mount_point / server2, the Vault server also removes those directories. In case you are using path splitting, you only have to create one mount point and change the config files so that the servers use separate paths. A mount point can be created using an HTTP request. With CURL, it can be done like this:

curl -L -H "X-Vault-Token: TOKEN" –cacert VAULT_CA
--data '{"type":"generic"}' --request POST VAULT_URL/v1/sys/mounts/SECRET_MOUNT_POINT

All fields (TOKEN, VAULT_CA, VAULT_URL, SECRET_MOUNT_POINT) correspond to the parameters in the configuration file. You can of course use the Vault utilities to do the same. But this makes it easier to automate the creation of the mount point. I hope you find this information useful and we will see you in the next articles in this series.

Read more:

  • Sysbench and distribution of random variables

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *