Why may semi-synchronous replication be needed?

Hello. In touch Vladislav Rodin. I am currently teaching courses on software architecture and high load software architecture on the OTUS portal. In anticipation of the start of a new course flow “High Load Architect” I decided to write a little copyright material that I want to share with you.


Introduction

Due to the fact that only about 400-700 operations per second can be performed on the HDD (which is incomparable with typical rps falling on a heavily loaded system), the classical disk database is a narrow neck of architecture. Therefore, special attention should be paid to the scaling patterns of this repository.

At the moment there are 2 patterns of base scaling: replication and sharding. Sharding allows you to scale the write operation, and, as a result, reduce the rps for recording per one server in your cluster. Replication allows you to do the same, but with read operations. This article is devoted to this very pattern.

Replication

If you look at replication at a very top-level level, this is a simple thing: you had one server, the data was on it, and then this server ceased to cope with the load of reading this data. You add a couple more servers, synchronize data on all servers, and the user can read from any server in your cluster.

Despite the apparent simplicity, there are several options for classifying various implementations of this scheme:

  • By roles in the cluster (master-master or master-slave)
  • By forwarded objects (row-based, statement-based or mixed)
  • According to the node synchronization mechanism

Today we will deal with the 3rd point.

How transaction is committed

This topic does not directly relate to replication, a separate article can be written on it, however, since without understanding the mechanism of a transaction commit, further reading is useless, let me remind you of the most basic things. A transaction is committed in 3 stages:

  1. Writing a transaction to the database log.
  2. Application of a transaction in a database engine.
  3. Return confirmation to the client about the successful application of the transaction.

Nuances may arise in various databases in this algorithm: for example, in the InnoDB engine of the MySQL database there are 2 logs: one for replication (binary log) and the other for maintaining ACID (undo / redo log), while in PostgreSQL there is one log executing both functions (write ahead log = WAL). But above it is precisely the general concept that allows such nuances to be ignored.

Synchronous (sync) replication

Let’s add the logic for replicating the received changes to the transaction commit algorithm:

  1. Writing a transaction to the database log.
  2. Application of a transaction in a database engine.
  3. Sending data to all replicas.
  4. Receive confirmation from all replicas about the transaction on them.
  5. Return confirmation to the client about the successful application of the transaction.

With this approach, we get a number of disadvantages:

  • the client is waiting for changes to be applied to all replicas.
  • as the number of nodes in the cluster increases, we reduce the likelihood that the write operation will succeed.

If everything is more or less clear with the 1st paragraph, then the reasons for the 2nd paragraph should be clarified. If during synchronous replication we don’t get a response from at least one node, we roll back the transaction. Thus, increasing the number of nodes in the cluster, you increase the likelihood that the write operation will fail.

Can we wait for confirmation only from a certain fraction of nodes, for example, from 51% (quorum)? Yes, we can, but in the classical version, confirmation from all nodes is required, because this is how we can ensure the complete consistency of data in the cluster, which is an undoubted advantage of this type of replication.

Async replication

Let’s modify the previous algorithm. We will send data to the replicas “sometime later”, and “sometime later” the changes will be applied on the replicas:

  1. Writing a transaction to the database log.
  2. Application of a transaction in a database engine.
  3. Return confirmation to the client about the successful application of the transaction.
  4. Sending data to replicas and applying changes to them.

This approach leads to the fact that the cluster is fast, because we do not keep the client waiting for the data to reach the replicas and even be communicated.

But the condition of dropping data to replicas “sometime later” can lead to the loss of the transaction, and to the loss of the transaction confirmed to the user, because if the data did not have time to replicate, a confirmation was sent to the client about the success of the operation, and the node to which the changes came flew HDD, we are losing the transaction, which can lead to very unpleasant consequences.

Semisync replication

Finally, we got to semi-synchronous replication. This type of replication is not very known and not very common, but it is of considerable interest, since it can combine the advantages of both synchronous and asynchronous replication.

Let’s try to combine the 2 previous approaches. We will not keep the client for long, but we require that the data be replicated:

  1. Writing a transaction to the database log.
  2. Application of a transaction in a database engine.
  3. Sending data to replicas.
  4. Receive confirmation from the replica about the receipt of changes (they will be applied “sometime later”).
  5. Return confirmation to the client about the successful application of the transaction.

Please note that with this algorithm, transaction loss occurs only in the event of a fall of the node accepting the changes and the replica nodes. The probability of such a malfunction is considered small, and these risks are accepted.

But with this approach, the risk of phantom readings is possible. Imagine the following scenario: in step 4, we did not receive confirmation from any replica. We must roll back this transaction, and not return the confirmation to the client. Since the data was applied in step 2, there is a time gap between the end of step 2 and the transaction rollback, during which parallel transactions can see those changes that should not be in the database.

Lose-less semisync replication

If you think a little, you can only by changing the steps of the algorithm in places, fix the problem of phantom readings in this scenario:

  1. Writing a transaction to the database log.
  2. Sending replica data.
  3. Receive confirmation from the replica about the receipt of changes (they will be applied “sometime later”).
  4. Application of a transaction in a database engine.
  5. Return confirmation to the client about the successful application of the transaction.

Now we commit changes only if they are replicated.

Conclusion

As always, there are no perfect solutions, there is a set of solutions, each of which has its own advantages and disadvantages and is suitable for solving various classes of problems. This is also true for choosing a mechanism for synchronizing data in a replicated database. The set of benefits that semi-synchronous replication has is solid and interesting enough to be considered deserving of attention, despite its low prevalence.

That’s all. See you at course!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *