What can result from weakening the level of transaction isolation in databases

Hello everyone. In touch Vladislav Rodin. Currently, I am the head of the High Load Architect course at OTUS, and I also teach courses on software architecture.

In addition to teaching, as you may have noticed, I am also writing copyright material for the OTUS blog on Habré and I want to coincide with today’s article to launch the course “PostgreSQL”which the set is open right now.


Introduction

Last time we talked about the fact that transactions in databases serve to solve two problems: ensuring fault tolerance and access to data in a competitive environment. To complete these tasks, a transaction must have ACID properties. Today we’ll talk in detail about the letter I (isolation) in this abbreviation.

Insulation

Isolation solves the problem of accessing data in a competitive environment, effectively providing protection against race conditions. Ideally, isolation means serialization, that is, a property that ensures that the result of the execution of transactions in parallel is the same as if they were executed sequentially. The main problem of this property is that it is technically very difficult to provide and, as a result, has a significant impact on system performance. That is why isolation is often weakened, accepting the risks of some anomalies, which will be discussed below. The possibility of the occurrence of certain anomalies just the same characterizes the level of transaction isolation.

The most famous anomalies are: dirty read, non-repeatable read, phantom read, but in fact there are 5 more: dirty write, cursor lost update, lost update, read skew, write skew.

Dirty write

The essence of the anomaly is that transactions can overwrite non-committed data.

image

This anomaly is dangerous not only because the data may conflict after committing both transactions (as in the picture), but also because atomicity is violated: because we allow overwriting non-locked data, it is not clear how to roll back one transaction without hitting the other .

The anomaly is treated quite simply: we hang up the write lock before recording starts, forbidding other transactions to change the record until the lock is released.

Dirty read

Dirty read means reading uncommitted data.

image

Problems arise when, based on a sample, it is necessary to carry out some actions or make decisions.

To fix the anomaly, you can hang a read lock, but it will hit performance hard. It is much simpler to say that for a rollback transaction, the initial state of the data (before recording) must be stored in the system. Why not read from there? This is fairly inexpensive, so most databases remove dirty read by default.

Lost update

Lost update means lost updates, and the translation accurately reflects the essence of the problem:

image

In fact, the result of transaction T2 has been canceled. This situation is fixed by explicit or implicit write locks. That is, we either simply update the record, and then an implicit lock occurs, or we perform select for update, causing the occurrence of a lock on read and write. Please note that such an operation is quite dangerous: with our “innocent” reading, we block other readings. Some databases offer more secure select for share, allowing you to read data, but not allowing them to change.

Cursor lost update

For finer control, bases can offer other tools, for example, a cursor. A cursor is a structure that contains a set of lines and allows you to iterate over them. declare cursor_name for select_statement. The contents of the cursor are described by select.

Why do we need a cursor? The fact is that some databases offer locking on all records selected by select (read stability), or only on the record on which the cursor is currently located (cursor stability). With cursor stability, a short lock is implemented, which allows us to reduce the number of locks if we iterate over a large data sample. Therefore, the lost update anomaly is highlighted separately for the cursor.

Non-repeatable read

Non-repeatable read is that during the execution of our transaction 2 consecutive reads of the same record will lead to different results, because another transaction intervened between these two reads, changed our data and was committed.

image

Why is this a problem at all? Imagine that the purpose of transaction T2 in the picture is to select all products whose price is less than 150 cu Someone else updated the price to $ 200 Thus, the installed filter did not work.

These anomalies cease to occur when two-phase locks are added or when using the MVCC mechanism, which I would like to talk about separately.

Phantom read

Phantom is reading data that has been added by another transaction.

image

As an example, you can observe the wrong selection of the cheapest product when this anomaly occurs.

Getting rid of phantom readings is already quite difficult. Normal blocking is not enough, because we would not be able to block what is not yet available. 2PL systems use predictive locking, whereas MVCC systems use a transaction scheduler to cancel transactions that could be broken by an insert. Both the first and second mechanisms are quite heavy.

Read skew

Read skew arises when we work with several tables, the content of which should change in a consistent manner.

Suppose there are tables representing posts and their meta-information:

image

One transaction reads from tables, another changes them:

image

As a result of transaction T1, the post has title = Good and updated_by = T2, which is some inconsistency.

In fact, this is a non-repeatable read, but as part of several tables.

For correction, T1 can hang up locks on all lines that it will read, which will prevent T2 from changing information. In the case of MVCC, transaction T2 will be canceled. Protection against this anomaly can become important if we use cursors.

Write skew

This anomaly is also easier to explain with an example: suppose that in our system at least one doctor must be on duty, but both doctors decided to cancel their duty:

image

image

The anomaly led to the fact that not one of the doctors would go on duty. Why did this happen? Because the transaction checked a condition that might be violated by another transaction, and because of isolation, we did not see this change.

This is the same non-repeatable read. Alternatively, selects can hang locks on these records.

Write skew and read skew are combinations of previous anomalies. You can consider write skew, which is essentially a phantom read. Consider a table in which there are names of employees, their salary and the project on which they work:

image

image

As a result, we get the following picture: each manager thought that his change would not lead to a budget, so they made personnel changes, which in total led to an overrun.

The cause of the problem is exactly the same as in phantom reading.

findings

Weakening the level of transaction isolation in the database is a compromise between security and performance, the choice of this level should be approached based on the potential risks to the business in the event of any anomalies.


Learn more about the course.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *