Isolation level “Repeatable Read”

Based on an article by Craig Freedman: Repeatable Read Isolation Level

The previous two articles (1, 2) have shown how queries with a “read committed” isolation level can generate unexpected results. This was made possible by changes to query-affected rows occurring at the same time. To avoid such surprises (but not all), you should use the “repeatable read” isolation level for sampling. In this article, we will just look at how concurrent changes behave with the “repeatable read” isolation level.
Unlike a “read committed” lookup, a “repeatable read” lookup holds locks on each affected row until the end of the transaction. Throughout the transaction, even some rows that do not match the selection as a result of the query may turn out to be locked. This locking ensures that query-affected rows are not modified or deleted in a concurrent session until the current transaction is completed (regardless of whether it is committed or rolled back). These locks do not prevent rows that have not yet been scanned from being modified or deleted, nor do they prevent new rows from being inserted between already locked rows. The figure below illustrates this behavior graphically:

Note that the ability to insert new “phantom” rows between locked rows that have already been scanned is the fundamental difference between read-only and serializable isolation levels. A serializable scan acquires a key range lock that prevents any new rows from being inserted anywhere in that range (and from modifying or deleting any existing rows in that range). Later in the article there will be some examples of how we can get unexpected results even when running queries with isolation level “repeatable read”. These examples are similar to those given in the previous two articles.

Move rows

First, let’s see how we can move a row in such a way as to cause a scan with the “repeatable read” isolation level to lose that row in the fetch. As in other examples from previous articles, we will need two sessions. Let’s start by creating a simple table:

create table t (a int primary key, b int)
insert t values (1, 1)
insert t values (2, 2)
insert t values (3, 3)

Next, in session 1, we block the second row:

begin tran
update t set b = 2 where a = 2

After that, in the second session, we will scan the table with “repeatable read”:

select * from t with (repeatableread)

This lookup reads the first row and then waits for the lock held in session 1 to be released on the second row. While the view is locked by session 1, let’s move the third row to the top of the table, doing so in the first session before committing the transaction and releasing the exclusive lock:

update t set a = 0 where a = 3
commit tran

As we expected, session 2 lost the third row and returns only two rows:

a    b    c
———– ———– ———–
1    1    1
2    2    2

Note that if we change the example so that session 1 tries to touch the first row of the table, this would result in a deadlock with session 2, which holds the lock on that row.

phantom strings

Now let’s look at the creation of phantom strings and how they can lead to unexpected results. This example is similar to the nested loops experiment from the previous article. Start by creating two tables:

create table t1 (a1 int primary key, b1 int)
insert t1 values (1, 9)
insert t1 values (2, 9)
create table t2 (a2 int primary key, b2 int)

Now in session 1 we lock the second row of table t1:

begin tran
update t1 set a1 = 2 where a1 = 2

Then, in session 2, make a fetch with an outer join and isolation level “repeatable read”:

set transaction isolation level repeatable read
select * from t1 left outer join t2 on b1 = a2

The query plan for this join uses a join statement with nested loops:

|–Nested Loops(Left Outer Join, WHERE:([t1].[b1]=[t2].[a2]))
… |–Clustered Index Scan(OBJECT:([t1].[PK__t1]))
… |–Clustered Index Scan(OBJECT:([t2].[PK__t2]))

This plan looks at the first row from t1, tries to connect it to t2, finds that there are no matching rows, and outputs NULL values ​​instead. It then locks on session 1, waiting for the lock on the second row t1 to be released. At this time, session 1 needs to insert a new row in t2 and release the lock:

insert t2 values (9, 0)
commit tran

Here is the output result for this outer join:

a1   b1   a2   b2
———– ———– ———– ———–
1    9    NULL NULL
2    9    9    0

Note that there is a NULL line here, and a line with values ​​for the same join key!

Summary

This was already noted in the summary of the previous article, but it’s worth re-emphasizing that the above results are not incorrect, but rather a side effect of running at a low isolation level. SQL Server guarantees constant data consistency.
CLARIFICATION 8/26/2008: The above examples work as described in this article when run in the context of the tempdb database. However, the SELECT statements in session 2 may not block, as described here, if the examples are run in the context of other databases. This is possible because of an optimization made in new versions where SQL Server avoids getting “read committed” locks when it knows the data on the page hasn’t changed. If you encounter this problem, run the examples in the context of the tempdb database, or change the UPDATE statement in session 1 to update the data in the modified row. For example, for the first example, try “update t set b = 12 where a = 2”.

Similar Posts

Leave a Reply