Check out part 1 and part 2 of the series on SQL Transaction isolation levels.
After covering the theory and repeatable read, it's time for the last part of the introduction to SQL transaction isolation levels. This time we will take a closer look at READ UNCOMMITTED
and READ COMMITTED
, as they seem to be similar (by name), but are oh so different. One is the weirdest, while the other is the most reasonable, at least for me.
Levels description
Just a quick reminder in regard with the SQL standard and the phenomena that can occur here (see part 1 for more information what are those), in READ UNCOMMITTED
we can have all the side effects we can think of (dirty read, nonrepeatable read, phantom read and serialization anomaly), while for READ COMMITTED
dirty reads should not be possible.
In PostgreSQL READ COMMITTED
is the default level (for MySQL it's REPEATABLE READ
).
PostgreSQL interpretation
Read committed
We start with the more reasonable one, as this story is much shorter and won't leave you with the nightmares as the other one. When a transaction is started in READ COMMITTED
level, we accept the fact that the same SELECT
statement returns a different result, as other transactions can alter the data. Take a look at the simplest example:
A: BEGIN;
A: SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
A: SELECT * FROM tbl WHERE <'condition'> -> 3 rows
B: BEGIN;
B: INSERT INTO tbl VALUES <maching 'condition'>;
A: SELECT * FROM tbl WHERE <condition> -> 3 rows
B: COMMIT;
A: SELECT * FROM tbl WHERE <condition> -> 4 rows
This makes our operations a bit less atomic (as we can select n
rows, then assume something that is not true a second later), but at the same time, our operations don't leave much mess behind. For instance, if you want to change all rows matching a certain condition, with REPEATABLE READ
you would only change the rows that were in place at the moment you started the transaction. Here, it's not the case, so your goal of having all rows changed does not run away from you.
Read uncommitted
Basic example
Now, this is where the story gets really interesting, as in my mind something like READ UNCOMMITTED
should not exist. Imagine the following scenario, where the session A
accepts all the changes made to the data, regardless if they've been committed or not:
> MySQL
A: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
A: BEGIN;
A: SELECT * FROM tbl WHERE <'condition'> -> 3 rows
B: BEGIN;
B: INSERT INTO tbl VALUES <maching 'condition'>;
A: SELECT * FROM tbl WHERE <condition> -> 4 rows
B: ROLLBACK;
A: SELECT * FROM tbl WHERE <'condition'> -> 3 rows
This is madness! And we just selected the data from the table - imagine that we took the data, calculated something based on this and stored in another place. The moment the second transaction is rolled back, the database is in an inconsistent state, even though we used transactions in both database sessions!
PostgreSQL denial
Again, I don't think that READ UNCOMMITTED
belongs in our world. Sounds too dramatic? Well, PostgreSQL agrees with me on this one, as even though you can set the transaction level there to UNCOMMITTED
, it will work the same way as COMMITTED
. How's that possible? The standard requires PostgreSQL to support a level with this name, but only tells us what phenomena can occur. As the list for COMMITTED
is included in the possible ones for UNCOMMITTED
, the standard is matched.
> PostgreSQL
A: BEGIN;
A: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
A: SELECT * FROM tbl WHERE <'condition'> -> 3 rows
B: BEGIN;
B: INSERT INTO tbl VALUES <maching 'condition'>;
A: SELECT * FROM tbl WHERE <condition> -> 3 rows
B: COMMIT;
A: SELECT * FROM tbl WHERE <'condition'> -> 4 rows
The documentation od PostgreSQL even states that they did this as it was the only sensible way.
Conflicting updates
Let's start roasting UNCOMMITTED
by making conflicting updates in two sessions. We will select data for updates with the same condition WHERE
clause and will update it so that it no longer matches the condition. We would expect that the first UPDATE
changes the data, and the second one does nothing. This is indeed what happens, but in a slightly weird way:
A: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; BEGIN;
B: BEGIN;
A: SELECT * FROM tbl WHERE <condition1>; -> 1 row
B: UPDATE tbl SET <condition1->condition2>; -> 1 changed
A: SELECT * FROM tbl WHERE <condition1>; -> 0 rows
A: UPDATE * FROM tbl WHERE <condition1>; -> WAIT...(?)
B: COMMIT;
A: WAIT finished -> 0 changed rows
As you can see, the session B
is first to change the row that matches condition1
. Since session A
already sees that change before being committed by B
(proven with SELECT
returning no rows) we should not be changing anything. The strange thing is that even though SELECT
returns zero rows, the UPDATE
hangs and waits for session B
to finish. It's easy to understand it in an isolated example in the blog post, but imagine your app freezing when trying to change nothing. The wait is released once transaction in session B
is completed.
MySQL inconsistencies
To makes matters worse, the UNCOMMITTED
part of the isolation level name is indeed for READ
part only. If you start two transactions, select rows that match a certain condition, and then insert another matching entry in the second session, the first one happily notices a new match. The problem is that when you try to UPDATE
all those rows, the change is being made only on the original selection!
A: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; BEGIN;
B: BEGIN;
A: SELECT * FROM tbl WHERE <condition1>; -> 3 rows
B: INSERT INTO tbl VALUES <maching 'condition1'>;
A: SELECT * FROM tbl WHERE <condition1>; -> 4 rows
A: UPDATE tbl SET <condition1->condition2>; -> 3 changed ?!
B: COMMIT;
A: SELECT * FROM tbl WHERE <condition1>; -> 1 row;
A: SELECT * FROM tbl WHERE <condition2>; -> 3 row;
This is so mind-boggling. Imagine going one step further, and trying to UPDATE
the rows again. It won't change anything, as the original selection is already altered. Then you try to see if any rows actually match the condition (remember, you can't change them anyway!) and there is one. But when you commit the transaction it's no longer available as the transaction in session B
is not committed yet! You are completely lost, where is that row? You try to select yet another time and there it is! WTF?
A: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; BEGIN;
B: BEGIN;
A: SELECT * FROM tbl WHERE <condition1>; -> 3 rows
B: INSERT INTO tbl VALUES <maching 'condition1'>;
A: SELECT * FROM tbl WHERE <condition1>; -> 4 rows
A: UPDATE tbl SET <condition1->condition2>; -> 3 changed ?!
A: SELECT * FROM tbl WHERE <condition1>; -> 1 row
A: UPDATE tbl SET <condition1->condition2>; -> 0 changed ?!
A: COMMIT;
A: SELECT * FROM tbl WHERE <condition1>; -> 0 rows
B: COMMIT;
A: SELECT * FROM tbl WHERE <condition1>; -> 1 row;
The difference is that the session in B
has been committed and now you can see the data that you saw that you shouldn't have seen and you couldn't have changed. Wow.
Summary
I hope this short introduction to SQL transaction isolation levels have been at least a bit fun for you, as it was definitely for me. I hope you are not going to use READ UNCOMMITTED
unless you want to troll somebody because it should definitely be illegal. Preparing the examples have also been a great experience for me, especially since I've heard so many people talking about how PostgreSQL is better in their opinion than MySQL, but I've never found any evidence myself, at least not any simple examples where one behaves differently. I encourage you to take a Docker image of either of these and play around so that you gain more understanding of how things work, and when they don't. Enjoy!
Versions
- PostgreSQL 11 (
postgres:11
from https://docs.docker.com/samples/library/postgres/) - MySQL 8 (
mysql:8
from https://docs.docker.com/samples/library/mysql/)
Read more & sources
- part 1, part 2 of this introduction
- Articles on sqlserverlogexplorer.com, thesqlgirl.com
- Documentation of PostgreSQL 11, MySQL 8