Last modified: November 26, 2024

This article is written in: 🇺🇸

Serializable and Repeatable Read in Database Systems

Transaction isolation levels are essential for maintaining data integrity and managing concurrency in database systems. Two of the highest isolation levels are Serializable and Repeatable Read, each offering different guarantees to prevent anomalies that can occur when multiple transactions interact with the same data concurrently.

The Need for Isolation Levels

In environments where multiple transactions execute at the same time, issues like dirty reads, non-repeatable reads, and phantom reads can arise. Without proper isolation, one transaction might read data that another transaction is modifying, leading to inconsistent or incorrect results. Isolation levels define how transactions are isolated from one another to prevent these problems.

Serializable Isolation Level

The Serializable isolation level is the strictest level, ensuring that transactions are completely isolated from each other. It guarantees that the outcome of executing transactions concurrently is the same as if they were executed sequentially in some order.

Illustration of Serializable Isolation:

Time | Transaction T1                | Transaction T2
--------------------------------------------------------------
T1   | BEGIN TRANSACTION             |
     | SELECT SUM(balance) FROM accounts; (Total = $10,000) |
     |                                |
T2   |                                | BEGIN TRANSACTION
     |                                | INSERT INTO accounts (id, balance) VALUES (101, $1,000);
     |                                |
T3   |                                | (Blocked until T1 completes)
     |                                |
T4   | COMMIT                         |
     |                                |
T5   |                                | INSERT completes
     |                                | COMMIT

In this scenario:

Repeatable Read Isolation Level

The Repeatable Read isolation level ensures that if a transaction reads a row, it will see the same data throughout the transaction, even if other transactions modify it. However, it doesn't prevent new rows (phantoms) from being inserted by other transactions.

Illustration of Repeatable Read Isolation:

Time | Transaction T1                  | Transaction T2
-----------------------------------------------------------------
T1   | BEGIN TRANSACTION               |
     | SELECT * FROM orders WHERE customer_id = 1; (Returns 5 rows) |
     |                                 |
T2   |                                 | BEGIN TRANSACTION
     |                                 | INSERT INTO orders (order_id, customer_id) VALUES (101, 1);
     |                                 | COMMIT
     |                                 |
T3   | SELECT * FROM orders WHERE customer_id = 1; (Returns 5 rows) |
     |                                 |
T4   | COMMIT                          |

In this example:

Comparing Serializable and Repeatable Read

Both isolation levels aim to maintain data consistency but differ in their handling of concurrent transactions and the types of anomalies they prevent.

Serializable Isolation Level:

Repeatable Read Isolation Level:

Practical Examples and Commands

Setting Isolation Levels in SQL:

To set the isolation level to Serializable:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
-- Transaction operations here
COMMIT;

To set the isolation level to Repeatable Read:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
-- Transaction operations here
COMMIT;

Interpretation:

When to Use Each Isolation Level

Choosing between Serializable and Repeatable Read depends on the specific needs of your application.

Balancing Performance and Consistency

Higher isolation levels like Serializable provide greater data integrity but can impact performance due to increased locking and decreased concurrency. Lower isolation levels improve performance but may expose the application to data anomalies.

Tips for Balancing:

Understanding Phantom Reads

Phantom reads occur when a transaction reads a set of rows that satisfy a condition and, upon re-reading, finds additional rows due to inserts by other transactions.

Example of Phantom Read under Repeatable Read:

Time | Transaction T1                  | Transaction T2
-----------------------------------------------------------------
T1   | BEGIN TRANSACTION               |
     | SELECT COUNT(*) FROM products WHERE category = 'Electronics'; (Returns 10) |
     |                                 |
T2   |                                 | BEGIN TRANSACTION
     |                                 | INSERT INTO products (product_id, category) VALUES (201, 'Electronics');
     |                                 | COMMIT
     |                                 |
T3   | SELECT COUNT(*) FROM products WHERE category = 'Electronics'; (Returns 11) |
     |                                 |
T4   | COMMIT                          |

Even under Repeatable Read, T1 sees the new product inserted by T2 when it re-executes the query, leading to a phantom read.

Strategies to Prevent Phantom Reads

If phantom reads pose a problem, consider using the Serializable isolation level or implementing additional locking mechanisms.

Using Serializable Isolation Level:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
-- Transaction operations
COMMIT;

Under Serializable, T1 would not see the new product inserted by T2 during its transaction.

Key Takeaways

Table of Contents

    Serializable and Repeatable Read in Database Systems
    1. The Need for Isolation Levels
    2. Serializable Isolation Level
    3. Repeatable Read Isolation Level
    4. Comparing Serializable and Repeatable Read
    5. Practical Examples and Commands
    6. When to Use Each Isolation Level
    7. Balancing Performance and Consistency
    8. Understanding Phantom Reads
    9. Strategies to Prevent Phantom Reads
    10. Key Takeaways