Last modified: November 28, 2024
This article is written in: 🇺🇸
What Is a Transaction?
A database transaction is a sequence of operations performed as a single, indivisible unit of work. These operations—such as inserting, updating, or deleting records—are executed together to ensure data integrity and consistency, especially when multiple users or processes access the database at the same time.
1. Initial State:
+------------------+ +------------------+
| Account A: $100 | | Account B: $50 |
+------------------+ +------------------+
2. Transaction Begins:
- Transfer $20 from Account A to Account B
3. After Transaction:
+------------------+ +------------------+
| Account A: $80 | | Account B: $70 |
+------------------+ +------------------+
In the example above, the transaction involves transferring $20 from Account A to Account B. If any part of this transaction fails—say, if the system crashes after debiting Account A but before crediting Account B—the transaction management system ensures that all changes are rolled back, returning the database to its initial state.
ACID Properties
Transactions are defined by their ACID properties: Atomicity, Consistency, Isolation, and Durability. These principles guarantee that database transactions are processed reliably, maintaining data integrity even in the face of errors, power failures, or other unexpected issues.
Atomicity
Atomicity ensures that a transaction is all-or-nothing. This means that either all operations within the transaction are completed successfully, or none are applied at all. If any operation fails, the entire transaction is aborted, and the database remains unchanged.
Transaction Steps:
1. Debit $20 from Account A
2. Credit $20 to Account B
If Step 2 fails, Step 1 is undone.
In this way, atomicity prevents partial updates that could lead to data inconsistencies.
Consistency
Consistency guarantees that a transaction brings the database from one valid state to another, adhering to all predefined rules such as integrity constraints and triggers. This means that any data written to the database must be valid according to all defined rules.
Before Transaction:
- Total Balance: $150
After Transaction:
- Total Balance: $150
The total balance remains consistent throughout the transaction.
This property ensures that the integrity of the database is maintained.
Isolation
Isolation means that concurrent transactions occur independently without interference. Each transaction operates as if it is the only one using the database, preventing transactions from seeing intermediate states of other concurrent transactions.
Transaction T1: Reads and updates Account A
Transaction T2: Reads and updates Account B
Even if T1 and T2 run simultaneously, they don't affect each other's operations.
Isolation prevents conflicts and ensures that transactions do not compromise each other's integrity.
Durability
Durability assures that once a transaction has been committed, its changes are permanent, even in the event of a system failure. The database system ensures that committed transactions are saved to non-volatile storage.
Transaction Committed:
- Changes are written to disk.
System Crash Occurs:
- After restart, the committed changes are still present.
This property guarantees that the results of a transaction won't be lost.
Analogy of a post office
Once upon a time in a quaint little village, there was a diligent postman named Tom. Tom had the responsibility of ensuring that all letters sent from the village's post office reached their rightful recipients.
One day, Tom received a special request. A villager, Alice, wanted to send two letters: one to her friend Bob and another to her cousin Charlie. Tom knew that both letters were part of a surprise birthday plan and thus, both had to be delivered together or not at all.
Tom carefully prepared both letters, ensuring that if one letter couldn't be delivered (perhaps due to an incorrect address), neither would be sent out. This way, Alice's surprise would remain a secret until both friends could be told at the same time, perfectly demonstrating the principle of ATOMICITY.
The post office had strict guidelines for sending letters. Each letter had to be stamped, sealed, and the address had to be correctly formatted. Tom meticulously checked Alice's letters to ensure they met all the criteria. When he found that one of the letters didn't have a stamp, he knew he couldn't send them until they were both properly prepared. This careful adherence to standards ensured the integrity and reliability of the postal service, much like how CONSISTENCY in databases ensures data integrity.
Meanwhile, other villagers were also sending and receiving letters. Tom was juggling multiple deliveries and pickups, but he was careful to treat each task independently. For instance, while preparing Alice's letters, he was also packaging a parcel for another villager, Dave. Despite switching between tasks, Dave's parcel and Alice's letters were handled as if they were the only tasks in the world, unaffected by each other. This highlighted the principle of ISOLATION, ensuring that simultaneous transactions don’t negatively impact each other.
Finally, after ensuring that both of Alice's letters were stamped, sealed, and addressed correctly, Tom sent them out for delivery. Once the letters were on their way, Tom knew that the task was irreversible and permanent. He recorded the dispatch in the post office's logbook, providing a tangible record of the transaction. Even if a storm came or his bicycle broke down, the post office guaranteed that the letters would reach Bob and Charlie. This commitment to delivering the letters despite potential obstacles demonstrated the principle of DURABILITY, akin to how once a transaction is committed in a database, it is permanently recorded and unaffected by subsequent failures.
Transaction Management
Managing transactions involves coordinating their execution to uphold the ACID properties. This ensures that the database remains reliable and consistent, even when multiple transactions occur concurrently.
- The Begin Transaction statement initializes a transaction, signaling that subsequent operations should be treated as part of a single unit of work.
- A Commit operation finalizes the transaction, ensuring that all changes made during the transaction are saved permanently to the database.
- The Rollback process cancels a transaction if an error occurs or if it is explicitly invoked, reverting the database to the state it was in before the transaction started.
- Mechanisms for Concurrency Control are implemented to allow multiple transactions to execute simultaneously without conflict, maintaining data consistency and isolation.
- Isolation Levels define the extent to which a transaction is isolated from others, with levels like Read Uncommitted, Read Committed, Repeatable Read, and Serializable offering varying degrees of isolation.
- The Atomicity property ensures that a transaction is executed entirely or not at all, maintaining data integrity in case of a failure.
- Consistency guarantees that a database transitions from one valid state to another, adhering to defined rules and constraints throughout the transaction process.
- The Durability principle ensures that once a transaction is committed, its changes are permanently stored, even in the event of a system failure.
- Locking Mechanisms such as shared locks and exclusive locks are employed to control access to resources and prevent conflicts between concurrent transactions.
- Deadlocks may occur when transactions wait indefinitely for resources locked by each other, requiring the database to detect and resolve such situations.
- Optimistic Concurrency Control assumes minimal conflict and checks for data integrity at the commit stage, avoiding locks during transaction execution.
- Pessimistic Concurrency Control uses locks to prevent conflicts upfront, ensuring that no other transaction modifies the data until the current one is complete.
- A Savepoint is a point within a transaction that allows partial rollbacks, enabling recovery to a specific state without undoing the entire transaction.
- Two-Phase Commit (2PC) is a protocol used in distributed transactions to ensure all participating nodes agree on the commit, enhancing reliability.
- Transaction Logs maintain a record of all changes made during a transaction, supporting recovery processes in case of failures.
- Read Phenomena like dirty reads, non-repeatable reads, and phantom reads are controlled by varying isolation levels to balance performance and consistency.
- Database Management Systems (DBMS) enforce ACID properties (Atomicity, Consistency, Isolation, Durability) to ensure reliable transaction processing.