Transactions, ACID Properties, Concurrency Control, and Recovery
Imagine transferring ₹500 from savings to checking. The bank must: (1) deduct ₹500 from savings, (2) add ₹500 to checking. What if the system crashes between steps? You'd lose ₹500! Transactions ensure either both steps happen or neither does—no in-between states. Understanding transactions, concurrency control, and recovery is essential for building reliable database applications.
The Basics
Transaction Concept
A transaction is a single logical unit of work that accesses and possibly modifies database contents. It consists of a sequence of read and write operations followed by commit or abort.
Properties:
- Atomicity: All or nothing execution
- Consistency: Database constraints maintained
- Isolation: Concurrent transactions don't interfere
- Durability: Committed changes persist
ACID Properties
1. Atomicity (All-or-Nothing)
- Transaction treated as single unit
- Either all operations execute or none
- Rollback undoes partial execution
- Enforced by transaction management subsystem
2. Consistency (Correctness)
- Database moves from one consistent state to another
- Integrity constraints satisfied before and after
- Application programmer's responsibility to write correct transactions
- DBMS enforces constraints
3. Isolation (Concurrent Execution)
- Concurrent transactions isolated from each other
- Intermediate state of one transaction invisible to others
- Result same as serial execution
- Enforced by concurrency control subsystem
4. Durability (Persistence)
- Once committed, changes permanent
- Survive system failures
- Updates written to non-volatile storage
- Enforced by recovery subsystem
Transaction States
- Active: Initial state, executing operations
- Partially Committed: After final operation, before commit
- Committed: Successfully completed, changes permanent
- Failed: Cannot proceed normally (error, abort, deadlock)
- Aborted: Rolled back, database restored to pre-transaction state
- After abort: Either killed or restarted
Schedules
A schedule defines execution order of operations from concurrent transactions.
Types:
Serial Schedule: Transactions execute one after another
- T1 completes before T2 starts
- Always consistent
- Poor concurrency (low throughput)
Concurrent Schedule: Operations interleaved
- Better CPU and disk utilization
- Improved throughput and response time
- May cause inconsistencies without proper control
Goal: Maximize concurrency while ensuring correctness
Serializability
A concurrent schedule is serializable if equivalent to some serial schedule.
Why Important?
- Serializable schedules guarantee consistency
- Allow concurrent execution without anomalies
- Easier to reason about correctness
Types:
- Conflict Serializability
- View Serializability
Technical Details
CONFLICT SERIALIZABILITY
Conflicting Operations:
Two operations conflict if:
- Belong to different transactions
- Access same data item
- At least one is a write
Conflict Types:
- Read-Write (RW): T1 reads X, T2 writes X
- Write-Read (WR): T1 writes X, T2 reads X (dirty read)
- Write-Write (WW): T1 writes X, T2 writes X (lost update)
Conflict Equivalence:
Two schedules are conflict equivalent if:
- Contain same operations
- Every pair of conflicting operations ordered the same way
Testing Conflict Serializability: Precedence Graph
Algorithm:
- Create node for each transaction
- For each conflict Ti → Tj (Ti's operation before Tj's):
- Add directed edge Ti → Tj
- Schedule is conflict serializable ⟺ Graph is acyclic
Example 1: Serializable Schedule
Schedule S:
T1 T2
Read(A)
Read(A)
Write(A)
Write(A)
Read(B)
Read(B)
Write(B)
Write(B)
Precedence Graph:
- T1 Write(A) before T2 Read(A) → T1 → T2
- T1 Write(A) before T2 Write(A) → T1 → T2
- T1 Write(B) before T2 Read(B) → T1 → T2
- T1 Write(B) before T2 Write(B) → T1 → T2
Graph: T1 → T2 (no cycle)
Result: Conflict serializable (equivalent to T1, T2)
Example 2: Non-Serializable Schedule
Schedule S:
T1 T2
Read(A)
Write(A)
Write(A)
Read(B)
Read(B)
Write(B)
Write(B)
Precedence Graph:
- T1 Read(A) before T2 Write(A) → T1 → T2
- T2 Write(A) before T1 Write(A) → T2 → T1
Graph: T1 ⟷ T2 (cycle!)
Result: NOT conflict serializable
VIEW SERIALIZABILITY
View Equivalence:
Two schedules S1 and S2 are view equivalent if:
- Initial Read: If Ti reads initial value of X in S1, Ti reads initial value of X in S2
- Updated Read: If Ti reads X written by Tj in S1, Ti reads X written by Tj in S2
- Final Write: If Ti writes final value of X in S1, Ti writes final value of X in S2
View Serializability:
Schedule S is view serializable if view equivalent to some serial schedule.
Relationship:
- Conflict Serializability ⊆ View Serializability
- Every conflict serializable schedule is view serializable
- Some view serializable schedules are NOT conflict serializable
Example: View Serializable but NOT Conflict Serializable
Schedule S:
T1 T2 T3
Read(A)
Write(A)
Write(A)
Write(A)
Check View Equivalence with Serial T1→T2→T3:
- Initial Read: T1 reads A initially in both ✓
- Updated Read: No updated reads in S ✓
- Final Write: T3 writes final A in both ✓
Result: View serializable
Check Conflict Serializability:
- T1 Read(A) before T2 Write(A) → T1 → T2
- T2 Write(A) before T1 Write(A) → T2 → T1 (cycle!)
Result: NOT conflict serializable
CONCURRENCY CONTROL PROTOCOLS
1. LOCK-BASED PROTOCOLS
Lock Types:
Shared Lock (S-lock): Read lock
- Multiple transactions can hold S-lock simultaneously
- Allows reading, prevents writing
Exclusive Lock (X-lock): Write lock
- Only one transaction can hold X-lock
- Allows reading and writing
- Prevents all other locks
Lock Compatibility Matrix:
| S | X | |
|---|---|---|
| S | ✓ | ✗ |
| X | ✗ | ✗ |
Locking Protocol:
Read(X):
Lock-S(X)
Read X
Unlock(X)
Write(X):
Lock-X(X)
Write X
Unlock(X)
Problems with Simple Locking:
- Non-serializable schedules possible
- Need more restrictive protocol
TWO-PHASE LOCKING (2PL)
Rules:
Phase 1 (Growing Phase):
- Acquire locks
- Cannot release any locks
Phase 2 (Shrinking Phase):
- Release locks
- Cannot acquire any locks
Theorem: If all transactions follow 2PL, resulting schedule is conflict serializable.
Example:
T1:
Lock-X(A) // Growing phase starts
Read(A)
A := A - 50
Write(A)
Lock-X(B) // Still growing
Read(B)
B := B + 50
Write(B)
Unlock(A) // Shrinking phase starts
Unlock(B) // Still shrinking
Variants:
1. Strict 2PL (S2PL)
- Hold all exclusive locks until commit/abort
- Prevents cascading rollbacks
- Most commonly used
2. Rigorous 2PL
- Hold ALL locks (shared and exclusive) until commit/abort
- Simplifies recovery
- Serialization order = commit order
Problems with 2PL:
1. Cascading Rollback
T1 T2
Lock-X(A)
Read(A)
A := A - 50
Write(A)
Unlock(A) // Released early!
Lock-X(A)
Read(A) // Dirty read
A := A * 2
Write(A)
Abort // Must rollback T2 too!
Solution: Strict 2PL
2. Deadlock
T1 T2
Lock-X(A)
Lock-X(B)
Wait for B
Wait for A // Deadlock!
DEADLOCK HANDLING
1. Deadlock Prevention
Wait-Die Scheme (non-preemptive):
- Older transaction waits for younger
- Younger aborted if requests lock held by older
- Timestamp determines age
Wound-Wait Scheme (preemptive):
- Older transaction "wounds" (aborts) younger
- Younger waits for older
- Less rollbacks than Wait-Die
2. Deadlock Detection
Wait-For Graph:
- Node for each transaction
- Edge Ti → Tj if Ti waiting for lock held by Tj
- Deadlock ⟺ Cycle in graph
Detection Algorithm:
- Periodically check for cycles
- If cycle found, abort victim transaction
- Victim selection: Minimize cost (age, work done, locks held)
3. Deadlock Recovery
Victim Selection Factors:
- Transaction age (prefer younger)
- Progress made (prefer less work done)
- Number of locks held
- Number of transactions to cascade rollback
2. TIMESTAMP-BASED PROTOCOLS
Idea: Use timestamps to order transactions without locks
Timestamp Assignment:
- TS(Ti): Timestamp when Ti enters system
- TS(Ti) < TS(Tj) means Ti entered before Tj
- Serialization order = timestamp order
Timestamp Ordering (TO) Protocol:
For each data item X, maintain:
- R-timestamp(X): Largest timestamp of transaction that read X
- W-timestamp(X): Largest timestamp of transaction that wrote X
Rules:
Read(X) by Ti:
if TS(Ti) < W-timestamp(X):
Reject and rollback Ti // Trying to read future write
else:
Execute read
R-timestamp(X) = max(R-timestamp(X), TS(Ti))
Write(X) by Ti:
if TS(Ti) < R-timestamp(X):
Reject and rollback Ti // Trying to overwrite future read
else if TS(Ti) < W-timestamp(X):
Reject and rollback Ti // Trying to overwrite future write
else:
Execute write
W-timestamp(X) = TS(Ti)
Properties:
- Conflict serializable (follows timestamp order)
- Deadlock-free (no waiting)
- May cause more rollbacks than 2PL
Thomas' Write Rule (Optimization):
Modified Write(X):
if TS(Ti) < R-timestamp(X):
Reject and rollback Ti
else if TS(Ti) < W-timestamp(X):
Ignore write (obsolete) // Don't rollback!
else:
Execute write
W-timestamp(X) = TS(Ti)
Allows view serializable (not just conflict serializable)
3. OPTIMISTIC CONCURRENCY CONTROL (VALIDATION-BASED)
Assumption: Conflicts are rare
Three Phases:
1. Read Phase:
- Read from database
- Write to local copies (not database)
- No locks acquired
2. Validation Phase:
- Check if commit would cause serializability violation
- Test if operations conflict with other transactions
3. Write Phase:
- If validation succeeds: Write local copies to database
- If validation fails: Abort and restart
Validation Test:
For transaction Ti validating:
Check against each Tj that committed during Ti's execution:
- Tj completed before Ti started (OK)
- Ti's read phase after Tj's write phase (OK)
- Ti doesn't read items written by Tj (OK)
Advantages:
- No locks (high concurrency when conflicts rare)
- No deadlocks
- Read-only transactions never abort
Disadvantages:
- May abort due to conflicts
- Validation overhead
- Best when conflicts actually rare
MULTI-VERSION CONCURRENCY CONTROL (MVCC)
Idea: Keep multiple versions of data items
Benefits:
- Readers don't block writers
- Writers don't block readers
- Improved concurrency
Implementation:
Each write creates new version:
- X1, X2, X3, ... (versions of X)
- Each version has timestamp
Transaction reads appropriate version:
- Read-timestamp determines which version
Example (PostgreSQL, Oracle):
Time T1 T2
1 BEGIN
2 Read(X) → X1
3 BEGIN
4 Write(X) → X2
5 Read(X) → X1 // Still reads old version!
6 COMMIT
7 Read(X) → X1 // Consistent read
8 COMMIT
Snapshot Isolation:
- Transaction sees snapshot of committed data
- Reads always succeed (no blocking)
- Writes checked for conflicts at commit
MVCC in Practice:
- PostgreSQL: MVCC by default
- MySQL InnoDB: MVCC for SELECT
- Oracle: Read Consistency
RECOVERY SYSTEM
Failure Types:
1. Transaction Failure:
- Logical errors (constraint violation, bad input)
- System errors (deadlock)
- Recovery: Abort transaction
2. System Crash:
- Power failure, software bug
- Main memory lost, disk intact
- Recovery: Use logs to restore consistency
3. Disk Failure:
- Head crash, disk corruption
- Recovery: Use backup + log
LOG-BASED RECOVERY
Write-Ahead Logging (WAL):
Rule: Log record must be written to stable storage before database modified
Log Records:
<Ti start> // Transaction begins
<Ti, X, V_old, V_new> // Update record
<Ti commit> // Transaction commits
<Ti abort> // Transaction aborts
Example:
<T1 start>
<T1, A, 1000, 950> // A: 1000 → 950
<T1, B, 2000, 2050> // B: 2000 → 2050
<T1 commit>
RECOVERY ALGORITHMS
1. DEFERRED DATABASE MODIFICATION
- Don't write to database until commit
- Log records written during transaction
- At commit: Write changes to database
Recovery:
- Redo: For committed transactions, replay updates
- No Undo: Uncommitted transactions didn't modify database
2. IMMEDIATE DATABASE MODIFICATION
- Write to database during transaction execution
- Log records written before database updates (WAL)
Recovery:
- Undo: For uncommitted transactions, restore old values
- Redo: For committed transactions, replay updates
ARIES RECOVERY ALGORITHM
(Analysis, Redo, Undo)
Used by DB2, SQL Server, PostgreSQL (variants)
Three Phases:
1. Analysis Phase:
- Scan log from last checkpoint
- Identify active transactions at crash
- Determine which pages dirty
2. Redo Phase:
- Scan log forward from checkpoint
- Redo all updates (even uncommitted)
- Restores database to state at crash
3. Undo Phase:
- Scan log backward
- Undo updates of transactions active at crash
- Restore database to consistent state
Checkpointing:
Periodic checkpoint reduces recovery time:
- Write all dirty pages to disk
- Write checkpoint record to log
- Recovery starts from last checkpoint
RECOVERABILITY
Recoverable Schedule:
If Ti reads item written by Tj, Tj must commit before Ti commits.
Example:
T1 T2
Write(A)
Read(A) // Dirty read
Commit
Commit // OK (T1 committed first)
Cascadeless Schedule:
If Ti reads item written by Tj, Tj must commit before Ti reads.
Example:
T1 T2
Write(A)
Commit
Read(A) // Clean read
Commit
Strict Schedule:
No transaction reads or writes item until last transaction that wrote it commits/aborts.
Relationship:
Strict ⊂ Cascadeless ⊂ Recoverable
Best: Use Strict 2PL for strict schedules
Examples
COMPREHENSIVE EXAMPLES
EXAMPLE 1: ACID Properties in Banking
-- Transfer $100 from Account A to Account B
BEGIN TRANSACTION;
-- Read balances
DECLARE @balanceA DECIMAL(10,2);
DECLARE @balanceB DECIMAL(10,2);
SELECT @balanceA = Balance FROM Accounts WHERE AccountID = 'A';
SELECT @balanceB = Balance FROM Accounts WHERE AccountID = 'B';
-- Check sufficient funds (Consistency)
IF @balanceA >= 100
BEGIN
-- Deduct from A (Atomicity)
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 'A';
-- Add to B (Atomicity)
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 'B';
-- Both succeed or both fail
COMMIT; -- (Durability)
END
ELSE
BEGIN
-- Insufficient funds
ROLLBACK;
PRINT 'Insufficient funds';
END
-- (Isolation): Other transactions don't see intermediate state
EXAMPLE 2: Conflict Serializability Analysis
Schedule S1:
Time T1 T2
1 Read(X)
2 Read(X)
3 X := X - 100
4 Write(X)
5 X := X + 200
6 Write(X)
7 Read(Y)
8 Read(Y)
9 Y := Y + 100
10 Write(Y)
11 Y := Y + 200
12 Write(Y)
Conflicts:
- T1 Write(X) [4] before T2 Write(X) [6] → T1 → T2
- T1 Write(Y) [10] before T2 Write(Y) [12] → T1 → T2
Precedence Graph: T1 → T2 (no cycle)
Conclusion: Conflict serializable, equivalent to serial schedule T1→T2
Final Result:
- If X=100, Y=100 initially
- After T1→T2: X=200, Y=400
Schedule S2 (Non-Serializable):
Time T1 T2
1 Read(X)
2 Read(Y)
3 X := X - 100
4 Write(X)
5 Y := Y - 100
6 Write(Y)
7 Read(Y)
8 Read(X)
9 Y := Y + 100
10 Write(Y)
11 X := X + 100
12 Write(X)
Conflicts:
- T1 Write(X) [4] before T2 Read(X) [8] → T1 → T2
- T2 Write(Y) [6] before T1 Read(Y) [7] → T2 → T1
Precedence Graph: T1 ⟷ T2 (cycle!)
Conclusion: NOT conflict serializable
EXAMPLE 3: Two-Phase Locking
Transaction T1 (Withdraw $100):
BEGIN TRANSACTION;
-- Growing Phase
Lock-S(Accounts) -- Check existence
Lock-X(Account_A) -- Will modify
IF Balance_A >= 100:
Balance_A = Balance_A - 100
UPDATE...
-- Shrinking Phase (Strict 2PL: at commit)
COMMIT
Unlock-X(Account_A)
Unlock-S(Accounts)
ELSE:
ROLLBACK
Unlock-X(Account_A)
Unlock-S(Accounts)
EXAMPLE 4: Deadlock Detection
Time T1 T2 T3
1 Lock-X(A)
2 Lock-X(B)
3 Lock-X(C)
4 Request Lock-X(B)
5 WAIT Request Lock-X(C)
6 WAIT Request Lock-X(A)
7 WAIT
Wait-For Graph:
T1 → T2 → T3 → T1 (CYCLE - DEADLOCK!)
Victim Selection: Abort T3 (youngest)
8 ABORT
9 Lock-X(C) granted
10 COMMIT
11 Lock-X(B) granted
12 COMMIT
EXAMPLE 5: Timestamp Ordering
Transactions:
- T1: TS = 10
- T2: TS = 20
- T3: TS = 30
Data Item X:
- Initial R-TS(X) = 0, W-TS(X) = 0
Time Operation R-TS(X) W-TS(X) Result
1 T1: Read(X) 10 0 Success
2 T2: Write(X) 10 20 Success
3 T1: Write(X) 10 20 ABORT
// TS(T1)=10 < W-TS(X)=20
// T1 trying to overwrite T2's future write
4 T3: Read(X) 30 20 Success
5 T2: Write(X) 30 20 ABORT
// TS(T2)=20 < R-TS(X)=30
// T2 trying to overwrite T3's future read
EXAMPLE 6: Optimistic Concurrency Control
T1: T2:
Read Phase: Read Phase:
Read(X) [local]
X = X + 100
Read(X) [local]
X = X * 2
Validation Phase:
Check conflicts
No conflicts found
PASS
Write Phase:
Write(X) to DB
Validation Phase:
Check conflicts
T1 wrote X after T2 read
FAIL - ABORT T2
EXAMPLE 7: MVCC (Snapshot Isolation)
Timeline:
T1 T2
t1 BEGIN (Snapshot at t1)
t2 Read(X) → 100
t3 BEGIN (Snapshot at t3)
t4 Write(X=200)
t5 Read(X) → 100 (still!)
t6 COMMIT
t7 Read(X) → 100 (consistent)
t8 COMMIT
Versions of X:
X@t0 = 100 (initial)
X@t6 = 200 (T2's write)
T1 reads X@t1 = 100 (snapshot isolation)
T2 reads X@t3 = 100, writes X@t6 = 200
EXAMPLE 8: Recovery Using Log
Log:
<T1 start>
<T1, A, 1000, 950>
<T1, B, 2000, 2050>
<T1 commit>
<T2 start>
<T2, C, 500, 450>
<T2, D, 300, 450>
*** CRASH ***
Recovery:
1. Identify committed: T1 (has <T1 commit>)
2. Identify active: T2 (no commit/abort)
3. REDO T1:
- A = 950
- B = 2050
4. UNDO T2:
- D = 300 (restore old value)
- C = 500 (restore old value)
Final State:
- A = 950 (T1 committed)
- B = 2050 (T1 committed)
- C = 500 (T2 undone)
- D = 300 (T2 undone)
EXAMPLE 9: Cascading Rollback
T1 T2 T3
Write(A)
Read(A) // Dirty read from T1
Read(A) // Dirty read from T1
Abort // T1 aborts
Result:
- T1 aborts (application logic)
- T2 must abort (read uncommitted data from T1)
- T3 must abort (read uncommitted data from T1)
Cascading rollback!
Prevention: Use Strict 2PL
- Hold locks until commit
- No dirty reads possible
EXAMPLE 10: Isolation Levels
-- READ UNCOMMITTED: Allows dirty reads
Session 1:
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = 500 WHERE ID = 1;
-- Not committed yet
Session 2:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Returns: 500 (DIRTY READ!)
Session 1:
ROLLBACK; -- Oops, T2 saw uncommitted data!
-- READ COMMITTED: Prevents dirty reads
Session 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Waits for Session 1 to commit/rollback
-- Or reads last committed value
-- REPEATABLE READ: Prevents dirty & non-repeatable reads
Session 1:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT Balance FROM Accounts WHERE ID = 1; -- 1000
-- Another session updates to 500 and commits
SELECT Balance FROM Accounts WHERE ID = 1; -- Still 1000!
COMMIT;
-- SERIALIZABLE: Prevents all anomalies
Session 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
SELECT COUNT(*) FROM Accounts; -- 10
-- Another session inserts new account and commits
SELECT COUNT(*) FROM Accounts; -- Still 10! (No phantom)
COMMIT;
Real-World Use
PRACTICAL GUIDELINES
1. Choosing Concurrency Control:
Two-Phase Locking (2PL):
- Use when: Traditional OLTP systems
- Pros: Well-understood, widely implemented
- Cons: Deadlocks possible, lower concurrency
- Examples: Banking, e-commerce orders
Timestamp Ordering:
- Use when: Read-heavy workloads
- Pros: Deadlock-free, good for distributed systems
- Cons: More rollbacks
- Examples: Analytics, reporting
Optimistic:
- Use when: Conflicts rare
- Pros: High concurrency, no locks
- Cons: Validation overhead, may abort
- Examples: Document editing, wikis
MVCC:
- Use when: High read/write concurrency needed
- Pros: Readers don't block writers
- Cons: Storage overhead (multiple versions)
- Examples: PostgreSQL (default), Oracle
2. Transaction Design Best Practices:
Keep Transactions Short:
-- BAD: Long transaction
BEGIN TRANSACTION;
SELECT ... -- Heavy query
-- User interaction here! (NEVER!)
INSERT ...
COMMIT;
-- GOOD: Short transaction
SELECT ... -- Outside transaction
-- Process data
BEGIN TRANSACTION;
INSERT ...
COMMIT;
Acquire Locks in Consistent Order:
-- BAD: Deadlock possible
T1: Lock A, then Lock B
T2: Lock B, then Lock A
-- GOOD: Same order
T1: Lock A, then Lock B
T2: Lock A, then Lock B
Handle Deadlocks:
max_retries = 3
for attempt in range(max_retries):
try:
# Begin transaction
conn.execute("BEGIN")
# ... operations ...
conn.execute("COMMIT")
break # Success
except DeadlockException:
conn.execute("ROLLBACK")
if attempt == max_retries - 1:
raise # Give up
time.sleep(random.uniform(0.1, 0.5)) # Random backoff
3. Choosing Isolation Level:
| Level | Dirty Read | Non-Repeatable | Phantom | Use Case |
|---|---|---|---|---|
| READ UNCOMMITTED | Yes | Yes | Yes | Never use |
| READ COMMITTED | No | Yes | Yes | Default for most apps |
| REPEATABLE READ | No | No | Yes | Financial reports |
| SERIALIZABLE | No | No | No | Critical operations |
MySQL/InnoDB:
-- Default: REPEATABLE READ
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- ... operations ...
COMMIT;
PostgreSQL:
-- Default: READ COMMITTED
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- ... operations ...
COMMIT;
4. Monitoring Transactions:
PostgreSQL:
-- Active transactions
SELECT * FROM pg_stat_activity
WHERE state = 'active';
-- Long-running transactions
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY duration DESC;
-- Blocking queries
SELECT blocked_locks.pid, blocked_activity.query,
blocking_locks.pid AS blocking_pid, blocking_activity.query AS blocking_query
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid
JOIN pg_locks blocking_locks ON blocked_locks.locktype = blocking_locks.locktype
JOIN pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid
WHERE NOT blocked_locks.granted;
MySQL:
-- Active transactions
SHOW PROCESSLIST;
-- InnoDB status (includes locks, deadlocks)
SHOW ENGINE INNODB STATUS;
-- Lock waits
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;
5. Deadlock Prevention:
Timeout-Based:
-- MySQL
SET innodb_lock_wait_timeout = 50; -- seconds
-- PostgreSQL
SET lock_timeout = '30s';
SET statement_timeout = '60s';
Lock Ordering:
- Always acquire locks on tables in same order
- Document locking policy
- Code reviews to enforce
6. Recovery Configuration:
Write-Ahead Logging:
-- PostgreSQL
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = fsync
-- MySQL InnoDB
innodb_flush_log_at_trx_commit = 1
innodb_doublewrite = 1
Checkpoint Configuration:
-- PostgreSQL
checkpoint_timeout = 5min
max_wal_size = 1GB
checkpoint_completion_target = 0.9
-- MySQL
innodb_log_file_size = 256M
innodb_log_files_in_group = 2
7. Real-World Scenarios:
Banking (High Consistency):
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Transfer logic
COMMIT;
E-commerce (Balance):
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- Order processing
COMMIT;
Analytics (Read Performance):
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; -- Report can tolerate
BEGIN;
-- Long-running analytics query
COMMIT;
Social Media (MVCC):
- Use database with MVCC (PostgreSQL)
- Readers never block writers
- High concurrent read/write throughput
For exams
IMPORTANT EXAM QUESTIONS
Concepts:
-
Define transaction and explain ACID properties with examples.
- Transaction: Logical unit of work
- Atomicity: All or nothing
- Consistency: Maintains invariants
- Isolation: Concurrent execution doesn't interfere
- Durability: Committed changes persist
-
Draw and explain transaction state diagram.
States: Active → Partially Committed → Committed
Active → Failed → Aborted -
What is serializability? Why is it important?
- Concurrent schedule equivalent to serial schedule
- Ensures correctness of concurrent execution
- Maximizes concurrency while maintaining consistency
-
Explain conflict serializability. How do you test it?
- Conflicting operations ordered same as some serial schedule
- Test: Build precedence graph, check for cycles
- No cycle → conflict serializable
-
What is view serializability? How does it differ from conflict serializability?
- View: Same initial reads, updated reads, final writes
- View serializability ⊇ Conflict serializability
- Some view serializable schedules not conflict serializable
Two-Phase Locking:
-
Explain Two-Phase Locking protocol. Why does it ensure serializability?
- Growing phase: Acquire locks
- Shrinking phase: Release locks
- Prevents conflicts, ensures serializable execution
-
What are the problems with basic 2PL? How are they solved?
- Cascading rollback → Strict 2PL
- Deadlock → Detection/prevention schemes
-
Compare Strict 2PL and Rigorous 2PL.
- Strict: Hold X-locks until commit
- Rigorous: Hold ALL locks until commit
Deadlocks:
-
Explain deadlock detection using wait-for graph.
- Node per transaction
- Edge Ti→Tj if Ti waits for Tj
- Cycle = deadlock
-
Compare Wait-Die and Wound-Wait schemes.
- Wait-Die: Older waits, younger aborts
- Wound-Wait: Older aborts younger, younger waits
Timestamp Ordering:
-
Explain timestamp-based concurrency control.
- Each transaction gets timestamp
- Read/Write rules based on R-TS and W-TS
- Ensures timestamp order serialization
-
What is Thomas' Write Rule? What does it allow?
- Ignore obsolete writes instead of aborting
- Allows view serializability
Other Protocols:
-
Explain optimistic concurrency control. When is it preferred?
- Read, Validate, Write phases
- Best when conflicts rare
- High concurrency, no locks
-
What is MVCC? How does it improve concurrency?
- Multiple versions of data
- Readers don't block writers
- Used in PostgreSQL, Oracle
Recovery:
-
Explain Write-Ahead Logging (WAL) and its importance.
- Log written before data
- Ensures recovery possible
- Foundation of ARIES
-
What is the difference between undo and redo?
- Undo: Restore old values (uncommitted transactions)
- Redo: Replay updates (committed transactions)
-
Explain ARIES recovery algorithm.
- Analysis: Identify transactions to undo/redo
- Redo: Replay all updates
- Undo: Reverse uncommitted updates
-
What is checkpointing? Why is it important?
- Periodic snapshot to stable storage
- Reduces recovery time
- Recovery starts from checkpoint
Schedules:
-
Given a schedule, determine if it's conflict serializable using precedence graph.
- Build graph with conflicts
- Check for cycles
- No cycle → serializable
-
What are recoverable, cascadeless, and strict schedules?
- Recoverable: Read order ensures recoverability
- Cascadeless: No dirty reads
- Strict: No read/write until commit
Practical:
-
Compare SQL isolation levels. Which anomalies does each prevent?
Level Dirty Non-repeatable Phantom READ UNCOMMITTED Allow Allow Allow READ COMMITTED Prevent Allow Allow REPEATABLE READ Prevent Prevent Allow SERIALIZABLE Prevent Prevent Prevent -
How do you handle deadlocks in application code?
- Retry with exponential backoff
- Timeout configuration
- Error handling and rollback
QUICK REVISION
ACID:
- Atomicity: All or nothing
- Consistency: Maintain invariants
- Isolation: Concurrent → serial
- Durability: Committed → permanent
Serializability:
- Conflict: Build precedence graph, no cycle
- View: Initial read, updated read, final write same
2PL:
- Growing: Acquire locks
- Shrinking: Release locks
- Strict: Hold X-locks until commit
Deadlock:
- Detection: Wait-for graph, cycle
- Prevention: Wait-Die, Wound-Wait
- Recovery: Abort victim, retry
Timestamp:
- TS order = serialization order
- Check R-TS, W-TS before read/write
- Thomas Write Rule: Ignore obsolete writes
Recovery:
- WAL: Log before data
- Undo: Uncommitted transactions
- Redo: Committed transactions
- ARIES: Analysis, Redo, Undo
Isolation Levels:
- READ UNCOMMITTED < READ COMMITTED < REPEATABLE READ < SERIALIZABLE
- Higher level = More consistency, Less concurrency
Key points
KEY TAKEAWAYS
✓ ACID properties guarantee reliable transactions: Atomicity, Consistency, Isolation, Durability
✓ Serializability ensures correctness: Concurrent execution equivalent to some serial execution
✓ Conflict serializability tested with precedence graph: No cycle → serializable
✓ View serializability is broader: Includes some non-conflict serializable schedules
✓ Two-Phase Locking (2PL) ensures serializability: Growing phase, shrinking phase
✓ Strict 2PL prevents cascading rollbacks: Hold X-locks until commit
✓ Deadlocks require handling: Detection (wait-for graph) or prevention (Wait-Die, Wound-Wait)
✓ Timestamp ordering is deadlock-free: But may cause more rollbacks
✓ Optimistic concurrency good for rare conflicts: Read, validate, write phases
✓ MVCC enables high concurrency: Readers don't block writers, multiple versions
✓ Write-Ahead Logging (WAL) enables recovery: Log before data modification
✓ Recovery involves undo and redo: Undo uncommitted, redo committed
✓ ARIES is industry-standard recovery: Analysis, Redo, Undo phases
✓ Isolation levels trade consistency for performance: SERIALIZABLE strongest, READ UNCOMMITTED weakest
✓ Transaction design matters: Keep short, acquire locks in order, handle deadlocks
✓ Choose protocol based on workload: 2PL for OLTP, MVCC for high concurrency, Optimistic for rare conflicts
REMEMBER: Transactions are the foundation of reliable database systems. Understanding concurrency control and recovery is essential for building robust applications that maintain data integrity even under concurrent access and system failures!