Transactions, ACID Properties, Concurrency Control, and Recovery

Unit 4CLO05

Imagine transferring ₹500 from savings to checking. The bank must: (1) deduct ₹500 from savings, (2) add ₹500 to checking. What if the system crashes between steps? You'd lose ₹500! Transactions ensure either both steps happen or neither does—no in-between states. Understanding transactions, concurrency control, and recovery is essential for building reliable database applications.

The Basics

Transaction Concept

A transaction is a single logical unit of work that accesses and possibly modifies database contents. It consists of a sequence of read and write operations followed by commit or abort.

Properties:

  • Atomicity: All or nothing execution
  • Consistency: Database constraints maintained
  • Isolation: Concurrent transactions don't interfere
  • Durability: Committed changes persist

ACID Properties

1. Atomicity (All-or-Nothing)

  • Transaction treated as single unit
  • Either all operations execute or none
  • Rollback undoes partial execution
  • Enforced by transaction management subsystem

2. Consistency (Correctness)

  • Database moves from one consistent state to another
  • Integrity constraints satisfied before and after
  • Application programmer's responsibility to write correct transactions
  • DBMS enforces constraints

3. Isolation (Concurrent Execution)

  • Concurrent transactions isolated from each other
  • Intermediate state of one transaction invisible to others
  • Result same as serial execution
  • Enforced by concurrency control subsystem

4. Durability (Persistence)

  • Once committed, changes permanent
  • Survive system failures
  • Updates written to non-volatile storage
  • Enforced by recovery subsystem

Transaction States

Transaction State Diagram

  1. Active: Initial state, executing operations
  2. Partially Committed: After final operation, before commit
  3. Committed: Successfully completed, changes permanent
  4. Failed: Cannot proceed normally (error, abort, deadlock)
  5. Aborted: Rolled back, database restored to pre-transaction state
    • After abort: Either killed or restarted

Schedules

A schedule defines execution order of operations from concurrent transactions.

Types:

Serial Schedule: Transactions execute one after another

  • T1 completes before T2 starts
  • Always consistent
  • Poor concurrency (low throughput)

Concurrent Schedule: Operations interleaved

  • Better CPU and disk utilization
  • Improved throughput and response time
  • May cause inconsistencies without proper control

Goal: Maximize concurrency while ensuring correctness

Serializability

A concurrent schedule is serializable if equivalent to some serial schedule.

Why Important?

  • Serializable schedules guarantee consistency
  • Allow concurrent execution without anomalies
  • Easier to reason about correctness

Types:

  1. Conflict Serializability
  2. View Serializability

Technical Details

CONFLICT SERIALIZABILITY

Conflicting Operations:

Two operations conflict if:

  1. Belong to different transactions
  2. Access same data item
  3. At least one is a write

Conflict Types:

  1. Read-Write (RW): T1 reads X, T2 writes X
  2. Write-Read (WR): T1 writes X, T2 reads X (dirty read)
  3. Write-Write (WW): T1 writes X, T2 writes X (lost update)

Conflict Equivalence:

Two schedules are conflict equivalent if:

  • Contain same operations
  • Every pair of conflicting operations ordered the same way

Testing Conflict Serializability: Precedence Graph

Algorithm:

  1. Create node for each transaction
  2. For each conflict Ti → Tj (Ti's operation before Tj's):
    • Add directed edge Ti → Tj
  3. Schedule is conflict serializable ⟺ Graph is acyclic

Example 1: Serializable Schedule

Schedule S:
T1          T2
Read(A)
            Read(A)
Write(A)
            Write(A)
Read(B)
            Read(B)
Write(B)
            Write(B)

Precedence Graph:

  • T1 Write(A) before T2 Read(A) → T1 → T2
  • T1 Write(A) before T2 Write(A) → T1 → T2
  • T1 Write(B) before T2 Read(B) → T1 → T2
  • T1 Write(B) before T2 Write(B) → T1 → T2

Graph: T1 → T2 (no cycle)

Result: Conflict serializable (equivalent to T1, T2)

Example 2: Non-Serializable Schedule

Schedule S:
T1          T2
Read(A)
            Write(A)
Write(A)
Read(B)
            Read(B)
            Write(B)
Write(B)

Precedence Graph:

  • T1 Read(A) before T2 Write(A) → T1 → T2
  • T2 Write(A) before T1 Write(A) → T2 → T1

Graph: T1 ⟷ T2 (cycle!)

Result: NOT conflict serializable

VIEW SERIALIZABILITY

View Equivalence:

Two schedules S1 and S2 are view equivalent if:

  1. Initial Read: If Ti reads initial value of X in S1, Ti reads initial value of X in S2
  2. Updated Read: If Ti reads X written by Tj in S1, Ti reads X written by Tj in S2
  3. Final Write: If Ti writes final value of X in S1, Ti writes final value of X in S2

View Serializability:

Schedule S is view serializable if view equivalent to some serial schedule.

Relationship:

  • Conflict Serializability ⊆ View Serializability
  • Every conflict serializable schedule is view serializable
  • Some view serializable schedules are NOT conflict serializable

Example: View Serializable but NOT Conflict Serializable

Schedule S:
T1          T2          T3
Read(A)
            Write(A)
Write(A)
                        Write(A)

Check View Equivalence with Serial T1→T2→T3:

  1. Initial Read: T1 reads A initially in both ✓
  2. Updated Read: No updated reads in S ✓
  3. Final Write: T3 writes final A in both ✓

Result: View serializable

Check Conflict Serializability:

  • T1 Read(A) before T2 Write(A) → T1 → T2
  • T2 Write(A) before T1 Write(A) → T2 → T1 (cycle!)

Result: NOT conflict serializable

CONCURRENCY CONTROL PROTOCOLS

1. LOCK-BASED PROTOCOLS

Lock Types:

Shared Lock (S-lock): Read lock

  • Multiple transactions can hold S-lock simultaneously
  • Allows reading, prevents writing

Exclusive Lock (X-lock): Write lock

  • Only one transaction can hold X-lock
  • Allows reading and writing
  • Prevents all other locks

Lock Compatibility Matrix:

SX
S
X

Locking Protocol:

Read(X):
  Lock-S(X)
  Read X
  Unlock(X)

Write(X):
  Lock-X(X)
  Write X
  Unlock(X)

Problems with Simple Locking:

  • Non-serializable schedules possible
  • Need more restrictive protocol

TWO-PHASE LOCKING (2PL)

Rules:

Phase 1 (Growing Phase):

  • Acquire locks
  • Cannot release any locks

Phase 2 (Shrinking Phase):

  • Release locks
  • Cannot acquire any locks

Theorem: If all transactions follow 2PL, resulting schedule is conflict serializable.

Example:

T1:
Lock-X(A)     // Growing phase starts
Read(A)
A := A - 50
Write(A)
Lock-X(B)     // Still growing
Read(B)
B := B + 50
Write(B)
Unlock(A)     // Shrinking phase starts
Unlock(B)     // Still shrinking

Variants:

1. Strict 2PL (S2PL)

  • Hold all exclusive locks until commit/abort
  • Prevents cascading rollbacks
  • Most commonly used

2. Rigorous 2PL

  • Hold ALL locks (shared and exclusive) until commit/abort
  • Simplifies recovery
  • Serialization order = commit order

Problems with 2PL:

1. Cascading Rollback

T1          T2
Lock-X(A)
Read(A)
A := A - 50
Write(A)
Unlock(A)   // Released early!
            Lock-X(A)
            Read(A)  // Dirty read
            A := A * 2
            Write(A)
Abort       // Must rollback T2 too!

Solution: Strict 2PL

2. Deadlock

T1          T2
Lock-X(A)
            Lock-X(B)
Wait for B
            Wait for A  // Deadlock!

DEADLOCK HANDLING

1. Deadlock Prevention

Wait-Die Scheme (non-preemptive):

  • Older transaction waits for younger
  • Younger aborted if requests lock held by older
  • Timestamp determines age

Wound-Wait Scheme (preemptive):

  • Older transaction "wounds" (aborts) younger
  • Younger waits for older
  • Less rollbacks than Wait-Die

2. Deadlock Detection

Wait-For Graph:

  • Node for each transaction
  • Edge Ti → Tj if Ti waiting for lock held by Tj
  • Deadlock ⟺ Cycle in graph

Detection Algorithm:

  • Periodically check for cycles
  • If cycle found, abort victim transaction
  • Victim selection: Minimize cost (age, work done, locks held)

3. Deadlock Recovery

Victim Selection Factors:

  • Transaction age (prefer younger)
  • Progress made (prefer less work done)
  • Number of locks held
  • Number of transactions to cascade rollback

2. TIMESTAMP-BASED PROTOCOLS

Idea: Use timestamps to order transactions without locks

Timestamp Assignment:

  • TS(Ti): Timestamp when Ti enters system
  • TS(Ti) < TS(Tj) means Ti entered before Tj
  • Serialization order = timestamp order

Timestamp Ordering (TO) Protocol:

For each data item X, maintain:

  • R-timestamp(X): Largest timestamp of transaction that read X
  • W-timestamp(X): Largest timestamp of transaction that wrote X

Rules:

Read(X) by Ti:

if TS(Ti) < W-timestamp(X):
    Reject and rollback Ti  // Trying to read future write
else:
    Execute read
    R-timestamp(X) = max(R-timestamp(X), TS(Ti))

Write(X) by Ti:

if TS(Ti) < R-timestamp(X):
    Reject and rollback Ti  // Trying to overwrite future read
else if TS(Ti) < W-timestamp(X):
    Reject and rollback Ti  // Trying to overwrite future write
else:
    Execute write
    W-timestamp(X) = TS(Ti)

Properties:

  • Conflict serializable (follows timestamp order)
  • Deadlock-free (no waiting)
  • May cause more rollbacks than 2PL

Thomas' Write Rule (Optimization):

Modified Write(X):

if TS(Ti) < R-timestamp(X):
    Reject and rollback Ti
else if TS(Ti) < W-timestamp(X):
    Ignore write (obsolete)  // Don't rollback!
else:
    Execute write
    W-timestamp(X) = TS(Ti)

Allows view serializable (not just conflict serializable)

3. OPTIMISTIC CONCURRENCY CONTROL (VALIDATION-BASED)

Assumption: Conflicts are rare

Three Phases:

1. Read Phase:

  • Read from database
  • Write to local copies (not database)
  • No locks acquired

2. Validation Phase:

  • Check if commit would cause serializability violation
  • Test if operations conflict with other transactions

3. Write Phase:

  • If validation succeeds: Write local copies to database
  • If validation fails: Abort and restart

Validation Test:

For transaction Ti validating:

Check against each Tj that committed during Ti's execution:

  1. Tj completed before Ti started (OK)
  2. Ti's read phase after Tj's write phase (OK)
  3. Ti doesn't read items written by Tj (OK)

Advantages:

  • No locks (high concurrency when conflicts rare)
  • No deadlocks
  • Read-only transactions never abort

Disadvantages:

  • May abort due to conflicts
  • Validation overhead
  • Best when conflicts actually rare

MULTI-VERSION CONCURRENCY CONTROL (MVCC)

Idea: Keep multiple versions of data items

Benefits:

  • Readers don't block writers
  • Writers don't block readers
  • Improved concurrency

Implementation:

Each write creates new version:

  • X1, X2, X3, ... (versions of X)
  • Each version has timestamp

Transaction reads appropriate version:

  • Read-timestamp determines which version

Example (PostgreSQL, Oracle):

Time    T1                  T2
1       BEGIN
2       Read(X) → X1
3                           BEGIN
4                           Write(X) → X2
5       Read(X) → X1        // Still reads old version!
6                           COMMIT
7       Read(X) → X1        // Consistent read
8       COMMIT

Snapshot Isolation:

  • Transaction sees snapshot of committed data
  • Reads always succeed (no blocking)
  • Writes checked for conflicts at commit

MVCC in Practice:

  • PostgreSQL: MVCC by default
  • MySQL InnoDB: MVCC for SELECT
  • Oracle: Read Consistency

RECOVERY SYSTEM

Failure Types:

1. Transaction Failure:

  • Logical errors (constraint violation, bad input)
  • System errors (deadlock)
  • Recovery: Abort transaction

2. System Crash:

  • Power failure, software bug
  • Main memory lost, disk intact
  • Recovery: Use logs to restore consistency

3. Disk Failure:

  • Head crash, disk corruption
  • Recovery: Use backup + log

LOG-BASED RECOVERY

Write-Ahead Logging (WAL):

Rule: Log record must be written to stable storage before database modified

Log Records:

<Ti start>               // Transaction begins
<Ti, X, V_old, V_new>   // Update record
<Ti commit>              // Transaction commits
<Ti abort>               // Transaction aborts

Example:

<T1 start>
<T1, A, 1000, 950>  // A: 1000 → 950
<T1, B, 2000, 2050> // B: 2000 → 2050
<T1 commit>

RECOVERY ALGORITHMS

1. DEFERRED DATABASE MODIFICATION

  • Don't write to database until commit
  • Log records written during transaction
  • At commit: Write changes to database

Recovery:

  • Redo: For committed transactions, replay updates
  • No Undo: Uncommitted transactions didn't modify database

2. IMMEDIATE DATABASE MODIFICATION

  • Write to database during transaction execution
  • Log records written before database updates (WAL)

Recovery:

  • Undo: For uncommitted transactions, restore old values
  • Redo: For committed transactions, replay updates

ARIES RECOVERY ALGORITHM

(Analysis, Redo, Undo)

Used by DB2, SQL Server, PostgreSQL (variants)

Three Phases:

1. Analysis Phase:

  • Scan log from last checkpoint
  • Identify active transactions at crash
  • Determine which pages dirty

2. Redo Phase:

  • Scan log forward from checkpoint
  • Redo all updates (even uncommitted)
  • Restores database to state at crash

3. Undo Phase:

  • Scan log backward
  • Undo updates of transactions active at crash
  • Restore database to consistent state

Checkpointing:

Periodic checkpoint reduces recovery time:

  1. Write all dirty pages to disk
  2. Write checkpoint record to log
  3. Recovery starts from last checkpoint

RECOVERABILITY

Recoverable Schedule:

If Ti reads item written by Tj, Tj must commit before Ti commits.

Example:

T1          T2
Write(A)
            Read(A)  // Dirty read
Commit
            Commit   // OK (T1 committed first)

Cascadeless Schedule:

If Ti reads item written by Tj, Tj must commit before Ti reads.

Example:

T1          T2
Write(A)
Commit
            Read(A)  // Clean read
            Commit

Strict Schedule:

No transaction reads or writes item until last transaction that wrote it commits/aborts.

Relationship:

Strict ⊂ Cascadeless ⊂ Recoverable

Best: Use Strict 2PL for strict schedules

Examples

COMPREHENSIVE EXAMPLES

EXAMPLE 1: ACID Properties in Banking

-- Transfer $100 from Account A to Account B
BEGIN TRANSACTION;

-- Read balances
DECLARE @balanceA DECIMAL(10,2);
DECLARE @balanceB DECIMAL(10,2);

SELECT @balanceA = Balance FROM Accounts WHERE AccountID = 'A';
SELECT @balanceB = Balance FROM Accounts WHERE AccountID = 'B';

-- Check sufficient funds (Consistency)
IF @balanceA >= 100
BEGIN
    -- Deduct from A (Atomicity)
    UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 'A';
    
    -- Add to B (Atomicity)
    UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 'B';
    
    -- Both succeed or both fail
    COMMIT; -- (Durability)
END
ELSE
BEGIN
    -- Insufficient funds
    ROLLBACK;
    PRINT 'Insufficient funds';
END

-- (Isolation): Other transactions don't see intermediate state

EXAMPLE 2: Conflict Serializability Analysis

Schedule S1:

Time    T1              T2
1       Read(X)
2                       Read(X)
3       X := X - 100
4       Write(X)
5                       X := X + 200
6                       Write(X)
7       Read(Y)
8                       Read(Y)
9       Y := Y + 100
10      Write(Y)
11                      Y := Y + 200
12                      Write(Y)

Conflicts:

  1. T1 Write(X) [4] before T2 Write(X) [6] → T1 → T2
  2. T1 Write(Y) [10] before T2 Write(Y) [12] → T1 → T2

Precedence Graph: T1 → T2 (no cycle)

Conclusion: Conflict serializable, equivalent to serial schedule T1→T2

Final Result:

  • If X=100, Y=100 initially
  • After T1→T2: X=200, Y=400

Schedule S2 (Non-Serializable):

Time    T1              T2
1       Read(X)
2                       Read(Y)
3       X := X - 100
4       Write(X)
5                       Y := Y - 100
6                       Write(Y)
7       Read(Y)
8                       Read(X)
9       Y := Y + 100
10      Write(Y)
11                      X := X + 100
12                      Write(X)

Conflicts:

  1. T1 Write(X) [4] before T2 Read(X) [8] → T1 → T2
  2. T2 Write(Y) [6] before T1 Read(Y) [7] → T2 → T1

Precedence Graph: T1 ⟷ T2 (cycle!)

Conclusion: NOT conflict serializable

EXAMPLE 3: Two-Phase Locking

Transaction T1 (Withdraw $100):

BEGIN TRANSACTION;

-- Growing Phase
Lock-S(Accounts)        -- Check existence
Lock-X(Account_A)       -- Will modify

IF Balance_A >= 100:
    Balance_A = Balance_A - 100
    UPDATE...
    
    -- Shrinking Phase (Strict 2PL: at commit)
    COMMIT
    Unlock-X(Account_A)
    Unlock-S(Accounts)
ELSE:
    ROLLBACK
    Unlock-X(Account_A)
    Unlock-S(Accounts)

EXAMPLE 4: Deadlock Detection

Time    T1                  T2                  T3
1       Lock-X(A)
2                           Lock-X(B)
3                                               Lock-X(C)
4       Request Lock-X(B)   
5       WAIT                Request Lock-X(C)
6                           WAIT                Request Lock-X(A)
7                                               WAIT

Wait-For Graph:
T1 → T2 → T3 → T1 (CYCLE - DEADLOCK!)

Victim Selection: Abort T3 (youngest)

8                                               ABORT
9                           Lock-X(C) granted
10                          COMMIT
11      Lock-X(B) granted
12      COMMIT

EXAMPLE 5: Timestamp Ordering

Transactions:

  • T1: TS = 10
  • T2: TS = 20
  • T3: TS = 30

Data Item X:

  • Initial R-TS(X) = 0, W-TS(X) = 0
Time    Operation           R-TS(X)     W-TS(X)     Result
1       T1: Read(X)         10          0           Success
2       T2: Write(X)        10          20          Success
3       T1: Write(X)        10          20          ABORT
        // TS(T1)=10 < W-TS(X)=20
        // T1 trying to overwrite T2's future write
        
4       T3: Read(X)         30          20          Success
5       T2: Write(X)        30          20          ABORT
        // TS(T2)=20 < R-TS(X)=30
        // T2 trying to overwrite T3's future read

EXAMPLE 6: Optimistic Concurrency Control

T1:                         T2:
Read Phase:                 Read Phase:
Read(X) [local]             
X = X + 100
                            Read(X) [local]
                            X = X * 2

Validation Phase:           
Check conflicts             
No conflicts found          
PASS                        

Write Phase:                
Write(X) to DB              

                            Validation Phase:
                            Check conflicts
                            T1 wrote X after T2 read
                            FAIL - ABORT T2

EXAMPLE 7: MVCC (Snapshot Isolation)

Timeline:
         T1                           T2
t1      BEGIN (Snapshot at t1)
t2      Read(X) → 100
t3                                   BEGIN (Snapshot at t3)
t4                                   Write(X=200)
t5      Read(X) → 100 (still!)       
t6                                   COMMIT
t7      Read(X) → 100 (consistent)
t8      COMMIT

Versions of X:
X@t0 = 100 (initial)
X@t6 = 200 (T2's write)

T1 reads X@t1 = 100 (snapshot isolation)
T2 reads X@t3 = 100, writes X@t6 = 200

EXAMPLE 8: Recovery Using Log

Log:
<T1 start>
<T1, A, 1000, 950>
<T1, B, 2000, 2050>
<T1 commit>
<T2 start>
<T2, C, 500, 450>
<T2, D, 300, 450>
*** CRASH ***

Recovery:
1. Identify committed: T1 (has <T1 commit>)
2. Identify active: T2 (no commit/abort)

3. REDO T1:
   - A = 950
   - B = 2050

4. UNDO T2:
   - D = 300 (restore old value)
   - C = 500 (restore old value)

Final State:
- A = 950 (T1 committed)
- B = 2050 (T1 committed)
- C = 500 (T2 undone)
- D = 300 (T2 undone)

EXAMPLE 9: Cascading Rollback

T1          T2          T3
Write(A)
            Read(A)     // Dirty read from T1
                        Read(A)  // Dirty read from T1
Abort       // T1 aborts

Result:
- T1 aborts (application logic)
- T2 must abort (read uncommitted data from T1)
- T3 must abort (read uncommitted data from T1)

Cascading rollback!

Prevention: Use Strict 2PL
- Hold locks until commit
- No dirty reads possible

EXAMPLE 10: Isolation Levels

-- READ UNCOMMITTED: Allows dirty reads
Session 1:
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = 500 WHERE ID = 1;
-- Not committed yet

Session 2:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Returns: 500 (DIRTY READ!)

Session 1:
ROLLBACK; -- Oops, T2 saw uncommitted data!

-- READ COMMITTED: Prevents dirty reads
Session 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Waits for Session 1 to commit/rollback
-- Or reads last committed value

-- REPEATABLE READ: Prevents dirty & non-repeatable reads
Session 1:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT Balance FROM Accounts WHERE ID = 1; -- 1000
-- Another session updates to 500 and commits
SELECT Balance FROM Accounts WHERE ID = 1; -- Still 1000!
COMMIT;

-- SERIALIZABLE: Prevents all anomalies
Session 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
SELECT COUNT(*) FROM Accounts; -- 10
-- Another session inserts new account and commits
SELECT COUNT(*) FROM Accounts; -- Still 10! (No phantom)
COMMIT;

Real-World Use

PRACTICAL GUIDELINES

1. Choosing Concurrency Control:

Two-Phase Locking (2PL):

  • Use when: Traditional OLTP systems
  • Pros: Well-understood, widely implemented
  • Cons: Deadlocks possible, lower concurrency
  • Examples: Banking, e-commerce orders

Timestamp Ordering:

  • Use when: Read-heavy workloads
  • Pros: Deadlock-free, good for distributed systems
  • Cons: More rollbacks
  • Examples: Analytics, reporting

Optimistic:

  • Use when: Conflicts rare
  • Pros: High concurrency, no locks
  • Cons: Validation overhead, may abort
  • Examples: Document editing, wikis

MVCC:

  • Use when: High read/write concurrency needed
  • Pros: Readers don't block writers
  • Cons: Storage overhead (multiple versions)
  • Examples: PostgreSQL (default), Oracle

2. Transaction Design Best Practices:

Keep Transactions Short:

-- BAD: Long transaction
BEGIN TRANSACTION;
SELECT ... -- Heavy query
-- User interaction here! (NEVER!)
INSERT ...
COMMIT;

-- GOOD: Short transaction
SELECT ... -- Outside transaction
-- Process data
BEGIN TRANSACTION;
INSERT ...
COMMIT;

Acquire Locks in Consistent Order:

-- BAD: Deadlock possible
T1: Lock A, then Lock B
T2: Lock B, then Lock A

-- GOOD: Same order
T1: Lock A, then Lock B
T2: Lock A, then Lock B

Handle Deadlocks:

max_retries = 3
for attempt in range(max_retries):
    try:
        # Begin transaction
        conn.execute("BEGIN")
        # ... operations ...
        conn.execute("COMMIT")
        break  # Success
    except DeadlockException:
        conn.execute("ROLLBACK")
        if attempt == max_retries - 1:
            raise  # Give up
        time.sleep(random.uniform(0.1, 0.5))  # Random backoff

3. Choosing Isolation Level:

LevelDirty ReadNon-RepeatablePhantomUse Case
READ UNCOMMITTEDYesYesYesNever use
READ COMMITTEDNoYesYesDefault for most apps
REPEATABLE READNoNoYesFinancial reports
SERIALIZABLENoNoNoCritical operations

MySQL/InnoDB:

-- Default: REPEATABLE READ
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- ... operations ...
COMMIT;

PostgreSQL:

-- Default: READ COMMITTED
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- ... operations ...
COMMIT;

4. Monitoring Transactions:

PostgreSQL:

-- Active transactions
SELECT * FROM pg_stat_activity 
WHERE state = 'active';

-- Long-running transactions
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY duration DESC;

-- Blocking queries
SELECT blocked_locks.pid, blocked_activity.query,
       blocking_locks.pid AS blocking_pid, blocking_activity.query AS blocking_query
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid
JOIN pg_locks blocking_locks ON blocked_locks.locktype = blocking_locks.locktype
JOIN pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid
WHERE NOT blocked_locks.granted;

MySQL:

-- Active transactions
SHOW PROCESSLIST;

-- InnoDB status (includes locks, deadlocks)
SHOW ENGINE INNODB STATUS;

-- Lock waits
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

5. Deadlock Prevention:

Timeout-Based:

-- MySQL
SET innodb_lock_wait_timeout = 50; -- seconds

-- PostgreSQL
SET lock_timeout = '30s';
SET statement_timeout = '60s';

Lock Ordering:

  • Always acquire locks on tables in same order
  • Document locking policy
  • Code reviews to enforce

6. Recovery Configuration:

Write-Ahead Logging:

-- PostgreSQL
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = fsync

-- MySQL InnoDB
innodb_flush_log_at_trx_commit = 1
innodb_doublewrite = 1

Checkpoint Configuration:

-- PostgreSQL
checkpoint_timeout = 5min
max_wal_size = 1GB
checkpoint_completion_target = 0.9

-- MySQL
innodb_log_file_size = 256M
innodb_log_files_in_group = 2

7. Real-World Scenarios:

Banking (High Consistency):

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Transfer logic
COMMIT;

E-commerce (Balance):

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- Order processing
COMMIT;

Analytics (Read Performance):

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; -- Report can tolerate
BEGIN;
-- Long-running analytics query
COMMIT;

Social Media (MVCC):

  • Use database with MVCC (PostgreSQL)
  • Readers never block writers
  • High concurrent read/write throughput

For exams

IMPORTANT EXAM QUESTIONS

Concepts:

  1. Define transaction and explain ACID properties with examples.

    • Transaction: Logical unit of work
    • Atomicity: All or nothing
    • Consistency: Maintains invariants
    • Isolation: Concurrent execution doesn't interfere
    • Durability: Committed changes persist
  2. Draw and explain transaction state diagram.
    States: Active → Partially Committed → Committed
    Active → Failed → Aborted

  3. What is serializability? Why is it important?

    • Concurrent schedule equivalent to serial schedule
    • Ensures correctness of concurrent execution
    • Maximizes concurrency while maintaining consistency
  4. Explain conflict serializability. How do you test it?

    • Conflicting operations ordered same as some serial schedule
    • Test: Build precedence graph, check for cycles
    • No cycle → conflict serializable
  5. What is view serializability? How does it differ from conflict serializability?

    • View: Same initial reads, updated reads, final writes
    • View serializability ⊇ Conflict serializability
    • Some view serializable schedules not conflict serializable

Two-Phase Locking:

  1. Explain Two-Phase Locking protocol. Why does it ensure serializability?

    • Growing phase: Acquire locks
    • Shrinking phase: Release locks
    • Prevents conflicts, ensures serializable execution
  2. What are the problems with basic 2PL? How are they solved?

    • Cascading rollback → Strict 2PL
    • Deadlock → Detection/prevention schemes
  3. Compare Strict 2PL and Rigorous 2PL.

    • Strict: Hold X-locks until commit
    • Rigorous: Hold ALL locks until commit

Deadlocks:

  1. Explain deadlock detection using wait-for graph.

    • Node per transaction
    • Edge Ti→Tj if Ti waits for Tj
    • Cycle = deadlock
  2. Compare Wait-Die and Wound-Wait schemes.

    • Wait-Die: Older waits, younger aborts
    • Wound-Wait: Older aborts younger, younger waits

Timestamp Ordering:

  1. Explain timestamp-based concurrency control.

    • Each transaction gets timestamp
    • Read/Write rules based on R-TS and W-TS
    • Ensures timestamp order serialization
  2. What is Thomas' Write Rule? What does it allow?

    • Ignore obsolete writes instead of aborting
    • Allows view serializability

Other Protocols:

  1. Explain optimistic concurrency control. When is it preferred?

    • Read, Validate, Write phases
    • Best when conflicts rare
    • High concurrency, no locks
  2. What is MVCC? How does it improve concurrency?

    • Multiple versions of data
    • Readers don't block writers
    • Used in PostgreSQL, Oracle

Recovery:

  1. Explain Write-Ahead Logging (WAL) and its importance.

    • Log written before data
    • Ensures recovery possible
    • Foundation of ARIES
  2. What is the difference between undo and redo?

    • Undo: Restore old values (uncommitted transactions)
    • Redo: Replay updates (committed transactions)
  3. Explain ARIES recovery algorithm.

    • Analysis: Identify transactions to undo/redo
    • Redo: Replay all updates
    • Undo: Reverse uncommitted updates
  4. What is checkpointing? Why is it important?

    • Periodic snapshot to stable storage
    • Reduces recovery time
    • Recovery starts from checkpoint

Schedules:

  1. Given a schedule, determine if it's conflict serializable using precedence graph.

    • Build graph with conflicts
    • Check for cycles
    • No cycle → serializable
  2. What are recoverable, cascadeless, and strict schedules?

    • Recoverable: Read order ensures recoverability
    • Cascadeless: No dirty reads
    • Strict: No read/write until commit

Practical:

  1. Compare SQL isolation levels. Which anomalies does each prevent?

    LevelDirtyNon-repeatablePhantom
    READ UNCOMMITTEDAllowAllowAllow
    READ COMMITTEDPreventAllowAllow
    REPEATABLE READPreventPreventAllow
    SERIALIZABLEPreventPreventPrevent
  2. How do you handle deadlocks in application code?

    • Retry with exponential backoff
    • Timeout configuration
    • Error handling and rollback

QUICK REVISION

ACID:

  • Atomicity: All or nothing
  • Consistency: Maintain invariants
  • Isolation: Concurrent → serial
  • Durability: Committed → permanent

Serializability:

  • Conflict: Build precedence graph, no cycle
  • View: Initial read, updated read, final write same

2PL:

  • Growing: Acquire locks
  • Shrinking: Release locks
  • Strict: Hold X-locks until commit

Deadlock:

  • Detection: Wait-for graph, cycle
  • Prevention: Wait-Die, Wound-Wait
  • Recovery: Abort victim, retry

Timestamp:

  • TS order = serialization order
  • Check R-TS, W-TS before read/write
  • Thomas Write Rule: Ignore obsolete writes

Recovery:

  • WAL: Log before data
  • Undo: Uncommitted transactions
  • Redo: Committed transactions
  • ARIES: Analysis, Redo, Undo

Isolation Levels:

  • READ UNCOMMITTED < READ COMMITTED < REPEATABLE READ < SERIALIZABLE
  • Higher level = More consistency, Less concurrency

Key points

KEY TAKEAWAYS

ACID properties guarantee reliable transactions: Atomicity, Consistency, Isolation, Durability

Serializability ensures correctness: Concurrent execution equivalent to some serial execution

Conflict serializability tested with precedence graph: No cycle → serializable

View serializability is broader: Includes some non-conflict serializable schedules

Two-Phase Locking (2PL) ensures serializability: Growing phase, shrinking phase

Strict 2PL prevents cascading rollbacks: Hold X-locks until commit

Deadlocks require handling: Detection (wait-for graph) or prevention (Wait-Die, Wound-Wait)

Timestamp ordering is deadlock-free: But may cause more rollbacks

Optimistic concurrency good for rare conflicts: Read, validate, write phases

MVCC enables high concurrency: Readers don't block writers, multiple versions

Write-Ahead Logging (WAL) enables recovery: Log before data modification

Recovery involves undo and redo: Undo uncommitted, redo committed

ARIES is industry-standard recovery: Analysis, Redo, Undo phases

Isolation levels trade consistency for performance: SERIALIZABLE strongest, READ UNCOMMITTED weakest

Transaction design matters: Keep short, acquire locks in order, handle deadlocks

Choose protocol based on workload: 2PL for OLTP, MVCC for high concurrency, Optimistic for rare conflicts

REMEMBER: Transactions are the foundation of reliable database systems. Understanding concurrency control and recovery is essential for building robust applications that maintain data integrity even under concurrent access and system failures!