Transactions, ACID Properties, Concurrency Control, and Recovery

Unit 4•CLO05

Imagine transferring ₹500 from savings to checking. The bank must: (1) deduct ₹500 from savings, (2) add ₹500 to checking. What if the system crashes between steps? You'd lose ₹500! Transactions ensure either both steps happen or neither does—no in-between states. Understanding transactions, concurrency control, and recovery is essential for building reliable database applications.

The Basics

Transaction Concept

A transaction is a single logical unit of work that accesses and possibly modifies database contents. It consists of a sequence of read and write operations followed by commit or abort.

Properties:

Atomicity: All or nothing execution
Consistency: Database constraints maintained
Isolation: Concurrent transactions don't interfere
Durability: Committed changes persist

ACID Properties

1. Atomicity (All-or-Nothing)

Transaction treated as single unit
Either all operations execute or none
Rollback undoes partial execution
Enforced by transaction management subsystem

2. Consistency (Correctness)

Database moves from one consistent state to another
Integrity constraints satisfied before and after
Application programmer's responsibility to write correct transactions
DBMS enforces constraints

3. Isolation (Concurrent Execution)

Concurrent transactions isolated from each other
Intermediate state of one transaction invisible to others
Result same as serial execution
Enforced by concurrency control subsystem

4. Durability (Persistence)

Once committed, changes permanent
Survive system failures
Updates written to non-volatile storage
Enforced by recovery subsystem

Transaction States

Transaction State Diagram

Active: Initial state, executing operations
Partially Committed: After final operation, before commit
Committed: Successfully completed, changes permanent
Failed: Cannot proceed normally (error, abort, deadlock)
Aborted: Rolled back, database restored to pre-transaction state
- After abort: Either killed or restarted

Schedules

A schedule defines execution order of operations from concurrent transactions.

Types:

Serial Schedule: Transactions execute one after another

T1 completes before T2 starts
Always consistent
Poor concurrency (low throughput)

Concurrent Schedule: Operations interleaved

Better CPU and disk utilization
Improved throughput and response time
May cause inconsistencies without proper control

Goal: Maximize concurrency while ensuring correctness

Serializability

A concurrent schedule is serializable if equivalent to some serial schedule.

Why Important?

Serializable schedules guarantee consistency
Allow concurrent execution without anomalies
Easier to reason about correctness

Types:

Conflict Serializability
View Serializability

Technical Details

CONFLICT SERIALIZABILITY

Conflicting Operations:

Two operations conflict if:

Belong to different transactions
Access same data item
At least one is a write

Conflict Types:

Read-Write (RW): T1 reads X, T2 writes X
Write-Read (WR): T1 writes X, T2 reads X (dirty read)
Write-Write (WW): T1 writes X, T2 writes X (lost update)

Conflict Equivalence:

Two schedules are conflict equivalent if:

Contain same operations
Every pair of conflicting operations ordered the same way

Testing Conflict Serializability: Precedence Graph

Algorithm:

Create node for each transaction
For each conflict Ti → Tj (Ti's operation before Tj's):
- Add directed edge Ti → Tj
Schedule is conflict serializable ⟺ Graph is acyclic

Example 1: Serializable Schedule

Schedule S:
T1          T2
Read(A)
            Read(A)
Write(A)
            Write(A)
Read(B)
            Read(B)
Write(B)
            Write(B)

Precedence Graph:

T1 Write(A) before T2 Read(A) → T1 → T2
T1 Write(A) before T2 Write(A) → T1 → T2
T1 Write(B) before T2 Read(B) → T1 → T2
T1 Write(B) before T2 Write(B) → T1 → T2

Graph: T1 → T2 (no cycle)

Result: Conflict serializable (equivalent to T1, T2)

Example 2: Non-Serializable Schedule

Schedule S:
T1          T2
Read(A)
            Write(A)
Write(A)
Read(B)
            Read(B)
            Write(B)
Write(B)

Precedence Graph:

T1 Read(A) before T2 Write(A) → T1 → T2
T2 Write(A) before T1 Write(A) → T2 → T1

Graph: T1 ⟷ T2 (cycle!)

Result: NOT conflict serializable

VIEW SERIALIZABILITY

View Equivalence:

Two schedules S1 and S2 are view equivalent if:

Initial Read: If Ti reads initial value of X in S1, Ti reads initial value of X in S2
Updated Read: If Ti reads X written by Tj in S1, Ti reads X written by Tj in S2
Final Write: If Ti writes final value of X in S1, Ti writes final value of X in S2

View Serializability:

Schedule S is view serializable if view equivalent to some serial schedule.

Relationship:

Conflict Serializability ⊆ View Serializability
Every conflict serializable schedule is view serializable
Some view serializable schedules are NOT conflict serializable

Example: View Serializable but NOT Conflict Serializable

Schedule S:
T1          T2          T3
Read(A)
            Write(A)
Write(A)
                        Write(A)

Check View Equivalence with Serial T1→T2→T3:

Initial Read: T1 reads A initially in both ✓
Updated Read: No updated reads in S ✓
Final Write: T3 writes final A in both ✓

Result: View serializable

Check Conflict Serializability:

T1 Read(A) before T2 Write(A) → T1 → T2
T2 Write(A) before T1 Write(A) → T2 → T1 (cycle!)

Result: NOT conflict serializable

CONCURRENCY CONTROL PROTOCOLS

1. LOCK-BASED PROTOCOLS

Lock Types:

Shared Lock (S-lock): Read lock

Multiple transactions can hold S-lock simultaneously
Allows reading, prevents writing

Exclusive Lock (X-lock): Write lock

Only one transaction can hold X-lock
Allows reading and writing
Prevents all other locks

Lock Compatibility Matrix:

	S	X
S	✓	✗
X	✗	✗

Locking Protocol:

Read(X):
  Lock-S(X)
  Read X
  Unlock(X)

Write(X):
  Lock-X(X)
  Write X
  Unlock(X)

Problems with Simple Locking:

Non-serializable schedules possible
Need more restrictive protocol

TWO-PHASE LOCKING (2PL)

Rules:

Phase 1 (Growing Phase):

Acquire locks
Cannot release any locks

Phase 2 (Shrinking Phase):

Release locks
Cannot acquire any locks

Theorem: If all transactions follow 2PL, resulting schedule is conflict serializable.

Example:

T1:
Lock-X(A)     // Growing phase starts
Read(A)
A := A - 50
Write(A)
Lock-X(B)     // Still growing
Read(B)
B := B + 50
Write(B)
Unlock(A)     // Shrinking phase starts
Unlock(B)     // Still shrinking

Variants:

1. Strict 2PL (S2PL)

Hold all exclusive locks until commit/abort
Prevents cascading rollbacks
Most commonly used

2. Rigorous 2PL

Hold ALL locks (shared and exclusive) until commit/abort
Simplifies recovery
Serialization order = commit order

Problems with 2PL:

1. Cascading Rollback

T1          T2
Lock-X(A)
Read(A)
A := A - 50
Write(A)
Unlock(A)   // Released early!
            Lock-X(A)
            Read(A)  // Dirty read
            A := A * 2
            Write(A)
Abort       // Must rollback T2 too!

Solution: Strict 2PL

2. Deadlock

T1          T2
Lock-X(A)
            Lock-X(B)
Wait for B
            Wait for A  // Deadlock!

DEADLOCK HANDLING

1. Deadlock Prevention

Wait-Die Scheme (non-preemptive):

Older transaction waits for younger
Younger aborted if requests lock held by older
Timestamp determines age

Wound-Wait Scheme (preemptive):

Older transaction "wounds" (aborts) younger
Younger waits for older
Less rollbacks than Wait-Die

2. Deadlock Detection

Wait-For Graph:

Node for each transaction
Edge Ti → Tj if Ti waiting for lock held by Tj
Deadlock ⟺ Cycle in graph

Detection Algorithm:

Periodically check for cycles
If cycle found, abort victim transaction
Victim selection: Minimize cost (age, work done, locks held)

3. Deadlock Recovery

Victim Selection Factors:

Transaction age (prefer younger)
Progress made (prefer less work done)
Number of locks held
Number of transactions to cascade rollback

2. TIMESTAMP-BASED PROTOCOLS

Idea: Use timestamps to order transactions without locks

Timestamp Assignment:

TS(Ti): Timestamp when Ti enters system
TS(Ti) < TS(Tj) means Ti entered before Tj
Serialization order = timestamp order

Timestamp Ordering (TO) Protocol:

For each data item X, maintain:

R-timestamp(X): Largest timestamp of transaction that read X
W-timestamp(X): Largest timestamp of transaction that wrote X

Rules:

Read(X) by Ti:

if TS(Ti) < W-timestamp(X):
    Reject and rollback Ti  // Trying to read future write
else:
    Execute read
    R-timestamp(X) = max(R-timestamp(X), TS(Ti))

Write(X) by Ti:

if TS(Ti) < R-timestamp(X):
    Reject and rollback Ti  // Trying to overwrite future read
else if TS(Ti) < W-timestamp(X):
    Reject and rollback Ti  // Trying to overwrite future write
else:
    Execute write
    W-timestamp(X) = TS(Ti)

Properties:

Conflict serializable (follows timestamp order)
Deadlock-free (no waiting)
May cause more rollbacks than 2PL

Thomas' Write Rule (Optimization):

Modified Write(X):

if TS(Ti) < R-timestamp(X):
    Reject and rollback Ti
else if TS(Ti) < W-timestamp(X):
    Ignore write (obsolete)  // Don't rollback!
else:
    Execute write
    W-timestamp(X) = TS(Ti)

Allows view serializable (not just conflict serializable)

3. OPTIMISTIC CONCURRENCY CONTROL (VALIDATION-BASED)

Assumption: Conflicts are rare

Three Phases:

1. Read Phase:

Read from database
Write to local copies (not database)
No locks acquired

2. Validation Phase:

Check if commit would cause serializability violation
Test if operations conflict with other transactions

3. Write Phase:

If validation succeeds: Write local copies to database
If validation fails: Abort and restart

Validation Test:

For transaction Ti validating:

Check against each Tj that committed during Ti's execution:

Tj completed before Ti started (OK)
Ti's read phase after Tj's write phase (OK)
Ti doesn't read items written by Tj (OK)

Advantages:

No locks (high concurrency when conflicts rare)
No deadlocks
Read-only transactions never abort

Disadvantages:

May abort due to conflicts
Validation overhead
Best when conflicts actually rare

MULTI-VERSION CONCURRENCY CONTROL (MVCC)

Idea: Keep multiple versions of data items

Benefits:

Readers don't block writers
Writers don't block readers
Improved concurrency

Implementation:

Each write creates new version:

X1, X2, X3, ... (versions of X)
Each version has timestamp

Transaction reads appropriate version:

Read-timestamp determines which version

Example (PostgreSQL, Oracle):

Time    T1                  T2
1       BEGIN
2       Read(X) → X1
3                           BEGIN
4                           Write(X) → X2
5       Read(X) → X1        // Still reads old version!
6                           COMMIT
7       Read(X) → X1        // Consistent read
8       COMMIT

Snapshot Isolation:

Transaction sees snapshot of committed data
Reads always succeed (no blocking)
Writes checked for conflicts at commit

MVCC in Practice:

PostgreSQL: MVCC by default
MySQL InnoDB: MVCC for SELECT
Oracle: Read Consistency

RECOVERY SYSTEM

Failure Types:

1. Transaction Failure:

Logical errors (constraint violation, bad input)
System errors (deadlock)
Recovery: Abort transaction

2. System Crash:

Power failure, software bug
Main memory lost, disk intact
Recovery: Use logs to restore consistency

3. Disk Failure:

Head crash, disk corruption
Recovery: Use backup + log

LOG-BASED RECOVERY

Write-Ahead Logging (WAL):

Rule: Log record must be written to stable storage before database modified

Log Records:

<Ti start>               // Transaction begins
<Ti, X, V_old, V_new>   // Update record
<Ti commit>              // Transaction commits
<Ti abort>               // Transaction aborts

Example:

<T1 start>
<T1, A, 1000, 950>  // A: 1000 → 950
<T1, B, 2000, 2050> // B: 2000 → 2050
<T1 commit>

RECOVERY ALGORITHMS

1. DEFERRED DATABASE MODIFICATION

Don't write to database until commit
Log records written during transaction
At commit: Write changes to database

Recovery:

Redo: For committed transactions, replay updates
No Undo: Uncommitted transactions didn't modify database

2. IMMEDIATE DATABASE MODIFICATION

Write to database during transaction execution
Log records written before database updates (WAL)

Recovery:

Undo: For uncommitted transactions, restore old values
Redo: For committed transactions, replay updates

ARIES RECOVERY ALGORITHM

(Analysis, Redo, Undo)

Used by DB2, SQL Server, PostgreSQL (variants)

Three Phases:

1. Analysis Phase:

Scan log from last checkpoint
Identify active transactions at crash
Determine which pages dirty

2. Redo Phase:

Scan log forward from checkpoint
Redo all updates (even uncommitted)
Restores database to state at crash

3. Undo Phase:

Scan log backward
Undo updates of transactions active at crash
Restore database to consistent state

Checkpointing:

Periodic checkpoint reduces recovery time:

Write all dirty pages to disk
Write checkpoint record to log
Recovery starts from last checkpoint

RECOVERABILITY

Recoverable Schedule:

If Ti reads item written by Tj, Tj must commit before Ti commits.

Example:

T1          T2
Write(A)
            Read(A)  // Dirty read
Commit
            Commit   // OK (T1 committed first)

Cascadeless Schedule:

If Ti reads item written by Tj, Tj must commit before Ti reads.

Example:

T1          T2
Write(A)
Commit
            Read(A)  // Clean read
            Commit

Strict Schedule:

No transaction reads or writes item until last transaction that wrote it commits/aborts.

Relationship:

Strict ⊂ Cascadeless ⊂ Recoverable

Best: Use Strict 2PL for strict schedules

Examples

COMPREHENSIVE EXAMPLES

EXAMPLE 1: ACID Properties in Banking

-- Transfer $100 from Account A to Account B
BEGIN TRANSACTION;

-- Read balances
DECLARE @balanceA DECIMAL(10,2);
DECLARE @balanceB DECIMAL(10,2);

SELECT @balanceA = Balance FROM Accounts WHERE AccountID = 'A';
SELECT @balanceB = Balance FROM Accounts WHERE AccountID = 'B';

-- Check sufficient funds (Consistency)
IF @balanceA >= 100
BEGIN
    -- Deduct from A (Atomicity)
    UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 'A';
    
    -- Add to B (Atomicity)
    UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 'B';
    
    -- Both succeed or both fail
    COMMIT; -- (Durability)
END
ELSE
BEGIN
    -- Insufficient funds
    ROLLBACK;
    PRINT 'Insufficient funds';
END

-- (Isolation): Other transactions don't see intermediate state

EXAMPLE 2: Conflict Serializability Analysis

Schedule S1:

Time    T1              T2
1       Read(X)
2                       Read(X)
3       X := X - 100
4       Write(X)
5                       X := X + 200
6                       Write(X)
7       Read(Y)
8                       Read(Y)
9       Y := Y + 100
10      Write(Y)
11                      Y := Y + 200
12                      Write(Y)

Conflicts:

T1 Write(X) [4] before T2 Write(X) [6] → T1 → T2
T1 Write(Y) [10] before T2 Write(Y) [12] → T1 → T2

Precedence Graph: T1 → T2 (no cycle)

Conclusion: Conflict serializable, equivalent to serial schedule T1→T2

Final Result:

If X=100, Y=100 initially
After T1→T2: X=200, Y=400

Schedule S2 (Non-Serializable):

Time    T1              T2
1       Read(X)
2                       Read(Y)
3       X := X - 100
4       Write(X)
5                       Y := Y - 100
6                       Write(Y)
7       Read(Y)
8                       Read(X)
9       Y := Y + 100
10      Write(Y)
11                      X := X + 100
12                      Write(X)

Conflicts:

T1 Write(X) [4] before T2 Read(X) [8] → T1 → T2
T2 Write(Y) [6] before T1 Read(Y) [7] → T2 → T1

Precedence Graph: T1 ⟷ T2 (cycle!)

Conclusion: NOT conflict serializable

EXAMPLE 3: Two-Phase Locking

Transaction T1 (Withdraw $100):

BEGIN TRANSACTION;

-- Growing Phase
Lock-S(Accounts)        -- Check existence
Lock-X(Account_A)       -- Will modify

IF Balance_A >= 100:
    Balance_A = Balance_A - 100
    UPDATE...
    
    -- Shrinking Phase (Strict 2PL: at commit)
    COMMIT
    Unlock-X(Account_A)
    Unlock-S(Accounts)
ELSE:
    ROLLBACK
    Unlock-X(Account_A)
    Unlock-S(Accounts)

EXAMPLE 4: Deadlock Detection

Time    T1                  T2                  T3
1       Lock-X(A)
2                           Lock-X(B)
3                                               Lock-X(C)
4       Request Lock-X(B)   
5       WAIT                Request Lock-X(C)
6                           WAIT                Request Lock-X(A)
7                                               WAIT

Wait-For Graph:
T1 → T2 → T3 → T1 (CYCLE - DEADLOCK!)

Victim Selection: Abort T3 (youngest)

8                                               ABORT
9                           Lock-X(C) granted
10                          COMMIT
11      Lock-X(B) granted
12      COMMIT

EXAMPLE 5: Timestamp Ordering

Transactions:

T1: TS = 10
T2: TS = 20
T3: TS = 30

Data Item X:

Initial R-TS(X) = 0, W-TS(X) = 0

Time    Operation           R-TS(X)     W-TS(X)     Result
1       T1: Read(X)         10          0           Success
2       T2: Write(X)        10          20          Success
3       T1: Write(X)        10          20          ABORT
        // TS(T1)=10 < W-TS(X)=20
        // T1 trying to overwrite T2's future write
        
4       T3: Read(X)         30          20          Success
5       T2: Write(X)        30          20          ABORT
        // TS(T2)=20 < R-TS(X)=30
        // T2 trying to overwrite T3's future read

EXAMPLE 6: Optimistic Concurrency Control

T1:                         T2:
Read Phase:                 Read Phase:
Read(X) [local]             
X = X + 100
                            Read(X) [local]
                            X = X * 2

Validation Phase:           
Check conflicts             
No conflicts found          
PASS                        

Write Phase:                
Write(X) to DB              

                            Validation Phase:
                            Check conflicts
                            T1 wrote X after T2 read
                            FAIL - ABORT T2

EXAMPLE 7: MVCC (Snapshot Isolation)

Timeline:
         T1                           T2
t1      BEGIN (Snapshot at t1)
t2      Read(X) → 100
t3                                   BEGIN (Snapshot at t3)
t4                                   Write(X=200)
t5      Read(X) → 100 (still!)       
t6                                   COMMIT
t7      Read(X) → 100 (consistent)
t8      COMMIT

Versions of X:
X@t0 = 100 (initial)
X@t6 = 200 (T2's write)

T1 reads X@t1 = 100 (snapshot isolation)
T2 reads X@t3 = 100, writes X@t6 = 200

EXAMPLE 8: Recovery Using Log

Log:
<T1 start>
<T1, A, 1000, 950>
<T1, B, 2000, 2050>
<T1 commit>
<T2 start>
<T2, C, 500, 450>
<T2, D, 300, 450>
*** CRASH ***

Recovery:
1. Identify committed: T1 (has <T1 commit>)
2. Identify active: T2 (no commit/abort)

3. REDO T1:
   - A = 950
   - B = 2050

4. UNDO T2:
   - D = 300 (restore old value)
   - C = 500 (restore old value)

Final State:
- A = 950 (T1 committed)
- B = 2050 (T1 committed)
- C = 500 (T2 undone)
- D = 300 (T2 undone)

EXAMPLE 9: Cascading Rollback

T1          T2          T3
Write(A)
            Read(A)     // Dirty read from T1
                        Read(A)  // Dirty read from T1
Abort       // T1 aborts

Result:
- T1 aborts (application logic)
- T2 must abort (read uncommitted data from T1)
- T3 must abort (read uncommitted data from T1)

Cascading rollback!

Prevention: Use Strict 2PL
- Hold locks until commit
- No dirty reads possible

EXAMPLE 10: Isolation Levels

-- READ UNCOMMITTED: Allows dirty reads
Session 1:
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = 500 WHERE ID = 1;
-- Not committed yet

Session 2:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Returns: 500 (DIRTY READ!)

Session 1:
ROLLBACK; -- Oops, T2 saw uncommitted data!

-- READ COMMITTED: Prevents dirty reads
Session 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT Balance FROM Accounts WHERE ID = 1;
-- Waits for Session 1 to commit/rollback
-- Or reads last committed value

-- REPEATABLE READ: Prevents dirty & non-repeatable reads
Session 1:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT Balance FROM Accounts WHERE ID = 1; -- 1000
-- Another session updates to 500 and commits
SELECT Balance FROM Accounts WHERE ID = 1; -- Still 1000!
COMMIT;

-- SERIALIZABLE: Prevents all anomalies
Session 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
SELECT COUNT(*) FROM Accounts; -- 10
-- Another session inserts new account and commits
SELECT COUNT(*) FROM Accounts; -- Still 10! (No phantom)
COMMIT;

Real-World Use

PRACTICAL GUIDELINES

1. Choosing Concurrency Control:

Two-Phase Locking (2PL):

Use when: Traditional OLTP systems
Pros: Well-understood, widely implemented
Cons: Deadlocks possible, lower concurrency
Examples: Banking, e-commerce orders

Timestamp Ordering:

Use when: Read-heavy workloads
Pros: Deadlock-free, good for distributed systems
Cons: More rollbacks
Examples: Analytics, reporting

Optimistic:

Use when: Conflicts rare
Pros: High concurrency, no locks
Cons: Validation overhead, may abort
Examples: Document editing, wikis

MVCC:

Use when: High read/write concurrency needed
Pros: Readers don't block writers
Cons: Storage overhead (multiple versions)
Examples: PostgreSQL (default), Oracle

2. Transaction Design Best Practices:

Keep Transactions Short:

-- BAD: Long transaction
BEGIN TRANSACTION;
SELECT ... -- Heavy query
-- User interaction here! (NEVER!)
INSERT ...
COMMIT;

-- GOOD: Short transaction
SELECT ... -- Outside transaction
-- Process data
BEGIN TRANSACTION;
INSERT ...
COMMIT;

Acquire Locks in Consistent Order:

-- BAD: Deadlock possible
T1: Lock A, then Lock B
T2: Lock B, then Lock A

-- GOOD: Same order
T1: Lock A, then Lock B
T2: Lock A, then Lock B

Handle Deadlocks:

max_retries = 3
for attempt in range(max_retries):
    try:
        # Begin transaction
        conn.execute("BEGIN")
        # ... operations ...
        conn.execute("COMMIT")
        break  # Success
    except DeadlockException:
        conn.execute("ROLLBACK")
        if attempt == max_retries - 1:
            raise  # Give up
        time.sleep(random.uniform(0.1, 0.5))  # Random backoff

3. Choosing Isolation Level:

Level	Dirty Read	Non-Repeatable	Phantom	Use Case
READ UNCOMMITTED	Yes	Yes	Yes	Never use
READ COMMITTED	No	Yes	Yes	Default for most apps
REPEATABLE READ	No	No	Yes	Financial reports
SERIALIZABLE	No	No	No	Critical operations

MySQL/InnoDB:

-- Default: REPEATABLE READ
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- ... operations ...
COMMIT;

PostgreSQL:

-- Default: READ COMMITTED
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- ... operations ...
COMMIT;

4. Monitoring Transactions:

PostgreSQL:

-- Active transactions
SELECT * FROM pg_stat_activity 
WHERE state = 'active';

-- Long-running transactions
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY duration DESC;

-- Blocking queries
SELECT blocked_locks.pid, blocked_activity.query,
       blocking_locks.pid AS blocking_pid, blocking_activity.query AS blocking_query
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid
JOIN pg_locks blocking_locks ON blocked_locks.locktype = blocking_locks.locktype
JOIN pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid
WHERE NOT blocked_locks.granted;

MySQL:

-- Active transactions
SHOW PROCESSLIST;

-- InnoDB status (includes locks, deadlocks)
SHOW ENGINE INNODB STATUS;

-- Lock waits
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

5. Deadlock Prevention:

Timeout-Based:

-- MySQL
SET innodb_lock_wait_timeout = 50; -- seconds

-- PostgreSQL
SET lock_timeout = '30s';
SET statement_timeout = '60s';

Lock Ordering:

Always acquire locks on tables in same order
Document locking policy
Code reviews to enforce

6. Recovery Configuration:

Write-Ahead Logging:

-- PostgreSQL
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = fsync

-- MySQL InnoDB
innodb_flush_log_at_trx_commit = 1
innodb_doublewrite = 1

Checkpoint Configuration:

-- PostgreSQL
checkpoint_timeout = 5min
max_wal_size = 1GB
checkpoint_completion_target = 0.9

-- MySQL
innodb_log_file_size = 256M
innodb_log_files_in_group = 2

7. Real-World Scenarios:

Banking (High Consistency):

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Transfer logic
COMMIT;

E-commerce (Balance):

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- Order processing
COMMIT;

Analytics (Read Performance):

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; -- Report can tolerate
BEGIN;
-- Long-running analytics query
COMMIT;

Social Media (MVCC):

Use database with MVCC (PostgreSQL)
Readers never block writers
High concurrent read/write throughput

For exams

IMPORTANT EXAM QUESTIONS

Concepts:

Define transaction and explain ACID properties with examples.
- Transaction: Logical unit of work
- Atomicity: All or nothing
- Consistency: Maintains invariants
- Isolation: Concurrent execution doesn't interfere
- Durability: Committed changes persist
Draw and explain transaction state diagram.
States: Active → Partially Committed → Committed
Active → Failed → Aborted
What is serializability? Why is it important?
- Concurrent schedule equivalent to serial schedule
- Ensures correctness of concurrent execution
- Maximizes concurrency while maintaining consistency
Explain conflict serializability. How do you test it?
- Conflicting operations ordered same as some serial schedule
- Test: Build precedence graph, check for cycles
- No cycle → conflict serializable
What is view serializability? How does it differ from conflict serializability?
- View: Same initial reads, updated reads, final writes
- View serializability ⊇ Conflict serializability
- Some view serializable schedules not conflict serializable

Two-Phase Locking:

Explain Two-Phase Locking protocol. Why does it ensure serializability?
- Growing phase: Acquire locks
- Shrinking phase: Release locks
- Prevents conflicts, ensures serializable execution
What are the problems with basic 2PL? How are they solved?
- Cascading rollback → Strict 2PL
- Deadlock → Detection/prevention schemes
Compare Strict 2PL and Rigorous 2PL.
- Strict: Hold X-locks until commit
- Rigorous: Hold ALL locks until commit

Deadlocks:

Explain deadlock detection using wait-for graph.
- Node per transaction
- Edge Ti→Tj if Ti waits for Tj
- Cycle = deadlock
Compare Wait-Die and Wound-Wait schemes.
- Wait-Die: Older waits, younger aborts
- Wound-Wait: Older aborts younger, younger waits

Timestamp Ordering:

Explain timestamp-based concurrency control.
- Each transaction gets timestamp
- Read/Write rules based on R-TS and W-TS
- Ensures timestamp order serialization
What is Thomas' Write Rule? What does it allow?
- Ignore obsolete writes instead of aborting
- Allows view serializability

Other Protocols:

Explain optimistic concurrency control. When is it preferred?
- Read, Validate, Write phases
- Best when conflicts rare
- High concurrency, no locks
What is MVCC? How does it improve concurrency?
- Multiple versions of data
- Readers don't block writers
- Used in PostgreSQL, Oracle

Recovery:

Explain Write-Ahead Logging (WAL) and its importance.
- Log written before data
- Ensures recovery possible
- Foundation of ARIES
What is the difference between undo and redo?
- Undo: Restore old values (uncommitted transactions)
- Redo: Replay updates (committed transactions)
Explain ARIES recovery algorithm.
- Analysis: Identify transactions to undo/redo
- Redo: Replay all updates
- Undo: Reverse uncommitted updates
What is checkpointing? Why is it important?
- Periodic snapshot to stable storage
- Reduces recovery time
- Recovery starts from checkpoint

Schedules:

Given a schedule, determine if it's conflict serializable using precedence graph.
- Build graph with conflicts
- Check for cycles
- No cycle → serializable
What are recoverable, cascadeless, and strict schedules?
- Recoverable: Read order ensures recoverability
- Cascadeless: No dirty reads
- Strict: No read/write until commit

Practical:

Compare SQL isolation levels. Which anomalies does each prevent?

Level Dirty Non-repeatable Phantom
READ UNCOMMITTED Allow Allow Allow
READ COMMITTED Prevent Allow Allow
REPEATABLE READ Prevent Prevent Allow
SERIALIZABLE Prevent Prevent Prevent
How do you handle deadlocks in application code?
- Retry with exponential backoff
- Timeout configuration
- Error handling and rollback

Level	Dirty	Non-repeatable	Phantom
READ UNCOMMITTED	Allow	Allow	Allow
READ COMMITTED	Prevent	Allow	Allow
REPEATABLE READ	Prevent	Prevent	Allow
SERIALIZABLE	Prevent	Prevent	Prevent

QUICK REVISION

ACID:

Atomicity: All or nothing
Consistency: Maintain invariants
Isolation: Concurrent → serial
Durability: Committed → permanent

Serializability:

Conflict: Build precedence graph, no cycle
View: Initial read, updated read, final write same

2PL:

Growing: Acquire locks
Shrinking: Release locks
Strict: Hold X-locks until commit

Deadlock:

Detection: Wait-for graph, cycle
Prevention: Wait-Die, Wound-Wait
Recovery: Abort victim, retry

Timestamp:

TS order = serialization order
Check R-TS, W-TS before read/write
Thomas Write Rule: Ignore obsolete writes

Recovery:

WAL: Log before data
Undo: Uncommitted transactions
Redo: Committed transactions
ARIES: Analysis, Redo, Undo

Isolation Levels:

READ UNCOMMITTED < READ COMMITTED < REPEATABLE READ < SERIALIZABLE
Higher level = More consistency, Less concurrency

Key points

KEY TAKEAWAYS

✓ ACID properties guarantee reliable transactions: Atomicity, Consistency, Isolation, Durability

✓ Serializability ensures correctness: Concurrent execution equivalent to some serial execution

✓ Conflict serializability tested with precedence graph: No cycle → serializable

✓ View serializability is broader: Includes some non-conflict serializable schedules

✓ Two-Phase Locking (2PL) ensures serializability: Growing phase, shrinking phase

✓ Strict 2PL prevents cascading rollbacks: Hold X-locks until commit

✓ Deadlocks require handling: Detection (wait-for graph) or prevention (Wait-Die, Wound-Wait)

✓ Timestamp ordering is deadlock-free: But may cause more rollbacks

✓ Optimistic concurrency good for rare conflicts: Read, validate, write phases

✓ MVCC enables high concurrency: Readers don't block writers, multiple versions

✓ Write-Ahead Logging (WAL) enables recovery: Log before data modification

✓ Recovery involves undo and redo: Undo uncommitted, redo committed

✓ ARIES is industry-standard recovery: Analysis, Redo, Undo phases

✓ Isolation levels trade consistency for performance: SERIALIZABLE strongest, READ UNCOMMITTED weakest

✓ Transaction design matters: Keep short, acquire locks in order, handle deadlocks

✓ Choose protocol based on workload: 2PL for OLTP, MVCC for high concurrency, Optimistic for rare conflicts

REMEMBER: Transactions are the foundation of reliable database systems. Understanding concurrency control and recovery is essential for building robust applications that maintain data integrity even under concurrent access and system failures!

←

Keys, Constraints, Functional Dependencies, and Normalization

Query Processing, Indexing (B-tree, Hashing), and Advanced Database Systems

→