Storage Strategies
The gap between memory speed and disk speed is vast—memory access takes nanoseconds, disk access takes milliseconds. That's a million-fold difference. Storage strategies bridge this gap through clever data structures, caching, and organization techniques that minimize disk I/O, the primary bottleneck in database performance.
🌳 B+ Tree Index Structure
The Basics
Storage strategies deal with how data is organized on disk and cached in memory so the DBMS can minimize disk I/O.
From master.txt (disk storage section): large databases rely on disk storage and maintain:
- Data files (the database itself)
- Data dictionary (metadata / schema)
- Indices (fast access paths)
An index provides pointers to data items that hold a particular value. Good storage strategies combine buffering + file organization + indexes/hashing to avoid repeated full scans.
Technical Details
FILE ORGANIZATION:
-
Heap File (Unordered):
Records stored in insertion order
Insert: O(1) - append at end
Search: O(n) - scan entire file
Delete: O(n) - find then mark deleted
Best for: Bulk loading, full table scans -
Sorted File:
Records sorted by search key
Insert: O(n) - maintain order
Search: O(log n) - binary search
Range queries: Efficient
Best for: Static data with range queries -
Hash File:
Records distributed across buckets using hash function
Search: O(1) average - compute hash, go to bucket
Range queries: Inefficient (requires full scan)
Best for: Exact match lookups
INDEXING STRUCTURES:
- B-TREE:
Properties:
- Balanced tree (all leaves at same level)
- Order d: Each node has [d, 2d] keys (except root)
- Internal nodes: [d+1, 2d+1] children
- Keeps data sorted
Structure:
Node: [P1, K1, P2, K2, ..., Pn, Kn, Pn+1]
- Ki = keys (sorted)
- Pi = pointers to children or data
- All keys in subtree Pi < Ki < all keys in subtree Pi+1
Search Algorithm:
- Start at root
- Binary search keys in node
- Follow appropriate pointer
- Repeat until leaf
Insertion:
- Search to find leaf
- Insert key in sorted order
- If leaf overflows (> 2d keys):
- Split into two nodes
- Promote middle key to parent
- Recursively split parent if needed
Deletion:
- Search and remove key
- If underflow (< d keys):
- Borrow from sibling, OR
- Merge with sibling
- Recursively fix parent
Cost Analysis (n records, order d):
- Tree height: log_d(n)
- Search: O(log n) disk I/Os
- Insert/Delete: O(log n) disk I/Os
- B+ TREE (Most Common in Databases):
Differences from B-Tree:
- All data in leaf nodes
- Internal nodes only store keys + pointers
- Leaf nodes linked (sequential access)
Advantages:
- Higher fanout (more keys per internal node)
- Shorter tree (fewer I/Os)
- Efficient range scans (follow leaf pointers)
- Sequential access without tree traversal
Example B+ Tree (order d=2):
[30]
/
[10,20] [40,50]
/ | \ / |
Leaves: [5,7][12,15][25,28][35,38][45,48][55,60]
Leaves linked: → allows sequential scan
- HASH INDEX:
Static Hashing:
hash(key) % M = bucket number
Issue: Fixed M, poor with growing data
Dynamic Hashing (Extendible Hashing):
- Bucket directory grows/shrinks
- Only affected buckets split
- Efficient for insertions
Linear Hashing:
- No directory
- Gradual bucket splitting
- Handles overflows with chaining
BUFFER MANAGEMENT:
Buffer Pool: Area of memory for caching disk pages
Replacement Policies:
-
LRU (Least Recently Used):
Replace page not used for longest time
Good general-purpose policy -
MRU (Most Recently Used):
Replace most recently used
Good for sequential scans -
Clock Algorithm:
Approximates LRU with less overhead
Each page has reference bit -
LRU-K:
Track K most recent accesses
Better for database workloads
Pin Count: Number of active users of a page
- Pinned pages cannot be replaced
- Must unpin after use
Dirty Bit: Indicates modified page
- Must write back before replacement
- WAL rule: Log before data
CLUSTERING:
Store related records physically close
Types:
- Intra-file clustering: Related records in same file
- Inter-file clustering: Records from multiple tables together
Clustered Index:
- Data stored in index order
- Only one per table
- Improves range queries
- Slows insertions (maintain order)
Example: Customer table clustered by CustomerID
- CustomerID 1-1000 in block 1
- CustomerID 1001-2000 in block 2
- Range query efficient
Examples
Example 1: B+ Tree Operations
Initial B+ Tree (order 3, max 5 keys per node):
[30]
/
[10,20] [40,50]
/ | \ / |
Leaves: [5,7,9][12,15,18][25,27,29][35,38][45,48][55,60]
Insert 13:
- Find correct leaf: [12,15,18]
- Insert: [12,13,15,18]
- No overflow, done
Insert 14:
- Find leaf: [12,13,15,18]
- Insert: [12,13,14,15,18]
- No overflow, done
Insert 16:
- Find leaf: [12,13,14,15,18]
- Insert: [12,13,14,15,16,18]
- Overflow! Split:
Left: [12,13,14]
Right: [15,16,18]
Promote 15 to parent: [10,15,20]
Example 2: Index Selection
Query: SELECT * FROM Orders WHERE OrderDate BETWEEN '2024-01-01' AND '2024-01-31';
Option A: No index
- Full table scan: 1M orders × 100 bytes = 100MB
- Assume 4KB pages: 25,000 page reads
Option B: B+ Tree index on OrderDate
- Assume 5% orders in range = 50K orders
- Tree height: 3 levels
- Cost: 3 (tree traversal) + 50K/100 (data pages) = 503 page reads
- 50x faster!
Example 3: Hash vs B+ Tree
Exact Match Query:
SELECT * FROM Employee WHERE EmployeeID = 12345;
Hash Index:
- hash(12345) % 100 = 45
- Go directly to bucket 45
- Cost: 1-2 I/Os (bucket + maybe overflow)
B+ Tree Index:
- Traverse tree: log_d(n) = 3 levels
- Cost: 3 I/Os
Winner: Hash (for exact match)
Range Query:
SELECT * FROM Employee WHERE EmployeeID BETWEEN 10000 AND 20000;
Hash Index:
- No ordering information
- Must scan all buckets
- Cost: All buckets
B+ Tree Index:
- Find 10000, then sequential scan
- Cost: log_d(n) + result size
Winner: B+ Tree (for range)
SQL Index Creation:
CREATE INDEX idx_employee_dept ON Employee(DepartmentID);
CREATE UNIQUE INDEX idx_employee_email ON Employee(Email);
CREATE INDEX idx_order_date ON Orders(OrderDate);
-- Composite index
CREATE INDEX idx_employee_name ON Employee(LastName, FirstName);
-- Covering index (includes non-key columns)
CREATE INDEX idx_employee_salary ON Employee(DepartmentID) INCLUDE (Salary);
-- Partial index (PostgreSQL)
CREATE INDEX idx_active_orders ON Orders(OrderDate) WHERE Status='Active';
Real-World Use
Best Practices:
-
Index Strategy:
- Index foreign keys (join performance)
- Index WHERE clause columns
- Index ORDER BY columns
- Don't over-index (slows writes)
- Monitor index usage (drop unused)
-
When to Use Each Index Type:
- B+ Tree: Default choice, handles all queries
- Hash: Exact match on high-cardinality columns
- Bitmap: Low-cardinality columns (gender, status)
- Full-text: Text search
- Spatial: Geographic data
-
Clustering Decisions:
- Cluster on most frequent range query column
- Consider write vs read trade-off
- Only one clustered index per table
-
Buffer Pool Tuning:
- PostgreSQL: shared_buffers (25% of RAM)
- MySQL: innodb_buffer_pool_size (70-80% of RAM)
- Monitor cache hit ratio (aim for >99%)
-
Storage Hardware Considerations:
- SSD: Lower latency, random I/O less painful
- HDD: Optimize for sequential access
- Consider NVMe for extremely high throughput
Real-World Scenarios:
- E-commerce: Index product searches, cluster orders by date
- Social media: Index user queries, hash indexes for user IDs
- Analytics: Columnar storage, heavy compression
- Banking: Cluster transactions by account, index timestamps
For exams
Important Questions:
- Compare heap, sorted, and hash file organizations. When to use each?
- Explain B+ Tree structure. Why preferred over B-Tree?
- Show step-by-step insertion/deletion in B+ Tree
- Calculate cost of query with and without index
- Explain buffer replacement policies: LRU, MRU, Clock
- What is a clustered index? Pros and cons?
- Compare hash index vs B+ Tree index
- Explain extendible hashing. How does it handle growth?
- What factors determine index selectivity?
- Design indexing strategy for given schema and query workload
Key points
Key Principles:
• Disk I/O is the primary bottleneck; minimize it with indexes and caching
• B+ Tree is the default index: handles all query types, keeps data sorted
• Hash indexes excel at exact matches but can't do ranges
• Clustered indexes physically order data, improving range queries
• Buffer pool caches hot pages, dramatically improving performance
• Over-indexing hurts write performance; balance reads vs writes
• Modern SSDs change trade-offs but don't eliminate I/O concerns