Keys, Constraints, Functional Dependencies, and Normalization

Unit 3•CLO03, CLO04

Poor database design leads to redundancy, inconsistency, and anomalies. A student's address stored in 10 different tables? Deleting a course accidentally deletes all student records? These nightmares are preventable through proper database design using keys, constraints, and normalization. Understanding these concepts is fundamental to creating efficient, maintainable databases.

The Basics

Keys

Keys uniquely identify tuples and establish relationships between tables.

1. Super Key

Any set of attributes that uniquely identifies a tuple
Can have redundant attributes
Example: {StudentID}, {StudentID, Name}, {StudentID, Email} are all super keys

2. Candidate Key

Minimal super key (no redundant attributes)
Multiple candidate keys possible
Example: {StudentID}, {Email} both uniquely identify students

3. Primary Key

Selected candidate key
One per table
Cannot be NULL
Example: StudentID chosen as primary key

4. Foreign Key

References primary key in another table
Establishes relationships
Can be NULL (unless specified NOT NULL)
Enforces referential integrity

5. Composite Key

Primary key with multiple attributes
Used when single attribute insufficient
Example: {StudentID, CourseID} in Enrollment table

6. Alternate Key

Candidate keys not chosen as primary key
Example: If StudentID is primary, Email is alternate key

Integrity Constraints

1. Domain Constraints

Define valid values for attributes
Data types, CHECK constraints
Example: Age BETWEEN 18 AND 100

2. Entity Integrity

Primary key cannot be NULL
Ensures each entity uniquely identifiable
Enforced automatically by PRIMARY KEY constraint

3. Referential Integrity

Foreign key must reference existing primary key OR be NULL
Maintains consistency across relationships
CASCADE options handle dependent operations

Functional Dependencies (FDs)

Definition: X → Y means if two tuples agree on X, they must agree on Y

Example: StudentID → Name (StudentID determines Name)

Properties:

Reflexive: If Y ⊆ X, then X → Y
Transitive: If X → Y and Y → Z, then X → Z
Does NOT hold: If X → Y, then Y → X (not symmetric)

Armstrong's Axioms

Three fundamental rules for deriving FDs:

1. Reflexivity: If Y ⊆ X, then X → Y
Example: {StudentID, Name} → StudentID

2. Augmentation: If X → Y, then XZ → YZ
Example: If StudentID → Name, then {StudentID, CourseID} → {Name, CourseID}

3. Transitivity: If X → Y and Y → Z, then X → Z
Example: StudentID → DeptID and DeptID → DeptName, so StudentID → DeptName

Derived Rules:

4. Union: If X → Y and X → Z, then X → YZ
5. Decomposition: If X → YZ, then X → Y and X → Z
6. Pseudotransitivity: If X → Y and WY → Z, then WX → Z

Normalization

Process of organizing data to minimize redundancy and avoid anomalies.

Anomalies:

Update Anomaly: Same data repeated, changing one instance leaves inconsistency
Insertion Anomaly: Cannot insert data without unrelated data
Deletion Anomaly: Deleting data unintentionally deletes other data

Normal Forms:

1NF (First Normal Form):

All attributes contain atomic (indivisible) values
No repeating groups or arrays
Each cell contains single value

2NF (Second Normal Form):

In 1NF
No partial dependencies (non-prime attribute dependent on part of candidate key)
Only applies to tables with composite keys

3NF (Third Normal Form):

In 2NF
No transitive dependencies (non-prime → non-prime)
For every X → Y: Either X is superkey OR Y is prime attribute

BCNF (Boyce-Codd Normal Form):

Stricter than 3NF
For every X → Y: X must be superkey
Eliminates all anomalies due to FDs

4NF (Fourth Normal Form):

In BCNF
No multivalued dependencies
MVD: X →→ Y means X determines set of Y values independently

5NF (Fifth Normal Form / PJNF):

In 4NF
No join dependencies
Cannot be decomposed further without loss

Technical Details

DETAILED KEY CONCEPTS

Attribute Closure (Finding Keys):

Algorithm to find X+:

Result := X
For each FD Y → Z in F:
If Y ⊆ Result, then Result := Result ∪ Z
Repeat until Result doesn't change

Example:
Given R(A, B, C, D) with FDs: {A → B, B → C, C → D}
Find A+:

Start: A+ = {A}
Apply A → B: A+ = {A, B}
Apply B → C: A+ = {A, B, C}
Apply C → D: A+ = {A, B, C, D}
Conclusion: A is a superkey (contains all attributes)

Minimal Cover (Canonical Cover):

Minimal set of FDs equivalent to original set.

Algorithm:

Decompose: Convert right sides to single attributes
Remove redundant: Test each FD for redundancy
Remove extraneous: Test each attribute in left side

Example:
F = {A → BC, B → C, A → B, AB → C}

Step 1: Decompose
F' = {A → B, A → C, B → C, A → B, AB → C}

Step 2: Remove redundant A → B (duplicate)
F'' = {A → B, A → C, B → C, AB → C}

Step 3: Remove AB → C (derivable from A → B and B → C)
Minimal Cover: {A → B, B → C, A → C}

Actually, A → C derivable from A → B → C
Final Minimal Cover: {A → B, B → C}

NORMAL FORMS IN DEPTH

1NF (First Normal Form)

Violations:

Multivalued attributes: Phone = {123-4567, 234-5678}
Composite attributes not broken down: Name = "John Doe"
Repeating groups: Course1, Course2, Course3 columns

Example Violation:

StudentID	Name	Phones
101	Alice	123-4567, 234-5678

1NF Solution:

StudentID	Name	Phone
101	Alice	123-4567
101	Alice	234-5678

2NF (Second Normal Form)

Only applies to tables with composite primary keys.

Violation: Non-prime attribute partially dependent on primary key

Example:
R(StudentID, CourseID, StudentName, CourseName, Grade)
Primary Key: {StudentID, CourseID}
FDs:

StudentID → StudentName (partial dependency!)
CourseID → CourseName (partial dependency!)
{StudentID, CourseID} → Grade

Problems:

Update: Change student name in multiple rows
Insert: Can't add student without enrolling in course
Delete: Delete last enrollment, lose student info

2NF Solution:
Decompose:

Student(StudentID, StudentName)
Course(CourseID, CourseName)
Enrollment(StudentID, CourseID, Grade)

3NF (Third Normal Form)

Violation: Transitive dependency (non-prime → non-prime)

Example:
R(StudentID, DeptID, DeptName, DeptHead)
Primary Key: StudentID
FDs:

StudentID → DeptID
DeptID → DeptName (transitive!)
DeptID → DeptHead (transitive!)

Problems:

Update: Change dept name in multiple student records
Insert: Can't add department without students
Delete: Delete last student in dept, lose dept info

3NF Solution:

Student(StudentID, DeptID)
Department(DeptID, DeptName, DeptHead)

BCNF (Boyce-Codd Normal Form)

Stricter than 3NF: Every determinant must be a superkey

Example where 3NF ≠ BCNF:

R(Professor, Course, Student)
FDs:

{Professor, Course} → Student
{Student, Course} → Professor

Candidate Keys: {Professor, Course}, {Student, Course}
All attributes are prime, so it's in 3NF.

But: Student, Course → Professor where {Student, Course} is not a superkey in context.
Actually, {Student, Course} IS a candidate key, so this IS in BCNF.

Better Example:

R(Court, StartTime, EndTime)
FD: Court → EndTime (one court can only end at one time, based on rules)
Candidate Key: {Court, StartTime}

Court → EndTime violates BCNF (Court is not a superkey)

BCNF Solution:

R1(Court, EndTime)
R2(Court, StartTime)

4NF (Fourth Normal Form)

Multivalued Dependency (MVD): X →→ Y

Meaning: For each X value, there's a set of Y values independent of other attributes.

Example:

R(Professor, Course, Phone)

Professor	Course	Phone
Dr. Smith	DB	111-1111
Dr. Smith	DB	222-2222
Dr. Smith	AI	111-1111
Dr. Smith	AI	222-2222

Professor →→ Course (Professor determines set of courses)
Professor →→ Phone (Professor determines set of phones)
These are independent!

Problems:

Redundancy: Phone numbers repeated for each course
Update: Change phone, must update multiple rows
Insert: Can't add phone without adding for all courses

4NF Solution:

ProfessorCourse(Professor, Course)
ProfessorPhone(Professor, Phone)

5NF (Fifth Normal Form / Project-Join Normal Form)

Join Dependency: R can be losslessly decomposed into R1, R2, ..., Rn

Example:

R(Supplier, Part, Project)
Meaning: Supplier supplies Part to Project

Business Rules:

If supplier S supplies part P
And project J uses part P
And supplier S works on project J
Then supplier S supplies part P to project J

This creates a join dependency!

5NF Solution:

SupplierPart(Supplier, Part)
ProjectPart(Project, Part)
SupplierProject(Supplier, Project)

Verification: JOIN these three recovers original without spurious tuples

DECOMPOSITION PROPERTIES

1. Lossless Join Decomposition

Decomposition R → R1, R2 is lossless if:
R = R1 ⋈ R2

Test: R1 ∩ R2 → R1 or R1 ∩ R2 → R2

Example:
R(A, B, C) with A → B
Decompose: R1(A, B), R2(A, C)
R1 ∩ R2 = {A}
A → B (attributes of R1), so lossless!

2. Dependency Preservation

All FDs can be checked without joining.

Example:
R(A, B, C) with FDs: A → B, B → C
Decompose: R1(A, B), R2(B, C)

Check FDs:

A → B: Check in R1 ✓
B → C: Check in R2 ✓
Dependency preserved!

Trade-off: BCNF guarantees lossless but may not preserve dependencies
3NF guarantees both lossless and dependency preservation

NORMALIZATION ALGORITHM

3NF Synthesis Algorithm:

Input: Relation R, FDs F
Output: 3NF decomposition

Find minimal cover Fc
For each FD X → Y in Fc, create relation(X, Y)
If no relation contains a candidate key of R, add one
Eliminate redundant relations

BCNF Decomposition Algorithm:

Input: Relation R, FDs F
Output: BCNF decomposition

If R in BCNF, done
Find FD X → Y that violates BCNF
Decompose: R1 = XY, R2 = R - (Y - X)
Recursively decompose R1 and R2

Example:

R(A, B, C, D)
F = {A → B, B → C, C → D}

Minimal Cover: Same (already minimal)

3NF Synthesis:

R1(A, B) from A → B
R2(B, C) from B → C
R3(C, D) from C → D
Candidate key: A (check: A+ = ABCD)
R1 contains A, so done

Result: R1(A, B), R2(B, C), R3(C, D)

Examples

COMPREHENSIVE EXAMPLES

EXAMPLE 1: Finding Keys

R(A, B, C, D, E)
FDs: {A → B, BC → E, ED → A}

Find all candidate keys:

Method: Attribute Classification

Left side only: C, D (must be in every key)
Right side only: B, E (never in key)
Both sides: A
Neither: None

Start with {C, D}:
CD+ = {C, D}
Not a key (doesn't include all attributes)

Try adding A: ACD+

A → B: ACD+ = {A, C, D, B}
BC → E: ACD+ = {A, C, D, B, E}
All attributes! ACD is a candidate key.

Try adding others systematically...
Result: Candidate keys = {ACD}

EXAMPLE 2: 1NF Conversion

Unnormalized:

OrderID	CustomerName	Items
1	John	Book, Pen, Notebook
2	Jane	Laptop

Problems:

Items column contains multiple values (not atomic)
Can't easily search for specific item
Can't efficiently count items per order

1NF Solution:

OrderID	CustomerName	Item
1	John	Book
1	John	Pen
1	John	Notebook
2	Jane	Laptop

EXAMPLE 3: 2NF Conversion

Table in 1NF:

StudentCourse(StudentID, CourseID, StudentName, CourseName, Instructor, Grade)

Primary Key: {StudentID, CourseID}

FDs:

StudentID → StudentName (partial dependency!)
CourseID → CourseName (partial dependency!)
CourseID → Instructor (partial dependency!)
{StudentID, CourseID} → Grade

Anomalies:

Update: Change student name requires updating all enrollment records
Insert: Can't add student without enrolling in a course
Delete: Drop last course, lose student information

2NF Solution:

Student(StudentID, StudentName)
Course(CourseID, CourseName, Instructor)
Enrollment(StudentID, CourseID, Grade)

Verification:

No partial dependencies
All non-prime attributes fully dependent on entire primary key

EXAMPLE 4: 3NF Conversion

Table in 2NF:

Employee(EmpID, DeptID, DeptName, DeptLocation, Salary)

Primary Key: EmpID

FDs:

EmpID → DeptID
EmpID → Salary
DeptID → DeptName (transitive!)
DeptID → DeptLocation (transitive!)

Transitive Path: EmpID → DeptID → DeptName

Anomalies:

Update: Change department name in all employee records
Insert: Can't add department without employees
Delete: Remove last employee, lose department info

3NF Solution:

Employee(EmpID, DeptID, Salary)
Department(DeptID, DeptName, DeptLocation)

EXAMPLE 5: BCNF Conversion

Table in 3NF:

ClassSchedule(Classroom, Period, Course, Professor)

FDs:

{Classroom, Period} → Course
{Classroom, Period} → Professor
{Professor, Course} → Classroom (assuming one prof teaches one course in one room)

Candidate Keys:

{Classroom, Period}
{Professor, Course, Period} (need Period to uniquely identify)

Actually, let's simplify:

Better Example:

R(StudentID, Major, Advisor)

FDs:

StudentID → Major
Major → Advisor (each major has one advisor)

Candidate Key: StudentID

Check BCNF:

StudentID → Major: StudentID is superkey ✓
Major → Advisor: Major is NOT superkey ✗

Violates BCNF!

BCNF Solution:

R1(Major, Advisor)
R2(StudentID, Major)

Verification:

R1: Major → Advisor, Major is candidate key ✓
R2: StudentID → Major, StudentID is candidate key ✓

Note: This might not preserve the FD StudentID → Advisor directly
To check: Join R1 and R2, derive StudentID → Advisor through Major

EXAMPLE 6: 4NF Conversion

Table with MVD:

ProfessorCourseTextbook(Professor, Course, Textbook)

Professor	Course	Textbook
Smith	DB	Book A
Smith	DB	Book B
Smith	AI	Book A
Smith	AI	Book B

MVDs:

Professor →→ Course (Smith teaches {DB, AI})
Professor →→ Textbook (Smith uses {Book A, Book B})

Problem: Every course-textbook combination created (cross product)

4NF Solution:

ProfessorCourse(Professor, Course)
ProfessorTextbook(Professor, Textbook)

Professor	Course
Smith	DB
Smith	AI

Professor	Textbook
Smith	Book A
Smith	Book B

EXAMPLE 7: 5NF Conversion

Relation:

SupplyPartProject(Supplier, Part, Project)

Business Rules:

Supplier supplies specific parts
Project uses specific parts
Supplier works on specific projects
If conditions 1, 2, 3 met → Supplier supplies Part to Project

Data:

Supplier	Part	Project
S1	P1	J1
S1	P2	J1
S2	P1	J2

Join Dependency: Can losslessly decompose into three binary relations

5NF Solution:

SupplierPart(Supplier, Part)
PartProject(Part, Project)
SupplierProject(Supplier, Project)

Verification:

JOIN these three:
SupplierPart ⋈ PartProject ⋈ SupplierProject = Original relation

EXAMPLE 8: Lossless vs Lossy Decomposition

Lossless Example:

R(A, B, C) with FD: A → B

Decompose: R1(A, B), R2(A, C)

Test:

R1 ∩ R2 = {A}
A → AB (attributes of R1) ✓
Lossless!

Lossy Example:

R(A, B, C) with FD: A → B

Decompose: R1(A, B), R2(B, C)

Original Data:

A	B	C
a1	b1	c1
a2	b1	c2

R1:

A	B
a1	b1
a2	b1

R2:

B	C
b1	c1
b1	c2

Join R1 ⋈ R2:

A	B	C
a1	b1	c1
a1	b1	c2
a2	b1	c1
a2	b1	c2

Spurious tuples created! Lossy decomposition!

EXAMPLE 9: Dependency Preservation

R(A, B, C) with FDs: {A → B, B → C, A → C}

Decomposition 1:
R1(A, B), R2(B, C)

Check:

A → B: In R1 ✓
B → C: In R2 ✓
A → C: Derivable (A → B in R1, B → C in R2, transitivity)
Dependency preserved!

Decomposition 2:
R1(A, B), R2(A, C)

Check:

A → B: In R1 ✓
B → C: Cannot check without JOIN ✗
NOT dependency preserved

EXAMPLE 10: Complete Normalization Workflow

Original Relation:

R(OrderID, OrderDate, CustomerID, CustomerName, CustomerCity, ProductID, ProductName, Quantity, UnitPrice)

Sample Data:

OrderID	OrderDate	CustomerID	CustomerName	CustomerCity	ProductID	ProductName	Qty	Price
1	2024-01-01	C1	Alice	NYC	P1	Laptop	2	1000
1	2024-01-01	C1	Alice	NYC	P2	Mouse	1	25
2	2024-01-02	C2	Bob	LA	P1	Laptop	1	1000

FDs:

OrderID → OrderDate, CustomerID
CustomerID → CustomerName, CustomerCity
ProductID → ProductName, UnitPrice
{OrderID, ProductID} → Quantity

Primary Key: {OrderID, ProductID}

Check 1NF: All atomic ✓

Check 2NF:

CustomerID → CustomerName (partial dependency on OrderID part)
CustomerID → CustomerCity (partial dependency)
ProductID → ProductName (partial dependency)
ProductID → UnitPrice (partial dependency)

Violates 2NF!

Convert to 2NF:

Order(OrderID, OrderDate, CustomerID)
Customer(CustomerID, CustomerName, CustomerCity)
Product(ProductID, ProductName, UnitPrice)
OrderDetail(OrderID, ProductID, Quantity)

Check 3NF:

In Order: OrderID → CustomerID → CustomerName (transitive!)

Violates 3NF! Customer info transitively dependent

Already fixed: Customer separated in 2NF conversion

Final Schema (3NF):

Order(OrderID, OrderDate, CustomerID)
Customer(CustomerID, CustomerName, CustomerCity)
Product(ProductID, ProductName, UnitPrice)
OrderDetail(OrderID, ProductID, Quantity)

Check BCNF:
All determinants are candidate keys ✓
In BCNF!

Real-World Use

PRACTICAL DATABASE DESIGN

Design Workflow:

Requirements Analysis
- Identify entities
- Identify attributes
- Identify relationships
- Document business rules
Conceptual Design (ER Model)
- Create ER diagram
- Define entities and relationships
- Identify keys
Logical Design (Relational Model)
- Convert ER to tables
- Define FDs
- Normalize to 3NF/BCNF
Physical Design
- Choose data types
- Create indexes
- Consider denormalization for performance
Implementation
- CREATE TABLE statements
- Add constraints
- Create relationships

Real-World Considerations:

When to Denormalize:

Read-heavy applications (more SELECTs than UPDATEs)
Performance critical queries
Data warehouse / reporting
Calculated columns (storing aggregates)

Example:
Instead of joining Order and OrderDetail for total:
Add TotalAmount column to Order (updated by trigger)

When to Normalize:

Write-heavy applications
Data integrity critical (banking, medical)
Frequent updates
OLTP systems

Design Patterns:

1. Lookup Tables:

-- Instead of storing country names repeatedly
Country(CountryID, CountryName)
Customer(CustomerID, Name, CountryID)

2. Type Tables:

-- For categorization
ProductType(TypeID, TypeName)
Product(ProductID, ProductName, TypeID)

3. Audit Tables:

-- Track changes
EmployeeHistory(HistoryID, EmpID, OldSalary, NewSalary, ChangeDate)

4. Bridge Tables (Many-to-Many):

Student(StudentID, Name)
Course(CourseID, CourseName)
Enrollment(StudentID, CourseID, Grade)

Common Mistakes:

Mistake 1: Over-normalization
Bad: Separate table for every attribute
Good: Balance normalization with performance

Mistake 2: Under-normalization
Bad: All data in one table
Good: At least 3NF for OLTP

Mistake 3: Ignoring Business Rules
Bad: Generic schema ignoring domain
Good: Model specific business logic

Mistake 4: Poor Naming
Bad: Table1, Col1, FK1
Good: Employee, FirstName, DepartmentID

Mistake 5: Missing Constraints
Bad: No foreign keys, no checks
Good: Enforce referential integrity

Tools:

ER Diagram Tools:
- MySQL Workbench
- draw.io
- ERDPlus
- Lucidchart
Normalization Tools:
- Paper and pencil (best for learning!)
- Spreadsheet for FD analysis
SQL Tools:
- MySQL Workbench
- pgAdmin (PostgreSQL)
- SQL Server Management Studio
- DBeaver (multi-platform)

Performance Tips:

Index foreign keys for join performance
Index WHERE clause columns frequently queried
Composite indexes for multi-column queries
Avoid over-indexing (slows INSERT/UPDATE)
Analyze query plans to find bottlenecks

Documentation:

Always document:

ER diagrams
Table descriptions
FDs and business rules
Normalization decisions
Denormalization rationale
Index strategy

Industry Standards:

OLTP (Banking, E-commerce): 3NF minimum, BCNF preferred
OLAP (Data Warehouse): Star schema (denormalized for queries)
Hybrid: 3NF for transaction tables, materialized views for reporting

For exams

IMPORTANT EXAM QUESTIONS

Keys:

Define and distinguish between super key, candidate key, and primary key with examples.
- Super key: Any set uniquely identifying tuple
- Candidate key: Minimal super key
- Primary key: Selected candidate key
Find all candidate keys for R(A,B,C,D) with FDs: {A→B, B→C, D→A}
Solution: Find attribute closure for combinations starting with attributes only on left/neither side
Explain composite key and foreign key with examples.
- Composite: Multiple attributes form key
- Foreign: References primary key in another table

Functional Dependencies:

Define functional dependency. Give examples.
X → Y: If two tuples agree on X, they agree on Y
State Armstrong's Axioms and derive union rule.
Reflexivity, Augmentation, Transitivity
Union: From X→Y and X→Z, derive X→YZ
Find attribute closure of AB for R(A,B,C,D) with FDs: {A→C, B→D, C→B}
AB+ = {A,B,C,D}
What is minimal cover? Find minimal cover for F = {A→B, B→C, A→C, AB→D}
Remove redundant FDs: {A→B, B→C, AB→D}
(A→C derivable from A→B→C)

Normalization:

Explain 1NF, 2NF, 3NF, and BCNF with examples.
1NF: Atomic values
2NF: No partial dependency
3NF: No transitive dependency
BCNF: Every determinant is superkey
Given relation and FDs, determine normal form.
Check each normal form definition systematically
Convert given relation to 2NF/3NF/BCNF. Show steps.
Identify violations, decompose, verify
Explain update, insertion, and deletion anomalies with examples.
- Update: Redundant data, inconsistent updates
- Insertion: Can't insert without unrelated data
- Deletion: Lose important data unintentionally
What is the difference between 3NF and BCNF?
3NF: X→Y where X superkey OR Y prime
BCNF: X→Y where X must be superkey
BCNF stricter
Explain 4NF and multivalued dependencies.
MVD: X→→Y independent set of values
4NF: No non-trivial MVDs
What is 5NF? Give an example.
No join dependencies
Cannot decompose further without loss

Decomposition:

What is lossless join decomposition? How to test?
R = R1 ⋈ R2
Test: R1∩R2 → R1 or R1∩R2 → R2
Explain dependency preservation. Why important?
Check FDs without joining
Important for constraint enforcement
Show lossy decomposition example with spurious tuples.
Provide data showing join creates extra tuples
3NF vs BCNF trade-off: dependency preservation.
3NF preserves dependencies, BCNF may not
3NF often preferred in practice

Algorithms:

Describe 3NF synthesis algorithm.
Minimal cover → relation per FD → add key → remove redundant
Describe BCNF decomposition algorithm.
Find violation → split → recurse

Constraints:

Explain entity integrity and referential integrity.
Entity: PK not NULL
Referential: FK references existing PK or NULL
What are CASCADE options in foreign keys?
CASCADE: Delete/update related rows
SET NULL: Set FK to NULL
RESTRICT: Reject operation

QUICK REVISION

Keys Quick Reference:

Super Key ⊇ Candidate Key ⊇ Primary Key
Composite Key: Multiple attributes
Foreign Key: References PK in another table

Armstrong's Axioms:

Reflexivity: Y⊆X ⇒ X→Y
Augmentation: X→Y ⇒ XZ→YZ
Transitivity: X→Y, Y→Z ⇒ X→Z

Normal Forms:

1NF: Atomic values
2NF: 1NF + No partial dependency
3NF: 2NF + No transitive dependency
BCNF: Every determinant is superkey
4NF: BCNF + No MVDs
5NF: 4NF + No join dependencies

Anomalies:

Update: Redundancy causes inconsistency
Insert: Can't insert without unrelated data
Delete: Lose unrelated data

Decomposition:

Lossless: R = R1 ⋈ R2
Dependency Preserving: Check FDs without join

Test Shortcuts:

Attribute closure: Find keys
Minimal cover: Simplify FDs
Check BCNF: Every left side a superkey?
Lossless: Common attributes determine one side?

Key points

KEY TAKEAWAYS

✓ Keys uniquely identify tuples: Super key ⊇ Candidate key ⊇ Primary key

✓ Foreign keys establish relationships and enforce referential integrity

✓ Functional dependency X → Y: X determines Y (X and Y uniquely paired)

✓ Armstrong's Axioms are foundation for deriving FDs: Reflexivity, Augmentation, Transitivity

✓ Attribute closure (X+) finds all attributes determined by X (used to find keys)

✓ Normalization eliminates anomalies through structured decomposition

✓ 1NF requires atomic values (no arrays, no repeating groups)

✓ 2NF eliminates partial dependencies (applies to composite keys)

✓ 3NF eliminates transitive dependencies (non-prime → non-prime)

✓ BCNF is stricter than 3NF: Every determinant must be superkey

✓ 4NF eliminates multivalued dependencies (independent sets of values)

✓ 5NF eliminates join dependencies (cannot decompose further)

✓ Lossless decomposition: Join recovers original relation without spurious tuples

✓ Dependency preservation: Check constraints without joining tables

✓ Trade-off: BCNF may not preserve dependencies; 3NF guarantees both lossless and preservation

✓ Practical design: Normalize to 3NF/BCNF for OLTP, denormalize for OLAP/reporting

✓ Anomalies prevented: Update anomaly, insertion anomaly, deletion anomaly

REMEMBER: Normalization is not about perfection—it's about eliminating redundancy and anomalies while maintaining data integrity. Most real-world OLTP systems use 3NF as a good balance between theory and practice!

←

SQL DDL, DML, and PL/SQL

Transactions, ACID Properties, Concurrency Control, and Recovery

→