Keys, Constraints, Functional Dependencies, and Normalization

Unit 3CLO03, CLO04

Poor database design leads to redundancy, inconsistency, and anomalies. A student's address stored in 10 different tables? Deleting a course accidentally deletes all student records? These nightmares are preventable through proper database design using keys, constraints, and normalization. Understanding these concepts is fundamental to creating efficient, maintainable databases.

The Basics

Keys

Keys uniquely identify tuples and establish relationships between tables.

1. Super Key

  • Any set of attributes that uniquely identifies a tuple
  • Can have redundant attributes
  • Example: {StudentID}, {StudentID, Name}, {StudentID, Email} are all super keys

2. Candidate Key

  • Minimal super key (no redundant attributes)
  • Multiple candidate keys possible
  • Example: {StudentID}, {Email} both uniquely identify students

3. Primary Key

  • Selected candidate key
  • One per table
  • Cannot be NULL
  • Example: StudentID chosen as primary key

4. Foreign Key

  • References primary key in another table
  • Establishes relationships
  • Can be NULL (unless specified NOT NULL)
  • Enforces referential integrity

5. Composite Key

  • Primary key with multiple attributes
  • Used when single attribute insufficient
  • Example: {StudentID, CourseID} in Enrollment table

6. Alternate Key

  • Candidate keys not chosen as primary key
  • Example: If StudentID is primary, Email is alternate key

Integrity Constraints

1. Domain Constraints

  • Define valid values for attributes
  • Data types, CHECK constraints
  • Example: Age BETWEEN 18 AND 100

2. Entity Integrity

  • Primary key cannot be NULL
  • Ensures each entity uniquely identifiable
  • Enforced automatically by PRIMARY KEY constraint

3. Referential Integrity

  • Foreign key must reference existing primary key OR be NULL
  • Maintains consistency across relationships
  • CASCADE options handle dependent operations

Functional Dependencies (FDs)

Definition: X → Y means if two tuples agree on X, they must agree on Y

Example: StudentID → Name (StudentID determines Name)

Properties:

  • Reflexive: If Y ⊆ X, then X → Y
  • Transitive: If X → Y and Y → Z, then X → Z
  • Does NOT hold: If X → Y, then Y → X (not symmetric)

Armstrong's Axioms

Three fundamental rules for deriving FDs:

1. Reflexivity: If Y ⊆ X, then X → Y
Example: {StudentID, Name} → StudentID

2. Augmentation: If X → Y, then XZ → YZ
Example: If StudentID → Name, then {StudentID, CourseID} → {Name, CourseID}

3. Transitivity: If X → Y and Y → Z, then X → Z
Example: StudentID → DeptID and DeptID → DeptName, so StudentID → DeptName

Derived Rules:

4. Union: If X → Y and X → Z, then X → YZ
5. Decomposition: If X → YZ, then X → Y and X → Z
6. Pseudotransitivity: If X → Y and WY → Z, then WX → Z

Normalization

Process of organizing data to minimize redundancy and avoid anomalies.

Anomalies:

Update Anomaly: Same data repeated, changing one instance leaves inconsistency
Insertion Anomaly: Cannot insert data without unrelated data
Deletion Anomaly: Deleting data unintentionally deletes other data

Normal Forms:

1NF (First Normal Form):

  • All attributes contain atomic (indivisible) values
  • No repeating groups or arrays
  • Each cell contains single value

2NF (Second Normal Form):

  • In 1NF
  • No partial dependencies (non-prime attribute dependent on part of candidate key)
  • Only applies to tables with composite keys

3NF (Third Normal Form):

  • In 2NF
  • No transitive dependencies (non-prime → non-prime)
  • For every X → Y: Either X is superkey OR Y is prime attribute

BCNF (Boyce-Codd Normal Form):

  • Stricter than 3NF
  • For every X → Y: X must be superkey
  • Eliminates all anomalies due to FDs

4NF (Fourth Normal Form):

  • In BCNF
  • No multivalued dependencies
  • MVD: X →→ Y means X determines set of Y values independently

5NF (Fifth Normal Form / PJNF):

  • In 4NF
  • No join dependencies
  • Cannot be decomposed further without loss

Technical Details

DETAILED KEY CONCEPTS

Attribute Closure (Finding Keys):

Algorithm to find X+:

  1. Result := X
  2. For each FD Y → Z in F:
    If Y ⊆ Result, then Result := Result ∪ Z
  3. Repeat until Result doesn't change

Example:
Given R(A, B, C, D) with FDs: {A → B, B → C, C → D}
Find A+:

  • Start: A+ = {A}
  • Apply A → B: A+ = {A, B}
  • Apply B → C: A+ = {A, B, C}
  • Apply C → D: A+ = {A, B, C, D}
  • Conclusion: A is a superkey (contains all attributes)

Minimal Cover (Canonical Cover):

Minimal set of FDs equivalent to original set.

Algorithm:

  1. Decompose: Convert right sides to single attributes
  2. Remove redundant: Test each FD for redundancy
  3. Remove extraneous: Test each attribute in left side

Example:
F = {A → BC, B → C, A → B, AB → C}

Step 1: Decompose
F' = {A → B, A → C, B → C, A → B, AB → C}

Step 2: Remove redundant A → B (duplicate)
F'' = {A → B, A → C, B → C, AB → C}

Step 3: Remove AB → C (derivable from A → B and B → C)
Minimal Cover: {A → B, B → C, A → C}

Actually, A → C derivable from A → B → C
Final Minimal Cover: {A → B, B → C}

NORMAL FORMS IN DEPTH

1NF (First Normal Form)

Violations:

  • Multivalued attributes: Phone = {123-4567, 234-5678}
  • Composite attributes not broken down: Name = "John Doe"
  • Repeating groups: Course1, Course2, Course3 columns

Example Violation:

StudentIDNamePhones
101Alice123-4567, 234-5678

1NF Solution:

StudentIDNamePhone
101Alice123-4567
101Alice234-5678

2NF (Second Normal Form)

Only applies to tables with composite primary keys.

Violation: Non-prime attribute partially dependent on primary key

Example:
R(StudentID, CourseID, StudentName, CourseName, Grade)
Primary Key: {StudentID, CourseID}
FDs:

  • StudentID → StudentName (partial dependency!)
  • CourseID → CourseName (partial dependency!)
  • {StudentID, CourseID} → Grade

Problems:

  • Update: Change student name in multiple rows
  • Insert: Can't add student without enrolling in course
  • Delete: Delete last enrollment, lose student info

2NF Solution:
Decompose:

  • Student(StudentID, StudentName)
  • Course(CourseID, CourseName)
  • Enrollment(StudentID, CourseID, Grade)

3NF (Third Normal Form)

Violation: Transitive dependency (non-prime → non-prime)

Example:
R(StudentID, DeptID, DeptName, DeptHead)
Primary Key: StudentID
FDs:

  • StudentID → DeptID
  • DeptID → DeptName (transitive!)
  • DeptID → DeptHead (transitive!)

Problems:

  • Update: Change dept name in multiple student records
  • Insert: Can't add department without students
  • Delete: Delete last student in dept, lose dept info

3NF Solution:

  • Student(StudentID, DeptID)
  • Department(DeptID, DeptName, DeptHead)

BCNF (Boyce-Codd Normal Form)

Stricter than 3NF: Every determinant must be a superkey

Example where 3NF ≠ BCNF:

R(Professor, Course, Student)
FDs:

  • {Professor, Course} → Student
  • {Student, Course} → Professor

Candidate Keys: {Professor, Course}, {Student, Course}
All attributes are prime, so it's in 3NF.

But: Student, Course → Professor where {Student, Course} is not a superkey in context.
Actually, {Student, Course} IS a candidate key, so this IS in BCNF.

Better Example:

R(Court, StartTime, EndTime)
FD: Court → EndTime (one court can only end at one time, based on rules)
Candidate Key: {Court, StartTime}

Court → EndTime violates BCNF (Court is not a superkey)

BCNF Solution:

  • R1(Court, EndTime)
  • R2(Court, StartTime)

4NF (Fourth Normal Form)

Multivalued Dependency (MVD): X →→ Y

Meaning: For each X value, there's a set of Y values independent of other attributes.

Example:

R(Professor, Course, Phone)

ProfessorCoursePhone
Dr. SmithDB111-1111
Dr. SmithDB222-2222
Dr. SmithAI111-1111
Dr. SmithAI222-2222

Professor →→ Course (Professor determines set of courses)
Professor →→ Phone (Professor determines set of phones)
These are independent!

Problems:

  • Redundancy: Phone numbers repeated for each course
  • Update: Change phone, must update multiple rows
  • Insert: Can't add phone without adding for all courses

4NF Solution:

  • ProfessorCourse(Professor, Course)
  • ProfessorPhone(Professor, Phone)

5NF (Fifth Normal Form / Project-Join Normal Form)

Join Dependency: R can be losslessly decomposed into R1, R2, ..., Rn

Example:

R(Supplier, Part, Project)
Meaning: Supplier supplies Part to Project

Business Rules:

  1. If supplier S supplies part P
  2. And project J uses part P
  3. And supplier S works on project J
  4. Then supplier S supplies part P to project J

This creates a join dependency!

5NF Solution:

  • SupplierPart(Supplier, Part)
  • ProjectPart(Project, Part)
  • SupplierProject(Supplier, Project)

Verification: JOIN these three recovers original without spurious tuples

DECOMPOSITION PROPERTIES

1. Lossless Join Decomposition

Decomposition R → R1, R2 is lossless if:
R = R1 ⋈ R2

Test: R1 ∩ R2 → R1 or R1 ∩ R2 → R2

Example:
R(A, B, C) with A → B
Decompose: R1(A, B), R2(A, C)
R1 ∩ R2 = {A}
A → B (attributes of R1), so lossless!

2. Dependency Preservation

All FDs can be checked without joining.

Example:
R(A, B, C) with FDs: A → B, B → C
Decompose: R1(A, B), R2(B, C)

Check FDs:

  • A → B: Check in R1 ✓
  • B → C: Check in R2 ✓
  • Dependency preserved!

Trade-off: BCNF guarantees lossless but may not preserve dependencies
3NF guarantees both lossless and dependency preservation

NORMALIZATION ALGORITHM

3NF Synthesis Algorithm:

Input: Relation R, FDs F
Output: 3NF decomposition

  1. Find minimal cover Fc
  2. For each FD X → Y in Fc, create relation(X, Y)
  3. If no relation contains a candidate key of R, add one
  4. Eliminate redundant relations

BCNF Decomposition Algorithm:

Input: Relation R, FDs F
Output: BCNF decomposition

  1. If R in BCNF, done
  2. Find FD X → Y that violates BCNF
  3. Decompose: R1 = XY, R2 = R - (Y - X)
  4. Recursively decompose R1 and R2

Example:

R(A, B, C, D)
F = {A → B, B → C, C → D}

Minimal Cover: Same (already minimal)

3NF Synthesis:

  • R1(A, B) from A → B
  • R2(B, C) from B → C
  • R3(C, D) from C → D
  • Candidate key: A (check: A+ = ABCD)
  • R1 contains A, so done

Result: R1(A, B), R2(B, C), R3(C, D)

Examples

COMPREHENSIVE EXAMPLES

EXAMPLE 1: Finding Keys

R(A, B, C, D, E)
FDs: {A → B, BC → E, ED → A}

Find all candidate keys:

Method: Attribute Classification

  • Left side only: C, D (must be in every key)
  • Right side only: B, E (never in key)
  • Both sides: A
  • Neither: None

Start with {C, D}:
CD+ = {C, D}
Not a key (doesn't include all attributes)

Try adding A: ACD+

  • A → B: ACD+ = {A, C, D, B}
  • BC → E: ACD+ = {A, C, D, B, E}
  • All attributes! ACD is a candidate key.

Try adding others systematically...
Result: Candidate keys = {ACD}

EXAMPLE 2: 1NF Conversion

Unnormalized:

OrderIDCustomerNameItems
1JohnBook, Pen, Notebook
2JaneLaptop

Problems:

  • Items column contains multiple values (not atomic)
  • Can't easily search for specific item
  • Can't efficiently count items per order

1NF Solution:

OrderIDCustomerNameItem
1JohnBook
1JohnPen
1JohnNotebook
2JaneLaptop

EXAMPLE 3: 2NF Conversion

Table in 1NF:

StudentCourse(StudentID, CourseID, StudentName, CourseName, Instructor, Grade)

Primary Key: {StudentID, CourseID}

FDs:

  • StudentID → StudentName (partial dependency!)
  • CourseID → CourseName (partial dependency!)
  • CourseID → Instructor (partial dependency!)
  • {StudentID, CourseID} → Grade

Anomalies:

  • Update: Change student name requires updating all enrollment records
  • Insert: Can't add student without enrolling in a course
  • Delete: Drop last course, lose student information

2NF Solution:

Student(StudentID, StudentName)
Course(CourseID, CourseName, Instructor)
Enrollment(StudentID, CourseID, Grade)

Verification:

  • No partial dependencies
  • All non-prime attributes fully dependent on entire primary key

EXAMPLE 4: 3NF Conversion

Table in 2NF:

Employee(EmpID, DeptID, DeptName, DeptLocation, Salary)

Primary Key: EmpID

FDs:

  • EmpID → DeptID
  • EmpID → Salary
  • DeptID → DeptName (transitive!)
  • DeptID → DeptLocation (transitive!)

Transitive Path: EmpID → DeptID → DeptName

Anomalies:

  • Update: Change department name in all employee records
  • Insert: Can't add department without employees
  • Delete: Remove last employee, lose department info

3NF Solution:

Employee(EmpID, DeptID, Salary)
Department(DeptID, DeptName, DeptLocation)

EXAMPLE 5: BCNF Conversion

Table in 3NF:

ClassSchedule(Classroom, Period, Course, Professor)

FDs:

  • {Classroom, Period} → Course
  • {Classroom, Period} → Professor
  • {Professor, Course} → Classroom (assuming one prof teaches one course in one room)

Candidate Keys:

  • {Classroom, Period}
  • {Professor, Course, Period} (need Period to uniquely identify)

Actually, let's simplify:

Better Example:

R(StudentID, Major, Advisor)

FDs:

  • StudentID → Major
  • Major → Advisor (each major has one advisor)

Candidate Key: StudentID

Check BCNF:

  • StudentID → Major: StudentID is superkey ✓
  • Major → Advisor: Major is NOT superkey ✗

Violates BCNF!

BCNF Solution:

R1(Major, Advisor)
R2(StudentID, Major)

Verification:

  • R1: Major → Advisor, Major is candidate key ✓
  • R2: StudentID → Major, StudentID is candidate key ✓

Note: This might not preserve the FD StudentID → Advisor directly
To check: Join R1 and R2, derive StudentID → Advisor through Major

EXAMPLE 6: 4NF Conversion

Table with MVD:

ProfessorCourseTextbook(Professor, Course, Textbook)

ProfessorCourseTextbook
SmithDBBook A
SmithDBBook B
SmithAIBook A
SmithAIBook B

MVDs:

  • Professor →→ Course (Smith teaches {DB, AI})
  • Professor →→ Textbook (Smith uses {Book A, Book B})

Problem: Every course-textbook combination created (cross product)

4NF Solution:

ProfessorCourse(Professor, Course)
ProfessorTextbook(Professor, Textbook)

ProfessorCourse
SmithDB
SmithAI
ProfessorTextbook
SmithBook A
SmithBook B

EXAMPLE 7: 5NF Conversion

Relation:

SupplyPartProject(Supplier, Part, Project)

Business Rules:

  1. Supplier supplies specific parts
  2. Project uses specific parts
  3. Supplier works on specific projects
  4. If conditions 1, 2, 3 met → Supplier supplies Part to Project

Data:

SupplierPartProject
S1P1J1
S1P2J1
S2P1J2

Join Dependency: Can losslessly decompose into three binary relations

5NF Solution:

SupplierPart(Supplier, Part)
PartProject(Part, Project)
SupplierProject(Supplier, Project)

Verification:

JOIN these three:
SupplierPart ⋈ PartProject ⋈ SupplierProject = Original relation

EXAMPLE 8: Lossless vs Lossy Decomposition

Lossless Example:

R(A, B, C) with FD: A → B

Decompose: R1(A, B), R2(A, C)

Test:

  • R1 ∩ R2 = {A}
  • A → AB (attributes of R1) ✓
  • Lossless!

Lossy Example:

R(A, B, C) with FD: A → B

Decompose: R1(A, B), R2(B, C)

Original Data:

ABC
a1b1c1
a2b1c2

R1:

AB
a1b1
a2b1

R2:

BC
b1c1
b1c2

Join R1 ⋈ R2:

ABC
a1b1c1
a1b1c2
a2b1c1
a2b1c2

Spurious tuples created! Lossy decomposition!

EXAMPLE 9: Dependency Preservation

R(A, B, C) with FDs: {A → B, B → C, A → C}

Decomposition 1:
R1(A, B), R2(B, C)

Check:

  • A → B: In R1 ✓
  • B → C: In R2 ✓
  • A → C: Derivable (A → B in R1, B → C in R2, transitivity)
  • Dependency preserved!

Decomposition 2:
R1(A, B), R2(A, C)

Check:

  • A → B: In R1 ✓
  • B → C: Cannot check without JOIN ✗
  • NOT dependency preserved

EXAMPLE 10: Complete Normalization Workflow

Original Relation:

R(OrderID, OrderDate, CustomerID, CustomerName, CustomerCity, ProductID, ProductName, Quantity, UnitPrice)

Sample Data:

OrderIDOrderDateCustomerIDCustomerNameCustomerCityProductIDProductNameQtyPrice
12024-01-01C1AliceNYCP1Laptop21000
12024-01-01C1AliceNYCP2Mouse125
22024-01-02C2BobLAP1Laptop11000

FDs:

  • OrderID → OrderDate, CustomerID
  • CustomerID → CustomerName, CustomerCity
  • ProductID → ProductName, UnitPrice
  • {OrderID, ProductID} → Quantity

Primary Key: {OrderID, ProductID}

Check 1NF: All atomic ✓

Check 2NF:

  • CustomerID → CustomerName (partial dependency on OrderID part)
  • CustomerID → CustomerCity (partial dependency)
  • ProductID → ProductName (partial dependency)
  • ProductID → UnitPrice (partial dependency)

Violates 2NF!

Convert to 2NF:

Order(OrderID, OrderDate, CustomerID)
Customer(CustomerID, CustomerName, CustomerCity)
Product(ProductID, ProductName, UnitPrice)
OrderDetail(OrderID, ProductID, Quantity)

Check 3NF:

In Order: OrderID → CustomerID → CustomerName (transitive!)

Violates 3NF! Customer info transitively dependent

Already fixed: Customer separated in 2NF conversion

Final Schema (3NF):

  • Order(OrderID, OrderDate, CustomerID)
  • Customer(CustomerID, CustomerName, CustomerCity)
  • Product(ProductID, ProductName, UnitPrice)
  • OrderDetail(OrderID, ProductID, Quantity)

Check BCNF:
All determinants are candidate keys ✓
In BCNF!

Real-World Use

PRACTICAL DATABASE DESIGN

Design Workflow:

  1. Requirements Analysis

    • Identify entities
    • Identify attributes
    • Identify relationships
    • Document business rules
  2. Conceptual Design (ER Model)

    • Create ER diagram
    • Define entities and relationships
    • Identify keys
  3. Logical Design (Relational Model)

    • Convert ER to tables
    • Define FDs
    • Normalize to 3NF/BCNF
  4. Physical Design

    • Choose data types
    • Create indexes
    • Consider denormalization for performance
  5. Implementation

    • CREATE TABLE statements
    • Add constraints
    • Create relationships

Real-World Considerations:

When to Denormalize:

  • Read-heavy applications (more SELECTs than UPDATEs)
  • Performance critical queries
  • Data warehouse / reporting
  • Calculated columns (storing aggregates)

Example:
Instead of joining Order and OrderDetail for total:
Add TotalAmount column to Order (updated by trigger)

When to Normalize:

  • Write-heavy applications
  • Data integrity critical (banking, medical)
  • Frequent updates
  • OLTP systems

Design Patterns:

1. Lookup Tables:

-- Instead of storing country names repeatedly
Country(CountryID, CountryName)
Customer(CustomerID, Name, CountryID)

2. Type Tables:

-- For categorization
ProductType(TypeID, TypeName)
Product(ProductID, ProductName, TypeID)

3. Audit Tables:

-- Track changes
EmployeeHistory(HistoryID, EmpID, OldSalary, NewSalary, ChangeDate)

4. Bridge Tables (Many-to-Many):

Student(StudentID, Name)
Course(CourseID, CourseName)
Enrollment(StudentID, CourseID, Grade)

Common Mistakes:

Mistake 1: Over-normalization
Bad: Separate table for every attribute
Good: Balance normalization with performance

Mistake 2: Under-normalization
Bad: All data in one table
Good: At least 3NF for OLTP

Mistake 3: Ignoring Business Rules
Bad: Generic schema ignoring domain
Good: Model specific business logic

Mistake 4: Poor Naming
Bad: Table1, Col1, FK1
Good: Employee, FirstName, DepartmentID

Mistake 5: Missing Constraints
Bad: No foreign keys, no checks
Good: Enforce referential integrity

Tools:

  1. ER Diagram Tools:

    • MySQL Workbench
    • draw.io
    • ERDPlus
    • Lucidchart
  2. Normalization Tools:

    • Paper and pencil (best for learning!)
    • Spreadsheet for FD analysis
  3. SQL Tools:

    • MySQL Workbench
    • pgAdmin (PostgreSQL)
    • SQL Server Management Studio
    • DBeaver (multi-platform)

Performance Tips:

  1. Index foreign keys for join performance
  2. Index WHERE clause columns frequently queried
  3. Composite indexes for multi-column queries
  4. Avoid over-indexing (slows INSERT/UPDATE)
  5. Analyze query plans to find bottlenecks

Documentation:

Always document:

  • ER diagrams
  • Table descriptions
  • FDs and business rules
  • Normalization decisions
  • Denormalization rationale
  • Index strategy

Industry Standards:

  • OLTP (Banking, E-commerce): 3NF minimum, BCNF preferred
  • OLAP (Data Warehouse): Star schema (denormalized for queries)
  • Hybrid: 3NF for transaction tables, materialized views for reporting

For exams

IMPORTANT EXAM QUESTIONS

Keys:

  1. Define and distinguish between super key, candidate key, and primary key with examples.

    • Super key: Any set uniquely identifying tuple
    • Candidate key: Minimal super key
    • Primary key: Selected candidate key
  2. Find all candidate keys for R(A,B,C,D) with FDs: {A→B, B→C, D→A}
    Solution: Find attribute closure for combinations starting with attributes only on left/neither side

  3. Explain composite key and foreign key with examples.

    • Composite: Multiple attributes form key
    • Foreign: References primary key in another table

Functional Dependencies:

  1. Define functional dependency. Give examples.
    X → Y: If two tuples agree on X, they agree on Y

  2. State Armstrong's Axioms and derive union rule.
    Reflexivity, Augmentation, Transitivity
    Union: From X→Y and X→Z, derive X→YZ

  3. Find attribute closure of AB for R(A,B,C,D) with FDs: {A→C, B→D, C→B}
    AB+ = {A,B,C,D}

  4. What is minimal cover? Find minimal cover for F = {A→B, B→C, A→C, AB→D}
    Remove redundant FDs: {A→B, B→C, AB→D}
    (A→C derivable from A→B→C)

Normalization:

  1. Explain 1NF, 2NF, 3NF, and BCNF with examples.
    1NF: Atomic values
    2NF: No partial dependency
    3NF: No transitive dependency
    BCNF: Every determinant is superkey

  2. Given relation and FDs, determine normal form.
    Check each normal form definition systematically

  3. Convert given relation to 2NF/3NF/BCNF. Show steps.
    Identify violations, decompose, verify

  4. Explain update, insertion, and deletion anomalies with examples.

    • Update: Redundant data, inconsistent updates
    • Insertion: Can't insert without unrelated data
    • Deletion: Lose important data unintentionally
  5. What is the difference between 3NF and BCNF?
    3NF: X→Y where X superkey OR Y prime
    BCNF: X→Y where X must be superkey
    BCNF stricter

  6. Explain 4NF and multivalued dependencies.
    MVD: X→→Y independent set of values
    4NF: No non-trivial MVDs

  7. What is 5NF? Give an example.
    No join dependencies
    Cannot decompose further without loss

Decomposition:

  1. What is lossless join decomposition? How to test?
    R = R1 ⋈ R2
    Test: R1∩R2 → R1 or R1∩R2 → R2

  2. Explain dependency preservation. Why important?
    Check FDs without joining
    Important for constraint enforcement

  3. Show lossy decomposition example with spurious tuples.
    Provide data showing join creates extra tuples

  4. 3NF vs BCNF trade-off: dependency preservation.
    3NF preserves dependencies, BCNF may not
    3NF often preferred in practice

Algorithms:

  1. Describe 3NF synthesis algorithm.
    Minimal cover → relation per FD → add key → remove redundant

  2. Describe BCNF decomposition algorithm.
    Find violation → split → recurse

Constraints:

  1. Explain entity integrity and referential integrity.
    Entity: PK not NULL
    Referential: FK references existing PK or NULL

  2. What are CASCADE options in foreign keys?
    CASCADE: Delete/update related rows
    SET NULL: Set FK to NULL
    RESTRICT: Reject operation

QUICK REVISION

Keys Quick Reference:

  • Super Key ⊇ Candidate Key ⊇ Primary Key
  • Composite Key: Multiple attributes
  • Foreign Key: References PK in another table

Armstrong's Axioms:

  • Reflexivity: Y⊆X ⇒ X→Y
  • Augmentation: X→Y ⇒ XZ→YZ
  • Transitivity: X→Y, Y→Z ⇒ X→Z

Normal Forms:

  • 1NF: Atomic values
  • 2NF: 1NF + No partial dependency
  • 3NF: 2NF + No transitive dependency
  • BCNF: Every determinant is superkey
  • 4NF: BCNF + No MVDs
  • 5NF: 4NF + No join dependencies

Anomalies:

  • Update: Redundancy causes inconsistency
  • Insert: Can't insert without unrelated data
  • Delete: Lose unrelated data

Decomposition:

  • Lossless: R = R1 ⋈ R2
  • Dependency Preserving: Check FDs without join

Test Shortcuts:

  • Attribute closure: Find keys
  • Minimal cover: Simplify FDs
  • Check BCNF: Every left side a superkey?
  • Lossless: Common attributes determine one side?

Key points

KEY TAKEAWAYS

Keys uniquely identify tuples: Super key ⊇ Candidate key ⊇ Primary key

Foreign keys establish relationships and enforce referential integrity

Functional dependency X → Y: X determines Y (X and Y uniquely paired)

Armstrong's Axioms are foundation for deriving FDs: Reflexivity, Augmentation, Transitivity

Attribute closure (X+) finds all attributes determined by X (used to find keys)

Normalization eliminates anomalies through structured decomposition

1NF requires atomic values (no arrays, no repeating groups)

2NF eliminates partial dependencies (applies to composite keys)

3NF eliminates transitive dependencies (non-prime → non-prime)

BCNF is stricter than 3NF: Every determinant must be superkey

4NF eliminates multivalued dependencies (independent sets of values)

5NF eliminates join dependencies (cannot decompose further)

Lossless decomposition: Join recovers original relation without spurious tuples

Dependency preservation: Check constraints without joining tables

Trade-off: BCNF may not preserve dependencies; 3NF guarantees both lossless and preservation

Practical design: Normalize to 3NF/BCNF for OLTP, denormalize for OLAP/reporting

Anomalies prevented: Update anomaly, insertion anomaly, deletion anomaly

REMEMBER: Normalization is not about perfection—it's about eliminating redundancy and anomalies while maintaining data integrity. Most real-world OLTP systems use 3NF as a good balance between theory and practice!