Vous êtes sur la page 1sur 29

CSC271 Database Systems

Relational Database Design


Normalization Process

Dr. Khalid Latif


Relation Decomposition

Normalization is used to validate and


improve a logical design so that it satisfies
certain constraints that avoid unnecessary
duplication of data
It is a process of decomposing relations
with anomalies to produce smaller, well-
structured relations.
But all decompositions are not good
Relation Decomposition Example

Consider employee relation that we want to


decompose
Student = (reg, name, course, grade)
Suppose we decompose employee into:
Student1 = (id, name)
Student2 = (name, course, grade)

Can we reconstruct the original relation


by joining the decomposition?
Reg Name Course Grade
101 Fatima EE101 B
102 Fatima CS273 A
... ... ... ...

Reg Name Name Course Grade


101 Fatima Fatima EE101 B
102 Fatima Fatima CS273 A


Reg Name Course Grade We can NOT
101 Fatima EE101 B reconstruct the
101 Fatima CS273 A
102 Fatima EE101 B
original relation
102 Fatima CS273 A from the
decomposition!
Normalization Theory

Decide whether a particular relation R is in


good form
If relation R is not in good form
decompose it into a set of relations
{R1, R2, ..., Rk} such that
!each new relation is in good form and
!the
decomposition is a lossless-join
decomposition
First Normal Form (1NF)

Domain is atomic if its elements are


considered to be indivisible units.
Non-atomic values complicate storage and
may involve redundant storage of data.
A relational schema R is in first normal
form if the domains of all attributes of R are
atomic.
No multi-valued attributes!
Atomic Values
Strings are normally considered indivisible and hence
atomic.
Identification numbers like CIIT/2016/ISB/CS/271 that
can be broken up into parts are non-atomic.
Atomicity is actually a property of how the elements of
the domain are used.
Suppose that students are given registration numbers
which are strings of the form CS0012 or EE1127
Ifthe first two characters are extracted to find their
program, the domain of registration numbers is not
atomic.
Achieving 1NF: Composite Strings

Split into multiple attributes


ID Name Address
100 Umar H#13, Street 13, G13/3 Islamabad
110 Usman H#11, Street 11, G11/1 Islamabad
150 Fatima H#3, Street 3, Chaklala Scheme III, Rawalpindi

ID Name Address City


100 Umar H#13, Street 13, G13/3 Islamabad
110 Usman H#11, Street 11, G11/1 Islamabad
150 Fatima H#3, Street 3, Chaklala Scheme III Rawalpindi
Achieving 1NF: Multivalued Attributes

Split into multiple relations


and pivot the values to rows.
ID Name Course
100 Umar CS101, SE103, CS201

ID Name ID Course
100 Umar 100 CS101
100 SE103
100 CS201
Functional Dependencies
ID Name
Value of a certain set of attributes
determines uniquely the value for 100 Umar
another set of attributes. 101 Ali
102 Umar
Let R be a relation schema and and are two
sets of attributes in R, that is R and R
The functional dependency holds on R if
and only if
For any legal relation r (R), whenever any two tuples
t1 and t2 of r agree on the attributes , they also
agree on the attributes .
!t1[] = t2 [] t1[ ] = t2 [ ]
FD Example

On this instance, Customer_ID Customer_Address holds, but


not other way round
What else holds?
Order Order Customer Customer Customer Product Product Product Unit Ordered
ID Date ID Name Address ID Name Finish Price Quantity
106 10.02.2016 2 Value Furniture F10 7 Dining Table Ash 800 2
Markaz
106 10.02.2016 2 Value Furniture F10 5 Bed Oak 900 2
Markaz
106 10.02.2016 2 Value Furniture F10 4 Writers Maple 600 1
Markaz Desk
107 15.02.2016 6 Furniture Golra 11 4-Dr Dresser Oak 500 3
Gallery
107 15.02.2016 6 Furniture Golra 4 Writers Maple 600 4
Gallery Desk
108 15.02.2016 2 Value Furniture F10 5 Bed Maple 900 1
Markaz
109 16.02.2016 7 Office Valley Golra 8 Office Desk Ash 500 4
Closure of a Set of FDs

Given a set F of functional dependencies,


there are certain other functional
dependencies that are logically implied by F.
The set of all functional dependencies
logically implied by F is the closure of F
denoted by F+.
F+ is a superset of F.
Finding Closure of a Set of FDs
We can find F+ by applying Armstrongs Axioms:
if , then (reflexivity)
if , then (augmentation)
if , and , then (transitivity)
These rules are
sound (generate only functional dependencies
that actually hold) and
complete (generate all functional dependencies
that hold).
Closure Example

R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
BH}
Some members of F+
A H transitivity from A B and B H
AG I augmenting A C with G, AG CG
and transitivity with CG I
More on FDs

A functional dependency is trivial


if or
{Reg_Num, Name, Program} {Name, Program}
We are interested in analyzing non-trivial FDs
K is a superkey for relation schema R iff K R
K is a candidate key for R iff
K R, and
for no K, R
Checkpoint

Find Functional Dependencies for the following relation


Compute F+ and also find candidate key(s)

Reg_Num Name Program Grade Course_Title Completed


100 Umar MS-CS B SemWeb 24-01-2009
100 Umar MS-CS B Inf Ret 06-05-2009
140 Ali MS-EE B+ Net Sec 06-05-2009
110 Usman MS-IT A Adv DB 15-02-2010
110 Usman MS-IT B Data Mining 24-01-2009
190 Ali MS-IT ABS Data Mining
150 Fatima MS-EE B Net Sec 06-01-2009
150 Fatima MS-EE C+ Ad. Net Com 06-01-2009
Checkpoint 2

Compute F+ for the following relation

Order Order Customer Customer Customer Product Product Product Unit Ordered
ID Date ID Name Address ID Name Finish Price Quantity
106 10.02.2016 2 Value Furniture F10 7 Dining Table Ash 800 2
Markaz
106 10.02.2016 2 Value Furniture F10 5 Bed Oak 900 2
Markaz
106 10.02.2016 2 Value Furniture F10 4 Writers Maple 600 1
Markaz Desk
107 15.02.2016 6 Furniture Golra 11 4-Dr Dresser Oak 500 3
Gallery
107 15.02.2016 6 Furniture Golra 4 Writers Maple 600 4
Gallery Desk
108 15.02.2016 2 Value Furniture F10 6 Bed Maple 900 1
Markaz
109 16.02.2016 7 Office Valley Golra 8 Office Desk Ash 500 4
Checkpoint 2 - Solution

Order_ID Order_Date, Customer_ID, Customer_Name, Cust_Address


Customer_ID Customer_Name, Customer_Address
Product_ID Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID Order_Quantity
Keys and Functional Dependency

Full dependency: A non-key attribute is


functionally dependent on the entire primary key.
Partial dependency: A non-key attribute is
functionally dependent on the part of the primary
key.
Transitive dependency: A non-key attribute is
functionally dependent on any other non-key
attribute.
Second Normal Form

A relation is in 2NF if it is in 1NF and every non-


key attribute is fully functionally dependent on the
ENTIRE primary key.
No partial functional dependencies: Every non-
key attribute must be defined by the entire set of
PK attributes, NOT by part of the key.
Every relation whose primary key consists of just
one attribute is automatically in Second Normal
Form.
Achieving 2NF

Remove Partial Dependencies by splitting the relation


into multiple relations where full dependency is achieved.
There may still be transitive dependencies, requiring
some more work for achieving third normal form.
Third Normal Form

Relation is in 2NF and no transitive dependencies


Non-key determinant with transitive dependencies go into
a new table; non-key determinant becomes primary key
in the new table and stays as foreign key in the old table.
Third Normal Form

A relation schema R is in 3NF iff for all non-trivial


functional dependencies in F+
Either is a superkey for R
Or each attribute A is contained in a candidate key
for R
Example schema not in 3NF:
sales = ( order_id, order_date, product_id, quantity )
!F+ = {order_id order_date,}
!Order_id is neither a superkey nor order_date part
of the candidate key
Boyce-Codd Normal Form

A relation schema R is in BCNF iff for all non-trivial


functional dependencies in F+
is a superkey for R If a relation is in BCNF
it is in 3NF as well

Whenever a set of attributes of R is determining another


attribute, should determine all the attributes of R
Example schema not in BCNF:
sales = ( order_id, order_date, product_id, quantity )
!F+ = {order_id order_date,}
!Order_id is not a superkey
Decomposing a Schema into BCNF

Suppose we have a schema R and a non-trivial


dependency causes a violation of BCNF
We decompose R into:
( U)
( R - )
In previous example of order_id order_date
= order_id
= order_date
So sales is replaced by
( U ) = (order_id, order_date), and
( R ) = (order_id, product_id, quantity, )
3NF Relation Not Meeting BCNF

Only in rare cases does a 3NF relation not meet the


requirements of BCNF.
Department Campus Head
Computer Science Lahore Umar
Bio Informatics Islamabad Usman
Computer Science Islamabad Usman
Mathematics Islamabad Fatima
Statistics Islamabad Fatima

Functional Dependencies
multiple
Head-> Campus overlapping
Department, Campus -> Head candidate keys
Checkpoint

Decompose following relation to BCNF


Grade_Report (RegNum,StudentName, Program,
Semester, CourseCode, CourseTitle, CreditHours,
InstructorID, InstructorName, Grade)
Functional dependencies
RegNum -> StudentName, Program
CourseCode -> CourseTitle, CreditHours
InstructorID -> InstructorName
Semester, CourseCode -> InstructorID
RegNum, Semester, CourseCode -> Grade
Checkpoint

CNIC Name License Reg Num Ticket Fine Date Violation Violation Fine
Num Num Code Summary

3302609 Ali R9087 RLA-456 5634 17/10/2008 1 Parking 300

3302609 Ali R9087 RLA-456 6017 13/11/2008 2 Speed 500

3029417 Umar L4720 LHR-567 4983 05/10/2008


13/11/2008 3 Signal 900

3029417 Umar L4720 LHR-567 6293 18/10/2008 2 Speed 500

3029417 Umar L4720 LHR-567 7892 13/12/2008 4 Lane 400


3039399 Ali L4441 LHR-123 8080 13/12/2008 2 Speed 500
Normalization Steps

Remove anomalies
of multiple BCNF
candidate keys

Remove transitive
dependencies 3NF

Remove partial
dependencies 2NF

Remove multi-
valued attributes 1NF

Vous aimerez peut-être aussi