Vous êtes sur la page 1sur 31

Normalization

 Normalization is defined as organizing data


so as to reduce unnecessary data
redundancy and to preserve information.

 A normal form is a measure of quality of


design of a relation schema and hence of a
relational database.

1
Normal forms & Normal tests
 The normal form of relation schema is the
highest normal form satisfied by the
schema.
 There are various normal forms and normal
tests namely,
First normal form, Second normal form,
Third normal form, Boyce-Codd normal form
 and tests to verify whether a relation
schema is in a desired normal form.
2
First normal form (1NF)
A relation schema is said to be in first
normal form if all its attributes are atomic.

3
A schema which is not in first normal form

Dlocations is a
multi-valued attribute.

4
Conversion into first normal form

The above relation schema is decomposed into two relation schemas.

Now DEPARTMENT & DEPT_LOCATIONS are in 1NF.

5
Conversion into first normal form
Alternative technique to decomposition also exists.
You may expand the primary key incorporating the
Multi-valued attribute into the primary key.

This solution has a


disadvantage of introducing
redundancy in the relation.

6
Conversion into first normal form
If maximum number of values of multi-valued
attribute is known then you may replace the
multi-valued attribute by a number of attributes.
In the example, instead of using Dlocations, you
may use three attributes, namely
Dlocation1, Dlocation2, Dlocation3
assuming that the maximum number of values of
Dlocations can be three.
This solution has the disadvantage of introducing NULL values if most
departments have fewer than three locations. It further introduces spurious
semantics about the ordering among the location values that is not
originally intended. Querying on this attribute becomes more difficult. 7
Multi-valued attribute replaced

DEPARTMENT
Dname Dnumber Dmgr_ssn Docation1 Dlocation2 Dlocation3

Research 5 333445555 Bellaire Sugarland Houston

Administration 4 987654321 Stafford

Headquaters 1 888665555 Houston

8
Conversion into first normal form
First normal form does not allow complex attribute too.

9
Conversion into first normal form
Decompose

into two relations

10
Multiple multi-valued attributes

This relation is NOT in 1NF


and so

decompose this relation into two relations, namely

Presence of multiple complex attributes can be dealt with in a similar fashion.


Firstly, the composite part of the complex attribute is replaced by its components
and multi-valued aspect of each component is dealt with in the above manner.

11
Second normal form (2NF)

A relation schema R is said to be in second normal


form if

(i) it is in first normal form and

(ii) there is no partial dependency on primary


key of R.

12
Example

FD1: {Ssn, Pnumber}  {Hours} (It is a full functional dependency.)

FD2: {Ssn}  {Ename} (It is a partial functional dependency.)

FD3: {Pnumber}  {Pname, Plocation} (It is a partial functional dependency.)

EMP_PROJ is already in 1NF because all its attributes are atomic.

But it is NOT in 2NF because of the partial dependencies – FD2 and


FD3.

13
Decomposition into 2NF

 In order to reduce the schema EMP_PROJ into 2NF,


we decompose it with respect to partial functional
dependency.
 Decomposition with respect to
{Ssn}  {Ename} results in
R1(Ssn, Ename) &
R2(Ssn, Pnumber, Hours, Pname, Plocation).
 R1 is in 2NF but R2 is not because of partial
dependency {Pnumber}  {Pname, Plocation}
 So decompose R2 with respect to
{Pnumber}  {Pname, Plocation}

14
Decomposition into 2NF

 The decomposition of R2 with respect to {Pnumber}


 {Pname, Plocation} results in
 R3(Pnumber, Pname, Plocation} &
R4(Ssn, Pnumber, Hours)
 So decomposition of EMP_PROJ with respect to
partial dependency results in three relation
schemas namely,
 R1(Ssn, Ename), R3(Pnumber, Pname, Plocation} &
R4(Ssn, Pnumber, Hours).
 All these relations are in 2NF. (In fact these are in
higher normal forms than 2NF.)

15
Decomposed relation schemas of EMP_PROJ

R1 R3

R4

16
Decomposition rule
 If you want to decompose a relation R with
respect to a functional dependency X  Y then
one of the relations will be R1 = X U Y and the
other will be R2 = R – Y.

 The primary key of R1 is X and the primary key


of R2 is the primary key of R.

17
Third normal form (3NF)
A relation schema R is said to be in third normal
form if

(i) it is in second normal form and

(ii) there is no transitive dependency in R.

18
Example
Consider the relation schema

This schema is in 2NF because all its attributes are atomic


and there is no partial dependency.

But it is not in 3NF because of the transitive dependency


{Dnumber}  {Dname, Dmgr_ssn}

19
Decomposition
 So decompose the schema with respect to
transitive dependency.

 After decomposition one relation is


R1(Dnumber, Dname, Dmgr_ssn)
 and the other is
 R2(Ename, Ssn, Address, Dnumber).
 R1 and R2 are in 3NF.

20
Boyce-Codd normal form (BCNF)
A relation schema is said to be in BCNF if
(i) it is third normal form and
(ii) key attribute does not depend on non-key
attribute.

FD1: AB  C
FD2: C  B

This relation schema is NOT in BCNF because of the dependency C  B.

Decompose R into R1(C, B) & R2(A, C)


21
Another example
 Consider another relation schema
TEACH(Student#, Course#, Instructor#) and
 Suppose that Instructor#  Course#
 This functional dependency means that an
instructor can teach at the most one course.
 Since Course# is a key attribute hence the
schema TEACH is NOT in BCNF.
 Decomposition of TEACH with respect to
Instructor#  Course# results in
 R1(Instructor#, Course#) & R2(Student# , Instructor#)

22
Exercise problem 1
Consider the following relation:
CAR_SALE(Car#, Date_sold, Salesperson#, Commission%,
Discount_amt).
Assume that a car may be sold by multiple
salespeople, and hence {Car#, Salesperson#} is
the primary key.
Additional dependencies are Date_sold → Discount_amt,
Salesperson# → Commission%.
Based on the given primary key, is this relation in 1NF, 2NF,
or 3NF? Why or why not? How would you successively
normalize it completely?
Ans. R1(Salesperson#, Commission%), R2(Date_sold, Discount_amt),
R3(Car#, Date_sold, Salesperson#)

23
Exercise problem 2
Consider the following relation for published books:
BOOK (Book_title, Author_name, Book_type, List_price,
Author_affil, Publisher).
Author_affil refers to the affiliation of author.
Suppose that the following dependencies exist:
Book_title → Publisher, Book_type
Book_type → List_price
Author_name → Author_affil.

What normal form is the relation in? Explain your answer.


Apply normalization until you cannot decompose the
relations further. State the reasons behind each
decomposition.
Ans. R1((Book_title, Book_type, Publisher), R2(Author_name, Author_affil),
R3(Book_title, Author_name, List_price)

24
Exercise problem 3
Consider the following relation:
CAR_SALE (Car_id, Option_type, Option_listprice, Sale_date,
Option_discountedprice).
This relation refers to options installed in cars (e.g., cruise control)
that were sold at a dealership, and the list and discounted prices of
the options.

Let the following additional functional dependencies hold on the


relation: Car_id → Sale_date, Option_type → Option_listprice.

What normal form is the relation in? Explain your answer.


Apply normalization until you cannot decompose the relations
further. State the reasons behind each decomposition.
Ans. R1(Car_id, Sale_date), R2(Option_type, Option_listprice),
R3(Car_id, Option_type, Option_discountedprice)

25
Exercise problem 4
Consider the universal relation
R = {A, B, C, D, E}
and the set of functional dependencies
F = {ABC, ADE}.
What is the key for R? Decompose R into 2NF and
then 3NF relations.

Ans. Key = AB
D = {R1(A, D, E), R2(A, B, C)}

26
Exercise problem 4a
Consider the universal relation
R = {A, B, C, D, E, G, H, I, J, K}
and the set of functional dependencies
F = {ABC, ADE, BK, KGH, D  IJ}.
What is the key for R? Decompose R into the
highest possible normal form.

Ans. Key: AB
D = {R1(A, D, E), R2(B, K), R3(A, B, C, G, H, I, J, K)}

27
Exercise problem 5
Consider the universal relation
R = {A, B, C, D, E, F, G, H, I, J, K}
and the set of functional dependencies
F = {AB C, BD  EF, AD  GH, A  I, HJ}.
What is the key for R? Decompose R into the
highest possible normal form.

Ans. Key: ABD


D = {R1(A, B, C), R2(B, D, E, F), R3(A, D, G, H), R4(A, I),
R5(A, B, D, I, J, K)}.

28
Exercise problem 6
Consider a relation R with five attributes ABCDE. You are
given the following dependencies:
A → B, BC → E, and ED → A.
i) List all keys for R.
ii) Is R in 3NF?
iii) Is R in BCNF?
Decompose the schema into the highest possible normal
form, if required.

Ans. The keys are K1 = CED & K2 = BCD.


D = {R1(A, B), R2(D, E, A), R3(A, C)}

29
Partial Solution 6
A+ = AB ≠ R
(BC)+ = BCE ≠ R
(ED)+ = EDAB ≠ R
(ABC)+ = ABCE ≠ R
(AED)+ = AEDB ≠ R
(BCED)+ = R so BCED is a super key of R
(CED)+ = CEDAB = R and (CE)+ ≠ R, (CD)+ ≠ R.
So CED is a key of R.
Similarly, show that BCD is also a key of R.

30
Exercise problem 6
Consider the attribute set R = ABCDEGH and the FD set
F = {AB → C, AC → B, AD → E, B → D, BC → A, E → G}.
i) List all keys for R.
ii) Is R in 3NF?
iii) Is R in BCNF?
Decompose the schema into the highest possible normal
form, if required.

Ans. The keys are: ABEH, ABDH, ACH.


D = {R1(A, B, C), R2(A, D, E), R3(A, B, G, H)}

31

Vous aimerez peut-être aussi