Database Normalization 3

DATABASE NORMALIZATION
What Youll Learn

This section of notes covers the process of database normalization in which relations
(tables) created from theconversion of the E-R model are analyzed for potential flaws
(anomalies) and these flaws are corrected. The following specific topics are covered:
The Relational Model
Functional Dependencies
Keys and Uniqueness
Modification Anomalies
Normalization Process
First Normal Form
Second Normal Form
Third Normal Form
Boyce-Codd Normal Form
Fourth Normal Form
Fifth Normal Form
Domain/Key Normal Form
De-Normalization
All-In-One Example of normalization.
More Normalization Exercises to try
o
o
o
o
o
o
o
Textbook Resources
Connolly, Begg,
Holowczak
Ch. 8
Database
Systems:
Conolly&Begg
5th Ed: 13 and 14
6th Ed: 14 and 15
Rob/Coronel Elmasri/Navathe Kroenke Hoffer, Pre

(5th ed)
(3rd) ed.
(7th ed.) McFadden
Ch. 4
Ch. 14 and 15
Ch. 5
Ch. 5 and A
B
The Relational Model

As a reminder, the database development process we are following has the these steps:
1.
2.
Gather user/business requirements.

Develop the conceptual E-R Model (shown as an E-R Diagram) based on the
user/business requirements.
3.
Convert the E-R Model to a set of relations in the (logical) relational model
4.
Normalize the relations to remove any anomalies.
5.
Implement the database by creating a table for each normalized relation in a

relational database management system.
What is Normalization?
Normalization is a process in which we systematically examine relations

for anomalies and, when detected, remove those anomalies by splitting up the relation
into two new, related, relations.
Normalization is an important part of the database development process: Often
during normalization, the database designers get their first real look into how the data
are going to interact in the database.
Finding problems with the database structure at this stage is strongly preferred to
finding problems further along in the development process because at this point it is
fairly easy to cycle back to the conceptual model (Entity Relationship model) and make
changes.
Normalization can also be thought of as a trade-off between data redundancy and
performance. Normalizing a relation reduces data redundancy but introduces the need
for joins when all of the data is required by an application such as a report query.
Recall, the Relational Model consists of the elements: relations, which are made up of
attributes.
1.
2.
3.
4.
5.
6.
A relation is a set of attributes with values for each attribute such that:
Each attribute (column) value must be a single value only.
All values for a given attribute (column ) must be of the same data type.
Each attribute (column) name must be unique.
The order of attributes (columns) is insignificant
No two tuples (rows) in a relation can be identical.
The order of the tuples (rows) is insignificant.
From our discussion of E-R Modeling, we know that an Entity typically
corresponds to a relation and that the Entitys attributes become attributes of the relation.
We also discussed how, depending on the relationships between entities, copies
of attributes (the identifiers) were placed in related relations as foreign keys.
The next step is to identify functional dependencies within each relation. Click on
the __Next Page link below to learn more about the normalization process.
Functional Dependencies
A Functional Dependency describes a relationship between attributes within a

single relation.
An attribute is functionally dependent on another if we can use the value of one
attribute to determine the value of another.
Example: Employee_Name is functionally dependent on Social_Security_Number
because Social_Security_Number can be used to uniquely determine the value of
Employee_Name.
We use the arrow symbol to indicate a functional dependency.
X Y is read X functionally determines Y
Here are a few more examples:
Student_ID Student_Major
Student_ID, CourseNumber, Semester Grade
Course_Number, Section Professor, Classroom, NumberOfStudents
SKU Compact_Disk_Title, Artist
CarModel, Options, TaxRate Car_Price
The attributes listed on the left hand side of the are called determinants.
One can read A B as, A determines B. Or more specifically: Given a value for A, we
can uniquely determine one value for B.
Keys and Uniqueness
Key: One or more attributes that uniquely identify a tuple (row) in a relation.
The selection of keys will depend on the particular application being considered.
In most cases the key for a relation will already be specified during the conversion
from the E-R model to a set of relations.
Users can also offer some guidance as to what would make an appropriate key.
Recall that no two relations should have exactly the same values, thus a
candidate key would consist of all of the attributes in a relation.
A key functionally determines a tuple (row). So one functional dependency that

can always be written is:
The Key All other attributes
Modification Anomalies
Once our E-R model has been converted into relations, we may find that some
relations are not properly specified. There can be a number of problems:
o
Deletion Anomaly: Deleting one fact or data point from a relation results
in other information being lost.

o
Insertion Anomaly: Inserting a new fact or tuple into a relation requires
we have information from two or more entities this situation might not be feasible.
o
Update Anomaly: Updating one fact in a relation requires us to update
multiple tuples.
Anomaly Example 1
Here is an example to illustrate these anomalies: Consider a very common

CUSTOMER relation:
CUSTOMER(CustomerID, CustomerName, Street, City, State,
PostalCode)
In the United States, the PostalCode (or ZipCode) references a specific City and
State so one might have data such as:
CustomerID
Name
Street
City
State PostalCo
C101
Bill Smith
123 First St.
New Brunswick
NJ
07101
C102
Mary Green
11 Birch St.
Old Bridge
NJ
07066
C103
Ted Jones
3 Academy St.
Old Bridge
NJ
07066
C104
Sally Taylor
446 First Ave.
New Brunswick
NJ
07101
C105
Mary Miller
44 Toga Ct.
Farmingdale
NY
11735
Insertion Anomaly: What happens if we go to add a new Customer: C106, Joe

Feldman, 99 Ninth St., Springfield, NJ
What we know about Joe is that he lives in Springfield, NJ (one fact) but we may not
know his PostalCode.
We will need to get that additional fact (the fact that the PostalCode for Springfield, NJ is
07081.
Deletion Anomaly: What happens if we delete customer C105: Then we not only
remove the customer information but we also remove (lose) the fact that Farmingdale,
NY has postal code 11735.
Modification Anomaly: It is possible that when a town grows in population, the zip
code will be split into two (or more) new zip codes.
For example, if Old Bridge, NJ splits its zip code, then we will have to update many
different tuples even though we are only changing one fact about Old Bridges zip code.
Anomaly Example 2
Here is another example to illustrate anomalies: A company has a Purchase

Order form:
Our dutiful consultant creates the E-R Model directly matching the purchase
order:
When we follow the steps to convert to a set of relations this results in two
relations (keys are underlined):
PO_HEADER (PO_Number, PODate, Vendor, Ship_To, ...)
LINE_ITEMS (PO_Number, ItemNum, PartNum, Description, Price,

Qty)
Consider some sample data for the LINE_ITEMS relation:

PO_Number
ItemNum
PartNum
Description
Price
O101
I01
P99
Plate
$3.00
O101
I02
P98
Cup
$1.00
O101
I03
P77
Bowl
$2.00
O102
I01
P99
Plate
$3.00
O102
I02
P77
Bowl
$2.00
O103
I01
P33
Fork
$2.50
What are some of the problems with this relation ?
1.
What happens if we want to add the fact that Order O103 has quantity 5 of
part P99 ?
2.
3.
What happens when we delete item I02 from Order O101 ?

What happens if we want to change the price of the Plate (P99)?
These problems occur because the relation in question contains data about 2 or
more themes.
Typical way to solve these anomalies is to split the relation in to two or more
relations This is part of theProcess called Normalization discussed next.
On the next page we will formally define the Normalization Process.

Database Normalization 3

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Database Normalization 3

Transféré par

Droits d'auteur :

Formats disponibles

DATABASE NORMALIZATION

What Youll Learn

Rob/Coronel Elmasri/Navathe Kroenke Hoffer, Pre

The Relational Model

Gather user/business requirements.

Implement the database by creating a table for each normalized relation in a

Normalization is a process in which we systematically examine relations

A Functional Dependency describes a relationship between attributes within a

Keys and Uniqueness

A key functionally determines a tuple (row). So one functional dependency that

in other information being lost.

Here is an example to illustrate these anomalies: Consider a very common

123 First St.

446 First Ave.

Insertion Anomaly: What happens if we go to add a new Customer: C106, Joe

Here is another example to illustrate anomalies: A company has a Purchase

LINE_ITEMS (PO_Number, ItemNum, PartNum, Description, Price,

Consider some sample data for the LINE_ITEMS relation:

What are some of the problems with this relation ?

What happens when we delete item I02 from Order O101 ?

Vous aimerez peut-être aussi