Normalization Paper

Database Normalization
What is Normalization?
Normalization is the process of efficiently organizing data in a database. There are two goals of the
normalization process:
eliminating redundant data (for example, storing the same data in more than one table) and
ensuring data dependencies make sense (only storing related data in a table).
Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that
data is logically stored.
The Normal Forms
The database community has developed a series of guidelines for ensuring that databases are
normalized. These are referred to as normal forms and are numbered from one (the lowest form of
normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical
applications, we'll often see 1NF, 2NF, and 3NF along with the occasional 4NF.
First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column
or set of columns (the primary key).
Second Normal Form (2NF)
Second normal form (2NF) further addresses the concept of removing duplicative data:
Meet all the requirements of the first normal form.
Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
Create relationships between these new tables and their predecessors through the use of foreign
keys.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:
Meet all the requirements of the second normal form.
Remove columns that are not dependent upon the primary key.
Fourth Normal Form (4NF)
Finally, fourth normal form (4NF) has one additional requirement:
Meet all the requirements of the third normal form.
A relation is in 4NF if it has no multi-valued dependencies.
Converting a database to the first normal form is rather simple. This first rule calls for the elimination of
repeating groups of data through the creation of separate tables of related data.
For example, lets think of a table. the original table contains several sets of repeating groups of data,
studentID studentName Major college collegeLocation classID ClassName professorID professorName
Each attribute can be repeated no of times times, allowing for each student to take multiple classes.
However, what if the student takes more than three classes? This, and other restrictions on this table should
be obvious.
First Normal Form

Therefore, lets break this mammoth table down into several smaller tables. The first table contains solely
student information (Student):
studentID
studentName
Major
college
collegeLocation
The second table contains solely class information (Class):
studentID
classID
className
The third table contains solely professor information (Professor):
professorID
professorName
Second Normal Form

Once we have separated the data into their respective tables, we can begin concentrating upon the rule of
Second Normal Form; that is, the elimination of redundant data. Referring back to the Class table, typical
data stored within might look like:
studentID
classID className
134-56-7890 M148
Math 148
123-45-7894 P113
Physics 113
534-98-9009 H151
History 151
134-56-7890 H151
History 151
While this table structure is certainly improved over the original, we can notice that there is still room for
improvement. In this case, the className attribute is being repeated. With 60,000 students stored in this
table, performing an update to reflect a recent change in a course name could be somewhat of a problem.
Therefore, we can create a separate table that contains classID to className mappings (ClassIdentity):
classID
className
M148
Math 148
P113
Physics 113
H151
History 151
The updated Class table would then be simply:
studentID
classID
134-56-7890 M148
123-45-7894 P113
534-98-9009 H151
134-56-7890 H151
Revisiting the need to update a recently changed course name, all that it would take is the simple update of
one row in the ClassIdentity table! Of course, substantial savings in disk space would also result, due to this
elimination of redundancy.
Third Normal Form
Continuing on the quest for complete normalization of the school system database, the next step in the
process would be to satisfy the rule of the Third Normal Form. This rule seeks to eliminate all attributes from
a table that are not directly dependent upon the primary key. In the case of the Student table, the college
and collegeLocation attributes are less dependent upon the studentID than they are on the major attribute.
Therefore, we can create a new table that relates the major, college and collegeLocation information:
major
College
collegeLocation
The revised Student table would then look like:
studentID
studentName
Major
Although for most cases these three Normal Forms sufficiently satisfy the requirements set for proper
database normalization, there are still other Forms that go beyond what rules have been set thus far.

Normalization Paper

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Normalization Paper

Transféré par

Droits d'auteur :

Formats disponibles

Database Normalization

First Normal Form (1NF)

studentID studentName Major college collegeLocation classID ClassName professorID professorName

First Normal Form

The second table contains solely class information (Class):

The third table contains solely professor information (Professor):

Second Normal Form

The updated Class table would then be simply:

The revised Student table would then look like:

Vous aimerez peut-être aussi