Normalisation

Normalisatio
This is the process by which we can make

sure that the data model we are designing
will indeed contain all the information we
want, and that it will be accessible to us,
and that data will be stored as far as
possible with minimum redundancy. To do
this will entail returning to steps 1 to 5
until a solid design is achieved.
Step 1
The
idea behind normalisation is quite

simple. All occurrences of an entity must
contain the same number of
fields(attributes)- this is called the first
normal form. This excludes variable
repeating groups. In our example of E-R
diagrams, students may study a number
of subjects, but this might imply a
relation with a variable number of fields.
For
example, assume students could

attend as many subjects as they wished.
Here is a sample relation:
STUDENT(student number#, name,
address, subject 1, subject 2,
subject3.)
This is imprecise, because some
students would have four fields , some
five etc. the solution is to turn this into
two relations:
STUDENT(student Number#, name, address)

COURSE(student number#, subject number#...)
The entities in the second relation, COURSE,
represent an individual student taking a
particular subject and could have further
attributes such as attendance or mark achieved.
In this way the many-to-many relationship of
students to subjects is made into two one-tomany relationships STUDENTS to COURSE and
COURSE to SUBJECT
STEP 2
Any non-key field must provide a fact about

the key-this summarises the second and
third normal form. We can break this down
a bit further. In SECOND NORMAL FORM.
The question to resolve is whether a nonkey field is a fact about a subset of a key. In
this example, lecture hours is a fact about
the subject, only one part of a composite
key. It really belongs in the
SUBJECTrelation:
COURSE(Student
Number#, Subject
Number, Exam Mark, Lecture Hours).
If this design were adhered to, the lecture
hours would be repeated many times, and
any change would require many updates to
the database. Also, if no students were
taking a particular subject-perhaps
temporarily during the vacation- then there
would be no possibility of storing the lecture
hours data.
The Third Step

The
THIRD NORMAL FORM is violated

if a non-key field is a fact about another
non-key field as in:
TEACHER(Staff Number#, Department,
Building).
Buildings may be fact about the
teacher(where their lab is-OK) or about
the department(where the department
office is-not OK). A better design is:
TEACHER(Staff#,
Department)
DEPARTMENT(Department#, Office
Location).
Overall this process of normalisation
helps to ensure that information is
stored once only, and that
inconsistencies do not occur in a
database when data is added or
deleted.
EXAMPLE
The resulting set of relations after normalisation
might be as follows:
STUDENT(Student Number#, Name, Address, Staff
Number of Tutor)
COURSE(Student Number#, Subject Number#,
Exam mark)
SUBJECT(Subject Number#, lecture Hours)
TEACHER(Staff Number#, Name, Subject,
Department)
DEPARTMENT(Department#, office Location,
Location).
Note
how the relationships are supported by

using the key or keys of one table as non-key
attributes of another- for example, TEACHER
containing the attribute Department, which
links it to the DEPARTMENT relation. Even so,
this example may still have problems.
Consider the relation TEACHER- is it the third
normal form? We are also still in the situation
that each teacher teaches only one subject
and that may not be realistic.
Having worked through from an E-R model

to a set of normalised relations, it should
be quite easy to subsequently implement
the design using a database package. The
only step remaining will be to determine
the exact form in which each field will be
stored.( e.g. as integers, real numbers,
characters string, dates etc.). This is
information we should find in the data
dictionary.
As
an exercise redesign the E-R model

to solve all the problems identified and
then redraw the appropriate E-R
diagram.
Question
Provide
an entity-relationship model and

a set of normalised relations for the
following e.g.
A database is to be designed to hold
information on the activities of a
hospital. The database will hold
information on patients in the hospital.
For each patient it will indicate in which
ward they are to be placed and their
medical problems.
A patient may have more than one medical

problem in which case each problem will be
the responsibility of one doctor. For each
doctor the database will contain
information on their home address, office
address and home and office phone
numbers. For each ward the database must
record the name of the nurse in charge and
the telephone number of the ward office.
Process specifications
Once
we have developed as et of dataflow diagrams, there will be a number of

bubbles at the lowest level that
constitute the basic processes of the
system under analysis. We need to be
able to specify in detail the logic of how
these information- processing tasks are
to be carried out. This can be done
using the structured English technique.
If
the logic is expressed in this way, then

when the designer or programmer
comes to implement the task, it will be
very easy. There are other techniques
that may be used for process
specifications, such as decision tables or
decision trees.
Data dictionary
The data flow diagram, the process

specifications and the data model each
contain a number of distinct types of
information collected and analysed by the
system analyst. In order to keep track of
these it is generally recommended that a
data dictionary is maintained as a central
reference on all the entities, attributes,
processes, data stores etc. that are
contained in a specification.
By establishing a data dictionary, the analyst is

able to keep track of changes that are made and
to help to ensure the consistency of the system
description.
A set of data flow diagrams, data models,

process specifications and a data
dictionary taken together constitute a
specification for a new system. Once this
specification has been developed, then
the task can begin.

Normalisation

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Normalisation

Transféré par

Droits d'auteur :

Formats disponibles

Normalisatio

This is the process by which we can make

idea behind normalisation is quite

example, assume students could

STUDENT(student Number#, name, address)

Any non-key field must provide a fact about

The Third Step

THIRD NORMAL FORM is violated

how the relationships are supported by

Having worked through from an E-R model

an exercise redesign the E-R model

an entity-relationship model and

A patient may have more than one medical

we have developed as et of dataflow diagrams, there will be a number of

the logic is expressed in this way, then

The data flow diagram, the process

By establishing a data dictionary, the analyst is

A set of data flow diagrams, data models,

Vous aimerez peut-être aussi