Vous êtes sur la page 1sur 145

KEYS

Chapter Objectives
The purpose of normailization Data redundancy and Update Anomalies Functional Dependencies The Process of Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF)

Chapter Objectives (2)


General Definition of Second and Third Normal Form Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

What is Normalization?
Database designed based on the E-R model may have some amount of  Inconsistency  Uncertainty  Redundancy To eliminate these draw backs some refinement has to be done on the database. This Refinement process is called Normalization Defined as a step-by-step process of decomposing a complex relation into a simple and stable data structure. The formal process that can be followed to achieve a good database design Also used to check that an existing design is of good quality The different stages of normalization are known as normal forms To accomplish normalization we need to understand the concept of Functional Dependencies.

The Purpose of Normalization


Normalization is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. The process of normalization is a formal method that identifies relations based on their primary or candidate keys and the functional dependencies among their attributes.

Update Anomalies
Relations that have redundant data may have problems called update anomalies, which are classified as , Insertion anomalies Deletion anomalies Modification anomalies

Example of Update Anomalies

To insert a new staff with branchNo B007 into the StaffBranch relation; To delete a tuple that represents the last member of staff located at a branch B007; To change the address of branch B003.
StaffBranch
staffNo
SL21 SG37 SG14 SA9 SG5 SL41

sName
John White Ann Beech David Ford Mary Howe Susan Brand Julie Lee

position
Manager Assistant Supervisor Assistant Manager Assistant

salary
30000 12000 18000 9000 24000 9000

branchNo bAddress
B005 B003 B003 B007 B003 B005 22 Deer Rd, London 163 Main St,Glasgow 163 Main St,Glasgow 16 Argyll St, Aberdeen 163 Main St,Glasgow 22 Deer Rd, London

Figure 1 StraffBranch relation

Example of Update Anomalies (2)


Staff
staffNo
SL21 SG37 SG14 SA9 SG5 SL41

sName
John White Ann Beech David Ford Mary Howe Susan Brand Julie Lee

position
Manager Assistant Supervisor Assistant Manager Assistant

salary
30000 12000 18000 9000 24000 9000

branceNo
B005 B003 B003 B007 B003 B005

Branch
branceNo
B005 B007 B003

bAddress
22 Deer Rd, London 16 Argyll St, Aberdeen 163 Main St,Glasgow

Figure 2 Straff and Branch relations

Functional Dependencies
Functional dependency describes the relationship between
attributes in a relation. For example, if A and B are attributes of relation R, and B is functionally dependent on A ( denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consist of one or more attributes.)
B is functionally

A
dependent on A

B
Refers to the attribute or group of attributes on the left-hand side of the arrow of a functional dependency

Determinant

Functional Dependencies (2) Trival functional dependency means that the right-hand side is a subset ( not necessarily a proper subset) of the lefthand side.
For example: (See Figure 1) staffNo, sName sName staffNo, sName staffNo They do not provide any additional information about possible integrity constraints on the values held by these attributes. We are normally more interested in nontrivial dependencies because they represent integrity constraints for the relation.

Functional Dependencies (3) Main characteristics of functional dependencies in normalization Have a one-to-one relationship between attribute(s) on the left- and right- hand side of a dependency; hold for all time; are nontrivial.

Functional Dependencies (4)

Identifying the primary key


Functional dependency is a property of the meaning or semantics of the attributes in a relation. When a functional dependency is present, the dependency is specified as a constraint between the attributes. An important integrity constraint to consider first is the identification of candidate keys, one of which is selected to be the primary key for the relation using functional dependency.

Functional Dependencies (5)

Inference Rules
A set of all functional dependencies that are implied by a given set of functional dependencies X is called closure of X, written X+. A set of inference rule is needed to compute X+ from X. Armstrongs axioms 1. 2. 3. 4. 5. 6. 7. If B is a subset of A, them A B Relfexivity: Augmentation: If A B, then A, C B Transitivity: If A B and B C, then A C Self-determination: A A Decomposition: If A B,C then A B and A C Union: If A B and A C, then A B,C Composition: If A B and C D, then A,C B,

Functional Dependencies (6)

Minial Sets of Functional Dependencies


A set of functional dependencies X is minimal if it satisfies the following condition: Every dependency in X has a single attribute on its right-hand side We cannot replace any dependency A B in X with dependency C B, where C is a proper subset of A, and still have a set of dependencies that is equivalent to X. We cannot remove any dependency from X and still have a set of dependencies that is equivalent to X.

Functional Dependencies (7)

Example of A Minial Sets of Functional Dependencies


A set of functional dependencies for the StaffBranch relation satisfies the three conditions for producing a minimal set. staffNo sName staffNo position staffNo salary staffNo branchNo staffNo bAddress branchNo bAddress branchNo, position salary bAddress, position salary

Functional dependency
In a given relation R, X and Y are attributes. Attribute Y is functionally dependent on attribute X if each value of X determines EXACTLY ONE value of Y, which is represented as X -> Y (X can be composite in nature). We say here x determines y or y is functionally dependent on x XpY does not imply YpX If the value of an attribute Marks is known then the value of an attribute Grade is determined since MarkspGrade Types of functional dependencies:
  

Full Functional dependency Partial Functional dependency Transitive dependency

Functional Dependencies
Consider the following Relation REPORT (STUDENT#,COURSE#, CourseName, IName, Room#, Marks, Grade) STUDENT# - Student Number COURSE# - Course Number CourseName - Course Name IName - Name of the Instructor who delivered the course Room# - Room number which is assigned to respective Instructor Marks - Scored in Course COURSE# by Student STUDENT# Grade - obtained by Student STUDENT# in Course COURSE#

Functional Dependencies- From the previous example


STUDENT# COURSE# COURSE# Marks CourseName,

COURSE# IName (Assuming one course is taught by one and only one Instructor) IName Room# (Assuming each Instructor has his/her own and non-shared room) Marks Grade

Dependency diagram
Report( S#,C#,SName,CTitle,LName,Room#,Marks,Grade)
S# SName C# CTitle, C# LName LName Room# C# Room# S# C# Marks Marks Grade S# C# Grade

S# SName Marks

C# CTitle LName Grade Room#

Assumptions:

Each course has only one lecturer and each lecturer has a room. Grade is determined from Marks.

Full dependencies
X and Y are attributes. X Functionally determines Y Note: Subset of X should not functionally determine Y

Partial dependencies
X and Y are attributes. Attribute Y is partially dependent on the attribute X only if it is dependent on a sub-set of attribute X.

Transitive dependencies
X Y and Z are three attributes. X -> Y Y-> Z => X -> Z

Need for Normalization


Student_Course_Result Table
Student_Details 101 102 101 Davis Daniel Davis 11/4/1986 11/6/1987 11/4/1986 M4 M4 H6 Course_Details Applied Mathematics Applied Mathematics American History Basic Chemistr y Basic Mathematics Basic Mathematics 7 7 4 Result_Details 11/11/200 4 11/11/200 4 11/22/20 04 11/16/20 04 11/26/20 04 11/12/20 04 11/12/20 04 11/27/20 04 11/22/20 04 11/11/200 4 82 62 79 A C B

103 104 102 105 103 105 104

Sandra Evelyn Daniel Susan Sandra Susan Evelyn

10/2/1988 2/22/1986 11/6/1987 8/31/1985 10/2/1988 8/31/1985 2/22/1986

C3 B3 P3 P3 B4 H6 M4

Bio Chemistry Botany Nuclear Physics Nuclear Physics Zoology American History Applied Mathematics

11 8

65 77 68 89 54 87 65

B B B A D A B

Basic Physics Basic Physics

13 13 5 4

Basic Mathematics

Insert , Delete, Update Anomaly and Data Duplication

The Process of Normalization


Normalization is often executed as a series of steps. Each step corresponds to a specific normal form that has known properties. As normalization proceeds, the relations become progressively more restricted in format, and also less vulnerable to update anomalies. For the relational data model, it is important to recognize that it is only first normal form (1NF) that is critical in creating relations. All the subsequent normal forms are optional.

First Normal Form (1NF) Repeating group = (propertyNo, pAddress,


Unnormalized form (UNF) A table that contains one or more repeating groups.
ClientNo cName propertyNo
PG4 PG16

rentStart, rentFinish, rent, ownerNo, oName)

pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow

rentStart
1-Jul-00

rentFinish
31-Aug-01

rent
350

ownerNo
CO40

oName
Tina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw

CR76

John kay

1-Sep-02 1-Sep-99

1-Sep-02 10-Jun-00

450 350

CO93 CO40

PG4

CR56

Aline Stewart

PG36

10-Oct-00

1-Dec-01

370

CO93

PG16

1-Nov-02

1-Aug-03

450

CO93

Figure 3 ClientRental unnormalized table

Definition of 1NF
First Normal Form is a relation in which the intersection of each row and column contains one and only one value. There are two approaches to removing repeating groups from unnormalized tables: 1. Removes the repeating groups by entering appropriate data in the empty columns of rows containing the repeating data. 2. Removes the repeating group by placing the repeating data, along with a copy of the original key attribute(s), in a separate relation. A primary key is identified for the new relation.

First Normal Form: 1NF


A relation schema is in 1NF :


if and only if all the attributes of the relation R are atomic in nature. Atomic: the smallest level to which data may be broken down and remain meaningful

1NF ClientRental relation with the first approach


With the first approach, we remove the repeating group (property rented details) by entering the appropriate client data into each row. The ClientRental relation is defined as follows,
ClientRental ( clientNo, propertyNo, cName, pAddress, rentStart, rentFinish, rent, ownerNo, oName)
ClientNo
CR76 CR76 CR56

propertyNo
PG4 PG16 PG4

cName
John Kay John Kay Aline Stewart Aline Stewart Aline Stewart

pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow

rentStart
1-Jul-00 1-Sep-02 1-Sep-99

rentFinish
31-Aug-01 1-Sep-02 10-Jun-00

rent
350 450 350

ownerNo
CO40 CO93 CO40

oName
Tina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw

CR56

PG36

10-Oct-00

1-Dec-01

370

CO93

CR56

PG16

1-Nov-02

1-Aug-03

450

CO93

Figure 4 1NF ClientRental relation with the first approach

1NF ClientRental relation with the second approach


Client PropertyRentalOwner

With the second approach, we remove the repeating group (property rented details) by placing the repeating data along with aClientNo of the original key attribute (clientNo) in a separte relation. copy cName
CR76 CR56 John Kay Aline Stewart

(clientNo, cName) (clientNo, propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName)

ClientNo
CR76 CR76 CR56 CR56 CR56

propertyNo
PG4 PG16 PG4 PG36 PG16

pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow

rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

rent
350 450 350 370 450

ownerNo
CO40 CO93 CO40 CO93 CO93

oName
Tina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw

Figure 5 1NF ClientRental relation with the second approach

Full functional dependency


Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency A B is partially dependent if there is some attributes that can be removed from A and the dependency still holds.

Student_Course_Result Table
Student_Details
101 102 101 103 Davis Daniel Davis Sandra 11/4/1986 11/6/1987 11/4/1986 10/2/1988 M4 M4 H6 C3

Course_Details
Applied Mathematics Applied Mathematics American History Bio Chemistry Basic Chemistry Basic Mathematics Basic Mathematics 7 7 4 11

Results
11/11/2004 11/11/2004 11/22/2004 11/16/2004 82 62 79 65 A C B B

104 102 105

Evelyn Daniel Susan

2/22/1986 11/6/1987 8/31/1985

B3 P3 P3

Botany Nuclear Physics Nuclear Physics Basic Physics Basic Physics

8 13 13

11/26/2004 11/12/2004 11/12/2004

77 68 89

B B A

103 105 104

Sandra Susan Evelyn

10/2/1988 8/31/1985 2/22/1986

B4 H6 M4

Zoology American History Applied Mathematics Basic Mathematics

5 4 7

11/27/2004 11/22/2004 11/11/2004

54 87 65

D A B

Table in 1NF

Student_Course_Result Table in First Normal Form

Second Normal Form (2NF)


Second normal form (2NF) is a relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. The normalization of 1NF relations to 2NF involves the removal of partial dependencies. If a partial dependency exists, we remove the function dependent attributes from the relation by placing them in a new relation along with a copy of their determinant.

2NF ClientRental relation


The ClientRental relation has the following functional dependencies:
fd1 fd2 fd3 fd4 fd5 fd6 clientNo, propertyNo rentStart, rentFinish (Primary Key) clientNo cName (Partial dependency) propertyNo pAddress, rent, ownerNo, oName (Partial dependency) ownerNo oName (Transitive Dependency) clientNo, rentStart propertyNo, pAddress, rentFinish, rent, ownerNo, oName (Candidate key) propertyNo, rentStart clientNo, cName, rentFinish (Candidate key)

2NF ClientRental relation


After removing the partial dependencies, the creation of the three Client (clientNo, cName) new relations called Client, Rental, and PropertyOwner Rental (clientNo, propertyNo, rentStart, rentFinish) PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName)
Client
ClientNo
CR76 CR56

Rental
cName
John Kay Aline Stewart

ClientNo
CR76 CR76 CR56 CR56 CR56

propertyNo
PG4 PG16 PG4 PG36 PG16

rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

PropertyOwner
propertyNo
PG4 PG16 PG36

pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow

rent
350 450 370

ownerNo
CO40 CO93 CO93

oName
Tina Murphy Tony Shaw Tony Shaw

Figure 6 2NF ClientRental relation

Second Normal Form: 2NF


A Relation is said to be in Second Normal Form if and only if :  It is in the First normal form, and  No partial dependency exists between non-key attributes and key attributes.
An attribute of a relation R that belongs to the candidate key of R is said to be a key attribute and that which doesnt is a non-key attribute To make a table 2NF compliant, we have to remove all the partial dependencies

Note : - All partial dependencies are eliminated

Key and Non-Key Attributes Student_Course_Result(Student#, Course#, StudentName,


DateOfBirth, CourseName, PreRequisite, DurationInDays, DateOfExam, Marks, Grade)
Student# Course#
StudentName DateOfBirth CourseName PreRequisite Marks Grade DurationInDays DateOfExam

Is a KEY Attribute

Is NON-KEY Attribute

Second Normal Form


STUDENT# is key attribute for Student COURSE# is key attribute for Course STUDENT# COURSE# together form the composite key attributes for Student_Course_Result relation. Other attributes like StudentName (Student Name), DateOfBirth, CourseName, PreRequisite, DurationInDays, DateOfExam, Marks and Grade are non-key attributes. To make this table 2NF compliant, we have to remove all the partial Dependencies. Student #, Course# -> Marks, Grade Student# -> StudentName, DateOfBirth, Course# -> CourseName, PreRequisite, DurationInDays, DateOfExam

Second Normal Form


S#,C# S#,C# S# S# C# C# C# C# Marks Grade StudentName DOB CourseName Prerequisite Duration DateOfExam

Fully Functionally dependent on composite Primary key Partial Dependency with respect to the Primary Key

Partial Dependency with respect to the Primary Key

Second Normal Form - Tables in 2 NF


COURSE TABLE
Course # Course Name Pre Req uisit e Durat ion Date Of Exam

STUDENT TABLE
Student# StudentName DateofBirth M1 M4 H6 C1 C3 B3 P1 Basic Mathematics Applied Mathematics American History Basic Chemistry Bio Chemistry Botany Basic Physics Nuclear Physics Zoology

11

11-Nov-04

101 102 103 104 105 106 107 108 109

Davis Daniel Sandra Evelyn Susan Mike Juliet Tom Catherine

04-Nov-1986 06-Nov-1987 02-Oct-1988 22-Feb-1986 31-Aug-1985 04-Feb-1987 09-Nov-1986

M1

7 4 5

11-Nov-04

22-Nov-04 16-Nov-04 16-Nov-04 26-Nov-04 12-Nov-04

C1

11 8 8

P3 07-Oct-1986 B4 06-Jun-1984

P1

13 5

12-Nov-04 27-Nov-04

Second Normal form Tables in 2 NF


Report
Student# Course# Marks Grade 82 A 62 C 79 B 65 B 77 B 68 B 89 A 54 D 87 A 65 B

101 M4 102 M4 101 H6 103 C3 104 B3 102 P3 105 P3 103 B4 105 H6 104 M4

Second Normal Form - Tables in 2 NF

COURSE TABLE STUDENT TABLE


Student# StudentN ame
Davis Daniel Sandra Evelyn Susan Mike Juliet Tom Catherine Cours e# Course Name

DateofBirth
M1 Basic Mathematics Applied Mathematics American History Basic Chemistry Bio Chemistry Botany Basic Physics Nuclear Physics Zoology

Pre Requisit e

Duration InDays

101 102 103 104 105 106 107 108 109

04-Nov-1986 06-Nov-1987 02-Oct-1988 22-Feb-1986 31-Aug-1985 04-Feb-1987 09-Nov-1986 07-Oct-1986 06-Jun-1984

11

M4 H6 C1 C3 B3 P1 P3 B4

M1

7 4 5

C1

11 8 8

P1

13 5

Second Normal form Tables in 2 NF


Report Student#
101 102 101 103 104 102 105 103 105 104

Course#
M4 M4 H6 C3 B3 P3 P3 B4 H6 M4

Marks
82 62 79 65 77 68 89 54 87 65

Grade
A C B B B B A D A B

Second Normal form Tables in 2 NF


Exam_Date Table
Course#
M4

DateOfExam
11-Nov-04

H6

22-Nov-04

C3

16-Nov-04

B3

26-Nov-04

P3

12-Nov-04

B4

27-Nov-04

Third Normal Form (3NF)


Transitive dependency A condition where A, B, and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Third normal form (3NF) A relation that is in first and second normal form, and in which no non-primary-key attribute is transitively dependent on the primary key. The normalization of 2NF relations to 3NF involves the removal of transitive dependencies by placing the attribute(s) in a new relation along with a copy of the determinant.

3NF ClientRental relation


The functional dependencies for the Client, Rental and PropertyOwner relations are as follows:
Client
fd2 clientNo cName (Primary Key)

Rental
fd1 fd5 fd6 clientNo, propertyNo rentStart, rentFinish clientNo, rentStart propertyNo, rentFinish propertyNo, rentStart clientNo, rentFinish (Primary Key) (Candidate key) (Candidate key)

PropertyOwner
fd3 fd4 propertyNo pAddress, rent, ownerNo, oName (Primary Key) ownerNo oName (Transitive Dependency)

3NF ClientRental relation


The resulting 3NF relations have the forms: Client Rental PropertyOwner Owner (clientNo, cName) (clientNo, propertyNo, rentStart, rentFinish) (propertyNo, pAddress, rent, ownerNo) (ownerNo, oName)

3NF ClientRental relation


Client
ClientNo
CR76 CR56

Rental
cName
John Kay Aline Stewart

ClientNo
CR76 CR76 CR56 CR56 CR56

propertyNo
PG4 PG16 PG4 PG36 PG16

rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

PropertyOwner
propertyNo
PG4 PG16 PG36

Owner
rent
350 450 370

pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow

ownerNo
CO40 CO93 CO93

ownerNo
CO40 CO93

oName
Tina Murphy Tony Shaw

Figure 7 2NF ClientRental relation

Third Normal Form: 3 NF


A relation R is said to be in the Third Normal Form (3NF) if and only if

 It is in 2NF and  No transitive dependency exists between non-key attributes and


key attributes. In Report Table STUDENT# and COURSE# are the key attributes. All other attributes, except grade are nonpartially, non-transitively dependent on key attributes. S#,C# Marks Grade S#,C# S#,C# Marks Grade

Student#, Course# - > Marks

Marks -> Grade

Note : - All transitive dependencies are eliminated

Marks

3NF Tables
M4 M4 H6 C3 B3 P3 P3 B4 H6 82 62 79 65 77 68 89 54 87

Stude Course Marks nt# #


101 102 101 103 104 102 105 103 105

Third Normal Form Tables in 3 NF


MarksGrade

Marks 82 62 79 65 77 68 89 54 87

Grade A C B B B B A D A

Third Normal Form Tables in 3 NF GRADE TABLE UpperB LowerB Grade ound ound 100 95 A+ 94 85 A 84 70 B 69 65 B64 55 C 54 45 D 44 0 E

Boyce-Codd Normal Form (BCNF)


Boyce-Codd normal form (BCNF) A relation is in BCNF, if and only if, every determinant is a candidate key. The difference between 3NF and BCNF is that for a functional dependency A B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key, whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key.

Example of BCNF
fd1 fd2 fd3 fd4 clientNo, interviewDate interviewTime, staffNo, roomNo (Primary Key) staffNo, interviewDate, interviewTime clientNo (Candidate key) roomNo, interviewDate, interviewTime clientNo, staffNo (Candidate key) staffNo, interviewDate roomNo (not a candidate key)

As a consequece the ClientInterview relation may suffer from update anmalies. For example, two tuples have to be updated if the roomNo need be changed for staffNo SG5 on the 13-May-02.
ClientInterview
ClientNo
CR76 CR76 CR74 CR56

interviewDate
13-May-02 13-May-02 13-May-02 1-Jul-02

interviewTime
10.30 12.00 12.00 10.30

staffNo
SG5 SG5 SG37 SG5

roomNo
G101 G101 G102 G102

Figure 8 ClientInterview relation

Example of BCNF(2)
To transform the ClientInterview relation to BCNF, we must remove the violating functional dependency by creating two new relations called Interview and SatffRoom as shown below, Interview (clientNo, interviewDate, interviewTime, staffNo) StaffRoom(staffNo, interviewDate, roomNo)
Interview
ClientNo
CR76 CR76 CR74 CR56

interviewDate
13-May-02 13-May-02 13-May-02 1-Jul-02

interviewTime
10.30 12.00 12.00 10.30

staffNo
SG5 SG5 SG37 SG5

StaffRoom
staffNo
SG5 SG37 SG5

interviewDate
13-May-02 13-May-02 1-Jul-02

roomNo
G101 G102 G102

Figure 9 BCNF Interview and StaffRoom relations

Fourth Normal Form (4NF)


Multi-valued dependency (MVD) represents a dependency between attributes (for example, A, B and C) in a relation, such that for each value of A there is a set of values for B and a set of value for C. However, the set of values for B and C are independent of each other. A multi-valued dependency can be further defined as being trivial or nontrivial. A MVD A > B in relation R is defined as being trivial if B is a subset of A or AU B = R A MVD is defined as being nontrivial if neither of the above two conditions is satisfied.

Fourth Normal Form (4NF)


Fourth normal form (4NF) A relation that is in Boyce-Codd normal form and contains no nontrivial multi-valued dependencies.

Fifth Normal Form (5NF)


Lossless-join dependency Fifth normal form (5NF) A property of decomposition, which ensures that no spurious A relation that has no join dependency. tuples are generated when relations are reunited through a natural join operation. Join dependency Describes a type of dependency. For example, for a relation R with subsets of the attributes of R denoted as A, B, , Z, a relation R satisfies a join dependency if, and only if, every legal value of R is equal to the join of its projections on A, B, , Z.

Merits of Normalization
Normalization is based on a mathematical foundation. Removes the redundancy to a greater extent. After 3NF, data redundancy is

Demerits of Normalization
Data retrieval or SELECT operation performance will be severely affected. Normalization might not always represent real world scenarios.

Summary of Normal Forms


Input
Unnormaliz ed Table Table in 1 NF Tables in 2 NF

Operation
Create separate rows or columns for every combination of multivalued columns Eliminate Partial dependencies Eliminate Transitive dependencies

Output
Table in 1 NF Tables in 2NF Tables in 3 NF

Keys
Candidate key Primary Key Alternate Key Super Key Foreign Key

Keys
Candidate key A Candidate key is a set of one or more attributes that can uniquely identify a row in a given table.

Keys
Candidate key

Keys
Primary key


During the creation of the table, the Database Designer chooses one of the Candidate Key from amongst the several available, to uniquely identify row in the given table.

Alternate Key
The candidate key that is chosen to perform the identification task is called the primary key and the remaining candidate keys are known as alternate keys. No of Alternate Keys = No of Candidate Keys - 1

Keys
Super key Any superset of a candidate Key is a super key. Example: Custid,CName can uniquely distinguish each tuple of the relation from the other ones. Thus it satisfies the property of uniqueness. Also Custid can alone distinguish each tuple of the relation from the others. Thus it too, satisfies the property of uniqueness. Therefore, Custid is the Candidate Key and Custid,CName(superset of candidate) is the super Key.

Keys
Foreign key

A Foreign Key is a set of attribute (s) whose values are required to match values of a Candidate key in the same or another table. EMP (Child /Referencing Table)
EmpNo 1001 1002 1003 1004 EName Elsa John Maria Maida EDeptNo D1 D2 Null D1

DEPT (Parent /Master/Referenced Table)


DeptNo D1 D2 DName IVS ENR

Point to remember  Foreign key values do not (usually) have to be unique.  Foreign keys can also be null .

Demos

Keys

Foreign key Points to remember  A Foreign Key is a set of attributes of a table, whose values are required to match values of some Candidate Key in the same or another table  The constraint that values of a given Foreign Key must match the values of the corresponding Candidate Key is known as Referential constraint  A table which has a Foreign Key referring to its own Candidate Key is known as Self-Referencing table

Keys
Non-Key Attributes
The attributes other than the Candidate Key attributes in a table/relation are called Non-Key attributes. OR  The attributes which do not participate in the Candidate key.


Exercise on Key attributes


Given a relation R1(X,Y,Z,L) and the following attribute(s) can uniquely identify the records of relation R1. 1)X 2)X,L 3)Z,L Identify the following in relation R1?

Entity Relationship modeling

Database Design Techniques


Top down Approach


E R Modeling

Bottom Up approach


Normalization

ER modeling
ER modeling: A graphical technique for understanding and organizing the data independent of the actual database implementation. Entity: Any thing that may have an independent existence and about which we intend to collect data. Also known as Entity type. E.g.: Trainee Entity instance: a particular member of the entity type e.g. a particular trainee Attributes: Properties/characteristics that describe entities.eg: Trainee name, Batchname, DOB, Address, etc. Relationships: Associations between entities.E.g.: Trainee belongs to a Batch

Attributes
The set of possible values for an attribute is called the domain of the attribute Example:  The domain of attribute marital status is having four values: single, married, divorced or widowed.


The domain of the attribute month is having twelve values ranging from January to December.

Key attribute: The attribute (or combination of attributes) that is unique for every entity instance  E.g.: the account number of an account, the employee id of an employee etc. If the key consists of two or more attributes in combination, it is called a composite key

Simple Vs composite attribute


Simple attribute: cannot be divided into
simpler components E.g.: age of an employee

Composite attribute: can be split into


components E.g.: Date of joining of the employee.  Can be split into day, month and year

Single Vs Multi-valued Attributes


Single valued : can take on only a single
value for each entity instance E.g.: age of employee. There can be only one value for this.

Multi-valued: can take up many values


E.g.: skill set of employee

Stored Vs Derived attribute


Stored Attribute: Attribute that need to be stored permanently.


E.g.: name of an employee

Derived Attribute: Attribute that can be calculated based on other attributes.




E.g. : years of service of employee can be calculated from date of joining and current date

Regular Vs. Weak entity type


Regular Entity: Entity that has its own key attribute (s). E.g.: Employee, student ,customer, policy holder etc. Weak entity: Entity that depends on other entity for its existence and doesnt have key attribute (s) of its own E.g. : spouse of employee

Relationships
A relationship type between two entity types defines the set of all associations between these entity types Each instance of the relationship between members of these entity types is called a relationship instance E.g if Works-for is the relationship between the Employee entity and the department entity, then Rohan works-for IVS department, Riya works for ENR department ..etc are relationship instances of the relationship, worksfor

Degree of a Relationship
Degree: the number of entity types involved  One Unary  Two Binary  Three Ternary E.g.: employee manager-of employee is unary employee works-for department is binary customer purchase item, shop keeper is a ternary relationship

Cardinality
Relationships can have different connectivity  one-to-one (1:1)  one-to-many (1:N)  many-to- One (M:1)  many-to-many (M:N) E.g.: Employee head-of department (1:1) Lecturer offers course (1:N) assuming a course is taught by a single lecturer Student enrolls course (M:N)

Cardinality One - To - One


P1 P2 P3 P4 C1 C2 C3 C4

Person

Chair

One instance of entity type Person is related to one instance of the entity type Chair.

Demos

Cardinality One -to- Many


O1 O2 O3 E1 E2 E3 E4 E5

Organization

Employee
Demos

One instance of entity type Organization is related to multiple instances of entity type Employee

Cardinality Many-to-One
E1 E2 E3 E4 E5
Employee Department
Demos

D1 D2 D3

Reverse of the One to Many relationship.

Cardinality Many-to-Many
S1 S2 S3 S4 C1 C2 C3 C4

Student

Course

Demos

Multiple instances of one Entity are related to multiple instances of another Entity.

Relationship Participation
Total : Every entity instance must be
connected through the relationship to another instance of the other participating entity types

Partial: All instances need not participate


E.g.: Employee Head-of Department Employee: partial Department: total

ER Modeling - Notations

ER Modeling -Notations
An Entity is an object or concept about which business user wants to store information.

A weak Entity is dependent on another Entity to exist. Example Order Item depends upon Order Number for its existence. Without Order Number it is impossible to identify Order Item uniquely. Attributes are the properties or characteristics of an Entity A key attribute is the unique, distinguishing characteristic of the Entity A multi-valued attribute can have more than one value. For example, an employee Entity can have multiple skill values.

ER Modeling -Notations
A derived attribute is based on another attribute. For example, an employee's monthly salary is based on the employee's basic salary and House rent allowance. Relationships illustrate how two entities share information in the database structure.

To connect a weak Entity with others, you should use a weak relationship notation.

ER Modeling -Notations
Cardinality specifies how many instances of an Entity relate to one instance of another Entity. M,N both represent MANY and 1 represents ONE Cardinality

In some cases, entities can be selflinked. For example, employees can supervise other employees

Attributes
DOB Name Address

E#

Employee

Designatio n

Key attribute
DOB Name Address

E#

Employee

Designatio n

The key attribute is underlined

Multivalued Attribute
DOB Name Address Designatio n Employee

E#

skill set

Composite attribute
floor building

DOB Name Address

E#

Employee

Designatio n

Relationship

student

enrols in

course

Unary Relationship
Manages

Employee

Role names
Role names may be added to make the meaning more explicit
subordinate

Employee
Manager

Manages

Binary Relationship
Employee Works for Department

Ternary Relationship
Medicine

Doctor

Prescription

Patient

Relationship participation
Employee 1 head of 1

department

Attributes of a Relationship
Medicine

Number of days dosage

Doctor

Prescription

Patient

Weak entity
E# id
name

Employee

has

dependant

The dependant entity is represented by a double lined rectangle and the identifying relationship by a double lined diamond

Case Study ER Model For a Uinversity DB


Assumptions :
A college contains many departments Each department can offer any number of courses Many instructors can work in a department An instructor can work only in one department For each department there is a Head An instructor can be head of only one department Each instructor can take any number of courses A course can be taken by only one instructor A student can enroll for any number of courses Each course can have any number of students

Steps in ER Modeling
Identify the Entities Find relationships Identify the key attributes for every Entity Identify other relevant attributes Draw complete E-R diagram with all attributes including Primary Key Review your results with your Business users

Steps in ER Modeling

Step 1: Identify the Entities


DEPARTMENT STUDENT COURSE INSTRUCTOR

Steps in ER Modeling
Step 2: Find the relationships
One course is enrolled by multiple students and one student enrolls for multiple courses, hence the cardinality between course and student is Many to Many. The department offers many courses and each course belongs to only one department, hence the cardinality between department and course is One to Many. One department has multiple instructors and one instructor belongs to one and only one department , hence the cardinality between department and instructor is one to Many. Each department there is a Head of department and one instructor is Head of department ,hence the cardinality is one to one .

Steps in ER Modeling

Step 3: Identify the key attributes


Deptname is the key attribute for the Entity Department, as it identifies the Department uniquely. Course# (CourseId) is the key attribute for Course Entity. Student# (Student Number) is the key attribute for Student Entity. Instructor Name is the key attribute for Instructor Entity. Step 4: Identify other relevant attributes For the department entity, the relevant attribute is location For course entity, course name,duration,prerequisite For instructor entity, room#, telephone# For student entity, student name, date of birth

Steps in ER Modeling

Step 5:

Draw complete E-R diagram with all attributes including Primary Key

ER diagram for the University

Case Study Banking Business Scenario


Assumptions :
There are multiple banks and each bank has many branches. Each branch has multiple customers Customers have various types of accounts Some Customers also had taken different types of loans from these bank branches One customer can have multiple accounts and Loans

Steps in ER Modeling
Identify the Entities Find relationships Identify the key attributes for every Entity Identify other relevant attributes Draw complete E-R diagram with all attributes including Primary Key Review your results with your Business users

Steps in ER Modeling
Step 1: Identify the Entities BANK BRANCH LOAN ACCOUNT CUSTOMER.

Steps in ER Modeling
Step 2: Find the relationships
One Bank has many branches and each branch belongs to only one bank, hence the cardinality between Bank and Branch is One to Many. One Branch offers many loans and each loan is associated with one branch, hence the cardinality between Branch and Loan is One to Many. One Branch maintains multiple accounts and each account is associated to one and only one Branch, hence the cardinality between Branch and Account is One to Many One Loan can be availed by multiple customers, and each Customer can avail multiple loans, hence the cardinality between Loan and Customer is Many to Many. One Customer can hold multiple accounts, and each Account can be held by multiple Customers, hence the cardinality between Customer and Account is Many to Many

Steps in ER Modeling
Step 3: Identify the key attributes BankCode (Bank Code) is the key attribute for the Entity Bank, as it identifies the bank uniquely Branch# (Branch Number) is the key attribute for Branch Entity Customer# (Customer Number) is the key attribute for Customer Entity Loan# (Loan Number) is the key attribute for Loan Entity Account No (Account Number) is the key attribute for Account Entity

Steps in ER Modeling
Step 4: Identify other relevant attributes For the Bank Entity, the relevant attributes other than BankCode would be Name and Address For the Branch Entity, the relevant attributes other than Branch# would be Name and Address For the Loan Entity, the relevant attribute other than Loan# would be Loan Type For the Account Entity, the relevant attribute other than Account No would be Account Type For the Customer Entity, the relevant attributes other than Customer# would be Name, Telephone# and Address

Steps in ER Modeling
Step 5: Draw complete E-R diagram with all attributes including Primary Key

ER Diagram for the Bank

Merits and Demerits of ER Modeling Merits


Easy to understand. Represented in Business Users Language. Can be understood by non-technical specialist. Intuitive and helps in Physical Database creation. Can be generalized and specialized based on needs. Can help in database design. Gives a higher level description of the system. Demerits Physical design derived from E-R Model may have some amount of ambiguities or inconsistency. Sometime diagrams may lead to misinterpretations

Logical Database Design

Converting Strong entity types


Each entity type becomes a table Each single-valued attribute becomes a column Derived attributes are ignored Composite attributes are represented by its equivalent parts Multi-valued attributes are represented by a separate table The key attribute of the entity type becomes the primary key of the table

Entity example
Here address is a composite attribute Years of service is a derived attribute (can be calculated from date of joining and current date)
Employee (E#, Name, Door_No, Street, City, Pincode, Date_Of_Joining)

Skill set is a multi-valued attribute


Emp_Skillset( E#, Skillset)

And

Entity example (Contd.)


Employee Table E# Name Door_No Street City Pincode Date_Of_Joining PK

SkillSet Table E# FK/PK Skillset PK

Converting weak entity types


Weak entity types are converted into a table of their own, with the primary key of the strong entity acting as a foreign key in the table. This foreign key along with the key of the weak entity forms the composite primary key of this weak table The Relational Schema
Employee (E# ,EmpName,DateOfJoining,SkillSet)

Dependant (Employee, Dependant_ID, Name, Address)

Converting weak entity types (Contd)


Dependent Table Employee PK/FK Dependent_ID PK Name Address

Employee Table E# PK EmpName DateofJoining SkillSet

Converting relationships
The way relationships are represented depends on the cardinality and the degree of the relationship The possible cardinalities are: 1:1, 1:M, N:M The degrees are:  Unary  Binary  Ternary

Binary 1:1
Employee 1 head of 1 Department

Case 1: Combination of participation types The primary key of the partial participant will become the foreign key in the total participant. Employee( E#, EName,DateOfJoining,SkillSet) Department (Dept#, DName,Location,Head)
Demos

Binary 1 : 1
Department Dept# PK DName Location Head FK

Employee Table PK E# EName DateofJoining SkillSet

Binary 1:1
Employee Sits_on CHAIR

Case 2: Uniform participation types

The primary key of either of the participants can become a foreign key in the other Employee (EmpCode,EmpName,DateOfJoining) Chair( Item#, Model, Location, Used_by) (OR) Employee (EmpCode,EmpName,DateOfJoining,Sits_on) Chair (Item#, Model, Location)

Binary 1 : 1
Employee Table EmpCode PK EmpName DateofJoining Employee Table EmpCode PK EmpName DateofJoining Sits_On FK Chair Table Item# Model Location Used_By FK PK

OR
Chair Table Item# Model Location PK

Binary 1:N
Teacher 1 Teaches N Subject

The primary key of the relation on the 1 side of the relationship becomes a foreign key in the relation on the N side. Demos Teacher (TeacherID, Name, Telephone, Cabin) Subject (SubCode, SubName, Duration, TeacherID)

Binary 1 : N
Subject Teacher TeacherID Name Telephone Cabin PK SubCode SubName Duration TeacherID FK PK

Binary M:N
M Student Enrolls N Course

A new table is created to represent the relationship which contains two foreign keys - one from each of the participants in the relationship. The primary key of the new table is the combination of the two foreign keys. Student (StudentID,SName,DOB,Address) Course(CourseID,CName)
Demos

Enrolls (SID, CID)

Binary M : N
Course CourseID Coursename PK

Student StudentID SName DOB Address

PK

Enrolls SID CID DOIssue Status

PK / FK PK / FK

Unary 1:1
Consider employees who are also

a couple The primary key field itself will become foreign key in the same table

Employee( E#, EName,Spouse)

Demos

Unary 1 : 1
Employee Table E# EName DateofJoining SkillSet Spouse FK PK

Unary 1:N
The primary key field itself

will become foreign key in the same table. Same as unary 1:1

Employee( E#, EName, DateOfJoining, SkillSet, Manager)


Demos

Unary 1 : N

Employee Table E# PK EName DateofJoining SkillSet Manager FK

Unary M:N
M Guarantor_of N Employee

There will be two resulting tables. One to represent the entity and another to represent the M:N relationship as follows Employee( E#, EName, DateOfJoining, SkillSet) Guaranty( Guarantor, Beneficiary)
Demos

Unary M : N
Guaranty Employee Table EmpCode EName DateofJoining SkillSet PK Beneficiary PK/FK Guarantor PK/FK

Ternary relationship
Represented by a new table. The new table contains three foreign keys - one from each of the participating Entities. The primary key of the new table is the combination of all three foreign keys. Prescription (DocID, PatCode, MedName)

Ternary
Prescription Doctor DocID Title PK DocID PatCode MedName NextVisit Patient PatCode PatName DOB Address PK Medicine MedName ExpDate PK PK / FK PK / FK PK/ FK

Deriving Logical Schema for Banking Application


Each Entity represented in the E-R model can be defined as a table in the relational scheme. All the attributes of the Entity will become columns of the table.
Example: Let us consider the CUSTOMER Entity of the banking database scenario. We can translate this Entity to a CUSTOMER table with the following columns. CUSTOMER

Deriving Logical Schema for Banking Application

Weak Entity types are converted into a table of their own, with the primary key of the strong Entity acting as a foreign key in the table. This foreign key along with the partial key of the Weak Entity forms the composite primary key of this table. Example: s per this guideline, a Branch table can be created with the following structure. BRANCH  (BankCode, Branch#, Name, Address)

Deriving Logical Schema for Banking Application

Each relationship can be defined as separate table in relational schema. Key attributes of participating entities will become key attribute of the Relationship. Example: We can define Loan_Detail table with Loan# and Customer# together as primary key with other relevant attributes like DateOfSanction, InterestRate, LoanAmount, Duration etc. LOAN_DETAILS  (Loan#, Customer#, DateofSanction, InterestRate, LoanAmount, Duration) Participating entities: The entities which are joined by the relation.

Deriving Logical Schema for Banking Application

In a Many to Many relationship, it is necessary to create separate tables for the participating entities and the relationship. In the banking application we have Customer and Loan Entities have a Many to Many relationship. Hence one should create separate tables for CUSTOMER, LOANS and LOAN_DETAILS. Here LOAN_DETAILS refers to relationship table.

Summary
Most of the application errors are because of miscommunication between the application user and the designer and between the designer and the developer. It is always better to represent business findings in terms of picture to avoid miscommunication It is practically impossible to review the complete requirement document by business users. An E-R diagram is one of the many ways to represent business findings in pictorial format. E-R Modeling will also help the database design E-R modeling has some amount of inconsistency and anomalies associated with it.

Vous aimerez peut-être aussi