Vous êtes sur la page 1sur 281

Relational Database Management

System







Education & Research Department
Infosys Ltd.






COMPANY CONFIDENTIAL
COPYRIGHT NOTICE

2009-2011 Infosys Limited, Bangalore, India. All Rights Reserved. Infosys believes the
information in this document is accurate as of its publication date; such information is
subject to change without notice. Infosys acknowledges the proprietary rights of other
companies to the trademarks, product names and such other intellectual property rights
mentioned in this document. Except as expressly permitted, neither this documentation nor
any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, printing, photocopying, recording or otherwise,
without the prior permission of Infosys Limited and/or any named intellectual property
rights holders under this document.

Education and Research Department
Infosys Limited
Electronic City
Hosur Road
Bangalore - 561 229, India.

Tel: 91 80 852 0261-270
Fax: 91 80 852 0362
www.infosys.com
mailto:E&R@infosys.com
Course Description and References





References

1. Database System Concepts, Henry F Korth, Abraham Silberschatz, Fifth Edition,
McGraw-Hill International Edition, Computer Science Series

2. An Introduction to Database Systems, C.J.Date, Eighth Edition, Pearson Education

3. The Complete Reference SQL, James R. Groff and Paul N. Weinberg, Second Edition,
Tata McGraw Hill Edition 2003

4. Scott Urman, Ron Hardman, Michael McLaughlin Oracle Database 10g PL/SQL
programming Oracle Press

5. Kevin Loney, George Koch Oracle 9i, The Complete reference Oracle Press



Table of Contents


Table of Contents

COPYRIGHT NOTICE ...................................................................................................... II
PURPOSE .................................................................................................................... 1
1. INTRODUCTION TO DBMS ......................................................................................... 2
1.1. WHAT IS A DATABASE? ........................................................................................... 2
1.2. WHAT IS A DATABASE MANAGEMENT SYSTEM? .................................................................... 5
1.3. FILE SYSTEM INTERFACE VERSUS DBMS INTERFACE ............................................................... 6
1.4. MASTER AND TRANSACTION FILES ................................................................................ 8
1.5. TRADITIONAL APPROACH TO INFORMATION PROCESSING ........................................................ 10
1.5.1. Disadvantages of the Traditional Approach to Information Processing ................... 11
1.6. WHY DBMS? ................................................................................................... 14
1.7. TYPES OF DATABASES .......................................................................................... 15
1.8. THREE LEVEL ARCHITECTURE FOR A DBMS ..................................................................... 19
1.9. DBMS USERS................................................................................................... 23
1.10. DATA MODELS .................................................................................................. 24
1.10.1. Object Based Logical Model ....................................................................... 25
1.10.2. Record Based Logical Model ....................................................................... 25
1.11. RDBMS ........................................................................................................ 28
1.12. SOME POPULAR RDBMS PACKAGES ............................................................................. 29
1.13. APPLICATION AREAS OF RDBMS ............................................................................... 29
1.14. KEYS ........................................................................................................... 29
1.15. SUMMARY ...................................................................................................... 36
2. ENTITY-RELATIONSHIP (E-R) MODELING..................................................................... 38
2.1. INTRODUCTION ................................................................................................. 38
2.2. ENTITY AND RELATIONSHIP ..................................................................................... 39
2.3. CARDINALITY OF A RELATIONSHIP ............................................................................... 39
2.3.1. One to One Relationship ........................................................................... 40
2.3.2. One to Many Relationship ......................................................................... 40
2.3.3. Many to One Relationship ......................................................................... 41
2.3.4. Many to Many Relationship ........................................................................ 41
2.4. E-R DIAGRAM NOTATIONS ...................................................................................... 42
2.5. MODELING USING E-R DIAGRAMS ............................................................................... 45
2.5.1. Steps in E-R Modeling ............................................................................... 45
2.6. CASE STUDY 1: PROBLEM STATEMENT .......................................................................... 45
2.6.1. Case Study 1: Solution .............................................................................. 46
2.7. CASE STUDY 2: PROBLEM STATEMENT .......................................................................... 48
2.7.1. Case Study 2: Solution .............................................................................. 49
2.8. CASE STUDY 3: PROBLEM STATEMENT .......................................................................... 50
2.8.1. Case Study 3: Solution .............................................................................. 51
2.9. TRANSFORMING AN E-R MODEL INTO PHYSICAL DATABASE DESIGN ............................................. 54
2.10. MERITS AND DEMERITS OF E-R MODELING ...................................................................... 56
2.10.1. Merits of E-R Modeling ............................................................................. 56
2.10.2. Demerits of E-R Modeling .......................................................................... 56
2.11. SUMMARY .................................................................................................... 57
3. NORMALIZATION .................................................................................................. 58
3.1. INTRODUCTION ................................................................................................. 58
3.2. THE NEED FOR NORMALIZATION ................................................................................ 58
3.3. PROCESS OF NORMALIZATION ................................................................................... 59
Table of Contents


3.3.1. Determinant .......................................................................................... 60
3.3.2. Functional Dependency ............................................................................. 60
3.3.3. Full Functional Dependency ....................................................................... 61
3.3.4. Partial Dependency ................................................................................. 61
3.3.5. Transitive Dependency ............................................................................. 62
3.3.6. Key attributes ........................................................................................ 62
3.3.7. Non key attributes .................................................................................. 63
3.4. TYPES OF NORMAL FORMS ...................................................................................... 63
3.4.1. First Normal Form (1 NF) .......................................................................... 63
3.4.2. Second Normal Form (2 NF) ....................................................................... 64
3.4.3. Third Normal Form (3 NF) ......................................................................... 67
3.5. MERITS AND DEMERITS OF NORMALIZATION ..................................................................... 68
3.5.1. Merits .................................................................................................. 68
3.5.2. Demerits .............................................................................................. 68
3.6. SUMMARY ...................................................................................................... 70
3.7. CASE STUDY .................................................................................................... 71
4. STRUCTURED QUERY LANGUAGE (SQL) ..................................................................... 74
4.1. THE PURPOSE OF SQL ......................................................................................... 74
4.2. A BRIEF HISTORY OF SQL ...................................................................................... 75
4.3. DATA TYPES.................................................................................................... 76
4.4. STATEMENT TYPES .............................................................................................. 78
4.5. DATA DEFINITION LANGUAGE (DDL) STATEMENTS ............................................................. 78
4.5.1. CREATE TABLE Statement ......................................................................... 79
4.5.2. ALTER TABLE statement ........................................................................... 85
4.5.3. DROP TABLE statement ............................................................................ 87
4.5.4. TRUNCATE TABLE statement ...................................................................... 88
4.5.5. CREATE INDEX statement .......................................................................... 88
4.6. DATA MANIPULATION LANGUAGE (DML) STATEMENTS ......................................................... 91
4.6.1. INSERT Statement ................................................................................... 91
4.6.2. DELETE Statement .................................................................................. 95
4.6.3. UPDATE Statement .................................................................................. 96
4.6.4. SELECT Statement ................................................................................... 97
4.6.5. Sub-Queries.......................................................................................... 117
4.6.6. JOINS ................................................................................................. 123
4.6.7. Queries using EXISTS / NOT EXISTS ............................................................. 129
4.6.8. The Order of Execution of a SELECT statement .............................................. 130
4.7. VIEWS ......................................................................................................... 131
4.7.1. Horizontal View .................................................................................... 131
4.7.2. Vertical View ........................................................................................ 131
4.7.3. DROP VIEW Statement ............................................................................. 131
4.7.4. Joined Views ........................................................................................ 132
4.7.5. VIEW Updates ....................................................................................... 132
4.7.6. Checking View Updates (CHECK OPTION) ...................................................... 132
4.7.7. Advantages of Views ............................................................................... 134
4.7.8. Disadvantages of Views ........................................................................... 134
4.8. DATA CONTROL LANGUAGE (DCL) ............................................................................ 134
4.8.1. Granting Privileges ................................................................................. 135
4.8.2. Revoking Privileges (REVOKE) .................................................................... 136
4.9. BEST PRACTICES .............................................................................................. 137
4.10. SUMMARY ..................................................................................................... 140
5. ON-LINE TRANSACTION PROCESSING(OLTP) ...............................................................141
5.1. PURPOSE ...................................................................................................... 141
Table of Contents


5.2. TRANSACTION ................................................................................................. 141
5.3. TRANSACTION SYSTEMS ........................................................................................ 143
5.3.1. Batch Transaction Processing System ........................................................... 143
5.3.2. On-line Transaction Processing System (OLTP) ............................................... 143
5.3.3. Real time Transaction Processing System...................................................... 144
5.4. TRANSACTION PROPERTIES .................................................................................... 144
5.5. REQUIREMENTS FOR AN OLTP SYSTEM ........................................................................ 145
5.5.1. Integrity .............................................................................................. 145
5.5.2. Concurrency ......................................................................................... 147
5.6. LOCKS ......................................................................................................... 151
5.6.1. Shared Lock (S) ..................................................................................... 152
5.6.2. Exclusive Lock (X) .................................................................................. 152
5.7. GRANULARITY OF LOCKING .................................................................................... 153
5.8. INTENT LOCKING .............................................................................................. 155
5.8.1. Intent Share (IS) .................................................................................... 156
5.8.2. Intent Exclusive (IX) ............................................................................... 156
5.8.3. Shared Intent Exclusive (SIX) ..................................................................... 156
5.8.4. Case study for Intent Locks ....................................................................... 158
5.9. DEADLOCK .................................................................................................... 160
5.10. SECURITY ..................................................................................................... 161
5.11. RECOVERY .................................................................................................... 161
5.12. TRANSACTION LOG ............................................................................................ 163
5.12.1. Deferred update .................................................................................... 163
5.12.2. Immediate Update ................................................................................. 164
5.12.3. Check-Points ........................................................................................ 166
5.13. SUMMARY ..................................................................................................... 170
6. INTRODUCTION TO PL/SQL ....................................................................................172
6.1. NEED FOR PL/SQL ........................................................................................... 172
6.2. PL/SQL ARCHITECTURE ...................................................................................... 173
6.3. PL/SQL BLOCK STRUCTURE ................................................................................... 173
6.4. COMMENTS IN PL/SQL ....................................................................................... 173
6.5. ANONYMOUS PL/SQL BLOCKS ................................................................................ 174
6.5.1. Declaration section ................................................................................ 174
6.5.2. Executable section ................................................................................. 175
6.5.3. Exception section ................................................................................... 176
6.6. PL/SQL BLOCK EXECUTION ................................................................................... 176
6.6.1. How a PL/SQL block can be executed? ......................................................... 176
6.6.2. Another way of executing the PL/SQL block .................................................. 177
6.7. NAMED PL/SQL BLOCKS ...................................................................................... 178
6.8. VARIABLES AND DATATYPES .................................................................................... 178
6.8.1. Scalar datatype - Character ...................................................................... 180
6.8.2. Scalar datatype PLS_INTEGER .................................................................. 182
6.8.3. Scalar datatype - NUMBER ........................................................................ 182
6.8.4. Scalar datatype - Boolean ........................................................................ 182
6.8.5. Scalar Datatype - Date ............................................................................ 183
6.8.6. Scalar Datatype - Timestamp .................................................................... 183
6.9. DBMS_OUTPUT PACKAGE ................................................................................... 184
6.9.1. DBMS_OUTPUT procedures ....................................................................... 185
6.9.2. DBMS_OUTPUT procedures usages .............................................................. 185
7. PL/SQL BASICS AND CONSTRUCTS ...........................................................................186
7.1. %TYPE ANCHORED DECLARATIONS ............................................................................ 186
7.2. BIND VARIABLES ............................................................................................... 187
Table of Contents


7.3. SUBSTITUTION VARIABLES ..................................................................................... 188
7.4. ACCEPTING INPUT IN PL/SQL ................................................................................. 189
7.5. SET VERIFY ON/OFF ....................................................................................... 190
7.6. OPERATORS AND EXPRESSIONS................................................................................. 191
7.6.1. Concatenation operator ........................................................................... 191
7.6.2. Arithmetic operator - Addition .................................................................. 192
7.6.3. Arithmetic operator - Exponentiation .......................................................... 192
7.6.4. Usage of Arithmetic operators with DATE variables ......................................... 193
7.7. NESTED PL/SQL BLOCKS ..................................................................................... 193
7.7.1. Scope of variables .................................................................................. 196
7.7.2. Qualifying identifiers .............................................................................. 197
7.8. PL/SQL CONDITIONAL CONSTRUCTS .......................................................................... 198
7.8.1. IF THEN END IF syntax ........................................................................... 198
7.8.2. IF THEN ELSE END IF syntax ..................................................................... 199
7.8.3. Usage of inequality operator ( != or <> ) ...................................................... 199
7.8.4. IF THEN ELSIF END IF syntax .................................................................... 200
7.8.5. LOOP.. END LOOP................................................................................... 202
7.8.6. Numeric FOR Loop .................................................................................. 203
7.8.7. Numeric FOR Loop with REVERSE option ..................................................... 205
7.8.8. WHILE Loop .......................................................................................... 205
7.9. USING SQL STATEMENTS IN PL/SQL .......................................................................... 206
7.9.1. Using SELECT statements in PL/SQL ............................................................ 206
7.10. COMPOSITE DATATYPE ......................................................................................... 208
7.10.1. %ROWTYPE ........................................................................................... 208
7.10.2. Using INSERT statements in PL/SQL............................................................. 209
7.10.3. Using UPDATE statements in PL/SQL ........................................................... 210
7.10.4. Using DELETE statements in PL/SQL ............................................................ 211
8. PL/SQL EXCEPTIONS ............................................................................................211
8.1. INTRODUCTION ................................................................................................ 212
8.2. HOW TO HANDLE EXCEPTION? ................................................................................. 212
8.3. EXCEPTION SYNTAX ............................................................................................ 212
8.4. EXCEPTION TYPES ............................................................................................. 213
8.4.1. Raising exceptions.................................................................................. 213
8.5. PREDEFINED ORACLE SERVER EXCEPTION ....................................................................... 213
8.5.1. NO_DATA_FOUND predefined exception ....................................................... 214
8.5.2. TOO_MANY_ROWS predefined exception ...................................................... 215
8.5.3. DUP_VAL_ON_INDEX predefined exception .................................................... 216
8.5.4. VALUE_ERROR predefined exception ........................................................... 216
8.5.5. INVALID_NUMBER predefined exception ....................................................... 217
8.6. NON-PREDEFINED ORACLE SERVER EXCEPTION ................................................................. 217
8.7. USER-DEFINED EXCEPTION ..................................................................................... 218
8.8. WHEN OTHERS EXCEPTION HANDLER ........................................................................ 219
8.9. USING SQLCODE AND SQLERRM ............................................................................ 220
8.10. RAISE_APPLICATION_ERROR BUILT IN PROCEDURE ....................................................... 221
8.11. EXCEPTION PROPAGATION ..................................................................................... 221
8.11.1. Exception raised in the declaration section ................................................... 222
8.11.2. Exception raised in the executable section ................................................... 223
8.11.3. Exception raised in the exception section ..................................................... 223
9. PL/SQL CURSORS ................................................................................................225
9.1. CURSORS ...................................................................................................... 225
9.2. IMPLICIT CURSORS ............................................................................................. 225
9.3. IMPLICIT CURSORS ATTRIBUTES ................................................................................ 226
Table of Contents


9.4. IMPLICIT CURSOR EXAMPLE .................................................................................... 228
9.5. EXPLICIT CURSORS ............................................................................................ 228
9.6. OPERATIONS ON EXPLICIT CURSOR ............................................................................. 228
9.6.1. Declaring the cursor ............................................................................... 228
9.6.2. Opening the cursor ................................................................................. 229
9.6.3. Fetching records from the cursor ............................................................... 230
9.6.4. Closing the cursor .................................................................................. 232
9.7. EXPLICIT CURSOR SIMPLE LOOP .............................................................................. 232
9.8. EXPLICIT CURSOR WITH GROUP BY CLAUSE .................................................................. 233
9.9. EXPLICIT CURSOR ATTRIBUTES ................................................................................. 234
9.10. USING RECORD VARIABLES WITH EXPLICIT CURSORS ............................................................ 234
9.11. NAVIGATING CURSORS WITH WHILE LOOP ................................................................... 235
9.12. CURSOR FOR LOOP .......................................................................................... 236
9.13. IMPLICIT CURSOR FOR LOOP ................................................................................. 237
9.14. CURSOR RELATED PREDEFINED ORACLE SERVER EXCEPTIONS .................................................... 238
9.14.1. INVALID_CURSOR exception ...................................................................... 238
9.14.2. CURSOR_ALREADY_OPEN exception ............................................................. 238
9.15. PARAMETERIZED CURSORS ..................................................................................... 239
9.16. EXPLICIT CURSOR FOR UPDATE ............................................................................ 240
9.17. FOR UPDATE CURSOR DECLARATION ......................................................................... 240
9.18. WHERE CURRENT OF CLAUSE .............................................................................. 242
10. TRANSACTION PROCESSING IN PL/SQL .....................................................................242
10.1. USING COMMIT STATEMENT IN PL/SQL ...................................................................... 242
10.2. USING ROLLBACK STATEMENT IN PL/SQL ................................................................... 244
10.3. USING SAVEPOINT IN PL/SQL .............................................................................. 244
10.4. CONCURRENCY CONTROL ...................................................................................... 246
11. ON LINE ANALYTICAL PROCESSING (OLAP) ................................................................248
11.1. DIFFERENCE BETWEEN OLTP AND OLAP ...................................................................... 249
11.2. DATA WAREHOUSE ............................................................................................ 250
11.2.1. Why data warehouse is needed? ................................................................. 250
11.2.2. Characteristics of Data Warehouse: ............................................................ 250
11.2.3. Data Warehousing Terminology .................................................................. 251
11.2.4. Data Collection for Data Warehouse Applications ........................................... 253
11.2.5. Storing of data in Data warehouse .............................................................. 253
11.2.6. Reporting of a Data warehouse application ................................................... 256
11.2.7. Difference between Data Warehouse and Data Mart ........................................ 258
11.2.8. Popular tools available for data warehousing ................................................ 258
11.3. SUMMARY ..................................................................................................... 259
APPENDIX-A .............................................................................................................260
BOYCE CODD NORMAL FORM (BCNF) ................................................................................... 260
EMBEDDED SQL ......................................................................................................... 262
Purpose ............................................................................................................ 262
Why Embedded SQL? ............................................................................................ 262
TIMESTAMPING .......................................................................................................... 265
GLOSSARY ...............................................................................................................268
INDEX .....................................................................................................................272
Relational Database Management System

1 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
PURPOSE
All business activities deal with a lot of data.

Examples:
x Schools, colleges and universities store data about students, courses, trainers, etc.
x Banks store data about their customers, transactions
1
(deposits, withdrawals), loans,
etc.

A Database Management System (DBMS) provides an efficient storage and data management
mechanism.

All real life software projects use databases to store huge volumes of data. It is extremely
important for software professional to understand the concepts of DBMS. The knowledge of
DBMS enables a software engineer to:

x Store data
x Access data
x Modify data
x Delete data
x Share data among the different users
x Ensure security of the data

In short, DBMS concepts and techniques help in the efficient management of data.


1
Transaction: It is defined as one or more processing steps that are treated as one activity to achieve
a desired result. These collections of operations which form a single and atomic logical unit of work are
called transactions. A database system ensures proper execution of transactions despite failures
either the whole transaction will executes, or none of it will execute.
Relational Database Management System

2 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1. Introduction to DBMS

1.1. What is a Database?
Database can be defined as an organized collection of interrelated data.

Example: Consider a bank database. The bank stores data about their customers in a file
known as Customer_Details. The Customer_Details file has the following fields:

x Cust_ID: The customers identification number
x Cust_Last_Name: The customers last name
x Cust_Mid_Name: The customers middle name or initials
x Cust_First_Name: The customers first name
x Account_No: The customers account number
x Account_Type: The type of account that the customer has in the bank (Savings or
Checking etc).
x Bank_Branch: The name of the bank branch
x Cust_Email: The customers email ID

The bank also stores data about the loan(s) taken by its customers. The loan details are
stored in the file Customer_Loan. The Customer_Loan file has the following fields:

x Cust_ID: The customers identification number
x Loan_No: The loan number to identify the loan
x Amount_in_Dollars: The amount loaned by the bank to the customer

One customer can avail multiple loans from the bank.

The data stored in the two files, Customer_Details and Customer_Loan constitutes
interrelated data.

Refer to Figure 1-1.

Relational Database Management System

3 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Figure 1-1: Example of a Bank Database

The data in the database is integrated which means that the database is a collection of
several distinct
2
files. These distinct files may have some duplicate data but the duplication
of data is kept to the minimum.

Example: Figure 1-1 shows two files, Customer_Details and Customer_Loan. The two files
are distinct in the sense, Customer_Details file contains details about the banks customers
and the Customer_Loan file contains details about all the loans taken by the customers of
the bank. Both the files have the Cust_ID field.

In order to sanction a loan to a customer, the bank requires the account number
(Account_No) of the customer. The account number information is not required again in the
Customer_Loan file, because it can always be discovered by referring to the
Customer_Details file.

The data present in the database can be shared. Sharing means individual pieces of data in
the database can be shared with different users. Each of those users can have access to the
same portion of data. They can use the data for different purposes.

Refer to Figure 1-2.

2
Distinct: Not identical.
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Detail records from Customer_Details file
Customer_Loan records from Customer_Loan file
Bank Database
Customer_Details file
Customer_Loan file
Relational Database Management System

4 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-2: Sharing the Account_No from the Customer_Details file

As depicted in Figure 1-2, the Account_No from the Customer_Details file is being accessed
by the banks Fixed Deposit Department and banks Loan Department. The information would
typically be used for different purposes by the two classes of users.

The users can even concurrently access the database. Concurrent access implies that
different users can access the same piece of data at the same time.


Points to Remember:

x A database is defined as an organized collection of interrelated data
x Data in the database:
o Is integrated
o Can be shared
o Can be concurrently accessed






User in the Banks Loan Department
User in the Banks Fixed Deposit Department
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101 Smith 1020 Downtown Smith_Mike@yahoo.com
105 Jones 2389 Brighton Jones_Simon@rediffmail.com
104 Quails 2367 Downtown Quails_Jack@yahoo.com
103 Langer 3421 Plainsboro Langer_Justin@yahoo.com
102 Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details file
Relational Database Management System

5 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1.2. What is a Database Management System?
A Database Management System (DBMS) is a collection of interrelated files and a set of
programs that allow users to access and modify these files. The primary goal of a DBMS is to
provide a convenient and efficient way to store, retrieve and modify information.

Figure 1-3 shows an end user
3
working with data from the Customer_Details file, maintained
in the bank database.


Figure 1-3: Basic picture of a database system
The database systems are designed to:
x Define structures for the storage of data
x Provide mechanisms for the manipulation
4
of data
x Ensure the safety and security of the data stored even in the cases of system crashes
or attempts at unauthorized access
x Share data among the different users
In short, database systems are designed to manage large volumes of data.


3
End User: The person who will use the system or for whom a system is developed. Example: a bank
teller is an end user of a bank system.
4
Manipulation: Data manipulation is addition of new data in to the database or modification of
existing data in the database.
Cust_ID
Cust_Last_Name
Bank_Branch
Account_No
Bank Database
End user working with data
from the Bank Database
Account_Type
Relational Database Management System

6 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1.3. File System Interface versus DBMS Interface
In the traditional file approach, data is stored in flat files
5
which are maintained by the file
system, beneath the operating systems control.

Refer to Figure 1-4.

The end users use the application programs to perform specific tasks. For example, personnel
in the banks Loan Department make use of the Loan_Processing system to process the loan(s)
of customer(s).

These flat files are accessed through application programs.


Figure 1-4: Conventional method of Data Storage

In the DBMS approach, all requests to use the data stored in the database are managed by the
DBMS. The end user can make use of either the application programs or the standard SQL
6
to
access the data.

Refer to Figure 1-5.

5
Flat files: A flat file is a file containing records that has no structured interrelationship. Files used in
structured programming (SP) projects were essentially flat files.
6
SQL: SQL stands for Structured Query Language. It is a language used by relational databases to
fetch, update and manage data. Relational Database is explained in Section 1.10.2.3.
File System
Loan_Processing
(Application Program)
Fixed_Deposit_Processing
(Application Program)
Transaction_Processing
(Application Program)
Customer_Details.dat Customer_Loan.dat Customer_Fixed_Deposit.dat Customer_Transaction.dat
Relational Database Management System

7 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The application programs are written in some programming language (COBOL, PL/I, C++ etc)
or in some higher level fourth generation language
7
. The standard SQL interface is provided
as an integral part of the database system software to access the database.



Figure 1-5: DBMS handles all requests for access to the database

The DBMS acts as a layer of abstraction
8
over the file system.


7
Fourth Generation Language (4GL): A 4GL is typically non-procedural and designed so that end users
can specify what they want without having to know how the computer will process their requirement.
8
Abstraction: A simplified representation of something complex. It may not be always necessary to
know everything in detail instead we may require knowing only the necessary things.
File System
Loan_Processing
(Application Program)
Fixed_Deposit_Processing
(Application Program)
Transaction_Processing
(Application Program)
DBMS
Bank Database
Customer_Details
Customer_Loan
Customer_Fixed_Deposit
Customer_Transaction
Relational Database Management System

8 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Example: As depicted in Figure 1-6, in the file system interface, the end user uses an
application program written in a high level language such as COBOL, to access the data from
the Customer_Details file. The files are maintained by the file system under the operating
systems control. In the DBMS interface, the end user uses an SQL interface to place a request
to the DBMS to retrieve data from the Customer_Details table
9
.


Figure 1-6: File System Interface versus DBMS Interface
1.4. Master and Transaction Files
A master file is used to store relatively static data about some entity
10
. A transaction file
contains relatively transient data about a particular data processing task.

Example: Consider the banking system consisting of two files, the Customer_Details and the
Customer_Transaction file.

9
Table: A table is a two dimensional structure which can have rows and columns. Rows stored in a
table are equivalent to records in the flat files.
10
Entity: An entity can be defined as a thing or object in the real world scenario which can be
differentiated from other objects. Example: each person is an entity, and a bank accounts can be
considered to be entities.
Application Programs
Interface through high level language
Ex: READ CUSTOMER_DETAILS-FILE AT
END STOP RUN
Operating System
(Disk Manager, File Manager)
Application Programs
Interface through Query (SQL)
Ex: SELECT Cust_ID, Account_No
FROM Customer_Details;
Operating System
(Disk Manager, File Manager)
DBMS
File System Interface DBMS Interface
File System (Disk Storage)
Customer_Details file
Customer_Loan file
Database(Disk Storage)
Customer_Details table
Customer_Loan table
End User
End User
Relational Database Management System

9 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

In Figure 1-7, Customer_Details is the master file containing all the information about the
banks customers. In Figure 1-8, Customer_Transaction is the transaction file containing
information about all the transactions that a customer makes with the bank.

The Customer_Details file is modified rarely. For example, when a new account is created or
whenever the existing details of a customer changes. However, for every deposit or
withdrawal made by the customer(s), the Customer_Transaction file is updated.


Figure 1-7: Example of master file - Customer_Details


Figure 1-8: Example of transaction file - Customer_Transaction


Points to Remember:

A master file
x Stores relatively static data about an entity
x Changes rarely

A transaction file
x Stores relatively transient data about a particular data processing task
x Changes frequently as transactions happen more periodically and in
large numbers

Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details file
Account
_No
Transaction
_Date
Transaction
_Type
Transaction_Amount
_in_Dollars
Total_Available_Balance
_in_Dollars
102012-Jan-2005 Deposit 5000.00 10000.00
234814-Jan-2005 Withdrawal 2500.00 13500.00
342114-Jan-2005 Deposit 2000.00 27234.00
236716-Jan-2005 Withdrawal 1200.00 12456.00
102017-Jan-2005 Withdrawal 1500.00 8500.00
Customer_Transaction records from Customer_Transaction file
Relational Database Management System

10 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1.5. Traditional Approach to Information Processing
In the traditional file approach each application maintains its own master file and generally
has its own set of transaction files. Files are custom-designed for each application and there
is little sharing of data among the various applications.

Application programs are data-dependent. It is impossible to change the physical
representation (how the data is physically represented in storage) or the access technique
(how it is physically accessed) without affecting the application.

Refer to Figure 1-9.

Although the traditional, file-oriented approach is still widely used, it has some serious
disadvantages. The next section deals with the drawbacks of the traditional approach to
information processing.


Figure 1-9: The traditional approach to information processing

Loan_Processing
Fixed_Deposit_Processing
(Application Program)
Transaction_Processing
(Application Programs)
Customer_Details
Customer_Transaction
Customer_Loan
Customer_Fixed_Deposit
End User uses
Application Programs
Application programs use
Transaction file(s) and Master file(s)
Master file
Transaction file
Relational Database Management System

11 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1.5.1. Disadvantages of the Traditional Approach to Information
Processing
The disadvantages of the traditional approach to information processing are discussed below:
x Data Security: The data as maintained in the flat file(s) is easily accessible and therefore
not secure

Example: Consider the banking system. The Customer_Transaction file has details about the
total available balance of all customers. A customer wants information about his or her
account balance, but in a file system it is difficult to give the customer access to only his or
her data in the file. This illustrates that it is difficult to enforce security constraints11 for
only certain data items in a file.

x Data Redundancy: Often the same information is duplicated in two or more files
Refer to Figure 1-10.


Figure 1-10: Data Redundancy in files
This duplication of data (redundancy) leads to higher storage and access cost. In addition it
may lead to data inconsistency
12
. Example: Assume that the same data is repeated in two or
more files. If a change is made to data in one file, it is required that the change be made to
the data in the other file as well. If this is not done, it will lead to an error during access to
the data.


11
Constraints: restrictions, limitations.
12
Inconsistency: lacking uniformity or agreement.
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit records from Customer_Fixed_Deposit file
Redundant Data
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details file
Relational Database Management System

12 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Example: As depicted in Figure 1-10, customers details such as Cust_Last_Name,
Cust_Mid_Name, Cust_First_Name, Cust_Email are stored both in the Customer_Details file
and the Customer_Fixed_Deposit file. If the email address of one customer, for example,
Langer G. Justin changes from Langer_Justin@yahoo.com to Langer_Justin@rediffmail.com,
the Cust_Email must be updated in both the files; otherwise it will lead to inconsistent data.

Although one can design file systems with minimal redundancy, data redundancy is
sometimes preferred. Example: Assume that the customers details such as Cust_Last_Name,
Cust_Mid_Name, Cust_First_Name and Cust_Email are not stored in the
Customer_Fixed_Deposit file. If this customer information is required along with the fixed
deposit details, it would mean that two different files would need to be accessed, and this
would lead to increased overhead. It is thus preferred to store the information in the
Customer_Fixed_Deposit file itself.

x Data Isolation: Data isolation means that all the related data is not available in one file.
Generally, the data is scattered in various files, and the files may be in different formats,
therefore writing new application programs to retrieve the appropriate data is difficult.

x Program/Data Dependence: Under the traditional file approach, application programs are
data-dependent. It is impossible to change the physical representation (how the data is
physically represented in storage) or access technique (how it is physically accessed)
without affecting the application. Changes in the physical format of the master file(s),
such as addition of a data field requires that the change be made in all the application
programs that accesses the master file(s). Consequently, for each of the application
programs that a programmer writes or maintains, the programmer must also focus on data
management issues. There can be no centralized
13
execution of the data management
functions. Data management is scattered among all the application programs.

Example: Consider the banking system. The master file, Customer_Fixed_Deposit contains
details about the customers fixed deposit accounts. Refer to Figure 1-11. A customers fixed
deposit record is described by:
x Cust_ID
x Cust_Last_Name
x Cust_Mid_Name
x Cust_First_Name
x Cust_Email
x Fixed_Deposit_No
x Amount_in_Dollars
x Rate_of_Interest_in_Percent

13
Centralized: Systems where decision making, flow of data, or the beginning of activities are initiated
at the same central point and disseminated to remote points in the organization
Relational Database Management System

13 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

An application program is available to display all the details about the fixed deposit
accounts of all the customers. Assume that a new data field, the
Fixed_Deposit_Maturity_Date is added to the master file. The application program must also
be altered because it depends on the master file.

If for example, the physical format of the master/transaction file such as the field delimiter
or record delimiter is changed, it necessitates that the application program which depends
on it, also be altered.


Figure 1-11: Master file - Customer_Fixed_Deposit

x Lack of Flexibility: The traditional systems are able to retrieve information for
predetermined requests for data. If the management needs unanticipated data, the
information can perhaps be provided if it is in the files of the system. Extensive
programming is however required which may result in a delay. By the time the
information is made available, it may no longer be required or useful.

Example: Consider the banking system. An application program is available to generate a list
of customer names in a particular area of the city. However the bank manager requires a
list of those customers who have an account balance greater than $10,000.00 and reside in a
particular area of the city. An application program for this purpose does not exist. The bank
manager has two choices:

o Print the list of customer names in a particular area of the city and then manually
find those with an account balance greater than $10,000.00
o Hire an application programmer to write an application program

Both the solutions are cumbersome.

x Concurrent Access Anomalies: Many traditional systems allow multiple users to access
and update the same piece of data simultaneously. But the interaction of concurrent
updates may result in inconsistent data. To guard against this possibility, the system must
maintain some form of supervision, but supervision is difficult because data may be
accessed by many different application programs and these application programs may not
have been coordinated previously.
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit records from Customer_Fixed_Deposit file
Relational Database Management System

14 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Example: Consider the banking system. Assume that the bank manager is analyzing all the
transactions made by the customers. At about the same time, a customer accesses his or her
account to make a withdrawal. The account is both read by the bank manager and updated
by the customer at the same time. This is called concurrent access. Because the customers
account is being updated at the same time, there is a possibility of the bank manager
reading an incorrect balance.

These difficulties prompted the development of database systems.


Points to Remember:

Disadvantages of the traditional file approach:
x Data Security Data easily accessible by all and therefore not secure

x Data Redundancy Same data is duplicated in two or more files which
may lead to update anomalies

x Data Isolation All the related data is not available in one file. Thus
writing a new application program is difficult

x Program / Data Dependence Application programs are data-
dependent. It is impossible to change the physical representation (how
the data is physically represented in storage) or the access technique
(how it is physically accessed) without affecting the application

x Lack of Flexibility Only pre-determined requests for information can
be met. It is not flexible enough to satisfy unanticipated queries

x Concurrent Access Anomalies Same piece of data is allowed to be
updated simultaneously which leads to inconsistencies
1.6. Why DBMS?
DBMS ensures the following:
x Application programs and queries
14
are data-independent. They do not depend on any
one particular physical representation of data in secondary storage or access
technique


14
Queries: A query is a request that a user makes to the database.
Relational Database Management System

15 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Allows for sharing of data among different users. Users are also able to access the
database concurrently without facing the issues of inconsistent data

x Controls redundancy and inconsistency

x Provides secure access to the database

x Enforces integrity constraints
15
(also known as business rules) by preventing the entry
of invalid information into the database

x Enables backup and recovery from system crashes

1.7. Types of Databases
There are two generic database architectures: centralized and distributed. The fundamental
differences between the two architectures are:
Centralized

Distributed


15
Integrity Constraints: A set of restrictions for the correctness and accuracy of data.
Relational Database Management System

16 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Refer to Figure 1-12.
x The entire data is located at a single
site
x Allows for greater control over
accessing and updating data
x Vulnerable to failure because they
depend on the availability of
resources only at the central site

Example: Consider the banking system
where a customer is withdrawing money
from ATM machine. The bank has account
information present for every customer
which needs to be made available at every
ATM machine. Bank can choose to keep all
these information at a central place instead
of keeping it at every ATM Machine and
sharing it through network.

Refer to Figure 1-13.
x The database is stored on several
computers personal computer or
mainframe system
x Computers in a distributed system can
communicate with each other via
various communication media. eg -
high speed networks or telephone
lines
x Distributed databases are
geographically separated and
managed
x Distributed databases are separately
administered
x Distributed databases have a slower
interconnection

Example: Consider a multinational banking
system. The head office of bank is located at
Chicago and the branches are at Melbourne
and Tokyo. The bank database is distributed
across these branches. The branch offices
are connected through a network


Relational Database Management System

17 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-12: Centralized Database
Database Server
ATM
ATM
ATM
ATM
Telecom Line, LAN or
Direct Line
Relational Database Management System

18 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-13: Distributed Databases
The distributed databases can be classified as homogeneous
16
or heterogeneous
17
.

In a distributed system, it is easy to differentiate between local and global transactions. A
transaction is said to be local, if it accesses data from the single site at which the transaction
was initiated. A global transaction, on the other hand, accesses data from the sites different
from the one at which the transaction was initiated.

Example: Consider the multinational banking system where the banks head office is located
at Chicago and the branch offices are located at Melbourne and Tokyo. The branch offices
are connected through a network. Each branch office has its own computer and database
consisting of all the accounts maintained at that branch.
Refer to Figure 1-14.
The head office maintains information about all the branches of the bank. Consider a
transaction to add $50 to account number 1020 located at the Downtown bank branch in
Tokyo. If the transaction was initiated at the Downtown bank branch in Tokyo, then it is
considered as local transaction; otherwise, it is considered as global transaction. A

16
Homogeneous: All the same, uniform, harmonized.
17
Heterogeneous: varied, mixed, diverse.
Network
Workstation
Workstation
Workstation
Database Server
Database server
Database Server
Tokyo
Chicago
Melbourne
Relational Database Management System

19 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
transaction to transfer $50 from account 1020 to account 2389, which is located at the
Brighton bank branch in Melbourne, is a global transaction, since accounts present at two
different sites are accessed as a result of its execution.

Thus in a distributed database system:

x The various sites are aware of one another
x Each site provides an environment to execute both local and global transactions
x If each site runs the same distributed database management software, it is called
homogeneous distributed database systems
x If different sites run different database management software, it is difficult to
manage global transactions. Such systems are called multi-database systems or
heterogeneous distributed database systems


Figure 1-14: Customer_Details file
1.8. Three level architecture for a DBMS
Most commercial databases are based on a three-level architecture model called the
ANSI/SPARC model (American National Standards Institute/Standard Planning and
Requirements Committee).

Refer to Figure 1-15.

Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details file
Relational Database Management System

20 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-15: The three levels of the architecture

The overall design of the database is called database schema. Schemas are not changed
frequently. In general, database systems support one internal schema, one conceptual
schema and several external schemas.

External / View level: Many users of the database system are not concerned with all the
information in the database. Instead, they need to access only a portion of the database. The
external level of abstraction simplifies the end users interaction with the system. The system
may provide many views for the same database.

Conceptual / Logical level: The conceptual level describes about the data stored in the
database and the relationship among those stored data. This level is used by the Database
Administrator
18
(s) (DBA), who in turn decide what information must be kept in the database.
Internal / Physical level: The internal level is the lowest level of abstraction and describes
the data storage and access methods. Database Administrator(s) may be aware of certain
details of the physical organization of the data.

Example: Consider a banking system. It uses:
A bank database
x Customer_Details table

18
Database Administrator: The DBA administers the DBMS and is in charge of creating, maintaining and
modifying all the three levels of the DBMS. The DBA also controls the allocation of system resources,
grants/revokes privileges to/from users and ensures the consistency of the database.
Internal Schema
Internal level
(storage view)
Conceptual Schema
External Schema A External Schema B External Schema Z
Conceptual level
(common user view)
External / View Level
(individual user views)
Relational Database Management System

21 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Customer_Transaction table

At the internal level, a Customer_Details record or Customer_Transaction record are stored
as a block of consecutive storage locations (for example, words or bytes). As the language
compiler hides this level of detail from programmers. Similarly, the database system hides
the lowest-level storage details from database programmers.

At the conceptual level, the table definition (the attribute
19
data type and width definition)
and the relationship among the data are explained.

Finally at the external level, several views
20
of the database are defined, and database end
users are able to see these views. These views hide the conceptual level details. It also
prevents users from accessing other parts of the database and hence it also provides security
mechanism. For example, tellers in the bank will be able to work with only that part of the
database that has data on customer accounts; they cannot see information such as salaries of
bank employees.

Detailed system architecture (Figure 1-16).

The database management system (DBMS) is the software that handles all access to the
database. Conceptually, what happens is the following:
x A user issues an access request for data (typically using SQL)
x The DBMS receives that request and analyzes it
x The DBMS checks the external schema, external/conceptual mapping, conceptual
schema, conceptual/internal mapping and the storage structure definition
x The DBMS executes the required operations on the stored database



19
Attribute: The literal meaning is quality; characteristic; trait or feature. Entities are described in a
database by a set of attributes. For Example, in the bank system, Cust_ID, Cust_Email etc. describe
Customer-Detail entity set.
20
View: A view is a virtual table in the database defined by a query. For more details on views see
Chapter 4.
Relational Database Management System

22 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-16: Detailed System Architecture

Storage
structure
definition
(Internal
Schema)
Conceptual / Internal mapping
Conceptual view
External / conceptual mapping
External View A External View B
Mike
(User)
Graham
(User)
Jack
(User)
Justin
(User)
External
Schemas
Conceptual Schema
Database ( Internal view)
Database Administrator
(DBA)
Schemas & mappings built
& maintained by the DBA
DBMS
Figure 1-17: An example of the three levels

CREATE TABLE Customer_Loan (
Cust_ID NUMBER(4)
Loan_No NUMBER(4)
Amount_in_Dollars NUMBER(7,2))
Customer_Loan
Cust_ID : 101
Loan_No : 1011
Amount_in_Dollars : 8755.00
Cust_ID TYPE = BYTE (4), OFFSET = 0
Loan_No TYPE = BYTE (4), OFFSET = 4
Amount_in_Dollars TYPE = BYTE (7), OFFSET = 8
Conceptual
External
Internal
Relational Database Management System

23 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The above figure depicts the three levels of DBMS architecture. The external view is how the
customer, Mike A Smith views it. The conceptual view is how the DBA views it. The internal
view is how the data is actually stored.

1.9. DBMS Users
The DBMS users, depending on their level of interaction with the system, fall into one of the
three categories.
x End User: End users deal only with the highest level of abstraction. End users may not
be concerned with or even aware of the details of the DBMS. Typically, the end user is
involved in updates to the database or queries on the database.
x Application Programmer: Application programmer is responsible for writing database
application programs in some programming language such as COBOL, PL/I, C++, or
some higher-level fourth generation language. The application programs access the
database by issuing the appropriate request to the DBMS
x Database Administrator (DBA): The DBA can be a single person or a team comprising a
group of persons. The functions of the DBA include the following:
o Definition of the Conceptual Schema: It is the DBAs job to decide exactly
what information is to be held in the database. The DBA identifies the entities
and the information to be recorded about those entities. This process is usually
referred to as logical database design. Once the DBA has decided the content
of the database at an abstract level, he creates the corresponding conceptual
schema
o Definition of the Internal Schema: The DBA must also decide how the data is
to be represented in the database. This process is usually referred to as
physical database design. Having done the physical design, the DBA must then
create the corresponding storage structure definition. In addition, the DBA
must also define the associated conceptual/internal mapping
o Liaising with users: The DBA liaises with users to ensure that the data they
need is available and to write the necessary external schema. In addition, the
DBA must also define the associated external/conceptual mapping
o Granting of authorization for data access: The granting of different types of
authorizations (read, write, etc.) allows the DBA to regulate which parts of the
database various users can access
o Defining Integrity constraints: The data values stored in the database must
satisfy certain consistency constraints
Refer to Figure 1-18.
Relational Database Management System

24 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-18: Users of DBMS


1.10. Data Models
A data model is a conceptual tool to describe data, relationships among data, semantics of
data and consistency constraints of the data.

Application Programmers
Works at the highest level of abstraction.
Deals with updates and queries
Writes application programs
Data Base Administrator
(DBA)
Defines the Conceptual, Internal and
External schema, control s access
pri vi l eges to users and ensures
consistency of the database
External Level
Internal Level
Conceptual Level
End User
Relational Database Management System

25 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Two of the widely used data models are discussed in the next sections.

1.10.1. Object Based Logical Model
Entity-Relationship Model (E-R Model) is a widely known object based logical model. These
models are used to portray data at the conceptual and the view level.
The E-R Model is based on the inspection of the real world that consists of a collection of
various basic objects also called entities, and of relationships among these objects or entities.
The E-R Model concept is covered in detail in Chapter 2.



Figure 1-19: Entity Relationship Model
1.10.2. Record Based Logical Model
They are used to depict data at the conceptual and the view level. They are used both to
specify the general logical structure of the database and to supply a higher-level description
of the implementation. There are three widely accepted record based logical models.

1.10.2.1. Hierarchical Data Model
The hierarchical data model organizes data in a tree like structure. This hierarchy is also
called parent child hierarchy. This structure implies that a record can have repeating
information (generally in the child data segments).
Customer_Details
Cust_Last_Name
Account_No
Account_Type
Cust_ID
Entity Set
Attribute
Connects attributes to entity
set and entity sets to one
another
Bank_Branch
Cust_Email
Customer_Loan
Cust_ID
Loan_No
Amount_in_Dollars
Cust_Mid_Name
Cust_First_Name
Relational Database Management System

26 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Data is represented by a collection of records (record types). A record type is corresponding
to the table in a relational model, and the individual records are corresponding to the rows of
the table. A parent child relationship is used to create links between these record types.

In a hierarchical database the parent-child relationship is one-to-many. Because of this
restriction a child segment can have only one parent segment.

IBM's Information Management System (IMS) is a popular example of DBMS based on
Hierarchical Database. The databases based on Hierarchical model were popular during 1960s
to 1970s.

Example: Consider the banking system. Figure 1-20 shows the hierarchical representation of
Customer_Details and Customer_Loan records from Customer_Details and Customer_Loan
files respectively.

Note: Loan (Loan_No: 1011) is shown as taken jointly by Mike A. Smith and Graham S. Smith
to explain the difference between hierarchical and network Model.


Figure 1-20: Hierarchical Model
1.10.2.2. Network Data Model
The network model permitted many-to-many relationships in data. A Conference on Data
Systems Languages (CODASYL) formally defined the network model in 1971.

Data in the network model is represented by a collection of records and the relationships
among data are represented by links (pointers). The records in the database are organized as
collections of graphs. Example: IDMS.

101 Smith Mike A. 1020 Savings Downtown Smith_Mike@yahoo.com
1011 8755.00
2010 2555.00 2015 2000.00
102 Smith Graham S. 2348 Checking Bridgewater Smith_Graham@rediffmail.com
ROOT
1011 8755.00
103 Langer Justin G. 3421 Savings Plainsboro Langer_Justin@yahoo.com
Relational Database Management System

27 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Example: Refer to Figure 1-21. Assume that loan (Loan_No:1011) is taken jointly by two
customers (Mike A. Smith and Graham S. Smith).

In the hierarchical model (Figure 1-20), the loan information has to be repeated for each
customer individually because it does not permit many to many relationship. The parent-
child relationship is one to many.

However in the network model, because it allows many to many relationship, the loan
information is stored only once and both the customers can refer to it.


Figure 1-21: Network Model

1.10.2.3. Relational Data Model
The relational model uses a set of tables (relations), each of which is assigned a unique name,
to represent both data and the relationships among those data.

A table has a specified number of columns but can have any number of rows. Rows stored in a
table resemble records from flat files. A row in a table represents a relationship among a set
of values. Refer to Figure 1-22, a row in the Customer_Loan table gives the details of a loan
taken by a customer. Example: Customer (Cust_ID: 101) has taken a loan (Loan_No: 1011) of
amount (Amount_in_Dollars: 8755.00)

Since a table is a collection of such relationships, there is a close correspondence between
the concept of table and the mathematical concept of relation, from which the relational
data model takes its name.


101 Smith Mike A. 1020 Savings Downtown Smith_Mike@yahoo.com 1011 8755.00
102 Smith Graham S. 2348 Checking Bridgewater Smith_Graham@rediffmail.com
103 Langer Justin G. 3421 Savings Plainsboro Langer_Justin@yahoo.com
104 Quails Jack D. 2367 Checking Downtown Quails_Jack@yahoo.com
105 Jones Simon E. 2389 Checking Brighton Jones_Simon@rediffmail.com
2010 2555.00
2056 3050.00
2015 2000.00
Relational Database Management System

28 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-22: Relations - Customer_Loan and Customer_Details

1.10.2.4. Structural Terminology
1.11. RDBMS
Relational Database: Any database in which the data is logically organized based on
relational model.

RDBMS: It is a DBMS which manages the relational database.

Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Attributes / Columns / Fields
Rows / Records / Tuples
No. of Records / Rows / Tuples:
Cardinality of the Relation
No. of Attributes / Columns / Fields :
Degree of the Relation
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details table
Formal Relational Term Informal Equivalence

Relation Table
Tuple Row or Record
Cardinality of a Relation Number of rows
Attribute Column or Field
Degree of a Relation Number of Columns
Primary Key Unique Identifier
Domain A pool of values from which the values of
specific attributes of specific relations are
taken
Relational Database Management System

29 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
An RDBMS is a category of DBMS that stores data in related tables.
1.12. Some Popular RDBMS packages


1.13. Application Areas of RDBMS

Databases are widely used in real life applications such as:

1. Airlines: For reservations and schedule information.

2. Banking: For customer information, accounts, loans and banking transactions

3. Universities: For student information, course registrations and grades.

4. Telecommunications: For keeping track of calls made by customers, generating
monthly bills of the customer and storing information about the communication
networks.

5. Sales: For customer, product and purchase information in any industry.


1.14. Keys
Candidate Key: A candidate key of a table is defined as a set of one or more attributes of
the table that can uniquely identify a row in a given table.

Example: Consider the Customer_Details table shown in Figure 1-23.
RDBMS Package Company / Corporation
Oracle Oracle Corp.
Sybase Sybase Inc.
Informix Informix Software Inc.
MySQL It is an Open source
DB2 IBM
Ingres II Computer Associates International Inc.
SQL Server Microsoft
Relational Database Management System

30 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-23: Customer_Details Table
Assumptions:
x One customer can have only one account
Example: As depicted in Figure 1-23, customer, Mike A. Smith has a Savings account with
Account_No: 1020. Similarly, customer, Justin G. Langer has a Savings account with
Account_No: 3421.

x An account can belong to only one customer
Example: Account_No: 2367 belongs to Jack Quails.

x Two rows cannot have the same values in the attributes, Cust_Last_Name and
Cust_First_Name attributes. if two rows have the same value for Cust_Last_Name,
they differ in their values for Cust_First_Name


In the Customer_Details table, there are four candidate Keys. Out of the four, three are
simple candidate keys and one is a composite candidate key. They are:

Simple Candidate Key: A candidate key comprising of one attribute only.
Example:
x Account_No
x Cust_ID
x Cust_Email

Composite Candidate Key: A candidate key comprising of two or more attributes.
Example: {Cust_Last_Name, Cust_First_Name}.
Cust_Last_Name alone is not sufficient to distinguish between rows in the table. But along
with Cust_First_Name it can distinguish between rows in the table. Their combination
constitutes a candidate key.

Invalid Candidate Key: A candidate key should be comprised of a set of attributes that can
uniquely identify a row. A subset of the attributes should not possess the unique
identification property. Example: the combination of {Account_No, Account_Type} is an
invalid candidate key. Although the attributes Account_No and Account_Type together can
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details table
Relational Database Management System

31 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
distinguish rows, their combination does not form a candidate key, since the attribute
Account_No alone is a candidate key.

Candidate keys are identified during the design of the database.

Primary Key: During the creation of the table (the implementation phase), the database
designer chooses one of the candidate key (amongst the several available) to uniquely
identify rows in the Customer_Details table. The candidate key so chosen is called the
primary key.

Example: The database designer chooses Account_No to differentiate between rows in the
Customer_Details table. Account_No is the primary key for the Customer_Details table.
Refer to Figure 1-24.

Entity integrity constraint: The primary key of a table is always not null and unique. The
attributes which constitute the primary key cannot have duplicate values in the rows of the
table. It is mandatory to provide input for the primary key attributes. This constraint is
referred to as the entity integrity constraint. A null value is used to represent an unknown
value. It is not a blank character or a zero value.

Primary Key is usually chosen from amongst the several candidate keys. The preference is
given to a candidate key which consists of minimal number of attributes.

Example: It is preferable to select the candidate key (Account_No) as the primary key as
opposed to the candidate key (Cust_Last_Name, Cust_First_Name).


Points to Remember:
x Candidate key is used to uniquely identify a row in a given table. A
candidate key can be a set of one or more attributes.
x A table can have more than one candidate keys
x Candidate keys are identified during the design phase
x One of the candidate key is chosen as primary key by the database
designer while creating the table
x It is preferred to select a candidate key which is having a minimal
number of attributes to function as a primary key


Guidelines to select a primary key:
x Give preference to numeric column(s). The search algorithm performs
better when the primary key is numeric
x Give preference to a single attribute. The search algorithm gives
better output with a single attribute primary key than with a
composite attribute primary key
Relational Database Management System

32 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Give preference to the minimal composite key. A composite key is a
collection of two or more attributes. Example: if the candidate keys
are {x1,x2,x3} and {y1,y2}, the composite key {y1,y2} is the minimal
composite key and will therefore be chosen as the primary key
x Primary keys are chosen according to business convenience


Figure 1-24: Customer_Details Table with Account_No as Primary Key

Foreign Key: A foreign key is defined as a set of attribute(s) in a table with a restriction that
its value should be matched with the values of a candidate key in the same or another table.
The foreign key attribute(s) can have duplicate or null values.

Example: Consider the banking system where the details of the customers of the bank are
stored in the Customer_Details table. Whenever the customer makes a transaction for
example a deposit or withdrawal of funds from the bank, the transaction is recorded in the
Customer_Transaction table.

A transaction is allowed only if the customer has an account in the bank. The account
number information is stored in the Customer_Details table. This information is referred to
for every transaction. In case the account number does not exist, the transaction will not be
allowed.

Primary Key of the table, Customer_Details
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details table
Relational Database Management System

33 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-25: Demonstration of Referential Integrity
In the example above, Account_No attribute of Customer_Transaction table is the foreign
key referring to Account_No of Customer_Details table. The foreign key attributes can have
duplicate values. In the example above, Account_No 1020 occurs in two rows of the table.
The foreign key attributes can have NULL values.

As per the referential integrity constraint, the values of a foreign key attribute must match
the values of the values of the corresponding candidate key.

The relation that contains the foreign key is the referencing relation (child table) and the
relation that contains the corresponding candidate key is the referenced relation (parent
table).

Invalid foreign key value: A value of 1050 in the Account_No attribute of
Customer_Transaction table is invalid because a value of 1050 is not present in the
Account_No attribute of the Customer_Details table.

Self Referencing: A table might include a foreign key, the values of which are required to
match the value of a candidate key in the same table. This is known as self referencing.



Candidate Key of the table, Customer_Details
Account
_No
Transaction
_Date
Transaction
_Type
Transaction_Amount
_in_Dollars
Total_Available_Balance
_in_Dollars
102012-Jan-2005 Deposit 5000.00 10000.00
234814-Jan-2005 Withdrawal 2500.00 13500.00
342114-Jan-2005 Deposit 2000.00 27234.00
236716-Jan-2005 Withdrawal 1200.00 12456.00
102017-Jan-2005 Withdrawal 1500.00 8500.00
Customer_Transaction records from Customer_Transaction table
Account_No in Customer_Transaction table is the Foreign Key
referring to Account_No in Customer_Details table
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details table
Relational Database Management System

34 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 1-26: An example of Self Referencing
As can be seen from Figure 1-26, each employee belongs to a department and department
has manager. For example Cindy S. Atherton, Henry A. George and Matt G. Jackson are the
managers of the HR, Finance and Design department respectively. A NULL value is unknown
value or inapplicable value. A NULL value does not mean a blank value or zero.

All employees including the manager have a unique Employee_ID. The Manager_ID attribute
of the Employee_Manager table can only have any existing value from the Employee_ID
attribute. Manager_ID is therefore the foreign key referencing the candidate key,
Employee_ID.

Example: Assume that the employees present in the organization have to undertake a
course. Details of the courses are available in the table Course_Description as shown in the
Figure 1-27. Some courses have a prerequisite, for example an employee must go through the
Computer Hardware and System Software Concepts course before taking up the Programming
Fundamentals course. Here, the attribute Prerequisite, is the foreign key referencing the
candidate key, Course_ID.


Figure 1-27: Another Example of Self Referencing Table

Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Records from Employee_Manager Table
Employee_ID is the Candidate Key for the table
Manager_ID is the Foreign Key referencing Employee_ID
Course_ID Course_Title Duration_in_
Days
Prerequisite
121Computer Hardware & System Software Concepts 4 NULL
122Programming Fundamentals 7 121
123Relational Database Management System 7 122
124User Interface Design 1 122
125Object Oriented Concepts 1 122
Course_ID is the Candidate Key for the
Table
Prerequisite is the Foreign Key referencing
Course_ID
Relational Database Management System

35 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Super Key: Any superset of a candidate key is a super key.
Example: Consider the following sets comprising of attributes from the Customer_Details
table.
x {Account_No}
x {Account_No, Account_Type}
x {Account_No, Account_Type, Bank_Branch}

{Account_No} is a candidate key for the Customer_Details table. {Account_No, Account_Type}
is a superset of {Account_No} and therefore it is a super key for the Customer_Details table.
Same is the case for the set {Account_No, Account_Type, Bank_Branch}.

A super key may have unnecessary attribute(s).

Although the combination of {Account_No, Account_Type} is a super key but {Account_Type}
is an unnecessary attribute as {Account_No} is sufficient to uniquely distinguish between rows
in the Customer_Details table.

Non-Key Attributes: The attributes other than the candidate key attributes in a
table/relation are called non-key attributes.
Example: Cust_Last_Name, Cust_Mid_Name, Cust_First_Name, Bank_Branch, etc. are non-
key attributes in the Customer_Details table.


Points to Remember:
x A foreign key is defined as a set of attributes of a table, the values of
which are required to match values of some candidate key in the same
or another table
x A referential constraint is defined as a restriction that values of a
given foreign key must match the values of the corresponding
candidate key
x A table which has a foreign key which refers its own candidate key is
known as self-referencing table
x The foreign key attribute(s) can have duplicate or null values
x Any superset of a candidate key is a super key
x The attributes other than the candidate key attributes in a table or
relation are called non-key attributes






Relational Database Management System

36 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


1.15. Summary
x A database is an organized collection of interrelated data
x Data in the database:
o Is integrated
o Can be shared
o Can be concurrently accessed
x The database systems are designed to:
o Define structures for the storage of data
o Provide mechanisms for the manipulation of data
o Ensure the security of the stored data even in case of system crash or attempts
of unauthorized access
o Share data among the different users
x A master file
o Stores relatively static data about an entity
o Changes rarely
x A transaction file
o Stores relatively temporary data about a particular data processing task
o Changes frequently as transactions happen periodically and in large numbers
x Disadvantages of the traditional file approach:
o Data security
o Data redundancy
o Data isolation
o Program / data dependence
o Lack of flexibility
o Concurrent access anomalies
x DBMS ensures the following:
o Data independence
o Allows for sharing of data among the different users
o Allows concurrent access to the database
o Controls redundancy and inconsistency
o Provides a secure access to the database
o Enforces integrity constraints by preventing the entry of invalid information
into the database
o Enables backup and recovery from system crashes
x Centralized Database: All data is located at a single site
x Distributed Database: The database is stored on several computers
x Three level architecture for a DBMS
o External/View Level: Enables users to view/access only a part of the database
Relational Database Management System

37 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
o Conceptual/Logical level: Describes what data is stored in the database and
what relationships exist among those data
o Internal/Physical level: Describes the data storage and access methods
x DBMS Users:
o End Users: Works at the external level and generally makes updates to the
database or executes queries on the database
o Application Programmer: Writes application programs
o Database Administrator: Defines the conceptual, internal and external
schema, control access privileges to/from users and ensures the consistency of
the database
x Data Models: Is a conceptual tool which can be used to describe data, data
relationships, data semantics and consistency constraints
o Object Based Logical Model: E-R Model
o Record Based Logical Model:
Hierarchical Data Model: IMS
Network Model: IDMS
Relational Data Model: Relational data model uses a collection of tables
(relations) to represent data and the relationships among those data.
Example: Oracle, Sybase
x Relational Database: Any database in which the organization is based on relational
data model
x RDBMS: A DBMS that manages the relational database
x Keys:
o A candidate key is defined as a set of one or more attributes of the table that
can uniquely identify a row in a given table
o A table can have more than one candidate keys
o Candidate keys are identified during the design phase
o While creating tables the database designer chooses one candidate key from
amongst the several available, to serve as a primary key
o Any one of the candidate key can be selected as a primary key. Preference
should be given to the candidate key with minimal number of attributes
o A foreign key is a set of attribute(s) of a table, the values of which are required
to match the values of a candidate key in the same table or in a different table
o The constraint that the values of a given foreign key must match the values of
the corresponding candidate key is known as referential integrity constraint
o If there is a table which has a foreign key and it is referring to its own
candidate key then this table is called self-referencing table
o Any superset of a candidate key is a super key
o The attributes other than the primary key attributes in a table/relation are
called non-key attributes

Relational Database Management System

38 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
2. Entity-Relationship (E-R) Modeling

2.1. Introduction
Generally the business scenarios are complex in nature. A software application designer
21

who is not an expert in a particular business domain may fail to capture the exact business
requirement to build a software application. This is one of the prominent causes of software
project failure.

Reviewing of requirement specification
22
document with the business users
23
(who are experts
in their respective business domains) will not yield the expected results because of the
following reasons:

x The application designer usually writes the requirement specification documents using
software technology jargons
24
which are difficult to understand by the business users

x Usually these documents are quite lengthy and the business users will not be able to
devote enough time to read and review the complete document

It is always better to represents all the business rules
25
in pictorial format so that the business
users can understand and review the business rules easily and correctly.

One such technique, which is commonly used for designing of the databases, is Entity-
Relationship Modeling (E-R Modeling). The diagram used in this technique is called Entity
Relationship Diagram (ERD).

In Infosys, 60% to 70% of projects use this technique to capture the requirement specification
for the database application design and development.

Entity Relationship Diagram (ERD) was first defined in 1976 by Peter Chen. Since then James
Martin and Charles Bachman have added some small refinements to the basic ERD principles.
Due to its simplicity and ease of use, this technique attracted considerable attention during
early 1990s in both industry and research community.

21
Software application designer: The person who designs software applications.
22
Requirement specification: A document which contains requirement for a specific application.
23
Business users: The users who owns the application.
24
Jargons: The specialized or technical language of a trade, profession, or similar group.
25
Business rules: The rules/policies which govern the functioning of the application.
Relational Database Management System

39 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
2.2. Entity and Relationship
Before learning E-R diagrams technique in detail let us understand Entity and Relationship.

Entity
Entity is a common word anything real or abstract
26
, about which we want to store data.
Entity types fall into five categories: roles, events, locations, tangible
27
things or concepts.


Some examples of entities are employee, hockey match, campus,
book and department. Department is an entity, and Education &
Research, HR, Finance, etc., are instances
28
of the department
entity.
Similarly Henry, Luther, Crystal, Jane etc., are instances of employee entity.

Attribute
An attribute describes the property of an entity. An entity could have multiple attributes.

Example: For an entity car, the attributes would be the color, model number, number of
doors etc.

Relationship
Relationship is a natural association which exists among one or more entities.

Example: Employee borrows books from the library.
2.3. Cardinality of a relationship
Cardinality of relationship defines the type of relationship between two participating entities.

Example: One employee can take many books from library. One book can be taken by only
one employee. Cardinality of relationship between employee and book is one to many.

One person can sit on only one chair at any point of time. One chair can accommodate only
one person in a given point of time. This relationship has one to one cardinality.


26
Abstract: Conceptual/theoretical object.
27
Tangible: Physical object.
28
Instance: Occurrence.

Relational Database Management System

40 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Points to Remember:
Cardinality of relationship is different from cardinality of Relation (Cardinality
of relation was discussed in chapter one which refers to number of rows in
given relation).
There are four types of cardinality relationship.
2.3.1. One to One Relationship
In this relationship, one instance of entity is related to another instance of the entity. Both
participating entities have a one to one relationship.

Example1: One person (P1,P2,P3,P4) can sit on only one chair at any point of time. And also
one chair (C1,C2,C3,C4) can accommodate a maximum of one person at any given time. In
this relationship both the participating entities have one-to-one relationship.


Figure 2-1 : One to One Relationship
Example2: One country can have only one citizen as its president and one citizen can
become president of only one country.
2.3.2. One to Many Relationship
One instance of entity is related to multiple instance of another entity.
Example1: One organization can have many employees but one employee can work only for
one organization.

Figure 2-2 : One to Many Relationship

P1
P2
P3
P4
C1
C2
C3
C4
P1
P2
P3
P4
C1
C2
C3
C4
O1
O2
O3
E1
E2
E3
E4
E5
O1
O2
O3
E1
E2
E3
E4
E5
Relational Database Management System

41 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Example2: One warehouse can be used to store many parts but one part can be stored only
in one warehouse. In this example one instance of warehouse accommodates many parts.
Hence the relationship between warehouse and part is one-to-many

Figure 2-3 : One to Many Relationship

2.3.3. Many to One Relationship
This is the reverse of the one to many relationship.
Example: Many employees can work for only one department but one department can have
many employees. The relationship between employee and department is many to one.

Figure 2-4 : Many to One Relationship
2.3.4. Many to Many Relationship
In this many to many relationship multiple instances of one Entity are related to multiple
instances of another Entity.
Example1: One student is enrolled for many courses and one course is enrolled by many
students.
W1
W2
W3
P1
P2
P3
P4
P5
W1
W2
W3
P1
P2
P3
P4
P5

D1
D2
D3
E1
E2
E3
E4
E5
D1
D2
D3
E1
E2
E3
E4
E5
Relational Database Management System

42 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 2-5: Many to Many Relationship
Example2: One student trained by many instructors and one instructor trains many
students.

Figure 2-6 : Example of Many to Many Relationship
Many to many relationship is superset of all the above mentioned relationships. All other
relationships are special case of many to many relationships.

2.4. E-R Diagram Notations


Entity
An entity is an object or concept about which business
users store information.


Weak entity
A weak entity is defined as an entity which is dependent
on another entity for its existence. For an example bank
branch entity depends upon bank entity for its existence.
Without bank entity it is impossible to identify bank
branch uniquely.
S1
S2
S3
S4
C1
C2
C3
C4
S1
S2
S3
S4
C1
C2
C3
C4

S1
S2
S3
S4
I1
I2
I3
I4
S1
S2
S3
S4
I1
I2
I3
I4
Relational Database Management System

43 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Attributes
Attributes are the properties or characteristics of an
entity.


Key attribute
A key attribute of an entity is the unique, distinguishing
characteristic of the entity. For example, employee
number of an employee entity might be the employee's
key attribute.

Multivalued attribute
A multivalued attribute can take more than one value.
For example skill set attribute of an employee entity can
have multiple values.

Derived attribute
A derived attribute is an attribute whose value is based
on the value of another attribute. For example, the
monthly salary attributes value is based on the value of
employee's basic salary attribute and house rent
allowance attribute.

Relationships
Relationships demonstrate how two entities share
information in the database structure.





Cardinality
Cardinality of a relationship is used to specify how many
instances of an entity is related to one instance of
another entity. M,N both values represent MANY and 1
represents ONE cardinality


Relational Database Management System

44 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Recursive relationship
In some cases, entities can have relationship with itself.
For example, employees can supervise other employees
Figure 2-7 : E-R Diagram Notations

Relational Database Management System

45 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
2.5. Modeling using E-R Diagrams
A model
29
is an abstract form of any system or process that hides the unnecessary details,
while highlighting those details important to the application. We
have noticed the model of huge campuses or buildings which
help to visualize the structure, before they are built. On
similar lines, we can also model our software applications before
they are developed. This will help the business users to visualize
the application before it is developed and propose changes, if it
is not as per their requirement.

Modeling the databases using E-R diagrams is called as E-R Modeling. This technique is also
known as Top-Down approach, because one need not identify all the attributes to build
model of the system using this technique.
2.5.1. Steps in E-R Modeling
Usually the following six steps are followed to generate E-R Models.

a) Find the entities: Look for general nouns in requirement specification
document which are of business interest to business users
b) Find the relationships: Identify the natural relationship and their
cardinalities between the entities
c) Find the key attributes for every entity: Identify the attribute or set of attributes
which can identify instance of entity uniquely
d) Identify other relevant attributes: Identify other attributes which are interest to
business users and want to store the information in database
e) Complete E-R diagram: Draw E-R diagram along with all attributes including key
attribute
f) Review your results with your business users - Look at the list of attributes you
associated with each entity to see if anything is missing.

Note that while this is an iterative
30
approach and one cannot come to a final E-R model in a
single step. It requires a great deal of patience and numerous revisions before the model is
created.
2.6. Case Study 1: Problem Statement
Let us apply the above methodology to model, University database application.

x An university has many departments

29
Model: A representation or a scaled down structure of an object.
30
Iterative: Process of repeating the same task.
Model of a City
Relational Database Management System

46 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Each department has multiple instructors; one among them is the head of the
department
x An instructor belongs to only one department
x Each department offers multiple courses, each of which is taught by a single instructor
x A student may enroll for many courses offered by different departments
2.6.1. Case Study 1: Solution
Step 1: Identify the Entities
Generally the entities will have multiple instances in a given business scenario. As per this
guideline, we can identify the following entities:
1. DEPARTMENT
2. COURSE
3. INSTRUCTOR
4. STUDENT
Head of the department is NOT an Entity. It is a relationship between the Instructor and
department entities.

Note: One may be tempted to identify university as an entity, but
it is not an entity because it has only one instance.


Step 2: Find relationships
We can derive the following relationships:
1. The department offers multiple courses and each course belongs to only one
department. So the cardinality between department and course is one to many.


2. One course is enrolled by multiple students and also one student enrolls for multiple
courses. So the relationship is many to many.

3. One department has multiple instructors and also one instructor belongs to one and
only one department. So the relationship is one to many.

Department Offers Course
N
1
Course
Enrolled
by
Student
N
M
Department Has Instructor
1 N
Relational Database Management System

47 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4. Each department has one Head of Department and one Instructor is Head of
Department for only one department, hence the relationship is one to one.

5. One course is taught by only one instructor but one instructor teaches many courses,
hence the relationship between course and instructor is many to one.

The relationship between instructor and student need NOT be defined in the diagram. The
reasons are as follows:
1. There is no business significance of this relationship.
2. We can always derive this relationship indirectly through course and instructor, and
course and students.

Step 3: Identify the key attributes
1. DName (Department Name) which identifies the department uniquely will be the key
attribute for DEPARTMENT entity.
2. STUDENT# (Student Number) which identifies the student entity uniquely will be the
key attribute for STUDENT entity.
3. IName (Instructor Name) is the key attribute for INSTRUCTOR entity.
4. COURSE# (Course Number) is the key attribute for COURSE entity.

Step 4: Identify other relevant attributes
1. For the Department entity, the relevant attribute other than Department Name is
Location.
2. For the Course entity, the relevant attributes other than Course Number are
Course Name, Duration and Pre Requisite.
3. For the Instructor entity, the relevant attributes other than Instructor Name are
Room Number and Telephone Number.
4. For the Student entity, the relevant attributes other than Student Number are
Student Name and Date of Birth.

Step 5: Complete E-R diagram
After considering all the above mentioned guidelines one can generate the E-R Model for the
university database as shown in Figure 2-8.

Department
Headed
by
Instructor
1 1
Course
Is taught
by
Instructor
1
N
Relational Database Management System

48 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 2-8 : E-R Diagram for University
2.7. Case Study 2: Problem Statement
Let us consider a university library scenario for developing the E-R model.

Assume in a university
x There are multiple libraries and each library has multiple student members
x Students can become members to multiple libraries by paying appropriate membership
fee
x Each library has its own set of books. Within the library these books are identified by a
unique number
x Students can borrow multiple books from subscribed library

Department
Student
Instructor Course
Student Name
Instructor
Name
Student#
Room#
Course
Name
Course#
Location
Department
Name
Offers
Is taught
by
Enrolled
by
Has
Headed
by
1 1
1
N 1 N
N 1
N
M
Telephone#
Date of Birth
Duration
Pre Requisite
Relational Database Management System

49 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Students can order books using inter-library loan. This can be useful if a student
wishes to borrow books from a library where s/he is not a member. The student orders
the books through a library where s/he is a member
2.7.1. Case Study 2: Solution
Step 1: Identify the entities
Generally the entities will have multiple instances in a given business scenario. As per this
guideline, we can identify the following entities:
1. LIBRARY
2. BOOK
3. STUDENT

In this business scenario BOOK is a weak entity because without knowing the library details,
book cannot be identified independently. Book is always associated with its library.

Step 2: Find Relationships
We can derive the following relationships:
1. One library has many member students and each student can become member of many
libraries, hence the cardinality between library and student is many to many.
2. One book belongs to only one library and one library can have multiple books, hence
cardinality between library and book is one to many.
3. One library can loan multiple books and each book can be loaned to only one library,
hence the cardinality between library and book is one to many.
4. One student can borrow multiple books and one book can be borrowed by only one
student, hence the cardinality between student and book is many to one.

Step 3: Identify the key attributes
1. Library# (library ID) is the key attribute for the entity Library, as it identifies the
library uniquely.
2. Book# (book ID) and Library# are together key attributes for Book entity.
3. Student# (student number) is the key attribute for Student entity.

Step 4: Identify other relevant attributes
1. For the Library entity, the relevant attributes other than Library# would be
Library Name and Location.
2. For the Book entity, the relevant attributes other than Book# would be Title
and ISBN.
3. For the Student entity, the relevant attribute other than Student# would be
Student Name and Date of Birth.

Step 5: Complete E-R diagram
Relational Database Management System

50 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
After considering all the above mentioned guidelines, one can generate the E-R Model for
above mentioned university library business scenario as shown in Figure 2-9.


Figure 2-9: E-R Diagram for University Library

2.8. Case Study 3: Problem Statement
Let us consider a banking business scenario for developing the E-R
model.

Assume
x There are many banks in the city and each bank has many
branches. Each branch of the bank has multiple customers
x Customers have opened various types of accounts
x Some customers also had taken different types of loans from these branches of the
bank
Library
Student
Book# ISBN Date of Birth
Student#
Location Library#
Subscribed
by
Borrows
Has Loans
M
1
1
N M N
1 M
Title
Student
Name
Book
Library
Name
Relational Database Management System

51 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x One customer can have many accounts and loans
2.8.1. Case Study 3: Solution
Step 1: Identify the entities
Generally the entities will have multiple instances in a given business scenario. As per this
guideline, we can identify the following entities:
1. BANK
2. BRANCH
3. CUSTOMER
4. ACCOUNT
5. LOAN


BRANCH is considered a weak entity because without knowing the BANK, we cannot define the
BRANCH independently. BRANCH is always associated with its BANK name.
Example: Citi bank Branch, ICICI Bank Branch or State Bank of India branch.

Note: One may be tempted to identify City as an entity, but it is
not an entity because it has only one instance.

Step 2: Find Relationships
We can derive the following relationships:
1. One bank has multiple branches and also the branches belong to only one bank, so the
cardinality of the relationship between bank and branch is one to many.
2. One branch gives many loans and also each loan is associated with one branch, so the
cardinality of the relationship between branch and loan is one to many.
3. One branch maintains multiple accounts and also each account is associated to one
and only one branch, so the cardinality of the relationship between branch and
account is one to many.
4. One Loan can be availed by multiple customers, and also each customer can avail
multiple loans, so the cardinality of the relationship between loan and customer is
many to many.
5. One customer can hold multiple accounts, and also each account can be held by
multiple customers, So the cardinality of the relationship between customer and
account is many to many.

Step 3: Identify the key attributes
1. BankCode (Bank Code) is the key attribute for the entity Bank, as it identifies the
bank uniquely.
2. Branch# (Branch Number) and BankCode (Bank Code) are together key attributes for
Branch entity.
Relational Database Management System

52 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3. Customer# (Customer Number) is the key attribute for Customer entity.
4. Loan# (Loan Number) is the key attribute for Loan entity.
5. Account# (Account Number) is the key attribute for Account entity.

Step 4: Identify other relevant attributes
1. For the Bank entity, the relevant attributes other than BankCode would be
Name and Address.
2. For the Branch entity, the relevant attributes other than Branch# would be
Name and Address.
3. For the Loan entity, the relevant attribute other than Loan# would be Loan
Type.
4. For the Account entity, the relevant attribute other than Account# would be
Account Type.
5. For the Customer entity, the relevant attributes other than Customer# would be
Name, Phone and Address.

Step 5: Complete E-R diagram
After considering all the above mentioned guidelines, one can generate the E-R Model for
above mentioned banking business scenario as shown in Figure 2-10.

Relational Database Management System

53 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 2-10 : E-R Diagram for Bank
Bank
Branch
Customer
Loan Account
Name
Account Type
Account No
Branch#
Loan Type
Loan#
Customer#
Name
Address
Name
Address
Address Bank Code
Telephone#
has
Held By
Availed
By
Maintains Offers
1
N
1
N
1
N
M
N
M
N
Branch
Relational Database Management System

54 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

2.9. Transforming an E-R Model into Physical Database Design
E-R model helps mainly in capturing and analyzing the requirements. It can also be used
during the design of the physical database. The following is a set of guidelines for converting
an E-R model into a physical database design.

1. Each entity represented in the E-R model can be defined as a table in the relational
schema. All attributes of the entity will become columns of the table. As per this
guideline we can translate BANK, BRANCH, LOAN, ACCOUNT and CUSTOMER entities to
following tables. Additional columns can be added to the below tables as per the
business requirements at the later stage.


BANK BRANCH
BankCode Name Address BankCode Branch# Name Address





LOAN ACCOUNT
Loan# LoanType Account# AccountType





CUSTOMER
Customer# Name Telephone# Address





Figure 2-11 : Entity based tables
Weak entity types are converted into a table of their own, with the primary key of the
strong entity acting as a foreign key in the table. This foreign key along with the key
of the weak entity form the composite primary key of this table. As per this guideline,
a Branch table is created with the above mentioned structure, with BankCode and
Branch# together as composite primary key.

Relational Database Management System

55 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
2. Each relationship can be defined as separate table in relational schema. Key attributes
of participating entities
31
will become key attribute of the relationship. As per this
guideline we can define LOAN_OFFERING_DETAILS table between BRANCH and LOAN
entities, BRANCH_ACCOUNT_DETAILS between BRANCH and ACCOUNT entities,
LOAN_DETAILS table between LOAN and CUSTOMER entities, ACCOUNT_DETAILS tables
between ACCOUNT and CUSTOMER entities. BANK and BRANCH relationship table is not
defined because this information is already captured in BRANCH table. Usually
relationship based tables will have their own attributes in addition to prime attributes
of participating entities. For example, LOAN_DETAILS table contain prime attributes
from LOAN and CUSTOMER table ( which together act as composite primary key) in
addition to other attributes such as DateofSanction, IntRate, LoanAmount, Duration
etc.


LOAN_OFFERING_DETAILS BRANCH_ACCOUNT_DETAILS
BankCode Branch# Loan# BankCode Branch# Account#





LOAN_DETAILS
Loan# Customer#
Dateof
Sanction IntRate LoanAmount Duration





ACCOUNT_DETAILS


Account# Customer# DateofOpen





Figure 2-12 : Relationship based tables

31
Participating entities: The entities which are joined by the relation.
Relational Database Management System

56 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Note: At this stage entities and relationships are converted to tables
hence table does not have any data.

Actual database table designs are driven by business requirements.
The principles of data base design from ERD might be subjected to
small changes depending on the requirements. We will be exposed to
these details once we get a real life project experience. Some of the
aspects we might have to keep in mind are:

x In one to one and one to many cases, we may not always have
separate tables for the participating entities and their
relationship. One combined table for both the participating
entities and related attributes of relationship may be
sufficient
x In a many to many relationship, it is mandatory to create
separate tables for entities which are participating in the
relationship as well as for the relationships. For example,
entities and relationship shown in Figure 2 11, CUSTOMER and
LOAN entities are having a many to many relationship. Hence
one should create three separate tables, two for CUSTOMER
and LOANS entities and one LOAN_DETAILS for relationship.

2.10. Merits and Demerits of E-R Modeling
The following sections discuss the merit and demerits of E-R modeling.
2.10.1. Merits of E-R Modeling
1. Easy to understand. Represented in business users language and can be understood by
non-technical specialist.
2. Intuitive
32
and helps in physical database creation.
3. Can be generalized and specialized depending on needs.
4. Can help in database design.
5. Gives a higher level abstraction of the system.
2.10.2. Demerits of E-R Modeling
1. Physical design that is derived from E-R Model may contain some amount of
ambiguities or inconsistency.
2. Sometime diagrams may lead to misinterpretations.

32
Intuitive: Natural.
Relational Database Management System

57 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Example:


In a real situation, there could be several types of borrowing, for example long term, normal
and short term. It is not immediately clear whether the above diagram represents all or
some of these only. If this aspect is not clarified, then people could come to a wrong
conclusion. Giving proper description of the relationship is extremely important for ensuring
better understanding.

The following set of figures describes relationship clearly to overcome misinterpretation.



2.11. SUMMARY
x Most of the application errors are found because of miscommunication between the
user of the application and the designer of the application and between the designer
of the application and the developer of the application
x This miscommunication can be handled by pictorially representing the business
findings
x An E-R diagram is one of the many ways in which the business findings are pictorially
represented
x Four types of cardinality of relationships are
a. one to one
b. one to many
c. many to one
d. many to many
Student Books Borrows
1 M
Student Books Borrows
1 M
Long Term
Student Books Borrows
1 M
Short Term
Relational Database Management System

58 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x ER modeling helps in database design

3. Normalization

3.1. Introduction
Usually in the software industry, E-R modeling is used by designer as a requirement analysis
tool. Database design using E-R diagram is a by-product.

Database designed based on the E-R model may contain some amount of inconsistency,
ambiguity
33
and redundancy. To resolve these issues some amount of refinement is required.
This refinement process of database design is referred as Normalization.

As normalization involves building structures (like table/tables), starting from the stage of
identifying the columns (attributes) associated in the table, it is also called Bottom-Up
approach. This normalization technique is based on a strong mathematical foundation.

Basically normalization eliminates the duplicate data and makes insert, update and delete
operations much more efficient in terms of performance and space requirement to store the
data.

In Infosys, almost all the database designs are initially based on E-R modeling and later
refined using normalization techniques before they are physically created.
3.2. The need for Normalization
Consider a university scenario, where in the data associated with the students, courses and
their results are maintained in a table called Student_Course_Result.
Student_Course_Result Table
Student_Details Course_Details Result_Details
101 Davis 11/4/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 82 A
102 Daniel 11/6/1987 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 62 C
101 Davis 11/4/1986 H6 American History 4 11/22/2004 79 B
103 Sandra 10/2/1988 C3 Bio Chemistry Basic Chemistry 11 11/16/2004 65 B
104 Evelyn 2/22/1986 B3 Botany 8 11/26/2004 77 B
102 Daniel 11/6/1987 P3 Nuclear Physics Basic Physics 13 11/12/2004 68 B
105 Susan 8/31/1985 P3 Nuclear Physics Basic Physics 13 11/12/2004 89 A
103 Sandra 10/2/1988 B4 Zoology 5 11/27/2004 54 D
105 Susan 8/31/1985 H6 American History 4 11/22/2004 87 A

33
Ambiguity: Uncertainty.
Relational Database Management System

59 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
104 Evelyn 2/22/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 65 B

Figure 3-1: Data file in table format
If we observe the table shown in Figure 3-1 closely, we would find that the table has many
anomalies
34
. They are:

Insert Anomaly
In some cases, Insertion of new data is difficult.
Example: We cannot insert prospective course which does not have any registered student or
we cannot insert student details who is yet to register for any course.
Update Anomaly
In some cases, Updation of existing data is difficult.
Example: If we want to update the course M4s name we need to do this operation three
times. Similarly we may have to update student 103s name twice if it changes.
Delete Anomaly
In some cases, deletion of existing data is not possible.
Example: If we want to delete a course M4, in addition to M4 course details, other critical
details of student also will be deleted. This kind of deletion is harmful to business.
Moreover, M4 appears thrice in above table and needs to be deleted thrice.
Duplicate Data
The table has lots of duplicate data.
Example: Course M4s data is stored thrice and student 102s data stored twice. This
redundancy will increase as the number of course offering and students increases.

Hence we need to refine our design so that we make an efficient database in terms of storage
space and Inserts, Updates and Deletes operations. This refining technique is called as
normalization.

3.3. Process of Normalization
As mentioned previously, normalization technique is based on strong mathematical
foundation.

Basically in software industry four normal forms are used to design the database.

Before getting to know the normalization techniques in detail, let us define a few building
blocks which are used to define normal forms.

34
Anomalies: Irregularities.
Relational Database Management System

60 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3.3.1. Determinant
Attribute X can be defined as determinant if it uniquely defines the attribute value Y in a
given relationship or entity. To qualify as determinant attribute need NOT be a key attribute.
Usually dependency of an attribute is represented as X Y, which means attribute X decides
attribute Y.

Example: In RESULT relation, Marks attribute may decide the grade attribute. This is
represented as Marks Grade and read as Marks decides Grade.


Figure 3-2: Determinant
In the RESULT relation, Marks attribute is not a key attribute. Hence it can be concluded that
key attributes are determinants but not all the determinants are key attributes.
3.3.2. Functional Dependency
Consider the following REPORT Relation
REPORT (Student#, Course#, CourseName, IName, Room#, Marks, Grade)
Where:
x Student# - A unique number associated with each student called Student Number
x Course# - A unique number associated with each course called Course Number
x CourseName Name of the Course
x IName Instructor Name who delivered the course
x Room# - Room number assigned to respective instructor
x Marks - Marks obtained in a particular course by a particular student
x Grade Grade obtained by a particular student in a particular course

Student# Course# together (called composite attribute) determines EXACTLY ONE value of
marks. This can be symbolically represented as

Student# Course# Marks

This type of dependency is called as functional dependency. In above example marks is
functionally dependent on Student# Course#.

Other functional dependencies in above examples are:
Marks Grade
Relational Database Management System

61 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Course# CourseName,
x Course# IName
(If we assume that one course is offered by one and only one instructor)
x IName Room#
(If we assume that each instructor has his/her own and non-shared room)
x Marks Grade.

Formally we can define functional dependency as: In a given relation R, X and Y are attribute
sets. Attribute set Y is functionally dependent on attribute set X if each value of X
determines EXACTLY ONE value of Y. It is represented as:
X Y
3.3.3. Full Functional Dependency
In above example Marks is fully functionally dependent on Student# Course# and not on sub
set of Student# Course#. This means that you cannot determine Marks obtained by a student
in a course if you know only the Student# OR Course#. It can be determined only using
Student# AND Course# together. So in this example Marks is fully functionally dependent on
Student# Course#.

CourseName is not fully functionally dependent on Student# Course# because one of the
subset Course#, is enough to determine the CourseName and Student# is not required in
determining CourseName. So CourseName is not fully functionally dependent on Student#
Course#.


Figure 3-3: Full Functional Dependency
Formal definition of full functional dependency is: In a given relation R, X and Y are
attributes. Y is fully functionally dependent on attribute X only if it is not functionally
dependent on sub-set of X. However X may be composite in nature.
3.3.4. Partial Dependency
In the above relationship CourseName, IName, Room# are partially dependent on attributes
Student# Course# because Course# alone is enough to determine the CourseName, IName,
Room#.

Student#
Marks
Course#
Relational Database Management System

62 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Figure 3-4: Partial Dependency
Formal definition of partial dependency is: In a given relation R, X and Y are attribute sets.
Attribute set Y is partially dependent on the attribute set X only if it is dependent on subset
of attribute set X.
3.3.5. Transitive Dependency
In above example, Room# depends on IName and in turn IName depends on Course#. Hence
Room# transitively depends on Course#.


Figure 3-5: Transitive Dependency
Similarly Grade depends on Marks, in turn Marks depends on Student# Course# hence Grade
fully transitively
35
depends on Student# Course#.
3.3.6. Key attributes
In a given relationship R, if the attribute X uniquely defines all other attributes, then the
attribute X is a Key attribute which is nothing but the candidate key which is defined in
Chapter One.

Example1: Student# Course# together is a composite key attribute which determines all
attributes in relationship REPORT (Student#,Course#, CourseName, IName, Room#, Marks,
Grade) uniquely. Hence Student# and Course# are key attributes.

Example2: Student# and EMailID also can be considered as candidate keys for entity student
STUDENT(Student#, StudentName, DateofBirth, EMailID). Student# or EMailID uniquely
defines all other attributes of student entity.

35
Transitive: In-direct.
Student#
Course#
Room#
IName
CourseName
9
8
IName Course# Room#
Relational Database Management System

63 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3.3.7. Non key attributes

The attributes other than the candidate key attributes in a table/relation are called Non-Key
attributes.
OR

The attributes which do not participate in the candidate key.

Example1: Student# and EMailID are the candidate keys of the entity STUDENT(Student#,
StudentName, DateofBirth, EMailID) so StudentName and DateofBirth are the non-key
attributes.

3.4. Types of Normal Forms

3.4.1. First Normal Form (1 NF)
A relation R is said to be in the first normal form (1NF) if and only if all the attributes of the
relation R are atomic
36
in nature.

Consider the Student_Course_Result table which is reproduced from the Section 3.2 The
need for Normalization.
Student_Course_Result Table
Student_Details Course_Details Results
101 Davis 11/4/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 82 A
102 Daniel 11/6/1987 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 62 C
101 Davis 11/4/1986 H6 American History 4 11/22/2004 79 B
103 Sandra 10/2/1988 C3 Bio Chemistry Basic Chemistry 11 11/16/2004 65 B
104 Evelyn 2/22/1986 B3 Botany 8 11/26/2004 77 B
102 Daniel 11/6/1987 P3 Nuclear Physics Basic Physics 13 11/12/2004 68 B
105 Susan 8/31/1985 P3 Nuclear Physics Basic Physics 13 11/12/2004 89 A
103 Sandra 10/2/1988 B4 Zoology 5 11/27/2004 54 D
105 Susan 8/31/1985 H6 American History 4 11/22/2004 87 A
104 Evelyn 2/22/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 65 B

Figure 3-6: Data file in table format
Table shown in Figure 3-6, Student_Details, Course_Details and Results attributes can be
further divided. Student_Details attribute is divided into Student# (Student Number),
StudentName (Student Name) and DateofBirth (Date of Birth). Course_Details attribute is

36
Atomic: The smallest levels to which data may be broken down and remain meaningful.
Relational Database Management System

64 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
divided into Course# (Course Number), CourseName, Prerequisites and Duration. Similarly
Results attribute is divided into DateofExam, Marks and Grade.

To make above table 1NF compliant, it is re-designed as shown below.



In the new form, all the attributes are atomic, meaning they are not further decomposable
37
.
You can not divide Student#, StudentName etc further into smaller attributes. Hence this
table is in 1NF.

Let us re-visit the issues we had with un-normalized table. Even at this stage, it is difficult to
add prospective course or student information. Still it is difficult to update or delete either
Course or Student information. Hence anomalies in inserts, updates and deletes are still to be
resolved.

Unfortunately first normal form has all the problems which we faced in un-normalized table.
3.4.2. Second Normal Form (2 NF)
A Relation is said to be in Second Normal Form if and only if:
x It is in the First normal form, and
x No partial dependency exists between non-key attributes and key attributes.

Let us re-visit 1NF table structure.

37
Decomposable: Further split or reduce.
Student_Course_Result Table
Student# Student Dateof Course# CourseName Pre Duration DateOf Marks Grade
Name Birth Requisite InDays Exam
101 Davis 4-Nov-86 M4 Applied
Mathematics
Basic
Mathematics
7 11-Nov-04 82 A
102 Daniel 6-Nov-86 M4 Applied
Mathematics
Basic
Mathematics
7 11-Nov-04 62 C
101 Davis 4-Nov-86 H6 American
History
4 22-Nov-04 79 B
103 Sandra 2-Oct-88 C3 Bio Chemistry Basic
Chemistry
11 16-Nov-04 65 B
104 Evelyn 22-Feb-86 B3 Botany 8 26-Nov-04 77 B
102 Daniel 6-Nov-86 P3 Nuclear
Physics
Basic
Physics
13 12-Nov-04 68 B
105 Susan 31-Aug-85 P3 Nuclear
Physics
Basic
Physics
13 12-Nov-04 89 A
103 Sandra 2-Oct-88 B4 Zoology 5 27-Nov-04 54 D
105 Susan 31-Aug-85 H6 American
History
4 22-Nov-04 87 A
104 Evelyn 22-Feb-86 M4 Applied
Mathematics
Basic
Mathematics
7 11-Nov-04 65 B
Figure 3-7: First Normal Form
Relational Database Management System

65 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Student# is the key attribute for Student relation or table ,
x Course# is the key attribute for Course relation or table
x Student# Course# together form the composite key attributes for Result relation or
table
x Other attributes like StudentName, DateofBirth, CourseName, DurationInDays ,
PreRequisite, DateofExam, Marks and Grade are non-key attributes.

To make this table 2NF compliant, we will have to remove all the partial dependencies.
x StudentName and DateofBirth depends on Student# only
x CourseName, PreRequisite and DurationInDays depends on Course# only
x DateofExam depends on Course# only

To remove this partial dependency we need to split Student_Course_Result table into four
separate tables, STUDENT, COURSE, RESULT and EXAM_DATE tables as shown in Figure 3-8 .


STUDENT TABLE
Student# StudentName DateofBirth
101 Davis 4-Nov-86
102 Daniel 6-Nov-87
103 Sandra 2-Oct-88
104 Evelyn 22-Feb-86
105 Susan 31-Aug-85
106 Mike 4-Feb-87
107 Juliet 9-Nov-86
108 Tom 7-Oct-86
109 Catherine 6-Jun-84

COURSE TABLE
Course# CourseName PreRequisite DurationInDays
M1 Basic
Mathematics
11
M4 Applied
Mathematics
M1 7
H6 American
History
4
C1 Basic
Chemistry
5
C3 Bio Chemistry C1 11
B3 Botany 8
P1 Basic Physics 8
P3 Nuclear
Physics
P1 13
B4 Zoology 5


RESULT Table
Student# Course# Marks Grade
101 M4 82 A
102 M4 62 C
101 H6 79 B
103 C3 65 B
104 B3 77 B
102 P3 68 B
105 P3 89 A
103 B4 54 D
105 H6 87 A
104 M4 65 B

EXAM_DATE Table
Course# DateOfExam
M4 11-Nov-04
H6 22-Nov-04
C3 16-Nov-04
B3 26-Nov-04
P3 12-Nov-04
B4 27-Nov-04


Relational Database Management System

66 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Figure 3-8: Second Normal Form
x In the first table (STUDENT), Student# is the key attribute and all other non-key
attributes, StudentName and DateofBirth are fully functionally dependant on the key
attribute
x In the second table (COURSE), Course# is the key attribute and all other non-key
attributes, CourseName, PreRequisite and DurationInDays are fully functionally
dependant on the key attribute
x In third table (RESULT), Student# Course# together are key attributes and all other
non key attributes, Marks and Grade are fully functionally dependant on the key
attributes
x In the fourth table (EXAM_DATE) Course# is the key attribute and the non-key
attribute, DateOfExam is fully functionally dependant on the key attribute
x These four tables are also compliant with the First Normal Form definition.
x So the above four tables are said to be in Second Normal Form (2NF)
At first look it appears like all our anomalies are taken away! Now we are storing Student 103
and M4 record only once. We can insert prospective students and courses at our will. We will
update only once if we need to change any data in STUDENT, COURSE tables. We can get rid
of any course or student details by deleting just one row.
Let us analyze the following table.

Student# Course# Marks Grade
101 M4 82 A
102 M4 62 C
101 H6 79 B
103 C3 65 B
104 B3 77 B
102 P3 68 B
105 P3 89 A
103 B4 54 D
105 H6 87 A
104 M4 65 B
Figure 3-9: RESULT Table
We already concluded that:
x All the attributes are atomic in nature
x No partial dependency exists between the key attributes and non-key attributes.
x RESULT table is in Second Normal form (2NF)
Assume, at present, as per the university evaluation policy,
x Students who score more than or equal to 80 marks are awarded with A grade
x Students who score more than or equal to 65 up till 79 gets B grade
x Students who score marks more than or equal to 50 up till 64 fetches C grade
Relational Database Management System

67 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Students who score marks less than 50 is only D grade
The university management which is committed to improve the quality of education wants to
change the existing grading system to a new grading system as given below.
x A+ grade for 95 and above
x A grade for 85 to 94
x B grade for 70 to 84
x B- grade for 65 to 69
x C grade for 55 to 64
x D grade for 45 to 54
x E grade for less than 40

In the present RESULT table structure,
x We do not have an option to introduce new grades like A+, B- and E.
x We need to do multiple updates on the existing records to bring them to the new
grading definition.
x We will not be able to take away D grade if we want to.
x 2NF does not take care of all the anomalies and inconsistencies.
3.4.3. Third Normal Form (3 NF)
A relation R is said to be in the Third Normal Form (3NF) if and only if
x It is in 2NF and
x No transitive dependency exists between non-key attributes and key attributes
through another non key attribute.

In the above RESULT table Student# and Course# are the key attributes. All other attributes,
except grade are non-partially, non-transitively dependent on key attributes. The Grade
attribute is dependent on Marks and in turn Marks is dependent on Student# Course#. To
bring this table to third normal form we need to take off this transitive dependency.

After taking this transitive dependency we can infer the following table structures which are
in 3NF.

Student# Course# Marks
101 M4 82
102 M4 62
101 H6 79
103 C3 65
104 B3 77
102 P3 68
105 P3 89
103 B4 54
105 H6 87
MARKSGRADE TABLE
UpperBound LowerBound Grade
100 95 A+
94 85 A
84 70 B
69 65 B-
64 55 C
54 45 D
44 0 E

Relational Database Management System

68 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
104 M4 65
Figure 3-10: Third Normal Form
After normalizing tables to Third Normal Form (3NF), we got rid of all the anomalies and
inconsistencies. Now we can add new grade systems, update the existing one and delete the
unwanted ones.
Hence the Third Normal Form is the most optimal normal form and 99% of the databases
which require efficiency in
x INSERT
x UPDATE and
x DELETE operations are designed in this normal form.
3.5. Merits and Demerits of Normalization
The following sections discuss merits and demerits of normalization.
3.5.1. Merits
1) Normalization is based on mathematical foundation.
2) Removes the redundancy to the greater extent. After 3NF, data redundancy is reduced
to the extent of foreign keys.
3) Removes the anomalies present in Inserts, Updates and Deletes.
3.5.2. Demerits
1) Data retrieval (Select) operation performance will be severely affected.
Example: Let us assume that the university management wants to have the report of
students performance in the following format.

UNIVERSITY REPORT
Student Name Course Name Date Of Exam Grade
Daniel Applied Mathematics 11-Nov-04 C
Daniel Nuclear Physics 12-Nov-04 B
Davis Applied Mathematics 11-Nov-04 A
Davis American History 22-Nov-04 B
Evelyn Botany 26-Nov-04 B
Evelyn Applied Mathematics 11-Nov-04 B
Sandra Bio Chemistry 16-Nov-04 B
Sandra Zoology 27-Nov-04 D
Susan Nuclear Physics 12-Nov-04 A
Susan American History 22-Nov-04 A
Figure 3-11: Proposed University Report
Relational Database Management System

69 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
After applying 3NF normalization technique for database design, a single table will not
contain all the information as desired by the college management.

We need to select Student Name from STUDENT table, Course Name from COURSE table,
Date of Examination from EXAM_DATE table and Grade from MarksGrade table.
In an un-normalized format we would have retrieved all these columns just from one
table.

Hence normalization will definitely slow down the Select operations. It is better to
restrict normalization process to 2NF, if application has more data retrieval operations
than insert or update or delete operations.

If the application is used for querying a database, it is called as Reporting System.
Let us take an example of a Railway enquiry system. This enquiry system is used to
enquire about reservation availability and not used to book the tickets. On the other hand
a Railway reservation system is called as On-line application because this system is used
for booking tickets (inserts), changing travel plans (updates) and canceling tickets
(deletes).

Hence one may normalize only up to 2NF for Reporting System and up to 3NF for
Online applications.

2) Normalization may not always correspond to real world scenarios. It should be borne in
mind however that full normalization may not always be desirable and the database
designer may take advantage of his/her intimate knowledge of the real world and
choose not to normalize in some particular instance.

Example: consider the following relation: CUSTOMER (Name, Street, City, Postcode.
Strictly speaking, the attribute Postcode uniquely identifies City, hence transitive
dependency exists in the above scenario.

Postcode -> City

Thus CUSTOMER table is not in 3NF. However in practice the attributes City and
Postcode are always used together as a unit and decomposing the relation would not
be advisable in this case.

Note: Some time to increase the performance of select operations for
reporting application, database design is taken back from higher
normal form to lower normal form (ex: 3NF to 2NF). This process is
called as de-normalization or Second Level Design (SLD).


Relational Database Management System

70 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3.6. Summary
x Normalization is a refinement process wherein it helps in removing anomalies in
insert, update and delete operations.
x Normalization is also called Bottom-up approach, because this technique requires
full knowledge of every participating attribute and its dependencies on the key
attributes. If you try to add new attributes after normalization is done, it may change
the normal form of the database design itself.
x There are three normal forms that were defined being commonly used.
x 1NF is used to makes sure that all the attributes of the relation are atomic in nature.
x 2NF removes the partial dependency.
x 3NF removes the transitive dependency.
x Excessive normalization adversely affects select or retrieval operations on the
database.
x It is always better to normalize up to 3NF for insert, update and delete intensive
(online transaction) systems.
x It is always better to restrict up to 2NF for select intensive (reporting) systems.
While normalizing a database, use common sense and dont use only the normal forms as
absolute measures.

Points to Remember:
Normal Form Test Remedy
(Normalization)
1NF Attributes of every relation
should be atomic. An
attribute is atomic if domain
of the attribute includes
only atomic (simple,
indivisible) values.
Form new relations for
each non-atomic
attribute
2NF For relations where
candidate key contains
multiple attributes
(composite candidate key),
non-key attribute should not
be functionally dependent
on a part of the candidate
key.
Decompose to form a
new relation for each
partial key with its
dependent attribute(s).
Also retain the relation
with the original
candidate key and any
attributes that are
fully functionally
dependent on it.
3NF Relation should not have any
non-key attribute
functionally determined by
Decompose to form a
relation that includes
the non-key
Relational Database Management System

71 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

3.7. Case study
Given below is the data in an un-normalized table. Normalize it to 1NF. Identify the problems
encountered when the table is in 1NF but not in 2NF. Subsequently normalize to 2NF and 3NF,
explaining the problems faced and the solution to it.

Proj_No Proj_Name Emp_No Emp_Name Rate_Category Hourly_Rate_in
_dollars
2023 Amsterdam
travel site
101
102
103
Vincent R
Pauline J
Charles C
A
B
C
60
50
40
2056 Real Estate
Agency
101
107
Vincent R
David R
A
B
60
50

Solution:

Table (1NF)
Proj_No Proj_Name Emp_No Emp_Name Rate_Category Hourly_Rate_in
_dollars
2023 Amsterdam
travel site
101 Vincent R

A 60
2023 Amsterdam
travel site
102 Pauline J

B 50
2023 Amsterdam
travel site
103 Charles C C 40
2056 Real Estate
Agency
101 Vincent R

A 60
2056 Real Estate
Agency
107 David R B 50
any other non-key attribute.
In other words there should
be no transitive dependency
of a non-key attribute on
the candidate key through
another non key attribute.
attribute(s) that
functionally
determine(s) other
non-key attribute(s).


Relational Database Management System

72 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Problems encountered when the table is in 1NF but not in 2NF:
i. Wastage of space: Information that code 1023 refers to the Amsterdam travel site
appears three (3) times.
ii. Update Anomaly: If the project name has to be changed, it has to be done in all the
rows that the project name appears in. If it has not been changed in just one row, this
may lead to inconsistency problems.
iii. Insert Anomaly: The information about a new employee cannot be inserted into the
table unless the employee is assigned to a project.
iv. Delete Anomaly: If there is only one employee working on a project, it is not possible
to delete information about the employee without losing information about the
project. In other words it is not possible to delete a subset of a record.

Solution: Normalize to 2NF
i. Take out the duplication
ii. Look for partial dependencies i.e. fields that are dependent on a part of a key and not
on the entire key.

In the above table, the key is (Proj_No, Emp_No)

The functional dependencies are as follows:
Proj_No Proj_Name
Emp_No Emp_Name, Rate_Category, Hourly_Rate_in_Dollars
Rate_Category Hourly_Rate_in_Dollars

The above table should be decomposed as follows:



Employee_Project Table
Proj_No Emp_No
2023 101
2023 102
2023 103
2056 101
2056 107

Employee_Table
Emp_No Emp_Name Rate_Category Hourly_Rate_in_Dollars
101 Vincent R A 60
102 Pauline J B 50
Relational Database Management System

73 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
103 Charles C C 40
107 David R B 50

Project Table
Proj_No Proj_Name
2023 Amsterdam Travel site
2056 Real Estate Agency

Problems faced with the table in 2NF
i. Stores data redundantly: The Hourly_Rate_in_Dollars and Rate_Category are being
stored in its entirety for each employee.
ii. Update Anomaly: If the hourly rate in dollars has to be changed for a particular rate
category, it has to be done in all the rows that the rate category appears in. If it has
not been changed in just one row, this may lead to inconsistency problems.
iii. Insert Anomaly: It is not possible to insert information about a new rate category and
the corresponding hourly rate in dollars unless there is an employee in that rate
category.
iv. Delete Anomaly: If there is only one employee in a particular rate category, it is not
possible to delete information about the employee without losing information about
that rate category and the corresponding hourly rate in dollars.

Solution: Normalize to 3NF
i. Remove this excess data into its own table.
ii. Look for transitive relationships or relationships where a non-key attribute is
dependent on another non-key attribute.

In the above table (Employee table), Hourly_Rate_in_Dollars is actually dependent on
Rate_Category according to the functional dependency Rate_Category
Hourly_Rate_in_Dollars

The above table (Employee) should be decomposed as follows:

Employee Table
Emp_No Emp_Name Rate_Category
101 Vincent R A
102 Pauline J B
103 Charles C C
107 David R B

Rate Table
Rate_Category Hourly_Rate_in_Dollars
Relational Database Management System

74 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
A 60
B 50
C 40







4. Structured Query Language (SQL)
SQL is used to interact with a database to manage and retrieve data.
4.1. The Purpose of SQL
SQL is used to retrieve data from the database. The DBMS processes the SQL request,
retrieves the requested data from the database, and returns it. This process of requesting
data from the database and receiving back the results is called a database query and hence
the name Structured Query Language.
Refer to Figure 4-1.

Figure 4-1: Using SQL for database access
SQL is used to control all the functions that a DBMS provides for its users, including:

x Data Definition: SQL allows a user to define the structure and the organization of the
data to be stored and the relationships among the stored data items
x Data Retrieval: SQL allows a user or an application program to retrieve the stored
data from the database
x Data Manipulation: SQL lets a user or an application program update the database by
allowing to add new data, delete the existing data, and modify the existing data
Database
SQL Request
Data
01000101
11001010
01001011
Computer System
DBMS
Relational Database Management System

75 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Access Control: SQL can be used to restrict a users ability to retrieve, add, and
modify data, thus protecting the stored data against unauthorized access










4.2. A Brief History of SQL

Date Event
1970 The relational model devised by Codd was explored during the 1970s, and
commercial relational database products began to emerge in the 1980s,
originally for mainframe systems and later for microcomputers. Edgar Codd
first wrote about the concept of relational databases in his paper A
relational model of data for large shared data banks in 1970.
1979 Oracle Corporation introduced the first commercial RDBMS
1982 ANSI (American National Standards Institute) formed SQL Standards
Committee
1983 IBM (International Business Machine) announced DB2 (a database)
1986 ANSI (American National Standards Institute) SQL1 standard is approved
1987 ISO (International Organization for Standardization) SQL1 standard is
approved
1992 ANSI (American National Standards Institute) SQL2 standard is approved
2000 Microsoft Corporation introduces SQL Server 2000, aimed at enterprise
applications
2002 Research firm Gartner ranked IBM as #1 database vendor over Oracle
2004 SQL: 2003 standard is published


Relational Database Management System

76 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4.3. Data Types
The data types are used to specify the type of data that will be stored in each column of the
table. The following table lists the typical data types
38
used in Oracle 8i and Oracle 9i:

Data Type
Syntax
Oracle 8i Oracle 9i Explanation
(if applicable)


NUMBER(P, S)
The maximum
precision is 38 digits.
The maximum
precision is 38 digits.
Where p is the
precision and s is the
scale.
Example: numeric
(7, 2) is a number
that has 5 digits
before the decimal
and 2 digits after the
decimal.
CHAR (SIZE) Up to 2000 bytes in
Oracle 8i.
Up to 2000 bytes in
Oracle 9i.
Where size is the
number of characters
to store. Fixed-length
strings. Space
padded. Example: if
the width of a
character variable is
10 and the string
stored in it is
RDBMS, it will be
stored as RDBMS
VARCHAR2 (SIZE) Up to 4000 bytes in
Oracle 8i.
Up to 4000 bytes in
Oracle 9i.
Where size is the
number of characters
to store. Variable-
length strings.
Example: if the
width of a character
variable is 10 and the
string stored in it is
RDBMS, it will be
stored as RDBMS
LONG Up to 2 gigabytes. Up to 2 gigabytes. Variable-length
strings. (backward
compatible
39
)

38
Data Types: The description of the kinds of data stored, passed and used.
Relational Database Management System

77 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
DATE A date between Jan
1, 4712 BC and Dec
31, 9999 AD.
A date between Jan
1, 4712 BC and Dec
31, 9999 AD.
Example:
25-JAN-2005



39
Backward Compatible: A design that continues to work with earlier versions of a language, program,
etc.
Relational Database Management System

78 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4.4. Statement types
The following table lists the three types of SQL statements:

Type of SQL statement SQL keywords Function

Data Definition Language (DDL) CREATE
ALTER
DROP

TRUNCATE
Used to define, change
and drop the structure
of a table

Used to remove all
rows from a table
Data Manipulation Language(DML) SELECT
INSERT INTO
UPDATE
DELETE FROM
Used to enter, modify,
delete and retrieve
data from a table
Data Control Language (DCL) GRANT
REVOKE


COMMIT
ROLLBACK
Used to provide control
over the data in a
database

Used to define the end
of a transaction


Note: All keywords must be entered as described otherwise users get
syntax errors.
4.5. Data Definition Language (DDL) Statements
DDL statements help us in defining the table structure.
x Define and create a new table
x Remove a table that is no longer needed
x Change the definition of an existing table
x Define a virtual table (view) of data (Covered in section 4.7)
x Build an index
40
to access a table faster (Covered in section 4.5.5)

CONSTRAINTS

40
Index: Indices are created in an existing table to locate rows more quickly and efficiently. It is
possible to create an index on one or more columns of a table, and each index is given a name. The
users cannot see the indexes; they are just used to speed up queries. More on index is covered in
Section 4.5.5.
Relational Database Management System

79 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Data types help us to specify the nature or the kind of data that can be stored in a table. But
datatype specification alone is not enough. For example, a column to store a product price
should accept only positive values. We do not have a data type which accepts only positive
numbers.

Another requirement could be to specify constraints on column data. For example, product
number should be a column in the product table which should contain unique values for
identifying product information.

SQL allows the definition of constraints on columns and tables. A user cannot store data in a
column violating the constraint specified on that column. This scenario would throw an error.

Types of Constraints:
x Column Constraint: A constraint specified at the column level and is applied only to a
specific column in addition to the column definition.
x Table Constraint: A constraint specified at the table level after completion of all
column definitions. This constraint is applied when we want to specify a constraint
which involves more than one column in a table.
4.5.1. CREATE TABLE Statement
The CREATE TABLE statement can:
x Create a table
x Define column constraints
x Define table constraints
Refer to Figure 4-2.
Relational Database Management System

80 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-2: CREATE TABLE syntax


Note: Anything enclosed between [ ] is optional.
CREATE TABLE table-name
(---------- Column-Definitions ---------)
Table-Constraint-Definitions
Column-Definition:
column-name data-type [ DEFAULT value ]
Table-Constraint-Definition:
CONSTRAINT constraint-name primary-key-constraint
foreign-key-constraint
uniqueness-constraint
check-constraint
Primary-Key-Constraint:
PRIMARY KEY ( column-name )
Foreign-Key-Constraint:
FOREIGN KEY ( column-name ) REFERENCES table-name [ column-name ]
Uniqueness-Constraint:
UNIQUE ( column-name )
Check-Constraint:
CHECK ( search-condition )
Relational Database Management System

81 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Example:

1. Create a table Customer_Details with the following specifications

Column Name Data Type and Width Constraint
Cust_ID NUMBER(5) NOT NULL
Cust_Last_Name VARCHAR2(20) NOT NULL
Cust_Mid_Name VARCHAR2(4)
Cust_First_Name VARCHAR2(20)
Account_No NUMBER(5) PRIMARY KEY
Account_Type VARCHAR2(10) NOT NULL
Bank_Branch VARCHAR2(25) NOT NULL
Cust_Email VARCHAR2(30)

Syntax:

CREATE TABLE Customer_Details(

Cust_ID Number(5) CONSTRAINT nn_cust_custid NOT NULL,
Cust_Last_Name VarChar2(20) CONSTRAINT nn_cust_lastname NOT NULL,
Cust_Mid_Name VarChar2(4),
Cust_First_Name VarChar2(20),
Account_No Number(5) CONSTRAINT pk_cust PRIMARY KEY,
Account_Type VarChar2(10) CONSTRAINT nn_cust_accounttype NOT NULL,
Bank_Branch VarChar2(25) CONSTRAINT nn_cust_bankbranch NOT NULL,
Cust_Email VarChar2(30));

2. Create a table Employee_Manager with the following specifications

Column Name Data Type and Width Constraint
Emp_ID NUMBER(6) PRIMARY KEY
Emp_Last_Name VARCHAR2(25)
Emp_Middle_Name VARCHAR2(5)
Emp_First_Name VARCHAR2(25)
Emp_Email VARCHAR2(45)
Department VARCHAR2(10)
Grade NUMBER(2)
Manager_ID NUMBER(6) Foreign Key Referencing Emp_ID




Relational Database Management System

82 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m



Syntax:

CREATE TABLE Employee_Manager(

Emp_ID NUMBER(6) CONSTRAINT pk_emp PRIMARY KEY,
Emp_Last_Name VARCHAR2(25),
Emp_Middle_Name VARCHAR2(5),
Emp_First_Name VARCHAR2(25),
Emp_Email VARCHAR2(45),
Department VARCHAR2(10),
Grade NUMBER(2),
Manager_ID Number(6) CONSTRAINT fk_emp_managerid
REFERENCES Employee_Manager(Emp_ID));

A column level constraint follows a column definition whereas a table level constraint follows
a table definition. A table level constraint generally involves two or more columns.

3. Applying primary key as a column constraint

Syntax:

CREATE TABLE Customer_Details(
Cust_ID NUMBER(5) CONSTRAINT nn_cust_custid NOT NULL,
Cust_Last_Name VARCHAR2(20) CONSTRAINT nn_cust_lastname NOT NULL,
Cust_Mid_Name VARCHAR2(4),
Cust_First_Name VARCHAR2(20),
Account_No NUMBER(5) CONSTRAINT pk_cust_accountno PRIMARY KEY,
Account_Type VARCHAR2(10) CONSTRAINT nn_cust_accounttype NOT NULL,
Bank_Branch VARCHAR2(25) CONSTRAINT nn_cust_branch NOT NULL,
Cust_Email VARCHAR2(30));

The primary key definition in the above example follows the column (Account_No) definition.
A column definition includes the name of the column, data type and length or size of the
column.

4. A primary key as a table constraint

Syntax:

CREATE TABLE Customer_Details(
Cust_ID NUMBER(5) CONSTRAINT nn_cust_custid NOT NULL,
Cust_Last_Name VARCHAR2(20) CONSTRAINT nn_cust_lastname NOT NULL,
Cust_Mid_Name VARCHAR2(4),
Cust_First_Name VARCHAR2(20),
Account_No NUMBER(5),
Account_Type VARCHAR2(10) CONSTRAINT nn_cust_accounttype NOT NULL,
Relational Database Management System

83 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Bank_Branch VARCHAR2(25) CONSTRAINT nn_cust_bankbranch NOT NULL,
Cust_Email VARCHAR2(30), CONSTRAINT pk_cust_email PRIMARY KEY(Cust_ID,
Account_No));

The primary key definition in the above example follows the table definition i.e. the primary
key definition occurs after all the columns have been defined in the table for their data type
and width.

5. How to create a new table from another existing table ?

Syntax:

CREATE TABLE Cust_Details AS
SELECT Cust_ID, Account_No, Account_Type, Bank_Branch, Cust_Email
FROM Customer_Details;

In the above example, Cust_Details table is created from Customer_Details table.
Cust_details table is created with attributes Cust_ID, Account_No, Account_Type,
Bank_Branch and Cust_Email. If the new Cust_Details table created should be of the same
structure as that of the existing Customer_Details table the syntax would be as follows:

CREATE TABLE Cust_Details as
SELECT *
FROM Customer_Details;

In the example, above not only is the structure copied but the data is also copied.

To copy only the structure and not the data

CREATE TABLE Cust_Details as
SELECT * FROM Customer_Details WHERE 1=2;

Note: When a table is created from another table, only the NOT NULL constraints are copied.
All the other constraints are not copied.

6. Domain integrity constraint (check constraint column constraint)

CREATE TABLE Customer_Details(
Cust_ID NUMBER(5) CONSTRAINT nn_cust_custid NOT NULL
CONSTRAINT cc_cust_custid CHECK( Cust_ID BETWEEN 101 AND 105),
Cust_Last_Name VARCHAR2(20) CONSTRAINT nn_cust_lastname NOT NULL,
Cust_Mid_Name VARCHAR2(4),
Cust_First_Name VARCHAR2(20),
Account_No NUMBER(5) CONSTRAINT pk_cust_accountno PRIMARY KEY,
Account_Type VARCHAR2(10) CONSTRAINT nn_cust_accounttype NOT NULL,
Bank_Branch VARCHAR2(25) CONSTRAINT nn_cust_branch NOT NULL,
Cust_Email VARCHAR2(30));
Relational Database Management System

84 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m



7. Domain integrity constraint ( check constraint table constraint)

CREATE TABLE Customer_Details(
Cust_ID NUMBER(5) CONSTRAINT nn_cust_custid NOT NULL,
Cust_Last_Name VARCHAR2(20) CONSTRAINT nn_cust_lastname NOT NULL,
Cust_Mid_Name VARCHAR2(4),
Cust_First_Name VARCHAR2(20),
Account_No NUMBER(5) CONSTRAINT pk_cust_accountno PRIMARY KEY,
Account_Type VARCHAR2(10) CONSTRAINT nn_cust_accounttype NOT NULL,
Bank_Branch VARCHAR2(25) CONSTRAINT nn_cust_bankbranch NOT NULL,
Cust_Email VARCHAR2(30),
CONSTRAINT cc_cust_email CHECK(Cust_ID BETWEEN 101 AND 105 AND
ACCOUNT_TYPE in (Savings, Checkings))
);


Note: Although giving a name to a constraint is optional, it is a good
programming practice to give every meaningful constraint name which is
unique and cant be applied to any other constraint of any table. The name of
the constraint is required when the constraint has to be dropped.

A NOT NULL constraint on a column(s) implies that value has to be provided for that
column(s) compulsorily.

A UNIQUE constraint on a column(s) implies that the values in the column(s) should be
distinct. A column with a UNIQUE constraint can have NULL values.

Mostly in DBMS, a PRIMARY KEY constraint implicitly imposes a NOT NULL and UNIQUE
constraint. If the table has a composite primary key, each of the attribute constituting the
primary key is NOT NULL. In other words, column involved in the composite primary key
cannot have NULL value. However the combination of attributes constituting the primary key
should offer a unique value.

A FOREIGN KEY constraint on a set of attribute(s) does not prevent them from having
duplicate or NULL values.


Note: Users can use the DESCRIBE <tablename> or DESC <tablename>
statement to see the structure of the table.

Example:
DESCRIBE Customer_Details;
Relational Database Management System

85 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Or

DESC Customer_Details;
4.5.2. ALTER TABLE statement
The ALTER TABLE statement can be used for the following purpose:
To add a new column definition to an existing table
Drop a column from an existing table
Add or drop a primary key to / from an existing table
Add or drop a foreign key to / from an existing table
Add or drop a unique constraint to / from an existing table
Add or drop a check constraint to / from an existing table



Figure 4-3: ALTER TABLE statement syntax


Note: The check constraint enforces the domain integrity constraint. It
permits only values allowed by the constraint into the column(s). The domain
integrity constraint will be covered in detail in chapter 5.


Example:

1. Adding a new column
Add a phone number to the Customer_Details table

ALTER TABLE Customer_Details
ADD Contact_Phone CHAR(10);

2. Modifying an existing column definition
Modify the size of the Contact_Phone column
ALTER TABLE table name Add column-definition
DROP column-name
ADD primary-key-definition
foreign-key-definition
unique-constraint
check-constraint
DROP CONSTRAINT column-name
Relational Database Management System

86 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

ALTER TABLE Customer_Details
MODIFY Contact_Phone CHAR(12);


3. Adding a NOT NULL Constraint
Add the NOT NULL constraint on the Contact_Phone column

ALTER TABLE Customer_Details
MODIFY Contact_Phone CHAR(12) CONSTRAINT nn_cust_phone NOT NULL;

4. Adding a UNIQUE Constraint
Add the UNIQUE constraint on the Contact_Phone column

ALTER TABLE Customer_Details
ADD CONSTRAINT uq_cust_phone UNIQUE (Contact_Phone);

5. Dropping a constraint
Drop the NOT NULL constraint on Contact_Phone column

ALTER TABLE Customer_Details
DROP CONSTRAINT nn_cust_phone;

6. Dropping a column
Drop the Contact_Phone column from the Customer_Details table

ALTER TABLE Customer_Details
DROP (Contact_Phone);

7. Adding a simple PRIMARY KEY
Make the Account_No column as the primary key

ALTER TABLE Customer_Details
ADD CONSTRAINT pk_cust_accountno PRIMARY KEY (Account_No);

8. Table level constraint - Adding a composite PRIMARY KEY to a table
Make the Account_No and Cust_ID columns as the primary key

ALTER TABLE Customer_Details
ADD CONSTRAINT pk_cust_accountno_custid PRIMARY KEY (Account_No,
Cust_ID);

9. Adding FOREIGN KEY
Make Account_No column in Customer_Transaction table as the foreign key referencing
Account_No column of Customer_Details

Relational Database Management System

87 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
ALTER TABLE Customer_Transaction
ADD CONSTRAINT fk_cust_trans_accountno FOREIGN KEY (Account_No)
REFERENCES Customer_Details (Account_No);

10. Adding a CHECK constraint
ALTER TABLE Customer_Details
ADD CONSTRAINT cc_cust_custid CHECK (Cust_ID BETWEEN 101 AND
105);

11. Dropping a simple or composite PRIMARY KEY constraint
Drop the primary key constraint

ALTER TABLE Customer_Details
DROP PRIMARY KEY;

Or

ALTER TABLE Customer_Details
DROP CONSTRAINT Pkey1;


Note: The syntax for dropping a simple or composite primary key constraint is
one and the same.

While creating a table one can have only one primary key but any number of foreign keys. If a
table already has a primary key column, adding another primary key column to the same
table using the ALTER TABLE statement would result in an error. RDBMS will not allow us to
have a PRIMARY KEY constraint on column(s) if the column(s) has NULL or duplicate values.


Note: We cannot change the name of column or the name of a table using
ALTER TABLE command. However, we can change the datatype or length of
the column using the same command.
If the table has only one column, the ALTER TABLE statement cannot be used to drop that
column because that would render the table definition invalid.
4.5.3. DROP TABLE statement
The DROP TABLE statement is used to drop or remove a table permanently from the database.


Figure 4-4: DROP TABLE statement syntax
DROP TABLE table-name
Relational Database Management System

88 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Both the schema/structure of the table and all of its contents are lost when DROP table
command is used. There is no way to recover the data.



Note: Most RDBMS will restrict the dropping of a table if it has attribute(s)
being referred to by attribute(s) of another table. This is called the
referential integrity constraint.


Example:
Discard Customer_Details table

DROP TABLE Customer_Details;

4.5.4. TRUNCATE TABLE statement
The TRUNCATE TABLE statement is used to remove/delete all rows from a table.

Figure 4-5: TRUNCATE TABLE command syntax
When the TRUNCATE TABLE statement is used, all the contents of the specified table are lost
but its definition remains intact. There is no way to recover the data. It releases the
secondary memory occupied by the contents of the specified table.
Example:
Delete all rows from the Customer_Details table

TRUNCATE TABLE Customer_Details;
4.5.5. CREATE INDEX statement
An index is a structure which provides quick access to the rows of a table, based on the values
of one or more columns. The index stores the data values and pointers (physical address
information) to the rows where those data values occur. In the index, the data values are
arranged either in descending or in ascending order, so that the RDBMS can quickly lookup the
index to find a particular value. It then follows the pointer to locate the row containing the
value.

In Error! Reference source not found., the index is created on the Account_No which in turn
oints to the corresponding rows in the table.

TRUNCATE TABLE table-name
Relational Database Management System

89 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Note: The presence of an index or its absence is unknown to the SQL user,
who accesses the table.




Figure 4-6: An index on Account_No column of Customer_Details table

Advantages of having an INDEX:
x Referring to indexed column(s) in search conditions speeds up the execution of SQL
statements.
x It is most appropriate when retrieval of data from tables is more frequent than inserts
and updates

Disadvantages of having an INDEX:
x It consumes additional disk space
x The INDEX must be updated whenever a row is added to the table or whenever
updation of indexed column happens in an existing row. This imposes additional
overhead on INSERT and UPDATE statements for the table


Note:
x Most RDBMS products automatically create an index for the primary
key column of a table because they anticipate these columns to be
most frequently accessed

x Most RDBMS products also automatically create an index on any column
(or column combination) defined with a unique constraint. The RDBMS
must check the value of such a column every time a new row is
inserted, or an existing row is updated, to make certain that the value
does not duplicate a value already contained in the table. Without the
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Detail records from Customer_Details file
1020
2348
2367
2389
3421
INDEX
Relational Database Management System

90 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
index on the column(s), the RDBMS would have to sequentially search
through every row of the table to check the constraint. With an index,
the RDBMS can simply use the index to find a row (if it exists) with the
value in question, which is a much faster operation than a sequential
search

x When the primary key of the table or the unique constraint on
column(s) is dropped, the index which was built on them is also
dropped automatically



Figure 4-7: CREATE INDEX statement syntax


Figure 4-8: DROP INDEX statement syntax
Example:
1. Create a simple index for the Customer_Details table on Cust_ID

CREATE UNIQUE INDEX Cust_Idx
ON Customer_Details (Cust_ID);

2. Create a composite index for the Customer_Details table on Cust_ID and Account_No

CREATE UNIQUE INDEX ID_AccountNo_Idx
ON Customer_Details (Cust_ID, Account_No);

3. Drop the index created earlier

DROP INDEX ID_AccountNo_Idx;

Note: The keyword UNIQUE in the CREATE INDEX statement is optional. If the keyword
UNIQUE is omitted, the index table may have duplicates entries.


CREATE [UNIQUE] INDEX index-name on table-name (column-name)
DROP INDEX index-name
Relational Database Management System

91 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Points to Remember:
x The CREATE TABLE statement creates a table with column definitions,
PRIMARY KEY, FOREIGN KEY(s) and other constraints like UNIQUE and
NOT NULL

x The DROP TABLE statement removes an existing table from the
database

x The ALTER TABLE statement can be used to add a new column to an
existing table, modify an existing column definition, add/drop a
PRIMARY KEY, FOREIGN KEY and other constraints like UNIQUE and NOT
NULL

x The CREATE INDEX statement can be used to define indexes, which
speeds up database queries but add overheads to database updates
4.6. Data Manipulation Language (DML) Statements
The DML statements are used to:
x Insert data into the table
x Delete data from the table
x Retrieve/Fetch data from the table
x Modify/update data in the table
4.6.1. INSERT Statement
Single-row insert: A single-row INSERT statement adds a single record (new row) of data to
the table. Refer to Figure 4-10.

The Single-Row INSERT statement

Figure 4-9: Single-row insert statement syntax
INSERT INTO table-name [ column-name(s) ] VALUES ( ----------- constant (s) -------------)
NULL
Relational Database Management System

92 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-10: Inserting a single row


Note: The column list specified in the INSERT statement help us to match the
data values in the VALUES clause. The number of columns mentioned in the
column list and its data type must exactly match with the data values
specified in the VALUES clause or else an error will occur.

Data of type CHAR, VARCHAR2 and DATE are always enclosed within single quotes.
Example: Costner, 12-Jan-2005.

Users can use the SELECT * from <tablename> to view the records inserted into the specified
table. The SELECT statement is covered in detail in Section 4.6.4.

Example of Invalid INSERT statements:

INSERT INTO Customer_Details
(Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,
Account_No, Account_Type, Bank_Branch)
VALUES (106, Costner, A., Kevin, 3350, Savings, Brighton,
Costner_Kevin@times.com);

The above INSERT statement is invalid because the number of values in the VALUES clause
exceeds the number of columns that are to receive them.

INSERT INTO Customer_Details
INSERT INTO Customer_Details
( Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,Account_No, Account_Type, Bank_Branch, Cust_Email)
VALUES ( 106,Costner, A., Kevin, 3350, Savings, Brighton, Costner_Kevin@times.com );
106Costner A. Kevin 3350Savings Costner_Kevin@times.com Brighton
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown Smith_Mike@yahoo.com
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Details table
Relational Database Management System

93 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
(Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,
Account_No, Account_Type, Bank_Branch, Cust_Email)
VALUES (106, Costner, A., Kevin, 3350, Savings,Brighton);

The above INSERT statement is invalid because the number of values in the VALUES clause is
less than the columns that are to receive them.

Assume Account_No is the Primary Key for the Customer_Details table

INSERT INTO Customer_Details
(Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,
Account_Type, Bank_Branch)
VALUES (106, Costner, A., Kevin, Savings, Brighton);

The above INSERT statement is invalid because the column list specified does not include the
attribute, Account_No which is the primary key. Because Account_No has not been included in
the column list, SQL automatically assigns a NULL value to it. Being a primary key attribute it
cannot have a NULL value.




Insertion of NULL values
SQL supports missing or unknown or inapplicable data by means of a NULL value. A NULL
value stored in a table implies that value for that row-column intersection is missing or
unknown or inapplicable. But the NULL value is not the actual data value like 0, 473.83 or
John Clark. The NULL value occupies space.

Refer to Figure 4-11.


Figure 4-11: Storing NULL Values in the Customer_Details Table

When a new row is inserted to a table, SQL automatically assigns a NULL value to any column
whose name is missing or omitted from the column list in the INSERT statement.
Cust_IDCust_Last_
Name
Account
_No
Bank_Branch Cust_Email
101Smith 1020 Downtown NULL
105Jones 2389 Brighton Jones_Simon@rediffmail.com
104Quails 2367 Downtown Quails_Jack@yahoo.com
103Langer 3421 Plainsboro Langer_Justin@yahoo.com
102Smith 2348 Bridgewater Smith_Graham@rediffmail.com
Account_
Type
Savings
Checking
Savings
Checking
Checking
Cust_Mid
_Name
A.
S.
G.
D.
E.
Cust_First
_Name
Mike
Graham
Justin
Jack
Simon
Customer_Details Table
Value Unknown
Relational Database Management System

94 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Example:


Explicit assignment of NULL value can be made by including these columns in the column list
and by correspondingly specifying NULL in the values list.

Example:

Inserting all columns:

SQL permits omitting of the column list from the INSERT statement. When we explicitly do
not mention the column list in the INSERT statement, by default all the columns of the table
are included, in sequence from left to right.

Example:



Note: When the column list is omitted, the NULL keyword has to be used in
the values list to explicitly assign NULL values to columns. In addition the
order of columns mentioned in the column list and the order of data values
must exactly match.
Example:
To Insert a row into the Customer_Transaction Table
INSERT INTO Customer_Details
(Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,Account_No, Account_Type, Bank_Branch)
VALUES ( 106,Costner, A., Kevin, 3350, Savings, Brighton);
106Costner A. Kevin 3350Savings Brighton NULL
INSERT INTO Customer_Details
(Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,Account_No, Account_Type, Bank_Branch, Cust_Email)
VALUES ( 106,Costner, A., Kevin, 3350, Savings, Brighton, NULL);
106Costner A. Kevin 3350Savings Brighton NULL
INSERT INTO Customer_Details
VALUES ( 106,Costner, A., Kevin, 3350, Savings, Brighton, NULL );
106Costner A. Kevin 3350Savings NULL Brighton
Relational Database Management System

95 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

INSERT INTO Customer_Transaction
VALUES (2367,'17-JAN-2005','Deposit, 2000.00, 14456);


Note: The Date value should be input in the format dd-mmm-yyyy or
dd-mmm-yy.



Example of Invalid INSERT into the Customer_Details Table

INSERT INTO Customer_Details VALUES (106,Costner);

In the above INSERT statement, the column list is omitted. The value for all columns should
have to be provided, but the value for only Cust_ID and Cust_Last_Name is provided.
4.6.2. DELETE Statement
The DELETE statement can delete one or more rows from a table.
Refer to Figure 4-12.

Note: Even if all the rows are deleted from the table, the table definition and
its column details are still stored in the database. Thus the table still exists.
To erase the table definition also from the database, the DROP TABLE
statement must be used.

The DELETE statement cannot delete column(s) from a table. It deletes only
row(s). To delete a given column from a table, the ALTER TABLE statement
must be used.


Figure 4-12: The DELETE statement syntax
Example:
1. Deleting all rows of a table - Delete all current customers

DELETE
FROM Customer_Details;

2. Deleting some rows of a table- Delete Customer with Cust_ID=102 from the list of
customers

DELETE
DELETE FROM table-name [ where search-condition ]
Relational Database Management System

96 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
FROM Customer_Details
WHERE Cust_ID = 102;

3. Examples of invalid DELETE Statements

DELETE *
FROM Customer_Details;

OR

DELETE Cust_ID
FROM Customer_Details;


Difference between TRUNCATE and DELETE statement
x TRUNCATE is a Data Definition Language (DDL) statement whereas DELETE is a Data
Manipulation Language (DML) statement
x TRUNCATE deletes all records from the table whereas DELETE can be used to
selectively delete records from a table by using the WHERE clause
x TRUNCATE releases the secondary storage occupied by the records of the table
whereas DELETE does not do so
x Data removed using TRUNCATE cannot be recovered whereas data removed using
DELETE can be recovered (a DCL statement called ROLLBACK can be used which is
covered in chapter 5)
4.6.3. UPDATE Statement
One or more column values can be modified in the selected rows of a table using UPDATE
statement.

The table to be modified is mentioned immediately after the UPDATE keyword. The WHERE
clause identifies the rows of the table to be modified. The SET clause specifies which
columns have to be updated and assigns the new values for them.


Figure 4-13: UPDATE statement syntax

Example:
1. Changing all rows

UPDATE table-name SET column-name1 = expression1, column-name2 = expression2, -----
[ WHERE search-condition ]
Relational Database Management System

97 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Until fresh instructions come in, delete Rate_of_Interest values for all customers.

UPDATE Customer_Fixed_Deposit
SET Rate_of_Interest_in_Percent = NULL;

2. Changing some rows

For customers with a fixed deposit > 3000, increase Rate_of_Interest to 7.3%.

UPDATE Customer_Fixed_Deposit
SET Rate_of_Interest_in_Percent = 7.3
WHERE Amount_in_Dollars > 3000;

3. Changing more than one column value.

Change the Email_ID and Rate_of_Interest of Customer (Cust_ID = 105)

UPDATE Customer_Fixed_Deposit
SET Cust_Email = Jack@rediffmail.com,
Rate_of_Interest_in_Percent = 7.4
WHERE Cust_ID = 105;

4.6.4. SELECT Statement
The SELECT statement helps us in retrieving data from the database and returns the resultant
set of record(s) in the form of query results. Refer to Figure 4-14.


Figure 4-14: SELECT statement syntax
The result of a SQL query is a table of data, having one or more rows and columns.
Refer to Figure 4-15.


SELECT [ ALL / DISTINCT ] column-name1, column-name2, ------ FROM table-specification
[ WHERE search-condition ]
[ GROUP BY grouping column ]
[ HAVING search-condition ]
[ ORDER BY sort-specification ]
Relational Database Management System

98 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-15: The tabular picture of SQL query results

4.6.4.1. Simple SELECT Statement
The SELECT statement is used to select either some or all the columns from a table.

The asterisk (*) is a wildcard character that is used to denote all columns. Avoiding use of
SELECT * is a good programming practice. It is better to list the column names explicitly.

Example:
1. Selecting all columns

List all information about all the customers

SELECT Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name,
Account_No, Account_Type, Bank_Branch, Cust_Email
FROM Customer_Details;
Or

SELECT *
FROM Customer_Details;

2. Selecting some columns

List Cust_ID, Account_No of all customers

SELECT Cust_ID, Account_No
FROM Customer_Details;
SELECT Cust_ID, Account_No
FROM Customer_Details;
Query
DBMS
Cust_ID Account_No
101 1020
105 2389
104 2367
103 3421
102 2348
Query Results
Database
Step 1
Step 2
Step 3
Relational Database Management System

99 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4.6.4.2. Avoiding duplicates (DISTINCT)
By default the SELECT statement retrieves all rows that are filtered by the SELECT statement.
This may however contain duplicates rows. Mentioning DISTINCT keyword in the column list,
before all the columns helps us in eliminating duplicate rows in the result set returned by the
SELECT statement. The default keyword used is ALL.

Example:
1. List all customers name

SELECT ALL Cust_Last_Name
FROM Customer_Details;

This is equivalent to:

SELECT Cust_Last_Name
FROM Customer_Details;

2. This is likely to return duplicate rows. To avoid this:

SELECT DISTINCT Cust_Last_Name
FROM Customer_Details;
4.6.4.3. Row Selection (WHERE clause)
The WHERE clause specifies a selection criteria or condition that limits the number of rows
retrieved. It is a row wise operation.

Refer to Figure 4-16.

For each row in the table, the search condition can produce one of the three results:
x If the condition is true, then the row is considered in the query results
x If the condition is false, then the row is discarded from the query results
x If the column being searched has a NULL value, then the row is excluded from the
query results
Problem Statement: To select rows which have 102 in the Manager column.

Relational Database Management System

100 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-16: Selection of rows with the WHERE clause
Example:
1. List all customers with an available account balance in dollars greater than $20000

SELECT Account_No, Total_Available_Balance_in_Dollars
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars > 20000.00;

2. List the Cust_ID, Account_No of James.

SELECT Account_No, Cust_ID
FROM Customer_Details
WHERE Cust_First_Name = James;

Note: The comparison is case sensitive. The column-names are not case-sensitive; the values
of the column(s) are case sensitive.

For Example: JAMES is not the same as james or James
The WHERE clause can be used with any comparison operators such as =, >, <, >=, <=, <> or
the logical operators (AND, OR, NOT).


Figure 4-17: Comparison test syntax
Name Manager
Gautam Kumar 101
D.K. Singh 102
Tapas A.P. 103
Vikas S. 102
S.P. singh NULL
102 = 102
TRUE
NULL = 102
103 = 102
FALSE
Unknown
Name Commission
D.K. Singh 1200
Vikas S. 1350
Query Results
Expression1 ------------------------- = --------------------- Expression2
<>
<
<=
>
>=
Relational Database Management System

101 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
When SQL evaluates the values of the two expressions in the comparison test, three results
can occur:
1. The test may yield a TRUE result
2. The test may yield a FALSE result
3. If either of the two expressions produces a NULL value, the comparison yields a NULL
result.

Example:
1. List all Account_No where total available balance in dollars is atleast $20000.00

SELECT Account_No
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars >= 20000.00;

2. List all Cust_ID, Cust_Last_Name from Customer_Details table where Account_type is
Savings and Bank_Branch is Downtown.

SELECT Cust_ID, Cust_Last_Name
FROM Customer_Details
WHERE Account_Type = Savings
AND Bank_Branch = Downtown;

3. List all Cust_ID, Cust_Last_Name from Customer_Details table where neither
Account_type is Savings and nor Bank_Branch is Downtown.

SELECT Cust_ID, Cust_Last_Name
FROM Customer_Details
WHERE NOT Account_Type = Savings
AND NOT Bank_Branch = Downtown;


4. List all Cust_ID, Cust_Last_Name where either Account_type is Savings or Bank_Branch is
Downtown.

SELECT Cust_ID, Cust_Last_Name
FROM Customer_Details
WHERE Account_Type = Savings
OR Bank_Branch = Downtown;

The Multi-Row INSERT statement

A multi-row INSERT statement extracts rows of data records from one table and inserts it into
another table. Refer to Figure 4-18.
Relational Database Management System

102 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-18: Multi-row INSERT statement syntax
The data values for the new rows are not explicitly specified in this form of INSERT
statement, within the statement text. Instead, the source of new rows is a database query,
as shown in Figure 4-19.
Example:
INSERT INTO OldCust_details
(Account_No, Transaction_Date,Total_Available_Balance_in_Dollars)
SELECT Account_No,Transaction_Date,Total_Available_Balance_in_Dollars
From Customer_Transaction
WHERE Total_Available_Balance_in_Dollars > 10000.00;


Figure 4-19: Inserting Multiple Rows
The logical restrictions on the query that appears within the multi-row INSERT statement:

The query results contains the same number of columns as the column list in the INSERT
statement and the data types must be compatible, column by column

4.6.4.4. BETWEEN, IN, LIKE
The BETWEEN operator includes both the end values specified.

INSERT INTO table-name [ column-name(s) ] query
SELECT Account_No, Transaction_Date, Total_Available_Balance_in_Dollars
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars > 10000
Account_No
Total_Available_Balance
_in_Dollars
2367 12456.00
3421 27234.00
2348 13500.00
Transaction_
Date
14-Jan-2005
14-Jan-2005
16-Jan-2005
Query Results
Account_No Total_Available_Balance
_in_Dollars
2348 13500.00
3421 27234.00
2367 12456.00
Transaction_
Date
14-Jan-2005
14-Jan-2005
16-Jan-2005
OldCust_details Table
Query uses data from
Customer_Transaction table
Account
_No
Transaction
_Date
Transaction
_Type
Transaction_Amount
_in_Dollars
Total_Available_Balance
_in_Dollars
102012-Jan-2005 Deposit 5000.00 10000.00
234814-Jan-2005 Withdrawal 2500.00 13500.00
342114-Jan-2005 Deposit 2000.00 27234.00
236716-Jan-2005 Withdrawal 1200.00 12456.00
102017-Jan-2005 Withdrawal 1500.00 8500.00
Customer_Transaction records from Customer_Transaction table
Step 1
Step 2
Relational Database Management System

103 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The IN operator is used to check if a value belongs to a set of values.

Note that BETWEEN and IN can be fully substituted with a combination of AND, OR, NOT.

The LIKE operator is used to check for similarity of strings.

When used with LIKE the use of _ refers to exactly one unknown character; % refers to an
unknown number of unknown characters.


Figure 4-20: Range test (Between) syntax


Figure 4-21: Set membership test (IN) syntax


Figure 4-22: Pattern matching test (LIKE) syntax
Example:
1. List all Account_Nos with an account balance in the range $20000.00 to $30000.00.

SELECT Account_No
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars
BETWEEN 20000.00 AND 30000.00;
Or

SELECT Account_No
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars >= 20000.00
AND Total_Available_Balance_in_Dollars <= 30000.00;

2. List all customers who have account in Downtown or Brighton.

SELECT Cust_ID
FROM Customer_Details
test-expression [NOT] BETWEEN low-expression AND high-expression
test-expression [NOT] IN (constant1, constant2)
Column-name [NOT] LIKE pattern ESCAPE escape-character
Relational Database Management System

104 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
WHERE Bank_Branch IN (Downtown, Brighton);

Or


SELECT Cust_ID
FROM Customer_Details
WHERE Bank_Branch = Downtown
OR Bank_Branch = Brighton;

3. List all Accounts where the Bank_Branch name begins with a D and has o as the second
character.

SELECT Account_No, Cust_ID, Cust_Last_Name
FROM Customer_Details
WHERE Bank_Branch LIKE Do%;

4. List all Accounts where the Bank_Branch column has o as the second character.

SELECT Account_No, Cust_ID, Cust_Last_Name
FROM Customer_Details
WHERE Bank_Branch LIKE _o%;

5. List all Account_Nos with balance not in the range $20000.00 to $30000.00.

SELECT Account_No
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars
NOT BETWEEN 20000.00 AND 30000.00;
4.6.4.5. IS NULL, IS NOT NULL
The NULL value is used to indicate the value is not present. It is not a zero or blank
character. NULL cannot be compared to any other value. If compared, since the result of the
comparison cannot be determined, the result of the comparison is also a NULL.

Figure 4-23: NULL value test (IS NULL) syntax

Note: A NULL value is not equal to another NULL value. The result of
comparing two NULL values is NULL. It is neither TRUE nor FALSE.


Example:
column-name IS [ NOT ] NULL
Relational Database Management System

105 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
1. List employees who have not been assigned a Manager yet.

SELECT Employee_ID
FROM Employee_Manager
WHERE Manager_ID IS NULL;

2. List employees who have been assigned to some Manager.
SELECT Employee_ID
FROM Employee_Manager
WHERE Manager_ID IS NOT NULL;
4.6.4.6. Column titles using AS
When the SELECT statement returns a column, the title of the result column set is the name
of the column. If the statement includes an evaluated expression, the column title is a
default name that the RDBMS gives the expression.

To give meaningful column titles use the keyword AS.

Example:
List those customer accounts whose account balance is greater than $10000.00.

SELECT Account_No AS Customer Account No.,
Total_Available_Balance_in_Dollars AS Total Balance
FROM Customer_Transaction
WHERE Total_Available_Balance_in_Dollars > 10000.00;
4.6.4.7. Sorting Query Results (ORDER BY clause)
The rows returned as an output of SQL query is not arranged in any particular order. If
needed, we can arrange the rows returned by an SQL query using the ORDER BY clause in the
SELECT statement. The ORDER BY is a row-wise operation. By default the ORDER BY clause
arranges the rows of the query result in ascending order. To arrange the rows returned by the
query in descending order, use the keyword DESC.

Figure 4-24: The ORDER BY clause syntax
Example:
1. List the account numbers and their account balances of all customers in ascending order
of the account balance.
ORDER BY ---------------Column name1, Column name2, .. -------- ASC -------
Column-number1, Column number2, DESC
Relational Database Management System

106 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SELECT Account_No, Total_Available_Balance_in_Dollars
FROM Customer_Transaction
ORDER BY Total_Available_Balance_in_Dollars;

2. List the customers and their account numbers in the descending order of the account
numbers.


SELECT Cust_Last_Name, Cust_First_Name, Account_No
FROM Customer_Details
ORDER BY 3 DESC;

Note: ORDER BY clause can be followed by the column name or the position of the column as
appeared in the SELECT statement.

3. List the customers and their account numbers in descending order of the Customer Last
Name and ascending order of account numbers.

SELECT Cust_Last_Name, Cust_First_Name, Account_No
FROM Customer_Details
ORDER BY Cust_Last_Name DESC, Account_No;
Or

SELECT Cust_Last_Name, Cust_First_Name, Account_No
FROM Customer_Details
ORDER BY 1 DESC, 3;
4.6.4.8. Aggregate Functions / Column Functions
SQL allows summarizing data from the database through a set of column or aggregate
functions. A SQL column or aggregate function takes a complete column of data as its
arguments and produces a single resultant data value that summarizes the column.

Commonly used aggregate functions:

x SUM() : computes the total of a given column

x AVG() : computes the average value in a given column

x MIN() : finds the smallest value in a given column

x MAX() : finds the largest value in a given column

x COUNT() : counts the number of non-NULL values in a given column

Relational Database Management System

107 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x COUNT (*): counts rows of query results including rows which have NULL values. If
there are no rows, this function returns a value zero.


Note: Rows that have a NULL value in the relevant column are ignored by all
the above aggregate function except count (*).




Figure 4-25: Column functions syntax
Example:
1. List the minimum account balance from Customer_Transaction table.

SELECT MIN (Total_Available_Balance_in_Dollars)
FROM Customer_Transaction;

2. List the maximum account balance from Customer_Transaction table.

SELECT MAX (Total_Available_Balance_in_Dollars)
FROM Customer_Transaction;

3. List the average account balance of customers from Customer_Transaction table.

SELECT AVG (Total_Available_Balance_in_Dollars)
FROM Customer_Transaction;

4. List total number of account holders in the Downtown Branch.

SELECT COUNT (*)
FROM Customer_Details
WHERE Bank_Branch = Downtown;



SUM ( [ DISTINCT ] column-name / expression )
AVG ( [ DISTINCT ] column-name / expression )
MIN ( expression)
MAX ( expression )
COUNT ( [ DISTINCT ] column-name )
COUNT ( *)
Relational Database Management System

108 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

5. List total number of Customers.

SELECT COUNT (*)
FROM Customer_Details;

6. List number of Customers having Savings Account.

SELECT COUNT (*)
FROM Customer_Details
WHERE Account_Type = Savings;

7. List the minimum and sum of all account balances from Customer_Transaction table.

SELECT MIN (Total_Available_Balance_in_Dollars),
SUM (Total_Available_Balance_in_Dollars)
FROM Customer_Transaction;

8. List total number of unique Customer Last Names from Customer_Details table.

SELECT COUNT (DISTINCT Cust_Last_Name)
FROM Customer_Details;

Difference between COUNT(*) and COUNT(Column-name):

9. List total number of Employees.

SELECT COUNT (*)
FROM Employee_Manager;

10. List total number of Employees who have been assigned a Manager from
Employee_Manager table.

SELECT COUNT (Manager_ID)
FROM Employee_Manager;

Note: COUNT (Column-Name) counts the number of non-NULL values in a column whereas
COUNT (*) counts rows of query results and includes NULL values in a column


4.6.4.9. GROUP BY
The GROUP BY clause is used in a SELECT statement to collect data across multiple records
and group the results by one or more columns.

Relational Database Management System

109 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Sometimes it is required to get information not about each row, but about each group.

Example: Consider the Customer_Loan table that has data about all the loans taken by all
the customers of the bank. Assume that we want to retrieve the total loan-amount of all
loans taken by each customer.

Related rows can be grouped together by the GROUP BY clause by specifying a column as a
grouping column.

In the above example, the Cust_ID will be the grouping column.

In the output table all the rows with an identical value in the grouping column will be
grouped together. Hence, the number of rows in the output is equal to the number of distinct
values of the grouping column.

Figure 4-26: Example of Group BY Clause
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
SELECT Cust_ID, SUM(Amount_in_Dollars) FROM Customer_Loan GROUP BY Cust_ID;
GROUP BY Cust_ID
Cust_ID
Sum(Amount
_in_Dollars)
101 8755.00
103 4555.00
104 3050.00
Query Results
Relational Database Management System

110 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Figure 4-27: Example of GROUP BY clause


Figure 4-28: Example of GROUP BY clause
IF GROUP BY clause has been used in a SELECT statement, all the rows with an identical value
in the grouping column will be grouped together.
SELECT Department, COUNT (Employee_ID) FROM Employee_Manager GROUP BY Department ;
Department Count(Employee_ID)
HR 3
Finance 3
Design 2
Query Results
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Records from Employee_Manager Table
GROUP BYDepartment
SELECT Manager_ID, COUNT (Employee_ID) FROM Employee_Manager GROUP BY Manager_ID;
Manager_ID Count(Employee_ID)
2345 2
3556 2
3620 1
NULL 3
Query Results
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Records from Employee_Manager Table
GROUP BYManager_ID
Relational Database Management System

111 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Once the GROUP BY clause is used, the aggregate functions in the SELECT statement are
calculated after grouping i.e., there is one value of the aggregate column for each value of
the grouping column.

Example: Refer to Figure 4-28. In the example, the grouping is based on the Manager_ID
column. There are three records with NULL values in the Manager_ID column. All the three
records are placed in the same group. It is the group with indeterminate values. This does
not imply that NULL values are equal.


Note: If the GROUP BY clause has been used in a SELECT statement, only the
grouping columns (columns on which grouping has been done) or aggregate
functions can appear in the column list specified in the SELECT statement.


Example:

Invalid SQL statement

SELECT Department, Manager_ID, COUNT(Employee_ID)
FROM Employee_Manager
GROUP BY Manager_ID;


The above SQL statement should be written as:

SELECT Department, Manager_ID, COUNT(Employee_ID)
FROM Employee_Manager
GROUP BY Manager_ID, Department;

Refer to Figure 4-29.

Relational Database Management System

112 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-29: An example of GROUP BY clause
4.6.4.10. HAVING
The HAVING clause is used along with the GROUP BY clause. The HAVING clause can be used
to select and reject row groups. The format of the HAVING clause is similar to the WHERE
clause, consisting of the keyword HAVING followed by a search condition. The HAVING clause
thus specifies a search condition for groups.
SELECT Department, Manager_ID, COUNT (Employee_ID) FROM Employee_Manager
GROUP BY Manager_ID, Department ;
Manager_ID Count(Employee_ID)
2345 2
3556 2
3620 1
NULL 1
Department
HR
Finance
Design
HR
Design NULL 1
Finance NULL 1
Query Results
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Records from Employee_Manager Table
Group By Manager_ID,Department
Relational Database Management System

113 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-30: An example of HAVING Clause

Figure 4-31: An example of HAVING Clause

Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
SELECT Cust_ID, SUM(Amount_in_Dollars) FROM Customer_Loan GROUP BY Cust_ID
HAVING SUM(Amount_in_Dollars) > 4000.00;
GROUP BY Cust_ID
Cust_ID
Sum(Amount
_in_Dollars)
101 8755.00
103 4555.00
Query Results
SELECT Department, COUNT (Employee_ID) FROM Employee_Manager GROUP BY Department HAVING COUNT(Employee_ID) > 2 ;
Department Count(Employee_ID)
HR 3
Finance 3
Query Results
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Records from Employee_Manager Table
GROUP BYDepartment
Relational Database Management System

114 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Note: The WHERE clause can be used to select and reject the individual rows
that participate in a query. The HAVING clause can be used to select and
reject row groups.

4.6.4.11. Retrieval using UNION
The UNION operation combines the rows from two sets of query results. By default, the UNION
operation eliminates duplicate rows as part of its processing.

Example:
SELECT Cust_ID
FROM Customer_Fixed_Deposit

UNION

SELECT Cust_ID
FROM Customer_Loan;

Refer to Figure 4-32.

To retain duplicate rows in a UNION operation, specify the ALL keyword immediately
following the word UNION.

Example:
SELECT Cust_ID
FROM Customer_Fixed_Deposit

UNION ALL

SELECT Cust_ID
FROM Customer_Loan;


Refer to Figure 4-33.


Relational Database Management System

115 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-32: Using UNION to combine query results




Figure 4-33: Using UNION ALL to combine query results
There are some restrictions on the table that can be combined by a UNION operation:

Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit records from Customer_Fixed_Deposit table
Cust_ID
101
103
104
Cust_ID
101
103
104
103
UNION
101
103
104
Query Results
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit records from Customer_Fixed_Deposit table
Cust_ID
101
103
104
Cust_ID
101
103
104
103
UNION ALL
101
104
103
104
103
101
103
Query Results
Relational Database Management System

116 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x The SELECT statements combined using UNION or UNION ALL must contain the same
number of columns
x The data type of each column in the first table must be the same as the data type of
the corresponding column in the second table. The data width and column name can
differ
x Neither of the two tables can be sorted with the ORDER BY clause. However, the
combined query results can be sorted


Note: Eliminating duplicate rows from query results is a time-consuming
process, especially if the query results contain a large number of rows. If one
is sure that the UNION operation cannot produce duplicate rows, one should
specifically use the UNION ALL operation because the query will execute much
more quickly.


4.6.4.12. Retrieval using INTERSECT
The INTERSECT operation selects the common row from two sets of query results.
Refer to Figure 4-34.

Example:
SELECT Cust_ID
FROM Customer_Fixed_Deposit

INTERSECT

SELECT Cust_ID
FROM Customer_Loan;

Relational Database Management System

117 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-34: Using INTERSECT to combine query results

4.6.5. Sub-Queries
A sub-query is a query within a query. The results of the sub-query are used by the DBMS to
determine the results of the higher-level query that contains the sub-query. Usually, the sub-
query appears within the WHERE or HAVING clause of another SQL statement.

Figure 4-35: Basic sub-query syntax
The sub-query is enclosed in parentheses, but otherwise it has a form similar to that of a
SELECT statement, with a FROM clause and optional WHERE, GROUP BY, and HAVING clauses.
The form of these clauses in a sub-query is identical to that in a SELECT statement, and they
perform their normal functions when used within a sub-query.
4.6.5.1. Independent Sub-Queries
x Inner Query is independent of Outer Query
x Inner Query is executed first and the results are stored
x Outer Query then runs on the stored results

Example: To list the Cust_ID and Loan_No for all Customers who have taken a loan of
amount greater than the loan amount of Customer (Cust_ID = 104).
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit records from Customer_Fixed_Deposit table
Cust_ID
101
103
104
Cust_ID
101
103
104
103
INTERSECT
101
103
104
Query Results
SELECT [ ALL / DISTINCT ] column-name1, column-name2, ------ FROM table-specification
[ WHERE search-condition ]
[ GROUP BY grouping column ]
[ HAVING search-condition ]
[ ORDER BY sort-specification ]
Relational Database Management System

118 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-36: How an independent sub-query executes
The inner query, which retrieves the Amount_in_Dollars of Cust_ID, 104 can be executed
independent of the outer query. Hence the name independent sub-query.

The inner query needs to be executed only once, since it returns one constant value
irrespective of the outer query.
In the above example, the innermost query is executed first, the result is stored and then the
outer query is executed for each row of the Customer_Loan table.
The inner query is executed only once, while the outer one is executed as many times as the
number of rows in the Customer_Loan table.

Example:
1. List customer names of all customers who have taken a loan > $3000.00.


SELECT Cust_ID, Loan_No
FROM Customer_Loan
WHERE Amount_in_Dollars >
(SELECT Amount_in_Dollars
FROM Customer_Loan
WHERE Cust_ID = 104);
SELECT Amount_in_Dollars
FROM Customer_Loan
WHERE Cust_ID = 104;
3050.00
Sub-Query
data
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
101 1011
Query Result
Step 1
Step 2
3050.00 compared with values in
Amount_in_Dollars
Relational Database Management System

119 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SELECT Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details
WHERE Cust_ID IN
( SELECT Cust_ID
FROM Customer_Loan
WHERE Amount_in_Dollars > 3000.00);

2. List customer names of all customers who have the same Account_type as Customer
Jones Simon.

SELECT Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details
WHERE Account_Type =
( SELECT Account_Type
FROM Customer_Details
WHERE Cust_Last_Name = Jones
AND Cust_First_Name = Simon);

3. List customer names of all customers who do not have a Fixed Deposit.

SELECT Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details
WHERE Cust_ID NOT IN
( SELECT Cust_ID
FROM Customer_Fixed_Deposit);

4. List customer names of all customers who have either a Fixed Deposit or a loan but not
both at any of the Bank Branches. It will include names that have no fixed deposit and
loan as well.

SELECT Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details
WHERE Cust_ID NOT IN
( SELECT Cust_ID
FROM Customer_Loan
WHERE Cust_ID IN
(SELECT Cust_ID
FROM Customer_Fixed_Deposit));
4.6.5.2. Co-Related Sub-Queries
In co-related sub-queries, SQL performs a sub-query, once for each row of the main query.
The column(s) from the table of the outer query is always referred in the inner query.
Refer to Figure 4-37.

Example: To list all Customers who have a fixed deposit of amount less than the sum of all
their loans.


Relational Database Management System

120 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Query:

Figure 4-37: A Correlated Query
Explanation of the query:
The inner query is repeated once for every record of the outer query. The outer query uses
the Customer_Fixed_Deposit table. Refer to Figure 4-38.

Figure 4-38: Customer_Fixed_Deposit table
The inner query uses the Customer_Loan table. Refer to Figure 4-39.


Figure 4-39: Customer_Loan table
The Customer_Fixed_Deposit table has three records. The inner query will be repeated three
times. This is similar to the nested FOR loop that has been covered in Programming
Fundamentals course.

For the first record in the Customer_Fixed_Deposit table:
1. The record with the value of 101 in the Cust_ID column of the Customer_Fixed_Deposit
table is read.

SELECT Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Fixed_Deposit
WHERE Amount_in_Dollars <
(SELECT SUM (Amount_in_Dollars)
FROM Customer_Loan
WHERE Customer_Loan.Cust_ID = Customer_Fixed_Deposit.Cust_ID);
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit
Step 1
Relational Database Management System

121 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

2. All records with a value of 101 in the Cust_ID column of the Customer_Loan table are
retrieved and their Amount_in_Dollars values are summed up. In the Example, there is
only one record with a value of 101 in the Cust_ID column and the Amount_in_Dollars
value is $8755.00.

3. This value is compared with $8055.00. Since the target value of $8055.00 is less than the
Amount_in_Dollars value of $8755.00, the record with the Cust_ID value of 101 is part of
the query result.


For the second record in the Customer_Fixed_Deposit table:
1. The record with the value of 103 in the Cust_ID column of the Customer_Fixed_Deposit
table is read.


2. All records with a value of 103 in the Cust_ID column of the Customer_Loan table are
retrieved and their Amount_in_Dollars values are summed up. In the Example, there are
two records with a value of 103 in the Cust_ID column and the sum of their
Amount_in_Dollars values is $4555.00.


Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Step 2
Step 3
8055.00 < 8755.00 (True)
The record with Cust_ID = 101 from Customer_Fixed_Deposit will occur in the query results
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit
Step 1
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Step 2
Relational Database Management System

122 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3. This value is compared with $2060.00. Since the target value of $2060.00 is less than the
sum(Amount_in_Dollars) value of $4555.00, the record with the Cust_ID value of 103 is
part of the query result.



For the third record in the Customer_Fixed_Deposit table:
1. The record with the value of 104 in the Cust_ID column of the Customer_Fixed_Deposit
table is read.


2. All records with a value of 104 in the Cust_ID column of the Customer_Loan table are
retrieved and their Amount_in_Dollars values are summed up. In the Example, there is
only one record with a value of 104 in the Cust_ID column and the Amount_in_Dollars
value is $3050.00.


3. This value is compared with $3050.00. Since the target value of $3050.00 is equal to the
Amount_in_Dollars of $3050.00, the record with the Cust_ID value of 104 is not part of
the query result.


Output of the co-related query:
Step 3
2060.00 < 4555.00 (True)
The record with Cust_ID = 103 from Customer_Fixed_Deposit will occur in the query results
Cust_ID Fixed_Deposit
_No
Amount_in_
Dollars
101 2011 8055.00
103 2015 2060.00
104 3010 3050.00
Cust_Last_
Name
Smith
Langer
Quails
Cust_Mid
_Name
A.
G.
D.
Cust_First
_Name
Mike
Justin
Jack
Rate_of_Interest
_in_Percent
6.5
6.5
6.5
Cust_Email
Smith_Mike@yahoo.com
Langer_Justin@yahoo.com
Quails_Jack@yahoo.com
Customer_Fixed_Deposit
Step 1
Cust_ID Loan_No Amount_in_Dollars
101 1011 8755.00
103 2010 2555.00
104 2056 3050.00
103 2015 2000.00
Customer_Loan records from Customer_Loan table
Step 2
Step 3
3050.00 < 3050.00 (False)
The record with Cust_ID = 104 from Customer_Fixed_Deposit will not occur in the query results
Relational Database Management System

123 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-40: Output of co-related query


Note: For each row of the Customer_Fixed_Deposit table to be tested by the
WHERE clause of the main query, the Cust_ID column (which appears in the
sub-query as an outer reference) has a different value. Thus SQL carries out
this sub-query - once for each row in the Customer_Fixed_Deposit table.
A sub-query containing an outer reference is called a correlated sub-query because its results
are correlated with each individual row of the main query. For the same reason, an outer
reference is sometimes called a correlated reference.

Example:
List customer IDs of all customers who have both a Fixed Deposit and a loan at any of Bank
Branches.

SELECT Cust_ID
FROM Customer_Details
WHERE Cust_ID IN
(SELECT Cust_ID
FROM Customer_Loan
WHERE Customer_Loan.Cust_ID = Customer_Details.Cust_ID)

AND Cust_ID IN

(SELECT Cust_ID
FROM Customer_Fixed_Deposit
WHERE Customer_Fixed_Deposit.Cust_ID = Customer_Details.Cust_ID);
4.6.6. JOINS
Join operations take two tables and return another table as a result.

Cartesian Product / Cross Join

Cross joins return all rows from the first table. Each row from the first table is combined with
all rows from the second table. Cross joins are also known as the Cartesian product
41
(or just
the product) of two tables. The columns of the product table are all the columns of the first
table, followed by all the columns of the second table.

41
Cartesian product: A mathematical term that, when applied to relational databases, refers to the
result obtained by joining all the rows of one table with all the rows of another table in every possible
combination.
Cust_ID Cust_Last_
Name
Cust_Mid_
Name
Cust_First_
Name
101Smith A. Mike
103Langer G. Justin
Output Table
Relational Database Management System

124 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Refer to Figure 4-41.



Figure 4-41: The Cartesian product of two tables


4.6.6.1. SELF JOIN
Joining a table with itself is a self-join.

Example:
Problem Statement: To list all the Employees (Employee_ID, Employee_Last_Name,
Employee_First_Name) along with their Managers (Manager_ID, Manager_Last_Name,
Manager_First_Name).

Query:
SELECT Emp.Employee_ID as Employee ID,
Emp.Employee_Last_Name as Employee Last Name,
Emp.Employee_First_Name as Employee First Name,
Emp.Manager_ID as Manager ID,
Manager.Employee_Last_Name as Manager Last Name,
Manager.Employee_First_Name as Manager First Name
FROM Employee_Manager Emp, Employee_Manager Manager
WHERE Emp.Manager_ID = Manager.Employee_ID;

Processing of the Query:

Step 1: The table Employee_Manager has two aliases (another name), Emp and Manager.

Step 2: Manager_ID attribute of Emp (alias for Employee_Manager) is matched with
Employee_ID attribute of Manager (alias for Employee_Manager). The Figure below shows the
matching of only two records. The other records are matched similarly.
A B C
a1 b1 c1
a2 b2 c2
X Y
x1 y1
x2 y2
Cartesian Product
( m * n ) rows
A B C X
a1 b1 c1 x1
a1 b1 c1 x2
a2 b2 c2 x1
Y
y1
y2
y1
a2 b2 c2 x2 y2
Table 1
Table 2
Product of Table1 and Table2
Relational Database Management System

125 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Step 3: The columns that appear in the output table are specified in the column list used
with the SELECT statement.

Figure 4-42: Output of SELF JOIN
4.6.6.2. INNER JOINS
An inner join between two (or more) tables is the Cartesian product that satisfies the join
condition in the WHERE clause.

Inner joins use a comparison operator like = or <> to match rows from two tables based on the
values in common columns from each table.

Inner Joins include Equi-Joins A join in which the joining condition is based on equality
between values in the common columns.

Example:
SELECT Table1.Emp_ID, Table1.City, Table2.Cust_ID, Table2.City
FROM Table1, Table2
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Employee
_ID
Employee_
Last_Name
Employee_
Mid_Name
Employee_
First_Name
Department
2345Atherton S. Cindy HR
3556George A. Henry Finance
22789Stevenson S. Crystal HR
23456Smith A. Luther Finance
Grade
1
1
2
2
Manager_ID
NULL
NULL
2345
3556
30456Langer C. Christiana HR 3 2345
31234Frost J. Robert Finance 3 3556
32345Austen L. Jane Design 2 3620
3620Jackson G. Matt Design 1 NULL
Employee_Email
Atherton_Cindy@yahoo.com
George_Henry@rediffmail.com
Jackson_Matt@samsonite.co.in
Stevenson_Crystal@mag.com
Smith_Luther@yahoo.com
Langer_Christiana@rediffmail.com
Frost_Robert@training.com
Austen_Jane@yahoo.com
Emp (Alias for Employee_Manager)
Manager (Alias for Employee_Manager)
Query Results
Employee
ID
Employee
Last Name
Employee
First Name
22789Stevenson Crystal
23456Smith Luther
Manager
ID
2345
3556
30456Langer Christiana 2345
31234Frost Robert 3556
32345Austen Jane 3620
Manager
Last Name
Atherton
George
Atherton
George
Jackson
Manager
First Name
Cindy
Henry
Cindy
Henry
Matt
Relational Database Management System

126 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
WHERE Table1.City = Table2.City;


Figure 4-43: An Example of inner join
4.6.6.3. OUTER JOINS
An inner join provides only those values that satisfy the WHERE condition. However, it may be
worthwhile sometimes, to retrieve all rows that match the WHERE clause and those that have
unmatched rows in the column being compared. An outer join is then used to retrieve the
rows with an unmatched value in the relevant column.

Refer to Figure 4-44.

Constructing a FULL OUTER JOIN:

x Begin with the INNER JOIN of the two tables, using matching columns
x For each row of the left table that is not matched by any row in the right table, add
one row to the query results, using the values of the columns in the left table, and
assuming a NULL value for all columns of the right table
x For each row of the right table that is not matched by any row in the left table, add
one row to the query results, using the values of the columns in the right table, and
assuming a NULL value for all columns of the left table

Emp_ID CITY
A1 New YorK
A2 NULL
A3 Chicago
A4 Chicago
A5 Paris
Cust_ID CITY
B1 New York
B2 New York
B3 NULL
B4 Chicago
B5 Moscow
Table1
Table2
Table1.Emp_ID Table1.City Table2.Cust_ID Table2.City
A1 New York B1 New York
A1 New York B2 New York
A3 Chicago B4 Chicago
A4 Chicago B4 Chicago
INNER
JOIN
Output Table
Relational Database Management System

127 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-44: An example of OUTER JOIN

Note: Full Outer Join is supported by Oracle 9i and later versions.
4.6.6.4. LEFT OUTER JOIN
Constructing a LEFT OUTER JOIN:

x Begin with the INNER JOIN of the two tables, using matching columns
x For each row of the left table that is not matched by any row in the right table, add
one row to the query results, using the values of the columns in the left table, and
assuming a NULL value for all columns of the right table

Refer to Figure 4-45.


Note: The LEFT OUTER JOIN thus includes NULL-extended copies of the
unmatched rows from the first (left) table but does not include any
unmatched rows from the second (right) table.


Example: The syntax given is Oracle specific.
SELECT Table1.Emp_ID, Table1.City, Table2.Cust_ID, Table2.City
FROM Table1, Table2
WHERE Table1.City = Table2.City (+);

Emp_ID CITY
A1 New YorK
A2 NULL
A3 Chicago
A4 Chicago
A5 Paris
Cust_ID CITY
B1 New York
B2 New York
B3 NULL
B4 Chicago
B5 Moscow
Table1
Table2
Table1.Emp_ID Table1.City Table2.Cust_ID Table2.City
A1 New York B1 New York
A1 New York B2 New York
A3 Chicago B4 Chicago
A4 Chicago B4 Chicago
A5 Paris NULL NULL
A2 NULL NULL NULL
NULL NULL B5 Moscow
NULL NULL B3 NULL
INNER
JOIN
Unmatched
rows
Unmatched rows
Outer_Join Table
Relational Database Management System

128 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-45: An example of LEFT OUTER JOIN

4.6.6.5. RIGHT OUTER JOIN
Constructing a RIGHT OUTER JOIN:

x Begin with the INNER JOIN of the two tables, using matching columns
x For each row of the right table that is not matched by any row in the left table, add
one row to the query results, using the values of the columns in the right table, and
assuming a NULL value for all columns of the left table

Refer to Figure 4-46.


Note: The RIGHT OUTER JOIN thus includes NULL-extended copies of the
unmatched rows from the SECOND (right) table but does not include any
unmatched rows from the first (left) table.

Example: The syntax given is Oracle specific.

SELECT Table1.Emp_ID, Table1.City, Table2.Cust_ID, Table2.City
FROM Table1, Table2
WHERE Table1.City (+) = Table2.City;
Emp_ID CITY
A1 New YorK
A2 NULL
A3 Chicago
A4 Chicago
A5 Paris
Cust_ID CITY
B1 New York
B2 New York
B3 NULL
B4 Chicago
B5 Moscow
Table1
Table2
Table1.Emp_ID Table1.City Table2.Cust_ID Table2.City
A1 New York B1 New York
A1 New York B2 New York
A3 Chicago B4 Chicago
A4 Chicago B4 Chicago
A5 Paris NULL NULL
A2 NULL NULL NULL
INNER
JOIN
Unmatched
rows
Left_Outer_Join Table
Relational Database Management System

129 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m



Figure 4-46: an example of RIGHT OUTER JOIN
4.6.7. Queries using EXISTS / NOT EXISTS
4.6.7.1. EXISTS
The EXISTS checks whether a sub-query produces any row(s) of results.

Consider a nested query. If the query following the EXISTS returns at least one row, the
EXISTS returns TRUE and stops further execution of the inner SELECT statement. The outer
query will be executed only if the EXISTS returns true.

If the inner query produces no rows, the EXISTS returns FALSE and the outer query will not be
executed. The EXISTS test cannot produce a NULL value.

Example:
1. List all Customers who have at least one Fixed Deposit more than $3000.00.

SELECT Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details CD
WHERE EXISTS
(SELECT *
FROM Customer_Fixed_Deposit CFD
Emp_ID CITY
A1 New YorK
A2 NULL
A3 Chicago
A4 Chicago
A5 Paris
Cust_ID CITY
B1 New York
B2 New York
B3 NULL
B4 Chicago
B5 Moscow
Table1
Table2
Table1.Emp_ID Table1.City Table2.Cust_ID Table2.City
A1 New York B1 New York
A1 New York B2 New York
A3 Chicago B4 Chicago
A4 Chicago B4 Chicago
NULL NULL B5 Moscow
NULL NULL B3 NULL
INNER
JOIN
Unmatched rows
Right_Outer_Join Table
Relational Database Management System

130 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
WHERE CFD.Amount_in_Dollars > 3000.00 AND CFD.Cust_ID = CD.Cust_ID);

Note: CD is the alias for Customer_Details. CFD is the alias for Customer_Fixed_Deposit.


2. List all Customers who have both a Fixed Deposit and a Loan at the Bank.

SELECT Cust_ID
FROM Customer_Fixed_Deposit
WHERE EXISTS
(SELECT *
FROM Customer_Loan
WHERE Customer_Loan.Cust_ID = Customer_Fixed_Deposit.Cust_ID);


4.6.7.2. NOT EXISTS
The logic of the EXISTS test can be reversed by using the NOT EXISTS form. In this case, the
test is TRUE if the sub-query produces no rows, and FALSE otherwise.

Example:
List all Customers who do not have a single Fixed Deposit over $3000.00.

SELECT Cust_ID, Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details CD
WHERE NOT EXISTS
(SELECT *
FROM Customer_Fixed_Deposit CFD
WHERE CFD.Amount_in_Dollars > 3000.00
AND CFD.Cust_ID = CD.Cust_ID);

4.6.8. The Order of Execution of a SELECT statement
If a SELECT Statement contains a WHERE, GROUP BY, HAVING and ORDER BY CLAUSE, the
order of execution is as follows:

1. The WHERE clause is applied first, and the rows for which the search condition in the
WHERE clause returns a TRUE are retained.

2. Next a GROUP BY clause is applied. It will group the rows selected by the WHERE clause
such that all the rows in each group have the same value for the column in the GROUP BY
clause.

3. Next the HAVING clause is applied. It will retain row groups for which the search condition
in the HAVING clause returns a TRUE value.
Relational Database Management System

131 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

4. Lastly the query result is sorted in the order specified in the ORDER BY clause.
4.7. Views
A view is a virtual table in the database defined by a query. A view does not exist in the
database as a stored set of data values. The rows and columns of data visible through the
view are produced by the query that defines the view.


Figure 4-47: The CREATE VIEW statement syntax
4.7.1. Horizontal View
Horizontal view restricts a users access to selected rows of a table.

Figure 4-48: Horizontal View
4.7.2. Vertical View
Vertical view restricts a users access to select columns of a table.

Figure 4-49: Vertical View
4.7.3. DROP VIEW Statement
The DROP VIEW statement is used to drop a view.

CREATE VIEW view-name column-name1, column-name2, ---------------- AS query
CREATE VIEWview_cust AS
SELECT *
FROM Customer_Details
WHERE Cust_ID in (101,102,103);
CREATE VIEWview_cust AS
SELECT Cust_ID, Account_No, Account_Type
FROM Customer_Details;
Relational Database Management System

132 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-50: DROP VIEW statement syntax
4.7.4. Joined Views
Joined Views are used to simplify multi-table queries. A joined view draws its data from two
or three different tables and presents the query results as a single virtual table. Once the
view is defined, one can use a single-table query against the view for requests that would
otherwise each require a two-table or three-table join.


Figure 4-51: Joined Views
A view can be referenced like a real table in a SELECT, INSERT, DELETE, or UPDATE
statement. However, more complex views cannot be updated; they are read only views.
4.7.5. VIEW Updates
A view can be updated if the query that defines the view meets all of the following
restrictions:

x DISTINCT must not be specified; that is, duplicate rows must not be eliminated from
the query results
x The FROM clause must specify only one updateable table; the view must have a single
underlying source table
x The SELECT list cannot contain expressions, calculated columns, or column functions
x The WHERE clause must not include a sub query; only simple row-by-row search
conditions may be used
x The SELECT list must include all the columns specified with the NOT NULL constraint
4.7.6. Checking View Updates (CHECK OPTION)
If a view is defined by a query that includes the WHERE clause, only rows that meet the
search criteria are visible in the view. Other rows may be present in the source table(s) from
which the view is derived, but they are not visible through the view.
DROP VIEWview-name
CREATE VIEWCust_View As
SELECT Customer_Details.Cust_Last_Name, Customer_Details.Cust_First_Name,
Fixed_Deposit_No, Amount_in_Dollars
FROM Customer_Details, Customer_Fixed_Deposit
WHERE Customer_Details.Cust_ID = Customer_Fixed_Deposit.Cust_ID;
Relational Database Management System

133 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m





Example:

Figure 4-52: Creation of a simple view

Figure 4-53: Insertion in a simple view

Note: This is a perfectly valid SQL statement, and the RDBMS inserts a new
row with the specified column values into the Customer_Details table.
However, the newly inserted row does not meet the search condition for the
view. As a result, if one runs this query immediately after the INSERT
statement the newly added row does not show up in the view.






SQL can allow DBMS to detect and prevent this type of INSERT or UPDATE from taking place
through the view by creating the view with the CHECK OPTION. The CHECK OPTION is
specified in the CREATE VIEW statement, as shown below:

CREATE VIEW view_customer AS
SELECT Cust_ID, Cust_Last_Name, Account_No, Account_Type, Bank_Branch
FROM Customer_Details
WHERE Bank_Branch = Downtown ;
INSERT INTO view_customer
VALUES (115, Costner, 107, Savings, Bridgewater);
SELECT Cust_ID, Cust_Last_Name, Bank_Branch
FROM view_customer;
Relational Database Management System

134 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 4-54: Create view with CHECK OPTION
4.7.7. Advantages of Views
x Security: A user can be permitted to access the database, only through a small set of
views that contain the specific data the user is authorized to see
x Query simplicity: A view can draw data from several different tables and present it as
a single table, thus effectively turning multi-table queries into single-table queries.
Internally RDBMS uses multi-table queries
x Structural simplicity: Views can give a user, a personalized view of the database
structure, presenting the database as a set of virtual tables that make sense to the
user
x Data integrity: If data is accessed and entered through a view, the DBMS can
automatically check the data to ensure that it meets specified integrity constraints
4.7.8. Disadvantages of Views
x Performance: The DBMS translates the queries against the view into queries against
the underlying source tables. If a view is defined by a multi-table query, then even a
simple query against a view becomes a complicated join, and it may take a long time
to complete. This is in reference to insert, delete and update operations

x Update restrictions: When a user tries to update rows of a view, the DBMS must
translate the request into an update on rows of the underlying source tables. This is
possible for simple views, but more complicated views cannot be updated
4.8. Data Control Language (DCL)
DCL statements are used to control access to the database and the data in it. It is used to
enforce data security.

CREATE VIEW view_customer AS
SELECT Cust_ID, Cust_Last_Name, Account_No, Account_Type, Bank_Branch
FROM Customer_Details
WHERE Bank_Branch = Downtown
With CHECK OPTION;
Relational Database Management System

135 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4.8.1. Granting Privileges
The GRANT statement is used to grant security privileges on database objects to specific
users. Normally, the GRANT statement is used by the owner of the table or view to give other
users access to the data.

Figure 4-55: The GRANT statement syntax
Example:
GRANT SELECT, INSERT
ON Customer_Details
TO Edwin ;

GRANT ALL PRIVILEGES
ON Customer_Loan
TO JACK ;

GRANT ALL
ON Customer_Loan
TO PUBLIC ;

4.8.1.1. Passing Privileges (GRANT OPTION)
A GRANT statement with the WITH GRANT OPTION clause conveys, along with the specified
privileges, the right to grant those privileges to other users.

Figure 4-56: Using the GRANT OPTION
GRANT SELECT/ INSERT / DELETE / UPDATE / ALL PRIVILEGES ON table-name
TO user-name / PUBLIC
[ WITH GRANT OPTION ]
1. WITH GRANT OPTION
2. GRANT
EDWIN
JACK
BORIS
Relational Database Management System

136 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

4.8.2. Revoking Privileges (REVOKE)
The REVOKE statement is used to REVOKE privileges previously granted with the GRANT
statement.

Figure 4-57: The REVOKE statement syntax
Example:
REVOKE SELECT, INSERT
ON Customer_Details
FROM Edwin ;

REVOKE ALL PRIVILEGES
ON Customer_Loan
FROM JACK ;

REVOKE ALL
ON Customer_Loan
FROM PUBLIC ;



Figure 4-58: REVOKE with CASCADE
REVOKE SELECT/ INSERT / DELETE / UPDATE / ALL PRIVILEGES ON table-name
FROM user-name / PUBLIC
1. WITH GRANT OPTION
2. GRANT
EDWIN
JACK
BORIS
3. REVOKE
Relational Database Management System

137 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4.9. Best Practices
1. Do not use SELECT *. This is time-consuming and reduces performance. Instead, list
out each field that is required.

2. It is potentially dangerous to use SELECT * in embedded SQL i.e. SQL embedded in an
application program because the meaning of the asterisk (*) might change. Example: if
a column is added to or dropped from some table.

3. While Evaluating NULL in a WHERE clause of a query, use IS NULL as opposed to =
NULL.

4. If one is sure that the UNION operation cannot produce duplicate rows, use the UNION
ALL as opposed to UNION because the query will execute much more quickly.

5. If the GROUP BY clause has been used in a SELECT statement, then use only the
grouping columns (columns on which grouping has been done) or aggregate functions in
the column list of the SELECT statement.

6. Rows that have a NULL value in the relevant column are ignored by all the aggregate
function except count (*).

7. Index is most appropriate when queries against a table are more frequent than INSERT
and UPDATE operations

8. EXISTS is beneficial when the most selective filter is in the parent query. This allows
the selective predicates in the parent query to be applied before filtering the rows
against the EXISTS criteria.

9. IN is most beneficial when the most selective filter appears in the sub-query and there
are indexes on the join columns

Tips to write a good query:

Tip 1:
SELECT account_no, trans_date, amount
FROM transaction
WHERE amount + 3000 < 5000;

Replace the above query with the following

SELECT account_no, trans_date, amount
FROM transaction
WHERE amount < 2000;
Relational Database Management System

138 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Reason: Avoid unnecessary computational overhead in queries.

Tip2:
SELECT quantity, AVG(actual_price)
FROM item
GROUP BY quantity
HAVING quantity > 40;

Replace the above query with the following:

SELECT quantity, AVG(actual_price)
FROM item
WHERE quantity > 40
GROUP BY quantity;

Reason: The WHERE clause filters the rows from the table according to the search condition.
Then the GROUP BY clause is applied only on the filtered rows. It saves time. If as opposed to
this, if the rows are grouped first then the row groups are filtered using the HAVING clause, it
leads to an increased overhead in terms of time required for execution of the query.


Tip 3:
Problem Statement: To retrieve the average salary for presidents and managers.
SELECT job, avg(sal)
FROM emp GROUP BY job
HAVING job = 'president' OR job = 'manager';

Replace the above query with the following:
SELECT job, avg(sal)
FROM emp
WHERE job = 'president OR job = 'manager'
GROUP BY job;

Reason: The WHERE clause filters the rows from the table according to the search condition.
Then the GROUP BY clause is applied only on the filtered rows. It saves time. If as opposed to
this, if the rows are grouped first then the row groups are filtered using the HAVING clause, it
leads to an increased overhead in terms of time required for execution of the query.

Tip 4:
Problem Statement: To select records from debit_transactions table, credit_transactions
table where tran_date is 31-DEC-99
SELECT acct_num, balance_amt
FROM debit_transactions
WHERE tran_date = `31-DEC-99'

UNION
Relational Database Management System

139 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SELECT acct_num, balance_amt
FROM credit_transactions
WHERE tran_date = `31-DEC-99';

Replace the above query with the following:

SELECT acct_num, balance_amt
FROM debit_transactions
WHERE tran_date = `31-DEC-99'

UNION ALL

SELECT acct_num, balance_amt
FROM credit_transactions
WHERE tran_date = `31-DEC-99';

Reason: Eliminating duplicate rows from query results is a very time-consuming process,
especially if the query results contain a large number of rows. If one is sure that the UNION
operation cannot produce duplicate rows, one should specifically use the UNION ALL
operation because the query will execute much more quickly.

Tip 5:
Problem statement: To determine if transaction(s) was made on 25-JAN-2005

SELECT COUNT(*)
FROM Customer_Transaction
WHERE Transaction_Date = 25-JAN-2005;

Replace the above query with the following:

SELECT Cust_Last_Name, Cust_Mid_Name, Cust_First_Name
FROM Customer_Details
WHERE EXISTS (SELECT Cust_ID
FROM Customer_Transaction
WHERE Transaction_Date = 25-JAN-2005);
Reason: When COUNT (*) is used, it scans the entire table which is a time consuming
operation. If EXISTS is used, it checks whether a sub-query produces any rows of query
results. If the sub-query following the EXISTS returns at least one row, the EXISTS test returns
TRUE and stops further execution of the inner SELECT statement. It thus minimizes overhead.







Relational Database Management System

140 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

4.10. Summary

x The CREATE TABLE statement creates a table and defines its columns, PRIMARY KEY,
FOREIGN KEY(s) and other constraints like UNIQUE and NOT NULL

x The DROP TABLE statement removes a previously created table from the database

x The ALTER TABLE statement can be used to add a column to an existing table, modify
a column definition, add/drop a PRIMARY KEY, FOREIGN KEY and other constraints like
UNIQUE and NOT NULL

x The CREATE INDEX statement can be used to define indexes, which speeds up
database queries but add overheads to database updates

x If a SELECT Statement contains a WHERE, GROUP BY, HAVING and ORDER BY CLAUSE,
the order of execution is as follows:

o The WHERE clause is applied first, and the rows for which the search condition
in the WHERE clause returns a TRUE are retained.

o Next a GROUP BY clause is applied. It will group the rows selected by the
WHERE clause such that all the rows in each group have the same value for the
column in the GROUP BY clause.

o Next the HAVING clause is applied. It will retain row groups for which the
search condition in the HAVING clause returns a TRUE value.

o Lastly the query result is sorted in the order specified in the ORDER BY clause.

x DCL statements are used to control access to the database and the data in it. It is used
to enforce data security

Relational Database Management System

141 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5. On-Line Transaction Processing(OLTP)

5.1. Purpose
The biggest responsibility of the modern day information system is
x To simulate
42
the manual system
x To record every transaction that the organization undertakes
x Capture the day-to-day activities in the life cycle of an enterprise
x Help the organization to make quick and correct decisions based on the data
x Protect the data from unauthorized access
x Recovering the data in case of failures

Every organization requires some on-line application system or server to manage their daily
activities. These systems help in recording the transactions, the organization goes through
with their employees, customers and vendors. It is impossible to imagine an enterprise
without an on-line transaction system.

To build an efficient on-line transaction system, it is necessary to know how these systems
are built, the difficulties that may encounter and how to overcome them.

In this cyber-age, we need to know how to protect data from un-authorized usage and how to
recover the data in case of failures.
5.2. Transaction
A transaction is nothing but an interaction between different users, or different systems or
user and a system.

A transaction is a logical unit of work which takes the database from one consistent state to
another consistent state. While moving from one consistent state to another consistent state,
the database may pass through multiple discrete steps.

The database may go back to its original state at the end of the transaction (this happens in
the case of failure) or to the next logical step (this happens in the case of success).
Consider the following examples:

Example1: Drawing money from a bank account is one transaction. This transaction has
multiple steps.
x Insert the ATM card into the ATM machine

42
Simulate: To make a model.
Relational Database Management System

142 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Enter the PIN number
x Machine validates the PIN number
x Choose the appropriate menu for money withdrawal
x The ATM machine checks for the account balance to ensure that all banking business
rules are strictly followed.
x After doing all the checks, the ATM machine correctly dispenses out the exact amount
and updates the records accordingly

For any reason, if any of the above steps fail, then the transaction itself fails and the records
are not modified. This means that the system goes back to its original state and no change is
made to the system.

If ALL the steps have been successfully carried out, then the records are updated accordingly
and the system goes to a state where it is equipped for the next transaction.

Example2: A person is interested in transferring money from account Acc1 to account Acc2.
This transaction has following steps:

x Insert the ATM card into the ATM machine
x Enter the PIN number
x Machine validates the PIN number
x Choose the appropriate menu for money transfer
x Enter information of the beneficiary account
x The ATM machine checks for the account balance to ensure that all banking business
rules are strictly followed
x After verifying the balance, amount will be debited from account Acc1 and the records
are updated accordingly
x The deducted amount will be deposited to Account Acc2 and the records are updated
accordingly.

From the above steps, it is evident that the transaction may have n number of physical
steps. Transaction is successful only if ALL the steps are carried out successfully. A
transaction cannot be divided into smaller tasks.

The successful completion of the transaction is called as the COMMIT state. After this state,
changes are permanent and irreversible.

If ANY ONE step fails, the complete transaction fails and the system is taken back to the
original state which was present before the beginning of the transaction. This process of going
back to original state is called as ROLLBACK.
If the transaction rolls back, then the transaction reaches the ABORT state.

Relational Database Management System

143 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Figure 5-1: The transaction state transition diagram


Figure 5-1: Transaction state transition diagram
5.3. Transaction Systems
The transaction processing (TP) systems which mimic the real life system like Salary
processing, library, banking, airline, defence missile systems are basically divided into three
categories.
5.3.1. Batch Transaction Processing System
In the batch transaction processing system, a set of application programs
work on a set of input data to produce the desired output. In this process
there will be absolutely NO human interaction.

The best example for batch processing is the salary slip generation
application. The salary slip generation program may read the data like
employee name, grade, basic salary, date of joining, overtime for the
week, loss of pay, loans, recoveries, etc., from the database and
generates the salary slips. Mail will be sent to employees automatically, without any
intervention by the users.
5.3.2. On-line Transaction Processing System (OLTP)
In OLTP system, the user will be continuously interacting with the
system through a computer or a terminal on a regular basis. Some
examples for the online systems are the Air-line reservation, Railway
Active
Partially
completed
Failed
Aborted
Committed
While executing
After the final
statement
has been
executed
When normal
execution cant
proceed
After rolling
back and
restoration to
previous state
After
successful
completion
BEGIN


Relational Database Management System

144 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
reservation system, the Banking ATM machine, the Library application, etc.

In these kind of systems, the user needs to enter pre-defined inputs like flight number, train
number, date of journey, amount to withdraw, book access number, return date, etc.
Based on these pre-defined inputs, the system produces pre-defined outputs like the
confirmed tickets, or non availability of ticket, the issuing of library book for a certain
period, etc.
We shall study in-depth about the OLTP system in this chapter.

5.3.3. Real time Transaction Processing System
This system is the most complicated among all the transaction systems.
It is capable of handling unexpected inputs to unexpected outputs.
Examples: Air traffic control system or Missile defense system.
These systems are capable of handling a sudden change in the air
pressure, the temperature, the wind direction, the target speed and the
direction and can change their output based on these inputs.

These real time applications are similar to on-line systems except for the reason that the
input and the output to the system is not all the time pre-defined.
5.4. Transaction Properties
Every transaction system must possess the following characteristics:

Atomicity: Transactions should either completely succeed or completely fail. For any reasons,
if the system crashes before the completion of the transaction, the database state should not
change. The data, which was involved with the transaction, should be restored to the
previous consistent state in the database. The transaction is indivisible or undividable which
means it cannot be divided further into sub tasks.

Consistency: Transactions must preserve database consistency or stability. A transaction
transforms the database from one consistent state to another consistent state.

Isolation: A transaction's operations like INSERT, SELECT, UPDATE and DELETE should not
interfere with other transactions, or in other words it should not interfere with transactions
of other users of the database. The database system should reveal the individual changes
made by a transaction only after a transaction completed successfully.

Durability: Once a transaction completes (commits), the changes made to database are
permanent and available to all the transactions that follow it.


Relational Database Management System

145 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
These properties are called as ACID (derived from the first letter of the above characteristics)
properties.
5.5. Requirements for an OLTP System
In addition to ACID properties, OLTP systems have additional requirements to meet. In the
following sections, these requirements are discussed.
5.5.1. Integrity
All the data entering into the system must be validated for its correctness and adherence to
the organizations business rules. This is implemented in RDBMS through three types of
integrity checks.

Domain Integrity involves the implementation of business domain specific rules.

Example: If an organization decides not to hire an employee who is above 58 years and less
than or equal to 20 years. This can be implemented using the CHECK constraint.

CREATE TABLE Employee (
Emp_number NUMBER(6) CONSTRAINT pk_employee PRIMARY KEY,
Emp_Name VARCHAR2(25) NOT NULL,
Dept_number NUMBER(5) REFERENCES DEPARTMENT(Dept_number),
Date_of_Birth DATE NOT NULL,
Date_of_joining DATE DEFAULT sysdate,
CHECK ((Date_of_joining - Date_of_Birth) >= 20 AND
(Date_of_joining - Date_of_Birth) <= 58 ));

Entity integrity is implemented using the primary key constraint. Basically entity integrity
refers to the fact that a particular attribute uniquely identifies the physical entity.

Example: For each employee, the employee number uniquely identifies an employee. This
means employee number 0007 always represent details of employee Gopalakrishnan S. Hence
entity integrity enforces that primary keys cannot have either null values or duplicate
values.

Referential Integrity is implemented using the relationships between primary keys and
foreign keys of tables within a database. This ensures consistency of
data. Referential integrity demands that the value of every foreign
key present in every table is matched by the value of a primary key
in another table. This relationship is called as parent-child
relationship.
For example, every employee of the organization must belong to a
Relational Database Management System

146 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
valid department. Hence department number column of the employee table refers to the
department number column of department table.

One cannot insert into the department number column of employee table unless the value is
present in the department number column of department table (except for NULL values). If
NULL is a value in the foreign key column, it represents the unknown state and is not a
violation of referential integrity. In above example, department table is the parent table and
employee table is child table.

After enforcing referential integrity, the parent table primary key value might be deleted.
This will violate the referential integrity of the child table. This is because the child table
might still contain records containing the original parent table primary key value.

For example, employee table contains department number as the foreign key column, which
refers to the primary key of the department table. If any department number is deleted from
the department table, and if the employee table contain the corresponding department
number value, it leads to violation of referential integrity. To avoid such situations, the
following restrictions on the foreign key columns of child table can be put at the time of
creation.
x ON DELETE RESTRICT Do not allow to delete the parent table data if it is referred in
child table. For example if department number 10 is referred in employee table, then
do not allow to delete the department number 10 in department table. This is default
clause for Oracle
x ON DELETE SET NULL On delete of the parent data, set NULL value in child table
wherever the deleted data is referred. For example if department number 10 is
referred in employee table, and it is deleted in department table, set NULL values in
corresponding department number columns of employee table (wherever department
number 10 was referred)
x ON DELETE SET DEFAULT Set the default values to child records on deletion of
parent records. For example if department number 10 is referred in employee table,
and it is deleted in department table, set default values (say 00) in employee table
wherever department number 10 was referred
x ON DELETE SET CASCADE Delete all the child table records from child table on
deletion of parent record in parent table. For example if department number 10 is
referred in employee table, and it is deleted in department table delete all the
records in employee table wherever department number 10 was referred

Relational Database Management System

147 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5.5.2. Concurrency
Concurrency means allowing different transactions to execute simultaneously. The biggest
challenge of having a concurrent system is maintenance of consistency in the system, in spite
of the multiple transactions executing simultaneously.

Consider a simple example to know how concurrency affects consistency.

Assume there are only three seats available between Bangalore and Singapore on a particular
day and around ten people are trying to book the ticket on the same flight, the same day.

The system allows transactions to occur simultaneously and these ten people can book those
three seats from different locations. These ten people get a ticket each when there were just
three seats available! This is a BIG violation of consistency or integrity of the system.

In next section listed all the possible consistency problems with transactions, occurring
simultaneously.

5.5.2.1. Lost Update
Let us understand lost update concept using a banking application. Assume Person named
Hilary holds an account in the Capital Bank, San Jose branch with USD 1500 account balance.

One fine day she deposits USD 2000 by cash to her account.

While the branch clerk is updating her account at San Jose, her husband Kevin deposits USD
1800 to her account almost at the same time in San Francisco.

Clerk in San Francisco is not aware of the other transaction and adds this amount to her
balance. Her balance is now updated to USD 3300 thereby ignoring her last deposit of USD
2000 at San Jose. Basically we lost one update which happened on her account!

This problem has occurred because two transactions are working on the same resource
without knowing each others activity.

The Figure 5-2 gives the snapshot of main memory
43
in which these two transactions operating
on Hilarys account.

Note that:
x This diagram is not RDBMS table. It is a snapshot of main memory at that point of time

43
Main Memory: Please recall CHSSC concepts - All the read and write operations happen in main
memory before they are written into hard disks.
Relational Database Management System

148 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Balance column indicates the current available balance of Hilary when two
transactions are concurrently running








Time Hilary's Deposit Balance Kevin's Deposit
10:22 Read Balance (1500) 1500
10:23 Balance=1500+2000
10:24 Read Balance (1500)
10:25 Write new Balance (3500) 3500
10:26 Commit
10:27 Balance=1500+1800
10:28 3300 Write new Balance (3300)
10:29 Commit
Figure 5-2: Lost Update
5.5.2.2. Dirty Read
Let us re-visit the same example discussed for lost update case with slightly different scenario
as shown in Figure 5-3.

This time, Hilary again tries to deposit USD 2000 to her account but due to some technical
reasons the transaction will not be successful. We know that a transaction can be either in
the prior state or a new state after the completion of the transaction. So, her deposit
transaction is aborted and her balance is rolled back to the original value USD 1500.

But unfortunately Kevins transaction read the balance value as USD 3500 in main memory
before the rollback of previous transaction. Due to this problem, Kevins transaction occurring
in San Francisco read the dirty data. This kind of problem is known as dirty read.
Time Hilary's Deposit Balance Kevin's Deposit
11:22 Read Balance (1500) 1500
11:23 Balance=1500+2000
11:24 Write new Balance (3500) 3500
11:25 Read Balance (3500)
11:26 Rollback
11:27 Balance= 3500 + 1800
11:28 5300 Write new Balance (5300)
Relational Database Management System

149 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
11:29 Commit
Figure 5-3: Dirty Read
5.5.2.3. Incorrect Summary
Let us consider another scenario where Hilary wants to transfer amount of USD 500 to her
sister Evelyns account in the same branch. After deducting USD 500, Hilarys balance will be
USD 1000.

Evelyns account balance was USD 1500 before and now will become USD 2000 with the
addition of USD 500.

Almost at the same time, the bank branch manager starts another transaction to calculate the
total sum available in bank through customer deposits.

This program calculates the sum by reading Hilarys balance amount as USD 1500 (before
deduction of transfer amount) and Evelyns balance as USD 2000 (after addition of transfer
amount). This program concludes that the sum is USD 3500 (sum of Hilarys balance amount
and Evelyns balance amount) but actually it is only USD 3000. This problem is known as
incorrect summary. Snapshot of main memory for these transactions are shown in Figure 5-4.

Time Hilary's Transfer Balance Summary Transaction
12:22
Read Hilarys Balance
(1500) 1500 Sum = 0
12:23 Balance=1500-500 Read Hilary's Balance (1500)
12:24 Write new Balance (1000) 1000
12:25 Sum=Sum + Balance (1500)
12:26
Read Evelyn's Balance
(1500) 1500
12:27 Balance=1500 + 500
12:28 Write new Balance (2000) 2000
12:29 Commit 2000
12:30 Read Evelyn's Balance (2000)
12:31 Sum=Sum + Balance (3500)
12:32 3500 Write Sum (3500)
12:33 Commit
Figure 5-4: Incorrect Summary
5.5.2.4. Phantom Record
Let us consider the snapshot of two different transactions which are running simultaneously
almost at the same time as shown in Figure 5-5.

Relational Database Management System

150 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
One transaction is counting the number of accounts held by the bank. Another transaction is
creating new accounts. Though two accounts are created and committed before completion
of Total Accounts transaction, these accounts (Simons account and Mikes account) are
missed by Total accounts transaction for counting.

These newly inserted rows appear as phantom to the Total Accounts transaction,
inconsistently appearing and disappearing. This is called Phantom record because Phantom is
considered as an invisible ghost as in the case of newly inserted rows.

Relational Database Management System

151 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Time Create account Total Accounts
13:22
13:23
13:24
Read the total number of accounts
in the bank as total
13:25
Create account for Simon
with a deposit of 500
13:26
Create account for Mike
with a deposit of 1000
13:27
13:28 Commit
13:29 Write Total
13:30 Commit
Figure 5-5: Phantom Record
If we observe these problems closely, all these problems are because of interleaving of the
transactions. The solution to overcome these problems would be to make every transaction
follow each other. This is called as serialization. Serialization of transactions can be achieved
by setting following rules on transactions.

1. If any row is being modified, then do not allow any other transaction either to read or
update/delete that row until the first transaction completes.
2. If a transaction is reading a particular row, prevent other transactions from making any
changes to that row until the first transaction completes.
3. If a transaction is reading some data, do not allow any other transaction to insert new
rows into the same table until the first transaction completes. This will avoid problems
like Phantom records.

Serialization can be achieved using Locking or Time Stamping techniques.
5.6. Locks
Locking is a technique to have a controlled access to the resources like a database,
tablespace
44
, table, rows and columns. While these resources are put under lock by some
transaction, other transactions have very restricted or no access to these resources,
depending on the locking mode.

Locking is one of the most widely used techniques in commercial RDBMS products to achieve
consistency while supporting concurrency of transactions.


44
Tablespace: The logical part of the database which represents collection of the structures like
tables, etc created by various users.
Relational Database Management System

152 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Basically resources can be locked either in Shared (S) mode for Read purpose or in Exclusive
(X) mode for Update, Delete or Insert purpose.
5.6.1. Shared Lock (S)
This locking technique allows a higher transaction concurrency.
When a particular resource like a table or a row is locked in the
shared mode by one transaction, all other transactions can
perform the read operation on the locked resource, but no
updates or modifications are possible by other transactions.
Usually SELECT operation takes the S lock on resources.
5.6.2. Exclusive Lock (X)
This is the most restrictive lock. Once a transaction puts the X
lock on a particular resource, no other transaction can put any
kind of lock on this resource. This resource is exclusively
reserved for the first transaction and no other transaction can
use it for read or write operation. Hence this X lock allows the
least concurrency.

Usually INSERT/UPDATE/DELETE operations put the X lock on
resources before writing/modifying/deleting operations.

The Figure 5-6 explains the compatibility of these locks. This figure can be interpreted as:
If transaction T1 locks the resource A (database/tablespace/table/row/field) in shared(S)
mode, another transaction T2 can also lock the same resource A, in shared(S) mode.

If transaction T1 locks the resource A (database/tablespace/table/row/field) in shared (S)
mode, another transaction T2 cannot lock the same resource A, in exclusive (X) mode until T1
releases its S lock on the resource A.

If transaction T1 locks resource A in X mode no other transaction can lock resource A in any
other (S or X) mode until T1 releases its X lock on the resource A.

Transaction T1
T
r
a
n
s
a
c
t
i
o
n

T
2

A X S
X
8 8
S
8 9
Relational Database Management System

153 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Figure 5-6: Share - Exclusive Lock Matrix
In Figure 5-6, symbol represents incompatibility of the lock and symbol epresents incompatibility of the lock and re 8 9
represents compatibility of the lock.
5.7. Granularity of Locking
Granularity of locking refers to the granular level at which a resource can be locked. Take for
example a database. A database is made up of multiple tablespaces. Each tablespace hosts
multiple tables. Within a table there are multiple rows and fields as shown in Figure 5-7. It is
possible to lock a

x Database
x Tablespace
x Table
x Row
x Field

If RDBMS application is capable of locking a field of a table explicitly, then the granularity of
locking is at field level. If it can lock only up to the row level, the locking granularity of that
RDBMS product is row level. Thus, the higher the granularity of locking, the higher will be the
concurrency.

In above case database, tablespace, table and row are all the ancestors of the field. Similarly
database, tablespace are ancestors of the table.

Tablespace, table, row and fields are descendants of database. In the same way rows and
fields are descendants of table.

S and X locks alone cannot achieve complete concurrency. This is illustrated below.
Let us consider the following scenario and analyze the concurrency that can be possible.

Assume in a banking application transaction called BalanceUpdate has locked Row R2 of table
ACC_DETAILS in X mode for updating the account balance. Because of this X mode lock, no
other transactions can acquire either S or X lock on row R7 or any of its fields.

Relational Database Management System

154 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 5-7: Granularity of Locking
Let us assume following scenarios:
x A transaction called BalanceEnquiry requires a lock on first row of table ACC_DETAILS
in S mode
x A transaction called SummaryReport requires a lock on complete table ACC_DETAILS in
S mode

In ideal condition:
x System should allow to lock first row of ACC_DETAILS table for transaction
BalanceEnquiry
x System should prevent transaction SummaryReport from acquiring a lock on table
ACC_DETAILS

Transaction SummaryReport should be prevented from acquiring a lock on table ACC_DETAILS
because row R7 of table ACC_DETAILS is already locked by transaction BalanceUpdate in X
mode.

Database
DB_BANK_DETAILS
TableSpace
TS_CUST_DETAILS
TableSpace
TS_LOAN_DETAILS
TableSpace
TS_BRANCH_DETAILS
Table
CUST_MAST
Table
ACC_DETAILS
Table
BRANCH_MAST
Table
LOAN_MAST
Table
INTEREST_MAST
Table
LOAN_DETAILS
ROWLOCKED IN
EXCLUSIVE MODE BY
TRANSACTION
BalanceUpdate
R7
Relational Database Management System

155 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
If transaction SummaryReport is allowed to acquire S lock on table ACC_DETAILS, we may
encounter the dirty read problem. On same lines no S or X locks are allowed by any other
transaction on the tablespace TS_CUST_DETAILS or database DB_BANK_DETAILS because row
R7 is part of TS_CUST_DETAILS and database DB_BANK_DETAILS.

In other words, although row R7 of table ACC_DETAILS was locked explicitly in X mode, the
table ACC_DETAILS, the tablespace TS_CUST_DETAILS and the entire database
DB_BANK_DETAILS, was locked implicitly in the X mode to avoid any parents of R7 being
locked by some other transactions.

This implicit locking of complete database now avoids lock on row R1 of Table ACC_DETAILS
by transaction BalanceEnquiry. This is serious threat to concurrency of the transactions.

Let us look at the solution for this problem in next section.
5.8. Intent Locking
In Intent locking only the intention of locking is expressed at the ancestor node of the
required resource and the resource at the lower level is locked explicitly only when required.

Consider the example discussed in Section 5.5.2.1. In the earlier case, it was required to lock
row R7 of table ACC_DETAILS in X mode explicitly but all its ancestors were implicitly locked
in the same mode (Refer Figure 5-7). This has reduced the concurrency considerably.

To overcome this concurrency issue it is necessary for a transaction to express only the
intension of locking the database DB_BANK_DETAILS, the tablespace TS_CUST_DETAILS and
the table ACC_DETAILS in the X mode and in turn lock the row R7 explicitly in X mode. This
concept is called as intent locking.

Some other transactions still can express their intension of exclusive or shared locking on
database DB_BANK_DETAILS or tablespace TS_CUST_DETAILS or table ACC_DETAILS and
explicitly lock any other row other than Row R7, either in X or S mode.

This intent locking mechanism not only increases concurrency but also stops the implicit
locking of ancestral resources.

Hence intent locking is called as Parent-Child locking. You express your intension of locking at
parent level and lock child resource in explicit mode.

Intent locking is classified as Intent Shared (IS) locking and Intent Exclusive (IX) locking.

Relational Database Management System

156 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5.8.1. Intent Share (IS)
This lock has the intention to share the requested node. This also allows the requester to
explicitly lock the descendants of this node in S or IS mode.

Example: Transaction SummaryReport explained in section 5.5.2.1 can lock entire database
DB_BANK_DETAILS and tablespace TS_CUST_DETAILS in IS mode and ACC_DETAILS table
explicitly in S mode.
5.8.2. Intent Exclusive (IX)
This lock has the intention to have exclusive access to the requested node and allows the
requester to explicitly lock the descendants in IX or X modes.

Example: Transaction BalanceUpdate explained in section 5.7 can lock database
DB_BANK_DETAILS, the tablespace TS_CUST_DETAILS and the table ACC_DETAILS in IX mode
and lock the row R7 explicitly in X mode.
5.8.3. Shared Intent Exclusive (SIX)
The Combination of Shared and Intent exclusive lock is referred to as Shared Intent Exclusive
Lock or SIX Lock.
A share and intent exclusive lock (or SIX lock, pronounced as the separate letters S I X rather
than like the number six) indicates an S lock at the current level plus an intention to insert,
delete, update data at a lower level of granularity. Think of a SIX lock as an S lock plus an IX
lock as shown in Figure 5-8. Only one transaction can be granted a SIX lock on a table at a
time.
IX Lock
on Rows
DATABSE
DB_BANK_DETAILS
TABLE
ACC_DETAILS
S LOCK
ON
ACC_DETAILS

Figure 5-8: SIX Lock
A SIX lock on a table indicates an intention to read all of the rows in the table and to
delete/update/insert to a few. The S lock in the SIX lock at the table level covers all of the
rows. Rows that are updated will obtain X locks, but only after IX intention locks have been
obtained on the pages
45
that contain them.


45
Page: It is part of a table. Usually in one page multiple rows are stored.
Relational Database Management System

157 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
A SIX lock is stronger than a S lock or an IX lock. When a transaction obtains a SIX lock on a
table, only that transaction will be able to modify data in the table. In this respect, a SIX lock
slightly resembles an X lock. With a SIX lock, however, other transactions that want to read
some of the data (read data at the row or page level and obtain an IS lock on the table) are
allowed to proceed, so concurrency is better than with an X lock. Lock mode compatibility
will be described in greater detail later in this module.

If other transactions obtain S lock on row of a table or S lock on page of a table the SIX
transactions wants to modify, the SIX transaction must wait until the S locks are released
before it can modify the data.

Other transactions that want to read all of the data (obtain an S lock on the table) or that
want to write to any portion of the data are not allowed to proceed until the SIX lock is
released.

A SIX lock is also called a share sub-exclusive lock.

A complete compatibility matrix of these locks is shown Figure 5-9. Note that in Figure 5-9
symbol represents incompatibility of the lock and symbol represents compatibility of
plete compatibility matrix of these locks is shown Figure 5 9
represents incompatibility of the lock and symbol re 8 9
the lock.

Transaction T1
T
r
a
n
s
a
c
t
i
o
n

T
2

A X S IS IX SIX
X
8 8 8 8 8
S
8 9 9 8 8
IS
8 9 9 9 9
IX
8 8 9 9 8
SIX
8 8 9 8 8
Figure 5-9: Complete Lock Matrix

This lock matrix can be interpreted as:

Relational Database Management System

158 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
If resource A (tablespace/table/row) is locked in X mode by transaction T1, no other
transactions can lock resource A in any mode.

If resource A is locked in S mode by transaction T1, another transaction say T2 can lock the
same resource A in S or IS mode but it can not lock in IX or SIX or X mode.

If resource A is locked in IS mode by transaction T1, another transaction say T2 can lock the
same resource A in S or IS or IX or SIX mode but it can not lock in X mode.

If resource A is locked in IX mode by transaction T1, another transaction say T2 can lock the
same resource A in IS or IX mode but it can not lock in S or X or IX or SIX mode.

If resource A is locked in SIX mode by transaction T1, another transaction say T2 can lock the
same resource A in IS mode but it can not lock in S or X or IX or SIX mode. SIX lock is
combination of S and IX. Hence SIX lock is compatible with that lock which has the common
compatibility with S and IX locks. Since S and IX are together compatible with IS lock, SIX lock
is compatible with IS lock only.

The biggest problem with locking technique is that it may lead to Deadlock.

5.8.4. Case study for Intent Locks
Objective: To study about the compatibility of locks.

Assumptions:
A database db has two tables (files) f1 and f2.

File f1 has pages p11, p12, and p13.
File f2 has pages p21, p22 and p23.

Page p11 has 2 records, r111 and r112.
Page p12 has 2 records, r121 and r122.
Page p13 has 2 records, r131 and r132.

Page p21 has 2 records, r211 and r212.
Page p22 has 2 records, r221 and r222.
Page p23 has 2 records, r231 and r232.

Consider the following situation:

Transaction T1 wants to update record r111 and record r211.
Relational Database Management System

159 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Transaction T2 wants to update all records on page p12.
Transaction T3 wants to read record r112 and the entire file f2.
Assume that transaction T4 and T5 starts only after all the other transactions have
committed.
Transaction T4 wants to modify r111 and transaction T5 wants to read record r112.

Problem statement: Specify
The locks which will be acquired by the transactions
The order in which the locks will be acquired by the transactions
The order in which the locks will be released by the transactions

Solution:
T1 T2 T3
IX(db)
IX(f1)
IX(db)
IS(db)
IS(f1)
Is(p11)
IX(p11)
X(r111)
IX(f1)
X(p12)
S(r112)
IX(f2)
IX(p21)
X(r211)
Do the updation
Unlock(r211)
Unlock(p21)
Unlock(f2)

S(f2)
Unlock(p12)
Unlock(f1)
Unlock(db)
Unlock(r111)
Unlock(p11)
Unlock(f1)
Unlock(db)
Unlock(r112)
Relational Database Management System

160 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Unlock(p11)
Unlock(f1)
Unlock(f2)
Unlock(db)
Assumption: Transaction T4 and transaction T5 start after the transactions T1, T2 and T3
have committed and released the locks.

T4 T5
SIX(db) IS(db)
SIX(f1) IS(f1)
IX(p11) IS(p11)
X(r111) S(r112)
Unlock(r111) Unlock(r112)
Unlock(p11) Unlock(p11)
Unlock(f1) Unlock(f1)
Unlock(db) Unlock(db)

Learnings from the above case study:
x Locks are acquired from the root to the place (node) one wants to lock. (top to
bottom)
x S or X mode locks are applied only at very fine granularity (only on the specific node
that the user wishes to read or update)
x Locks are released in bottom to top fashion
x Check for the compatibility of the locks in cases where a transaction already holds a
lock on the node and another transaction wants to acquire a lock on the same node
5.9. Deadlock
Deadlock is a situation where one transaction is waiting for
another transaction to release the resource it needs, and vice
versa. Each transaction will be waiting forever for the other to
release the resource. This is shown in the Figure 5-10:

Time
Transaction
BalanceUpdate
Transaction
LoanUpdate
10:22 Lock ACC_DETAILS Lock LOAN_DETAILS
10:23 Update ACC_DETAILS Update LOAN_DETAILS
10:24
Try for lock on
LOAN_DETAILS Try for lock on ACC_DETAILS
10:25 Wait for lock Wait for lock
10:26 Wait for lock Wait for lock
10:27 Wait for lock Wait for lock
Relational Database Management System

161 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
10:28 Wait for lock Wait for lock
Figure 5-10: Deadlock
In above diagram transaction BalanceUpdate locked the table ACC_DETAILS in X mode at time
10:22 and waiting to acquire lock on table LOAN_DETAILS for some updation. But transaction
LoanUpdate already has the X lock on table LOAN_DETAILS and waiting for table ACC_DETAILS
which is locked by transaction BalanceUpdate. These two transactions will be waiting
infinitely for each other to release the locked resources. This is known as deadlock.

If a deadlock occurs, one of the participating transactions must be rolled back to allow the
other to proceed. There are various methods to choose which transaction to roll back when a
deadlock is detected. Usually rollback action is decided on:
x How long the transactions have been running
x Data already updated by the transaction
x Data that remains to be updated by the transaction

There are schemes available for preventing deadlock. Most of the RDBMS products allow
deadlocks to occur and resolve them, when they are detected.

5.10. Security
Security is one of the best implemented strategies in RDBMS.

Security is implemented in RDBMS packages using:

1. USERID and PASSWORD to restrict the users from acquiring an un-authorized access
2. Grant and Revoke statements (Data Control Language) to provide restricted access
control to resources like Tables
3. Database views to restrict access to sensitive data
4. Encryption
46
of data to avoid un-authorized access
5.11. Recovery
A database might be left in an inconsistent state by:
x An Application error
x Power failure
x O/S or Database failure
x Network failure
x Hardware or Media failures

46
Encryption: The process of manipulation of data to prevent accurate interpretation by all but those
for whom the data is intended.
Relational Database Management System

162 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

If the database is in an inconsistent state, it is necessary to restore it to a consistent state.
Recovery process can be achieved either using log files or backups of the database.

The simplest backup technique is Dumping. The entire content of the database are backed
up on to secondary devices like tapes on a regular basis. This backing up operation must be
performed when the state of the database is consistent. Therefore no transactions which
modify the database can be running during this backup process.
This dumping process can take a long time to perform and one may not be able to stop
transactions for a longer time in the production environment. Hence it cannot be performed
as often as one would like to. This type of back-up is called cold back up and is usually done
on a periodic basis like once a week or once a month at night when transactions in system are
at their minimal threshold. These tapes can then be used in case of complete hard disk
failures. This is showed in the Figure 5-11.

Figure 5-11: Database Backup
If the database back up is done while transactions are running, it is called as Hot backups.
Usually hot backups are incremental in nature. This means only modified data since the last
backups are captured. Usually it takes less time and is done on a daily basis.
The hot and cold backups are useful only in the case of media or hard disk failure. This back
up cannot be used for

x Un-planned power shutdown
x Sudden breakdown in O/S or database
x Memory failure

These kinds of failures are called as instance failures. Instance failures can be handled by
making use of transactional log or redo log files. These are further explained in the following
sections.
Main Memory
Data
File 1
Database Harddisk
Back-up
Tape
Device
Log
File 1
Log
File 2
Data
File 2
D
a
t
a
T
r
a
n
s
f
e
r
Applying Log
Log File Hard Disk
Log Information
Log file contains
Trans ID
Timestamp
Old Value
New Value
Relational Database Management System

163 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5.12. Transaction Log
Transaction log or the journal log or redo-log is a physical file. Usually the Transaction ID, the
time stamp of the transaction, the old value and the new values of the data are stored in
transaction log file. Therefore the RDBMS is aware of the state of the database i.e. before
and after image of data after each transaction. Every database is returned to a consistent
state and the log may be truncated to remove committed transactions.

Normally there are two techniques used to maintain the log files.
5.12.1. Deferred update
Deferred update, or NO UNDO/REDO, is an algorithm to support transaction failures owing to
O/S, application, power, memory and machine failures.

While a transaction runs, not updates/alterations made by that transaction are recorded in
the database but captured only in the log files.

On commit, data changes are applied to the database using the log files. This process is called
as Re-doing.

On rollback, data changes which are captured in the log files are discarded and no changes
are made to the database.

On system restart, due to any of the above mentioned reasons if transaction fails and it is not
committed, contents of the log files are discarded and the transaction will be restarted. If it
is committed before crashing then after restart, the log file contents are applied to the
database.
Sequences of deferred update are explained in Figure 5-12 and Figure 5-13.

Time Transaction Disk Before Disk After Log
10:22 Start 6 6
Start
Timestamp
10:23 Read field F1 6 6
10:24 Update F1 to 23 6 6 (6,23)
10:25 Read field F2 12 12
10:24 Update F2 to 45 12 12 (12,45)
10:25 Commit F1=23, F2=45 Commit
Relational Database Management System

164 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Figure 5-12: Deferred Update
From Figure 5-12 it is evident that when transaction updates the field F1 to 23 and field F2 to
45 in main memory log file will have old value and new value of the field. Database disk file
still holds the old values. Contents of database are modified using log file only after
transaction commits. The process or re-doing the transaction from the log is sometimes
referred as Rollforward. Disadvantage of deferred update technique is increased time of
recovery in case of system failure.

Figure 5-13: Sequences of Deferred Update
5.12.2. Immediate Update
Immediate update, or UNDO/REDO, is another algorithm to support transaction failure owing
to O/S, application, power, memory or machine failure.

START
Update Record in ate Reco
Memory
Has System Has System
crashed?
NO
NO
YES
YES
Update in Logs
Restart System
Is transaction transaction
committed?
Do you find Do you find
commit in log?
YES
Make changes Make changes
permanent in permanent in
database using ase
log
NO
Discard Log data
STOP
Relational Database Management System

165 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
While a transaction runs, updates/alterations made by that transaction can be written to the
database directly. However, the original and the new data being written must both be stored
in the log BEFORE writing it to the database.
On commit, all the changes to the database are made permanent and log contents are
discarded.
On rollback, using the log entries, old values are restored. All the changes which that
transaction has made to the database disk are discarded. This process is called as Un-doing.
Database changes are made permanent once the system restarts, after the crash for
committed transactions. The original values are restored using the log files for uncommitted
transactions.

Transaction snapshot is shown in Figure 5-14 and sequences of immediate update process are
shown in Figure 5-15.
Time Transaction Disk Before Disk After Log
10:22 Start 6 6
Start
Timestamp
10:23 Read column F1 6 6
10:24 Update F1 to 23 6 23 (6,23)
10:25 Read Column F2 12 12
10:24 Update F2 to 45 12 45 (12,45)
10:25 Commit F1=23, F2=45 Commit
Figure 5-14: Immediate Update
From Figure 5-14 is evident that when transaction updates the field F1 to 23 and field F2 to
45 in main memory, log file will have old value and new value of the field. Simultaneously
database disk file also modified to reflect the new values even before transaction commits.
For any reason if transaction fails to commit, contents of database disk files values are
restored to old values using log file. The process of undoing changes using the log files is
frequently referred to as rollback. Disadvantage of immediate update technique is frequent
I/O operations while the transaction is active.

Relational Database Management System

166 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Figure 5-15: Sequences in Immediate Update
5.12.3. Check-Points
Usually in commercial RDBMS applications neither the deferred updates nor the immediate
updates are used because of their disadvantages. In these commercial RDBMS applications,
databases are updated at fixed intervals of time; say every 2 minutes, irrespective of the
START
Update Record in ate Recor
Memory
Has System as System
crashed?
NO
NO
YES
YES
Update in Logs
Update Database te Datab
on disk
Restart System
Is transaction transaction
committed?
Do you find Do you find
commit in log?
YES
Make changes ake change
permanent
NO
Discard Log data
STOP
Undo changes in Undo changes in
database using se u
log
Undo changes in Undo changes in
database using se
log
Relational Database Management System

167 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
transaction commit/uncommit state. Updating the database at fixed intervals of time is
called as check-pointing.
At the check point time, the contents of the log files are applied to the database.
Transactions may be committed or non-committed at the check point. Later if the transaction
rolls back, the database is restored to the original state using the log files. As already
explained, this process is called as Un-doing. If the transaction commits, changes are made
permanent, again using the log files. This process is called as Re-doing. Hence check point
based updates use both the Roll forward and the Rollback mechanism. To some extent this
technique speeds up the recovery mechanism during instance failures.

For example consider the snapshot of the database shown in Figure 5-16.


Figure 5-16: Checkpoint Scenario
Let us analyze the situation on a system restart, after an unfortunate crash

1. Transaction T1 committed before check point and also wrote to the database hence no
changes are required in the database.
2. Transaction T2 committed before system failure but partially wrote to the database at
the check point. After restart, other parts of T2 should be written to the database
using the log files.
Memory Memory
Crash Crash
Memory Memory
Crash Crash
Memory Memory
Crash Crash
CC
Ts Ts
Tc Tc
Tf Tf
C B A
T1
C B A
T3
B A
T4
Database
B A B A
T2
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T2: C T2: C
T3: ABC T3: ABC
B A BB AA
T5
T5: AB T5: AB
Check Check
Point Point
Check Check
Point Point
Log File Log File
Database Database
Start Start
Failure Failure
Relational Database Management System

168 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3. Transaction T3 began after the checkpoint hence contents were not written to the
database but successfully completed before crash. Complete transaction needs to be
written to the database using the log file.
4. Transaction T4 began before the checkpoint hence part of T4 was written to the
database. The unfortunate crash happened before T4 committed. Hence it is required
to undo the changes to the database using the log files and restore it to the old values.
5. The contents of the transaction T5 in the log files needs to be discarded and the
transaction needs to be re-started as this transaction started after checkpoint and
hence no traces of this transaction exist in the database.

Recovery scenario is explained in the Figure 5-17:


Figure 5-17: Recovery from Crash

Note: System can not be restored using the log files for hard disk failure(s).
Only backup of data files and log files can save databases from media failures.



Examples:
1. Recovery using deferred update in a single-user environment

Consider the read and write operations of two transactions T1 and T2 given below:

Database
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T2: C T2: C
T3: ABC T3: ABC
T5: AB T5: AB
Log File Log File
Database Database
Start Start Check Point Check Point Memory Crash Memory Crash
T1 : No Changes T1 : No Changes
T2 : Redo C T2 : Redo C
T3 : Redo ABC T3 : Redo ABC
T4 : Undo AB T4 : Undo AB
T5 : Discard AB T5 : Discard AB
C C
T3 : ABC T3 : ABC
T1 T1
Commits Commits
C
C B A
T1
C B A
T3
B A
T4
B A
T2
B A
T5 T5
T2 T2
Commits Commits
T3 T3
Commits Commits
T4 T4
No Commit No Commit
T5 No T5 No
Commit Commit
Database
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T1: ABC T1: ABC
T2: AB T2: AB
T4: AB T4: AB
T2: C T2: C
T3: ABC T3: ABC
T5: AB T5: AB
Log File Log File
Database Database
Start Start Check Point Check Point Memory Crash Memory Crash
T1 : No Changes T1 : No Changes
T2 : Redo C T2 : Redo C
T3 : Redo ABC T3 : Redo ABC
T4 : Undo AB T4 : Undo AB
T5 : Discard AB T5 : Discard AB
C C
T3 : ABC T3 : ABC
T1 T1
Commits Commits
C
C B A
T1
C B A
T3
B A
T4
B A
T2
B A
C
C B A
T1
C B A
T3
B A
T4
B A
T2
CC
C B A
T1
C B A
T3
B A
T4
B A B A
T2
B A BB AA
T5 T5
T2 T2
Commits Commits
T3 T3
Commits Commits
T4 T4
No Commit No Commit
T5 No T5 No
Commit Commit
Relational Database Management System

169 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

T1 T2
read_item(A) read_item(B)
read_item(D) write_item(B)
write_item(D) read_item(D)
write_item(D)
The system log at the time of crash is as given below:

<start T1>
<write_item, T1, D, 20>
<commit T1>
<start T2>
<write_item, T2, B, 10>
<write_item, T2, D, 25> ----------------System crash

Solution: Transaction T1 commits before the system crash. The operations of transaction
T1 are therefore redone (redone means contents of the log files are applied to the data file).
The entries in the log corresponding to transaction T2 are ignored by the system because T2 is
not committed.

2. Recovery using check points (concurrent transactions considered)

Consider the read and write operations of transactions T1, T2, T3 and T4 given below:

T1 T2 T3 T4
read_item(A) read_item(C) read_item(A) read_item(C)
read_item(B) write_item(C) write_item(A) write_item(C)
write_item(B) read_item(B) read_item(E) read_item(A)
write_item(B) write_item(E) write_item(A)

The system log at the time of crash is as given below:

<start T1>
<write_item, T1, B, 20>
<commit T1>
<checkpoint>
<start T4>
<write_item, T4, C, 15>
<write_item, T4, A, 20>
<commit T4>
<start T2>
<write_item, T2, C, 12>
Relational Database Management System

170 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
<start T3>
<write_item, T3, A, 30>
<write_item, T2, E, 25> ----------------------- System crash


Solution:

Transaction T1 committed before the checkpoint. Therefore no operation is performed on
account of it.

Transaction T4 is redone because its commit point is after the last system checkpoint.

Transaction T2 and T3 are ignored because they did not reach their commit points.
5.13. Summary
x All transactions should be:
o Atomic
o Consistent
o Isolated
o Durable

x OLTP applications should ensure:
o Integrity
o Concurrency
o Security
o Recovery

x Integrity of the RDBMS application can be maintained using:
o Entity Integrity
o Referential Integrity
o Domain Integrity

x While allowing Concurrency one may face problems in implementing consistency.
Following are the four major problems encountered:
o Lost Updates
o Dirty Read
o Incorrect Summary
o Phantom records

x Consistency can be implemented using serialization techniques like:
o Locking
Relational Database Management System

171 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
o Time-stamping

x Locking technique leads to the dead lock problem

x Time stamping technique leads to many rollback problem

x Security is implemented in RDBMS using:
o User ID / Password
o Grant and Revoke commands
o Views
x Two types of recovery mechanism can be implemented in the RDBMS application:
o Media failures using back-up strategy
o Instance recovery using transaction log files
x Two types of backups are possible:
o Cold backup
o Hot backup
x Three types of updates to the database are possible, using the transaction log files:
o Immediate update
o Deferred update
o Check-point based updates






















Relational Database Management System

172 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m




6. Introduction to PL/SQL

6.1. Need for PL/SQL
SQL is a flexible, efficient fourth generation language
47
. It has features designed within it
which can create, manipulate and control the relational database. SQL lacks programming
language capabilities.

PL/SQL is a technology inbuilt within Oracle which provides all the features available in SQL
with the procedural logic implementation capabilities expected out of any programming
language. Hence a programmer can loop through a set of records in the underlying table and
he can manipulate one record at a time, applying the intended business logic on it.

PL/SQL is considered as an extension to SQL developed by Oracle. As Oracle database can be
hosted in heterogeneous platforms such as UNIX, Windows, a code written in PL/SQL can also
be hosted in all the above mentioned platforms. This introduces a kind of platform
independence.

In PL/SQL a group of DML statements can be combined together and executed as a transaction
which does some logical unit of work, thereby transforming the database from one consistent
state to another consistent state. Changes introduced by the set of statements can be
permanently saved to the database (committing the transaction) or can be rolled back
(undoing the transaction).

If the set of statements submitted for execution fails in the middle of the transaction due to
some reasons like, system crash, memory crash or unplanned power shutdown, then the
database is automatically restored to its earlier consistent state.

PL/SQL also allows a programmer to write DDL, DML and DCL statements. But we will be
discussing how to write DML and DCL statements in PL/SQL, and not about writing DDL
statements in PL/SQL as it is beyond the scope of this course-ware material.


47
A 4GL is typically non-procedural and designed so that end users can specify what they want without
having to know how the computer will process their requirement
Relational Database Management System

173 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Moreover an equivalent counterpart of PL/SQL in SQL-Server is T-SQL. Similarly every RDBMS
in turn might have a PL/SQL component. We have selected PL/SQL in Oracle technology for
our discussion.
6.2. PL/SQL Architecture
PL/SQL technology is available in an Oracle server environment as well as in some of the
Oracle application development tools such as Oracle Forms and Oracle Reports environment.
Both the environments expect a valid PL/SQL block to be submitted. All the procedural
statements are executed by the procedural statement executor module present in the PL/SQL
engine and SQL statements are executed by the SQL statement executor present in the Oracle
server.
6.3. PL/SQL block structure
Every PL/SQL block written has the following structure, with DECLARE and EXCEPTION
optional keywords. BEGIN and END mandatory keywords. Optional keywords [ ] are enclosed
within square brackets.


[ DECLARE ]

BEGIN

[ EXCEPTION ]

END;

6.4. Comments in PL/SQL
Two different ways of placing comments in PL/SQL:
1. Single line comment (--)
2. Multi line comment (/* */)

Single line comment starts with a double hyphen (--) symbol, appearing at the
beginning of comment.

Multiline comments are also possible in PL/SQL, which follows the same C style.
Multiline comments begin with /* and end up with */. Multiline comments should not
be nested.


Relational Database Management System

174 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

DECLARE
--Declaration section
BEGIN
/* Executable
declaration */
END;
6.5. Anonymous PL/SQL blocks
As these PL/SQL blocks do not have a name of its own, it is called anonymous PL/SQL blocks.
Hence we will not be able to invoke an anonymous PL/SQL block from anywhere else in our
PL/SQL code, as well as itself.

PL/SQL blocks are not stored permanently in the database. But we can type the code in a text
file and have it saved and stored as a part of file system. There are other forms of PL/SQL
blocks that are stored permanently in the oracle database such as PROCEDURES, FUNCTIONS.
These are also called as NAMED PL/SQL blocks.

Every anonymous PL/SQL block can have three sections.
1. Declaration section
2. Executable section
3. Exception section

All statements that fall between the DECLARE and BEGIN keywords constitute the declaration
section. All PL/SQL variables would be declared in the declaration section.

All statements that fall between the BEGIN and END keywords constitute the executable
section. Any valid SQL and PL/SQL statements can be present in the executable section.

Statements written between the EXCEPTION and END keywords form the exception section.
All runtime errors can be handled with the help of this section.

6.5.1. Declaration section
Declaration section is an optional section within an anonymous PL/SQL block. This section is
especially used for declaring PL/SQL variables along with their datatypes. All the datatypes
available in SQL are supported in PL/SQL too.

For example, to declare a variable to store the current system date and time details, we need
to declare as shown below in the declaration section.

Relational Database Management System

175 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

DECLARE
v_datetime TIMESTAMP;

BEGIN

END;

Any number of variables can be declared in the declaration section. There is no upper limit on
the number of variables that can be declared within the PL/SQL declaration section.

Two variables cannot be declared in the same line. For example,


DECLARE
v_customername, v_suppliername VARCHAR2(30);
BEGIN

END;

The above declaration is INVALID. Maximum length of variable name is 30 characters.
Variables declared can also be initialized with some initial values. If a variable is not
initialized, the default value present in the variable would be NULL. Variables declared can
be referred in the executable section and exception section. The scope or lifetime of variable
declared is that it is alive both in the executable section and in the exception section.
Variables cannot be declared in the execution section or exception section.
6.5.2. Executable section
Executable section is a mandatory section. Any valid SQL and PL/SQL statements can be
present in the executable section. If no executable statement is present in this section, then
the PL/SQL block becomes invalid. Hence at least a NULL statement should be present in this
section to make this PL/SQL block a valid one. Hence NULL is a valid executable statement in
PL/SQL.


SQL>
1 BEGIN
2 NULL;
3 END;
SQL>

The above PL/SQL block is valid, which does nothing.

Relational Database Management System

176 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SQL>
1 BEGIN
2 END;
SQL>

The above PL/SQL block is invalid and would throw an error.
6.5.3. Exception section
Exception section is an optional section. This section is especially used to trap any runtime
errors generated during the execution of a PL/SQL program. To understand about what an
exception is let us assume that we try to divide a numeric value by zero. During compilation,
this statement would not throw compilation error but during runtime the PL/SQL runtime
engine would identify this runtime error. This runtime error is called as an exception.

Exception section can also be used to define certain alternative routine or a recovery
mechanism which need to be executed during runtime errors.

Any valid SQL and PL/SQL statements can be present in the exception section.
6.6. PL/SQL block execution

6.6.1. How a PL/SQL block can be executed?
Type the PL/SQL block as shown below in the SQL prompt. After we type BEGIN, press enter.
New line number would start generating for every new line we type. To stop generating line
numbers place a full stop after typing the last line.


SQL> BEGIN
2 DBMS_OUTPUT.PUT_LINE('Hello World');
3 END;
4 .
SQL> /

PL/SQL procedure successfully completed.
Relational Database Management System

177 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Now to execute the PL/SQL block, type (/) at the SQL prompt. The recently typed PL/SQL
block would be executed. In Oracle we have STANDARD package
48
. Within which
DBMS_OUTPUT is a sub package and PUT_LINE is a procedure within that package which helps
us to echo any information on the screen. But in this case, nothing would be displayed on the
screen.

To enable display of output information on the screen, type the following. This command has
to be typed once for every new session. Meaning, every time you connect to Oracle Server
using SQL PLUS, type the below command once.


SQL> SET SERVEROUTPUT ON
SQL> /
Hello World

PL/SQL procedure successfully completed.

SQL>
Later typing a (/) symbol executes the PL/SQL block present in the editor buffer and would
display the output Hello World.
6.6.2. Another way of executing the PL/SQL block
Save the PL/SQL block in a file named First.sql as shown below. Once the file is saved, then
we can execute the content of the file using @filename or @<complete path where the file
is saved>.


SQL> SAVE First.sql
Created file First.sql
SQL> @First.sql
Hello World

PL/SQL procedure successfully completed.

48
PACKAGE: A collection of procedures and functions bundled together with a name
Relational Database Management System

178 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
6.7. Named PL/SQL blocks
PL/SQL block which has a name is called named PL/SQL block. Named PL/SQL block has a
special header section which specifies whether it is a PROCEDURE or a FUNCTION. The syntax
of implementing a PROCEDURE is as follows:

CREATE [ OR REPLACE ] PROCEDURE block_name(p_param DATATYPE)
IS/AS
--declaration of variables
BEGIN
--SQL and PL/SQL statements
EXCEPTION
--error handling
END;

PROCEDURE is used for implementing an action and FUNCTION is used for implementing
mathematical computations. The header has a special RETURN clause only for FUNCTION
which specifies the type of data returned by the function.


CREATE [ OR REPLACE ] FUNCTION block_name(p_param DATATYPE)
RETURN datatype
IS/AS
--declaration of variables
BEGIN
--SQL and PL/SQL statements
EXCEPTION
--error handling
END;
Both for procedures and functions we can pass one or more parameters as input. Procedures
may return zero, one or more than one output, whereas function must return at least one
value as output. When the named PL/SQL block is submitted to the oracle server, it would not
execute immediately, rather it get compiled and stores permanently in the database for later
execution. Procedures and functions can be invoked within an anonymous PL/SQL block.
6.8. Variables and datatypes
In the declaration section of an anonymous PL/SQL block we can declare variables. The
syntax for declaring variables in PL/SQL is as shown below.

variable_name [CONSTANT] datatype [ NOT NULL ] [ := value ]

The coding convention followed for declaring variable is to start with v_. For example to
declare a variable to store the employee name we declare v_empname.
Relational Database Management System

179 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


DECLARE
v_empname VARCHAR2(30);


Immediately followed by the variable name, we need to specify the datatype of the variable,
which decides the nature of data stored within that variable during runtime. Uninitialized
variables would be initialized with NULL values by default, irrespective of the datatype of the
variable.

If we are interested in declaring a variable, which should not contain an initial value NULL,
then we can initialize the variable with some value, as shown below.


DECLARE
v_empname VARCHAR2(30) :=Joe;


Using assignment operator (:=) we can initialize a PL/SQL variable with a value. An equivalent
of assignment operator is the DEFAULT keyword which can also be used to initialize a variable
with a value, as shown below.


DECLARE
v_empname VARCHAR2(30) DEFAULT Joe;


How do I declare a variable which should not hold a NULL value at any point of time
during execution of a PL/SQL block?


DECLARE
v_empname VARCHAR2(30) NOT NULL := 'Joe';


Using NOT NULL constraint we can implement the above requirement, as shown above.
Whenever we declare PL/SQL variables with a NOT NULL constraint, we have to initialize the
variable with some value. Whenever we refer this variable within PL/SQL henceforth, after
declaration, it would be ensured that it has a non NULL value.

We can even declare constants as shown below. CONSTANTs have to be initialized with some
value during the declaration. The coding convention followed for declaring constants is c_.
Constants can be assigned NULL value. NOT NULL variables cannot be assigned NULL values.
Relational Database Management System

180 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Values of the constant variables cannot be changed in the program but the values of the
variable which has a NOT NULL constraint can be changed except NULL.



DECLARE
c_discount CONSTANT NUMBER := 10;


6.8.1. Scalar datatype - Character
Scalar variables are variables which hold single value during runtime. Character datatype is
of two types.

1. CHAR (n) Fixed length character datatype. n an integer which stands for the number of
bytes to be allocated. In other words, n stands for the number of alphanumeric characters
that can be stored. If no size is mentioned, one character can be stored.

For example, assume a variable declared as shown below.

DECLARE
v_empname CHAR(10):= 'Joe';


Irrespective of the actual number of characters stored (3 characters), 10 memory
spaces would be allocated, but unoccupied, when we go with the above declaration. (i.e. 7
memory spaces are unoccupied)

2. VARCHAR2 (n) Variable length character datatype. n is an integer which stands for the
number of bytes to be allocated. In other words, n stands for the number of alphanumeric
characters that can be stored.

For example, assume a variable declared as shown below.


DECLARE
v_empname VARCHAR2 (10):= 'Joe';


Only 3 memory spaces are allocated and 3 characters are stored in the variable
v_empname. The maximum number of alphanumeric characters which can be stored in
the variable is 10.

Relational Database Management System

181 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3. CHAR (n CHAR) To support internalization and globalized databases, wherein the number
of bytes allocated to store every character is more than one byte, the above declaration is
introduced. For example, let us say for storing a Chinese alphabet, we need 4 bytes.


DECLARE
v_empname CHAR(10 CHAR);


In the above declaration, regardless of the number of bytes allocated for every
character, for sure, we can store 10 characters, supporting multi-byte characters. As
we have seen earlier it exhibits the fixed length nature, similar to CHAR (n).

4. VARCHAR2 (n CHAR) This is similar to CHAR (n CHAR) but it exhibits the variable length
nature, similar to VARCHAR2 (n).


DECLARE
v_empname VARCHAR2(10 CHAR);

Few valid variable declarations are shown below:


SQL> DECLARE
2 v_departmentname VARCHAR2(20) :='Civil';
3 v_instructorname CHAR(20) NOT NULL:= ' Bob Hockins';
4 v_applicantname VARCHAR2(20 CHAR) DEFAULT 'Joel';
5 c_coursename CONSTANT CHAR(20 CHAR) := 'AutoCAD';
..
Relational Database Management System

182 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
6.8.2. Scalar datatype PLS_INTEGER

Variable declared with datatype as PLS_INTEGER can store positive, negative numbers or 0. As
arithmetic operations involving PLS_INTEGER provides the best performance in PL/SQL.
Assigning a real value to a variable declared as PLS_INTEGER would not throw an error and
the value stored would be rounded to the nearest integer.
6.8.3. Scalar datatype - NUMBER

Variable declared with NUMBER (P, S) datatype can store integer and floating point values
where P is total number of digits allowed and S is the number of digits to the right of the
decimal place. Note here that the decimal point is ignored in the calculation of width. A
maximum of 38 digits can be stored in NUMBER.


SQL> DECLARE
2 v_semester PLS_INTEGER;
3 v_durationinhours NUMBER(7,2);

6.8.4. Scalar datatype - Boolean

Variable declared with BOOLEAN datatype can store boolean values such as TRUE, FALSE and
NULL. These variables can be used for decision making purpose while writing conditional
statements.


SQL> DECLARE
2 v_test BOOLEAN;
3 BEGIN
4 v_test:='TRUE'; --Wrong
5 v_test:= TRUE; --Correct
6 END;

While explicitly assigning boolean value to boolean variables care should be taken that it is
not enclosed within single quotes (). The above code snippet demonstrates the same.


Do not attempt to print or display the value stored in boolean variable, as it is not possible.

Relational Database Management System

183 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
6.8.5. Scalar Datatype - Date

The DATE datatype stores the century, year, month, day, hour, minute and second. Fractional
seconds are not available in the date datatype.


SQL> 1 DECLARE
2 v_billingdate DATE := SYSDATE;
3 BEGIN
4 DBMS_OUTPUT.PUT_LINE( TO_CHAR (v_billingdate, 'DD-MON-YYYY HH:MI:SS'));
5* END;
SQL> /
24-JAN-2010 10:01:39

PL/SQL procedure successfully completed.

SQL>

Certain predefined formats can be applied to DATE variables using TO_CHAR () function, so as
to print the date according to format mentioned, as shown in the above example.

6.8.6. Scalar Datatype - Timestamp

The TIMESTAMP datatype stores the date/time details much like DATE datatype, and in
addition it also provides the subsecond details upto nine digits (the default is six).

SQL>
1 DECLARE
2 v_billingdate TIMESTAMP := SYSTIMESTAMP;
3 BEGIN
4 DBMS_OUTPUT.PUT_LINE(v_billingdate);
5* END;
SQL> /
24-JAN-10 10.34.25.929000 PM

PL/SQL procedure successfully completed.

SQL>

Relational Database Management System

184 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
By default, whenever we print a TIMESTAMP variable it prints both the date and time
duration, without the need of using TO_CHAR () function.

SQL>
1 DECLARE
2 v_billingdate TIMESTAMP(9) := SYSTIMESTAMP;
3 BEGIN
4 DBMS_OUTPUT.PUT_LINE(v_billingdate);
5* END;
SQL> /
24-JAN-10 10.36.54.413000000 PM

PL/SQL procedure successfully completed.

SQL>
6.9. DBMS_OUTPUT package

SQL>
1 DECLARE
2 v_departmentname VARCHAR2(20) :='Civil';
3 v_dateofjoining DATE:= SYSDATE;
4 v_registrationdate TIMESTAMP := SYSTIMESTAMP;
5 BEGIN
6 DBMS_OUTPUT.PUT_LINE(v_departmentname);
7 DBMS_OUTPUT.PUT_LINE(v_dateofjoining);
8 DBMS_OUTPUT.PUT_LINE(v_registrationdate);
9* END;
SQL> /

DBMS_OUTPUT is an oracle supplied package and it has a set of procedures defined within the
package. One among the procedures is PUT_LINE () procedure which is used to display the
string passed as argument enclosed in parenthesis to be echoed on the screen.
Thus for display of messages from an anonymous PL/SQL block we use this procedure. This
procedure is mainly used for debugging purpose. When executing a PL/SQL block, any
DBMS_OUTPUT.PUT_LINE () messages are placed in an output buffer, which displays it content
on the screen when the program completes its execution.

Relational Database Management System

185 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Civil
24-JAN-10
24-JAN-10 10.47.03.368000 PM
6.9.1. DBMS_OUTPUT procedures

DBMS_OUTPUT.ENABLE and DBMS_OUTPUT.DISABLE are procedures which will enable and
disable transfer and display of information in the output buffer respectively

DBMS_OUTPUT.PUT () procedure merely places the information in the output buffer but does
not display the same on the screen.

DBMS_OUTPUT.PUT_LINE () procedure not only places the information in the output buffer
but also display the same on the screen, including the display of messages which are not yet
displayed and also places an end-of-line marker, for every invocation.

DBMS_OUTPUT.NEW_LINE () procedure forces the display of information from the output
buffer on to the screen, for those messages which are yet to be echoed on the screen and also
places an end-of-line marker once, even for consecutive repeated invocation. Using this
procedure, we cannot transfer any messages to output buffer for display.
6.9.2. DBMS_OUTPUT procedures usages

SQL>
1 BEGIN
2 DBMS_OUTPUT.PUT('An intelligent programmer ');
3 DBMS_OUTPUT.PUT('finds programming in ');
4 DBMS_OUTPUT.PUT_LINE('PL/SQL ');
5 DBMS_OUTPUT.PUT('interesting ');
6* END;
SQL> /

An intelligent programmer finds programming in PL/SQL

PL/SQL procedure successfully completed.





SQL>
1 BEGIN
2 DBMS_OUTPUT.PUT('An intelligent programmer ');
Relational Database Management System

186 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
3 DBMS_OUTPUT.PUT('finds programming in ');
4 DBMS_OUTPUT.NEW_LINE;
5 DBMS_OUTPUT.PUT_LINE('PL/SQL ');
6 DBMS_OUTPUT.PUT('interesting ');
7 DBMS_OUTPUT.NEW_LINE;
8* END;
SQL> /

An intelligent programmer finds programming in
PL/SQL
interesting
PL/SQL procedure successfully completed.

7. PL/SQL basics and constructs

7.1. %TYPE anchored declarations
Usage1: Anchored declaration is a way of associating a database column definition to a
PL/SQL variable.

SQL> DECLARE
1 -- variablename tablename.columnname%TYPE; --Syntax
2 v_courseid course.courseid%TYPE;
. . . .

The primary advantage of anchored declaration is changes to column precision or datatype
definition in the database would not affect the PL/SQL block which deals with those values.

For example, v_courseid instead of declaring it as VARCHAR2(6), we have defined in the
above PL/SQL block as COURSE.COURSEID%TYPE, where COURSE is the name of the base table
and COURSEID is the column present in the COURSE table, both separated by dot(.) symbol
and followed by %TYPE, which stands for copying the datatype definition alone.

The advantage over here is, changes which happens to COURSEID from VARCHAR2 (6) to
VARCHAR2 (8) at a later point of time, during maintenance, certainly would not affect the
PL/SQL block in any way, since the next time when we compile and execute the PL/SQL
block, changes in the datatype definition would be automatically reflected, leading to ease of
maintenance from the programmer point of view.

Relational Database Management System

187 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
At the same time, if there is any NOT NULL constraint or CHECK constraint associated with
that database column, those constraints would NOT be applied to the PL/SQL variable defined
using anchored declaration.

Usage2: Another usage of %TYPE is whenever we want to reuse the datatype of an earlier
declared PL/SQL variable, we can use %TYPE.

SQL> DECLARE
2 v_projectscore NUMBER(3) NOT NULL:= 67;
3 v_assignmentscore v_projectscore%TYPE :=28;
. . . .

For example, as shown in the above PL/SQL block, v_projectscore is a variable declared with
a NUMBER datatype with a NOT NULL constraint and initialized with a value 67. If another
variable needs to be declared with the same datatype and constraints, we can go for %TYPE.
Thus, while declaring v_assignmentscore we have declared it as v_projectscore%TYPE,
thereby treating v_assignmentscore also as a NUMBER datatype and the NOT NULL constraint
is also applied. Hence v_assignmentscore also has to be initialized. Note the value of
v_projectscore 67 will not be copied to v_assignmentscore.

7.2. Bind variables

Bind variables are declared in the SQL PLUS host environment. These variables are used to
pass runtime values out of one or more PL/SQL programs to the host environment. The
syntax for declaring variable is to use the VARIABLE keyword followed by the name of the
bind variable and specify the datatype. This declaration is done outside any PL/SQL block.

For example, to declare a bind variable named g_courseid do the declaration as shown in the
below PL/SQL block. The convention followed for declaring bind variable is to start with g_
and these variables help us in transfer of the information from PL/SQL to SQL PLUS
environment. These variables are alive only for the current session in which it is declared.

Soon after declaring the bind variable, it will hold a NULL value and we cannot initialize bind
variables with an initial value. Within the PL/SQL block g_courseid is assigned a value C001.
Understand the difference in addressing bind variable within PL/SQL as it has to be prefixed
with a colon symbol.


SQL> SET SERVEROUTPUT ON
SQL> VARIABLE g_courseid VARCHAR2(4);
SQL> BEGIN
Relational Database Management System

188 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
2 :g_courseid :='C001';
3 END;
4 /
PL/SQL procedure successfully completed.

Changes happens to the bind variable would be visible within the PL/SQL block as well as
even outside, once the PL/SQL block has completed execution. Thus we can use
DBMS_OUTPUT.PUT_LINE (:g_courseid) to print the value present in a bind variable inside a
PL/SQL block. To display the value present in the bind variable in SQL environment use the
PRINT command as shown below.


SQL> PRINT g_courseid
G_COURSEID
----------------
C001

Using PRINT command we can view the content of only one bind variable. To view the list of
all bind variables declared in a session type VARIABLE in the SQL prompt and press enter. We
cannot declare bind variables with a DATE datatype. We can not declare the size of a NUMBER
or CHAR type variable.
7.3. Substitution variables
Substitution variables are declared in the SQL PLUS environment. These variables are used to
pass run time values into one or more PL/SQL programs. Using DEFINE command we can
define values for these variables. Irrespective of the type of data assigned to these variables,
all would be considered as CHAR datatype.

For example, to define a substitution variable named g_courseid and to initialize the variable
with a value C001 we need to do as shown below.


SQL> SET SERVEROUTPUT ON
SQL> DEFINE g_courseid = 'C001';
SQL> DECLARE
2 v_courseid VARCHAR2(4);
3 BEGIN
4 v_courseid :='&g_courseid';
5 DBMS_OUTPUT.PUT_LINE(v_courseid);
6 END;
7 /

Relational Database Management System

189 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Substitution variable have to be initialized with some value during declaration in SQL PLUS
environment while using DEFINE command. While referring the substitution variable in PL/SQL
prefix it with an (&) ampersand symbol.

Thus the value defined in SQL PLUS is substituted in PL/SQL block; thereby v_courseid is
assigned a value C001. PL/SQL would not ask us to enter any value for g_courseid. We cannot
change the value of g_courseid within PL/SQL.

Thus for mere transfer of data values from SQL PLUS to PL/SQL we can go for substitution
variables. These variables are also alive only for the current session in which they are
declared. The output of the above PL/SQL block is shown below.


old 4: v_courseid:= '&g_courseid';
new 4: v_courseid:= 'C001';

PL/SQL procedure successfully completed.
7.4. Accepting input in PL/SQL

To accept input in PL/SQL prefix the declared PL/SQL variable with an (&) ampersand symbol.
Once we compile and execute the PL/SQL block the system would prompt we to enter some
value during runtime.

The below PL/SQL block demonstrates the same.

SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 v_courseid VARCHAR2(4);
3 BEGIN
4 v_courseid:='&v';
5 DBMS_OUTPUT.PUT_LINE(v_courseid);
6 END;
7 /
When the above PL/SQL block is executed, it would ask the user to enter the value for v.
Value entered would be assigned to v_courseid, which is displayed by the subsequent
DBMS_OUTPUT.PUT_LINE () statement.


Enter value for v: C001
old 4: v_courseid:='&v';
new 4: v_courseid:='C001';
PL/SQL procedure successfully completed.
PL/SQL is not interactive. Please follow the code snippet to understand the same.
Relational Database Management System

190 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


SQL> DECLARE
2 v_customername VARCHAR2(20);
3 v_qtyrequired NUMBER;
4 BEGIN
5 v_customername := '&v_customername';
6 DBMS_OUTPUT.PUT_LINE('Customer Name : '||v_customername);
7 v_qtyrequired := &v_qtyrequired;
8 DBMS_OUTPUT.PUT_LINE('Required Qty : '||v_qtyrequired);
9 END;
10 /


Enter value for v_customername: JAMES
old 5: v_customername := '&v_customername';
new 5: v_customername := 'JAMES';
Enter value for v_qtyrequired: 20
old 7: v_qtyrequired := &v_qtyrequired;
new 7: v_qtyrequired := 20;
Customer Name : JAMES
Required Qty : 20
PL/SQL procedure successfully completed.
While executing the above PL/SQL block, the system would ask us to enter the customer
name and quantity required both, only after which it will display the entered customer name
and quantity required details.

Even though there is a presence of DBMS_OUTPUT.PUT_LINE () statement immediately after
accepting the customer name, the system would not display the customer name immediately
after accepting it, rather it would accept values for all input variables (i.e., preceded by an
(&) ampersand symbol ) and then finally displays both the customername and the required
quantity. A PL/SQL programmer should be aware of this behavior.

Note:
x Even if & is present in a commented line, it executes and prompts for a value.
x If it is a character input, accept it within single quotes (' ').
7.5. SET VERIFY ON/OFF
The usual tendency of PL/SQL is that for every substitution that happens within the PL/SQL
block it would display two lines. This would help us in identifying how the substitution has
happened and how many has happened. We can even suppress the display of substitution by
typing the above command SET VERIFY OFF in the SQL PLUS environment.

Relational Database Management System

191 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The below PL/SQL block and its output demonstrates the same. To enable the display of
substitution we can use SET VERIFY ON command in the SQL PLUS environment.


SQL> SET VERIFY OFF
SQL> ed
Wrote file afiedt.buf
1 DECLARE
2 v_courseid VARCHAR2(4);
3 BEGIN
4 v_courseid:='&v';
5 DBMS_OUTPUT.PUT_LINE(v_courseid);
6* END;
SQL> /

Enter value for v: C002
C002
PL/SQL procedure successfully completed.
7.6. Operators and Expressions
The list of operators where our focus of discussion would be is as follows:
1. Concatenation operator ( || )
2. Arithmetic operators ( +,-,*,/, **)
3. Relational operators (=, !=, <, >, <=, >=)
4. Logical Operators (AND, OR and NOT)
Using these operators, expressions can be framed.
7.6.1. Concatenation operator
Concatenation operator attaches or concatenates two or more strings together. For example,
in the below PL/SQL block, v_applicantname declared and initialized with a value John was
appended with a value 10 and the concatenated value is reassigned to the same variable
v_applicantname and displayed on the screen.


DECLARE
v_applicantname VARCHAR2(10) := 'John';
BEGIN
v_applicantname := v_applicantname || '10';
DBMS_OUTPUT.PUT_LINE('value of v_applicantname : '|| v_applicantname);
END;

Relational Database Management System

192 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

value of v_applicantname : John10
This operator is especially used in formatted outputs.
7.6.2. Arithmetic operator - Addition
The below PL/SQL block demonstrates the usage of plus (+) operator, obviously used for
addition. v_hostelfee even though declared but not initialized, has a value NULL within it.
When an arithmetic plus (+) operator is applied on this variable with some other operand
value 50, the resultant would be NULL. As the formula says NULL when added with any
NUMBER value would yield NULL.


DECLARE
v_hostelfee NUMBER;
BEGIN
v_hostelfee:= v_hostelfee+500; -- NULL +500
DBMS_OUTPUT.PUT_LINE(value of v_hostelfee : '|| v_hostelfee);
END;

Value of v_hostelfee: (NULL)

Use only numeric and date datatypes with arithmetic operators.
7.6.3. Arithmetic operator - Exponentiation
The below example demonstrates the usage of (**) exponentiation operator. To identify 3
5
= ?
, we would go with this exponentiation operator. v_number a PL/SQL variable declared and
initialized with a value 3 and a statement v_number:= v_number ** 5 is written in the
executable part of the PL/SQL block which is substituted as 3 ** 5 (as 3 raise to the power 5
equals 243) during runtime and the resultant value is displayed as shown in the below PL/SQL
block.

Relational Database Management System

193 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
7.6.4. Usage of Arithmetic operators with DATE variables

The below PL/SQL block demonstrates the usage of plus (+) operator, with the date datatype.

A variable named v_today is assigned a date value and an executable statement written in the
PL/SQL block increments the date assigned by one and assigns the newly incremented date
value to yet another variable named v_tomorrow.

7.7. Nested PL/SQL blocks
We can define a PL/SQL block within another PL/SQL block. Thus the PL/SQL block defined
inside is called a nested PL/SQL block. Nested PL/SQL block can also have declaration
section, executable section and exception section. One or more PL/SQL blocks can be present
within an anonymous PL/SQL block. Nested PL/SQL blocks can be present in the executable
section or in the exception handling section. The below example demonstrates the way in
which a nested PL/SQL block can be defined.

DECLARE
v_number NUMBER:=3;
BEGIN
v_number:=v_number ** 5;
DBMS_OUTPUT.PUT_LINE( 'value of v_number : '|| v_number);
END;

value of v_number : 243


SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 v_today DATE := '31-MAR-2009';
3 v_tomorrow DATE;
4 BEGIN
5 v_tomorrow := v_today + 1;
6 DBMS_OUTPUT.PUT_LINE('Tomorrow''s date is '||v_tomorrow);
7 END;
8 /

Tomorrow's date is 01-APR-09

PL/SQL procedure successfully completed.
Relational Database Management System

194 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

The outer block has a variable named v_departmentid and initialized with a value 10. There is
another block defined which has a variable named v_seatsavailable initialized with a value
20. The scope or the lifetime of variable v_departmentid is within the block in which it is
declared and the nested PL/SQL block, whereas v_seatsavailable is visible only within the
nested block in which it is declared.

Hence executing the above PL/SQL block would lead to compilation error as shown in the
above PL/SQL block.

The various different ways in which, nesting of PL/SQL blocks can happen, is shown in below
PL/SQL block.


DECLARE
v_departmentid NUMBER:=10;
BEGIN
DECLARE
v_seatsavailable NUMBER:=20;
BEGIN
DBMS_OUTPUT.PUT_LINE('The value of v_departmentid:
'||v_departmentid);
DBMS_OUTPUT.PUT_LINE('The value of v_seatsavailable:
'||v_seatsavailable);
END;
DBMS_OUTPUT.PUT_LINE('The value of v_departmentid: '|| v_departmentid);
DBMS_OUTPUT.PUT_LINE('The value of v_seatsavailable: '||v_seatsavailable);
END;
/

ERROR at line 11:
ORA-06550: line 11, column 57:
PLS-00201: identifier 'V_SEATSAVAILABLE' must be declared
ORA-06550: line 11, column 1:
PL/SQL: Statement ignored
Relational Database Management System

195 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m



Overlapping of nested blocks is not allowed. This is shown in the below PL/SQL block.


DECLARE
--declaration of variables in the enclosed block
BEGIN
--SQL and PL/SQL statement(s)
DECLARE
--- declaration of variables in the nested block
BEGIN
-- SQL & PL/SQL statement(s) in nested block
END;
DECLARE
-- declaration of variables in the nested block
BEGIN
-- SQL & PL/SQL statement(s) in nested block
END;
--SQL and PL/SQL statement(s)
END;

DECLARE
--declaration of variables in the enclosed block
BEGIN
--SQL and PL/SQL statement(s)
DECLARE
-- declaration of variables in the nested block
BEGIN
-- SQL & PL/SQL statement(s) in nested block
DECLARE
-- declaration of variables
BEGIN
-- SQL and PL/SQL statement(s)
END;
END;
--SQL and PL/SQL statement(s)
END;
Relational Database Management System

196 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
7.7.1. Scope of variables
PL/SQL variables declared in the DECLARE section would be visible in the EXECUTABLE section
and EXCEPTION section. Lifetime of variables declared in the nested block will be only within
the nested block. Variables declared in the outermost block are visible in all the nested
blocks.

The below code snippet demonstrates the concept of scope of variables. v_courseid is a
variable declared in the outer block as assigned a value C001. Another variable with the same
name v_courseid is declared in the first inner block also and assigned a value C002. Now when
we try to print or display the value of v_courseid in the inner block what value would be
displayed?


DECLARE
--declaration of variables in the enclosed block
BEGIN
--SQL and PL/SQL statement(s)
DECLARE
-- declaration of variables in the nested block
BEGIN
-- SQL & PL/SQL statement(s) in nested block
DECLARE
-- declaration of variables
END;
BEGIN
-- SQL and PL/SQL statement(s)
END;
--SQL and PL/SQL statement(s)
END;

DECLARE
v_courseid VARCHAR2(4) := C001;
BEGIN
DECLARE
v_courseid VARCHAR2(4) := C002;
BEGIN
DBMS_OUTPUT.PUT_LINE(v_courseid);
END;
DECLARE
v_durationinhours NUMBER(4):=3;
BEGIN
Relational Database Management System

197 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Thus it displays a value C002 as variable declared in the nested block always has a higher
precedence over outer block variables. Hence we cannot access the v_courseid with value
C001 within the nested PL/SQL block, as far as the code snippet 2-17 is concerned.

A simple analogy to remember the above scenario is as we have learnt in C language the local
variables always have higher precedence over global variables, the variables declared within
a nested block always has higher precedence.


C002
C001
C001

7.7.2. Qualifying identifiers
Anonymous PL/SQL blocks can be qualified with identifiers (or names). While qualifying use
<< and >> angle brackets to enclose the identifiers. For example, <<branch>> is one PL/SQL
block within which there is a nested <<course>> PL/SQL block. These qualifying identifiers
will be useful if there is a presence of one or more variables with the same name.

v_seatsavailable is a variable declared within the branch block as well as in the nested course
block. If we specify the variable without any qualifier, it will refer to the variable declared
DBMS_OUTPUT.PUT_LINE(v_courseid);
END;
DBMS_OUTPUT.PUT_LINE(v_courseid);
END;

<<branch>>
DECLARE
v_seatsavailable NUMBER:=10;
BEGIN
<<course>>
DECLARE
v_seatsavailable NUMBER:=20;
BEGIN
DBMS_OUTPUT.PUT_LINE(' branch seats available: '||branch.v_seatsavailable);
DBMS_OUTPUT.PUT_LINE(' course seats available: '||course.v_seatsavailable);
END;
DBMS_OUTPUT.PUT_LINE(' branch seats available: '||branch.v_seatsavailable);
DBMS_OUTPUT.PUT_LINE(' course seats available: '||course.v_seatsavailable);
END;
Relational Database Management System

198 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
in the current block and it is not mandatory that we have to make use of qualifier declared
earlier.

By this special qualification names, we are able to access both the variables declared in the
outer block <<branch>> and the inner block <<course>>. The outcome of the above code
snippet is as shown below.

ERROR at line 13:
ORA-06550: line 13, column 63:
PLS-00219: label 'COURSE' reference is out of scope
ORA-06550: line 13, column 6:
PL/SQL: Statement ignored

7.8. PL/SQL conditional constructs
PL/SQL supports a list of conditional constructs which are discussed in this section.
7.8.1. IF THEN END IF syntax

IF condition THEN
action;
END IF;

The above PL/SQL conditional construct is used to check whether a condition is TRUE, and if
it is TRUE the set of statements enclosed between IF THEN and END IF is executed once. The
condition can evaluate to TRUE or FALSE or NULL.


DECLARE
v_maxscore NUMBER :=25;
v_projectscore NUMBER :=&v;
BEGIN
IF v_projectscore>v_maxscore THEN
DBMS_OUTPUT.PUT_LINE('Invalid project score');
END IF;
END;

SQL> /
Enter value for v: 26
old 3: v_projectscore NUMBER :=&v;
new 3: v_projectscore NUMBER :=26;
Invalid project score

Relational Database Management System

199 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
PL/SQL procedure successfully completed.
7.8.2. IF THEN ELSE END IF syntax


IF condition THEN
action-true;
ELSE
action-false;
END IF;

The above PL/SQL conditional construct is used to execute a set of statements (action-true
part), if the condition is evaluated to TRUE. If the condition evaluates to FALSE or NULL then
the set of statements associated with the false part (action-false) is executed.

When the condition can evaluate to NULL? If any variable referred in the condition has a value
NULL then the resultant of the condition would be NULL and still the false part would be
executed.


SQL> ed
Wrote file afiedt.buf
1 DECLARE
2 v_num NUMBER;
3 BEGIN
4 IF v_num > 10 THEN
5 DBMS_OUTPUT.PUT_LINE('TRUE');
6 ELSE
7 DBMS_OUTPUT.PUT_LINE('FALSE OR NULL');
8 END IF;
9* END;

SQL> /
FALSE OR NULL
PL/SQL procedure successfully completed.
As we could see in the above example v_num a variable declared in the declaration part is not
initialized and when checked for condition v_num > 10, reduces to NULL > 10 and resultant
condition is NULL.
7.8.3. Usage of inequality operator ( != or <> )

The below example demonstrates the usage of inequality operator while comparing the
dissimilarity of 2 strings.
Relational Database Management System

200 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SQL> ed
Wrote file afiedt.buf
1 DECLARE --Comparing VARCHAR2 datatypes
2 v_string1 VARCHAR2(20) := 'Foundation program';
3 v_string2 VARCHAR2(20) := 'Foundation program';
4 BEGIN
5 IF v_string1 <> v_string2 THEN
6 DBMS_OUTPUT.PUT_LINE('Both are unequal');
7 ELSE
8 DBMS_OUTPUT.PUT_LINE('Both are equal');
9 END IF;
10* END;

SQL> /
Both are equal
PL/SQL procedure successfully completed.

Predict the output of the above code snippet by doing the following changes in it.

x Change the declaration of v_string1 alone to CHAR(20) and check the output
x Change both the declaration of strings v_string1 and v_string2 to CHAR(20) and
check the output
7.8.4. IF THEN ELSIF END IF syntax

IF condition THEN
action;
ELSIF condition THEN
action;
[ELSE
action;]
END IF;
The above conditional construct is also called as IF ELSIF ladder, wherein more than one
condition can be checked one after the other and for every condition check; we can associate
a set of executable statements. Hence if any one condition leads to TRUE value, the
associated block of statements just beneath that would be executed and no other condition
would be checked.

Make a note of the ELSIF spelling where E is missing. This is the correct syntax. If none of the
condition specified in the IF ELSIF ladder is TRUE, then the control would move to the ELSE
part placed at the end and the set of statements present in that ELSE block would be
executed.

Relational Database Management System

201 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The presence of ELSE block is optional in the above syntax and hence enclosed in square
brackets. We need only one END IF at the end of the construct.

Another variation of IF ELSE IF ladder is also present, which provides the same purpose and
functionality, but with a difference in syntax as shown below.


IF condition THEN
action;
ELSE IF condition THEN
action;
[ELSE
action;]
END IF;
END IF;

Here for every IF there will be a separate END IF. Examples for both the forms have been
given.


DECLARE
v_maxscore NUMBER :=25;
v_projectscore NUMBER :=&v;
BEGIN
IF v_projectscore>v_maxscore THEN
DBMS_OUTPUT.PUT_LINE('Project score cannot be greater than 25');
ELSIF v_projectscore < 0 THEN
DBMS_OUTPUT.PUT_LINE('Project score cannot be ve');
ELSE
DBMS_OUTPUT.PUT_LINE('Valid project score');
END IF;
END;

Enter value for v: 20
Valid project score
PL/SQL procedure successfully completed.



SQL>
1 DECLARE
2 v_maxscore NUMBER :=25;
3 v_projectscore NUMBER :=&v;
4 BEGIN
5 IF v_projectscore>v_maxscore THEN
Relational Database Management System

202 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
6 DBMS_OUTPUT.PUT_LINE('Project score cannot be greater than 25');
7 ELSE IF v_projectscore < 0 THEN
8 DBMS_OUTPUT.PUT_LINE('Project score cannot be -ve');
9 ELSE
10 DBMS_OUTPUT.PUT_LINE('Valid project score');
11 END IF;
12 END IF;
13* END;

SQL> /
Enter value for v: 22
Valid project score
PL/SQL procedure successfully completed.


7.8.5. LOOP.. END LOOP
Whenever we want a set of statements to be executed repeatedly, we prefer this LOOP END
LOOP construct. The syntax of this construct is as shown below.

LOOP
action;
END LOOP;


To transfer the control outside the LOOP ... END LOOP construct, we have an EXIT WHEN
clause. Otherwise, the control would stay indefinitely within the LOOP ... END LOOP
construct.

LOOP
action;
EXIT WHEN condition;
END LOOP;
Once the condition specified in the EXIT WHEN clause is TRUE, the control would be
transferred outside the LOOP ... END LOOP construct. If the condition evaluates to FALSE or
NULL the control is retained within the LOOP ... END LOOP construct. An example, given
below demonstrates the usage of LOOP ... END LOOP construct.


DECLARE
v_seatsallocated NUMBER:=1;
BEGIN
LOOP
DBMS_OUTPUT.PUT_LINE('Seats allocated '|| v_seatsallocated);
Relational Database Management System

203 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
v_seatsallocated:= v_seatsallocated +1;
EXIT WHEN v_seatsallocated >5;
END LOOP;
END;

Seats allocated 1
Seats allocated 2
Seats allocated 3
Seats allocated 4
Seats allocated 5

PL/SQL procedure successfully completed.

7.8.6. Numeric FOR Loop


FOR countervariable IN low_number .. high_number
LOOP
action;
END LOOP;

PL/SQL supports numeric FOR Loop where an implicit declaration of counter variable happens
and the counter variable is initialized with the low_number initially. After this initialization,
the block of statements present in the body of the FOR Loop is executed once. Subsequently
the counter variable is incremented by one and checked whether it has attained the
high_number value. If not, once again the block of statements is executed; else the control is
transferred outside.

We cannot increment the counter variable by 2 or 3 or any other value. Counter variable will
always be incremented by one.


SQL> BEGIN
2 FOR v_num IN 1 .. 4
3 LOOP
4 DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
5 END LOOP;
6 END;

Seats allocated : 1
Seats allocated : 2
Seats allocated : 3
Seats allocated : 4

Relational Database Management System

204 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
PL/SQL procedure successfully completed.
The lifetime of the counter variable would be only within the FOR loop and a reference to
counter variable outside the FOR loop would throw scope error.


SQL> ed
Wrote file afiedt.buf
1 BEGIN
2 FOR v_num IN 1 .. 4
3 LOOP
4 DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
5 END LOOP;
6 DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
7* END;

SQL> /
DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
*
ERROR at line 6:
ORA-06550: line 6, column 45:
PLS-00201: identifier 'V_NUM' must be declared
ORA-06550: line 6, column 1:
PL/SQL: Statement ignored

When the low_number and high_number value is one and the same, still the body of the FOR
loop will be executed once which is demonstrated in the below example.




SQL> ed
Wrote file afiedt.buf
1 BEGIN
2 FOR v_num IN 1 .. 1
3 LOOP
4 DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
5 END LOOP;
6* END;

SQL> /
Seats allocated : 1
PL/SQL procedure successfully completed.
Try to identify what happens, if you declare a variable named v_num in the above PL/SQL
block?
Relational Database Management System

205 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
7.8.7. Numeric FOR Loop with REVERSE option
Another variation of numeric FOR loop available in PL/SQL is with the REVERSE keyword
option, wherein we can start from the high_number value and end up with the low_number
value. The syntax is as shown below.


FOR countervariable IN REVERSE low_number .. high_number
LOOP
action;
END LOOP;

The below example demonstrates the usage of REVERSE keyword in FOR loop.

SQL> BEGIN
2 FOR v_num IN REVERSE 1 .. 4
3 LOOP
4 DBMS_OUTPUT.PUT_LINE('Seats allocated : '|| v_num);
5 END LOOP;
6 END;

Seats allocated : 4
Seats allocated : 3
Seats allocated : 2
Seats allocated : 1

PL/SQL procedure successfully completed.

Note: Always the lower number should be mentioned at the beginning

7.8.8. WHILE Loop
We also have the traditional WHILE loop available in PL/SQL, with the very same entry
controlled nature.

WHILE condition
LOOP
action;
END LOOP;

Only if the condition is TRUE, we can execute the blocks of statements enclosed within WHILE
LOOP END LOOP. As long as the condition is TRUE the control would be retained inside, but
once the condition returns FALSE the control will be transferred outside.

The counter variable used in the condition part needs to be declared explicitly. The below
example demonstrates the usage of WHILE loop.
Relational Database Management System

206 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


SQL> ed
Wrote file afiedt.buf
1 DECLARE
2 v_num NUMBER:= 1;
3 BEGIN
4 WHILE v_num <= 4
5 LOOP
6 DBMS_OUTPUT.PUT_LINE('v_num: '|| v_num);
7 v_num := v_num + 1;
8 END LOOP;
9* END;

SQL> /
v_num: 1
v_num: 2
v_num: 3
v_num: 4
PL/SQL procedure successfully completed.

7.9. Using SQL statements in PL/SQL
All the SQL statements available in SQL can be used in PL/SQL also. This section illustrates
the usage of various SQL statements which we have learnt earlier, and highlights the
difference in syntax, if any.

7.9.1. Using SELECT statements in PL/SQL


SELECT select_list [INTO variable_list] FROM table_list [WHERE where_clause];


We can use SELECT statements in PL/SQL but with an additional clause called the INTO
clause. This INTO clause is used for specifying the list of PL/SQL variables, where the selected
value has to be moved into. Only one row can be returned into the variable list.


SQL>
1 DECLARE
2 v_ename emp.ename%TYPE;
3 v_sal emp.sal%TYPE;
4 v_job emp.job%TYPE;
Relational Database Management System

207 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5 BEGIN
6 --select statement fetches the employee name, salary and
7 --job of an employee where employee no equal to 7934
8 SELECT ename,sal,job INTO v_ename,v_sal,v_job FROM emp WHERE
empno=7934;
9 DBMS_OUTPUT.PUT_LINE('The employee name is '||v_ename);
10 DBMS_OUTPUT.PUT_LINE('Salary '||v_sal);
11 DBMS_OUTPUT.PUT_LINE('Job '||v_job);
12* END;

SQL> /
The employee name is MILLER
Salary 1300
Job CLERK
PL/SQL procedure successfully completed.

In the above example we have made use of anchored declarations and declared three
variables v_ename, v_sal, v_job and the SELECT statement fetches the employee name,
salary and job value of an employee with employee number 7934 into the above declared
PL/SQL variables and displays the same.

One important thing which is worth mentioning here is that in the SELECT statement the
datatype of every column mentioned in the column list and datatype of PL/SQL variables
should match.

If the SELECT statement is unable to identify a row in the underlying table matching the
query condition specified in the WHERE clause, a NO_DATA_FOUND exception is thrown.


SQL>
1 DECLARE
2 v_ename emp.ename%TYPE;
3 v_sal emp.sal%TYPE;
4 v_job emp.job%TYPE;
5 BEGIN
6 --select statement fetches the employee name, salary and
7 --job of an employee where employee no equal to 1234
8 SELECT ename,sal,job INTO v_ename,v_sal,v_job FROM emp WHERE
empno=1234;
9 DBMS_OUTPUT.PUT_LINE('The employee name is '||v_ename);
10 DBMS_OUTPUT.PUT_LINE('Salary '||v_sal);
11 DBMS_OUTPUT.PUT_LINE('Job '||v_job);
12* END;
Relational Database Management System

208 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SQL> /
DECLARE
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at line 8
If the SELECT statement fetches more than one row then TOO_MANY_ROWS exception would
be thrown.


SQL
1 DECLARE
2 v_ename emp.ename%TYPE;
3 v_sal emp.sal%TYPE;
4 v_job emp.job%TYPE;
5 BEGIN
6 SELECT ename,sal,job INTO v_ename,v_sal,v_job FROM emp;
7* END;

SQL> /
DECLARE
*
ERROR at line 1:
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at line 6

Exceptions are discussed in the subsequent chapters. Refer to those chapters for more
details.

7.10. Composite datatype
Composite datatype helps us to store more than one value. This is just similar to the concept
of structure in C language, wherein a programmer can store either homogeneous or
heterogeneous values in contiguous memory locations within a variable.

In PL/SQL, allows us to create record variables which can store multiple column values. Using
%ROWTYPE anchored declaration; we can declare a record variable based on a database table
definition. i.e. we need to prefix the %ROWTYPE keyword with a database table name.
7.10.1. %ROWTYPE

Relational Database Management System

209 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

--recordvariablename tablename%ROWTYPE;

v_branchrec branch%ROWTYPE;

The name of the individual columns within the record variable would be similar to name of
database table column names. None of the constraints mentioned during table creation would
be applied to the individual columns while creating record variables. Only the column names
alone are copied and retained.

To refer to individual field or column within a record, after creation of record variable we
have to use the following syntax



recordvariable.columnname

Referring to record variable name alone would not print the entire record. If any underlying
column definition is modified, the change would be reflected in the structure of record
variable, the next time the PL/SQL block is run or compiled.


SQL> DECLARE
2 v_branchrec branch%ROWTYPE;
3 BEGIN
4 SELECT * INTO v_branchrec FROM branch WHERE branchid='B1';
5 DBMS_OUTPUT.PUT_LINE(v_branchrec.branchid);
6 DBMS_OUTPUT.PUT_LINE(v_branchrec.branchname);
7 END;
8 /

SQL> /
B1
Computer Science
PL/SQL procedure successfully completed.

7.10.2. Using INSERT statements in PL/SQL


INSERT INTO table_name[(column_list)] VALUES select_statement | (value_list);

Relational Database Management System

210 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The syntax of INSERT statement in PL/SQL is shown above. We can write INSERT statements in
PL/SQL and hard code the value directly. We can even accept the branchid, branchname,
seatavailable, departmentid from the end user and insert all the details into the branch
table.

--Inserting values to branch table directly by providing values
BEGIN
INSERT INTO branch VALUES (B4,Microbiology,10,40);
END;


DECLARE
v_branchid branch.branchid%TYPE:= '&bid';
v_branchname branch.branchname%TYPE := '&bname';
v_seatsavailable branch.seatsavailable%TYPE := &seats;
BEGIN
INSERT INTO branch(branchid, branchname, seatsavailable) VALUES
(v_branchid,v_branchname, v_seatsavailable);
END;
/

7.10.3. Using UPDATE statements in PL/SQL

UPDATE table_name SET column_name = select_statement | value ,
[column_name = value] [WHERE where_clause];


The syntax for updating a record or a set of records in PL/SQL is shown above. We can write
UPDATE statement and hard code the value for any straightforward updation, or we can
accept the new value to be updated from the end user and substitute the PL/SQL variable in
the appropriate place for updation.

The below PL/SQL block demonstrates the usage of UPDATE statement in PL/SQL.

BEGIN
UPDATE branch SET seatsavailable=12 WHERE branchid='B4';
END;
/

Relational Database Management System

211 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
7.10.4. Using DELETE statements in PL/SQL


DELETE FROM table_name [WHERE where_clause]


The syntax for deleting a record or a set of records in PL/SQL is shown above. We can write
delete statement and hard code the value for any straightforward deletion, or we can accept
the value to be deleted from the end user and substitute the PL/SQL variable in the
appropriate place for deletion.


BEGIN
DELETE FROM branch WHERE branchid='B4';
END;
/





















8. PL/SQL EXCEPTIONS

Relational Database Management System

212 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
8.1. Introduction

In this chapter, we would be discussing about how exceptions are handled in PL/SQL. When a
PL/SQL is compiled, if there are any compilation errors, such as incorrect syntax usage then
the PL/SQL compilation unit would throw compilation errors. PL/SQL block which does have
any compilation error, might throw some runtime errors or exceptions, during the PL/SQL
code execution, by the PL/SQL runtime engine.

For example, when a programmer writes an arithmetic expression which leads to division by
zero situations then ZERO_DIVIDE runtime exception is thrown by the PL/SQL block. These
exceptions can be trapped by writing exception handlers appropriately in the exception block
of PL/SQL.

Thus exception is an identifier in PL/SQL raised during the execution of a PL/SQL block.
Whenever an exception arises the control leaves the main body of action and transfers the
control to the EXCEPTION section of the anonymous PL/SQL block. For example, if the
exception is thrown in the n
th
line of a PL/SQL block, the control will not return to the (n+1)
th

line. The execution of the program would continue in the exception handler, and then to any
outer block, if it is nested. Also the program execution would never return to the subsequent
statement after the exception is raised.
8.2. How to handle exception?
Using the exception part of a PL/SQL block we can handle exceptions. If the exceptions are
not trapped in the exception part of a PL/SQL, these exceptions would be propagated to the
calling environment.

Also note that exceptions can be raised in the declaration part, executable part as well as in
the exception part.

8.3. Exception syntax


EXCEPTION
WHEN exception1 [OR exception2 . . .] THEN
statement1;
statement2;
. . .
[WHEN exception3 [OR exception4 . . .] THEN
statement1;
statement2;
Relational Database Management System

213 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
. . .]
[WHEN OTHERS THEN
statement1;
statement2;
. . .]
END;

The above code snippet shows the syntax of writing exception handlers. EXCEPTION keyword
starts the exception handling section. PL/SQL programmer can define several exception
handlers, each with its own set of actions. The runtime exception identifier which need to be
handled should be placed in between the WHEN and THEN keyword. If more than one
exception situation need to be handled, then we can make use of optional OR keyword in
between two exception identifier names. We cannot replace the optional OR keyword by AND
keyword.

When an exception occurs, only one among the several exception handlers would be executed
before leaving the EXCEPTION block, after which the control would be transferred to the
outer block or to the calling environment. As a programmer, we anticipate lot of runtime
error situations which a PL/SQL block might come across, but what would happen in case, if
there is an unanticipated runtime error. The solution is to go for a WHEN OTHERS exception
handler, which would take care of all other unanticipated runtime error scenarios.
8.4. Exception Types
Exceptions can be classified into following types.
1. Predefined oracle server exceptions
2. Non-predefined oracle server exceptions
3. User defined exceptions
8.4.1. Raising exceptions
In general exceptions can be raised in the executable or in the exception section of a PL/SQL
block. Predefined exceptions are raised implicitly, whenever that situation arises. PL/SQL
runtime engine executes statements associated with the trapped predefined exception. We
can raise our own exceptions also explicitly in the executable section or in the exception
section.
8.5. Predefined oracle server exception
Predefined exceptions are raised implicitly whenever an anticipated error situation arises
while executing the statements in the PL/SQL block. PL/SQL runtime engine executes
statements associated with the trapped predefined exception.

Relational Database Management System

214 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
For example, division by zero is a predefined error situation. Whenever this error situation
arises immediately the control would be moved over to the exception section searching for
ZERO_DIVIDE exception identifier. If the programmer has written a block of statement under
this ZERO_DIVIDE exception identifier, we say that the programmer has defined an exception
handler. The PL/SQL runtime engine also executes the bunch of statements associated with
this exception handler, after which the control is transferred to the enclosed outer block, if
any or to the calling SQL environment.

Oracle Error Predefined Exception Description
ORA-1403 NO_DATA_FOUND SELECT statement matches no rows
ORA-1422 TOO_MANY_ROWS SELECT statement matches more than one
row
ORA-0001 DUP_VAL_ON_INDEX Unique constraint violated
ORA-1476 ZERO_DIVIDE Division by zero
ORA-6502 VALUE_ERROR Truncation, Arithmetic error
ORA-1722 INVALID_NUMBER Conversion to a number failed. Ex. 2A
is not valid

To trap a predefined oracle server exception we need to know the standard name. We can
refer to oracle documentation for predefined oracle server exception identifier names.
8.5.1. NO_DATA_FOUND predefined exception
NO_DATA_FOUND is a predefined oracle server exception that would be implicitly raised,
whenever a SELECT statement enclosed in a PL/SQL block, fails to identify a matching record
in the underlying table. Note that if an INSERT or UPDATE or DELETE statement does not
affect one or more rows, this exception is NOT thrown or raised. Only when the SELECT
statement fails in a PL/SQL block, the above predefined exception is thrown.


SQL> DECLARE
2 v_branchid branch.branchid%TYPE;
3 v_seats branch.seatsavailable%TYPE;
4 BEGIN
5 v_branchid := '&branchid';
6 SELECT seatsavailable INTO v_seats FROM branch WHERE branchid
Relational Database Management System

215 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
LIKE v_branchid;
7 DBMS_OUTPUT.PUT_LINE('Seats Available: ' || v_seats);
8 EXCEPTION
9 WHEN NO_DATA_FOUND THEN
10 DBMS_OUTPUT.PUT_LINE('Invalid Branch ID');
11 END;

In the example, if the given branchid is present in the branch table, the above PL/SQL block
would display the number of seats available; else it would display Invalid Branch ID.
8.5.2. TOO_MANY_ROWS predefined exception
TOO_MANY_ROWS is a predefined oracle server exception that is thrown implicitly, whenever
the SELECT statement fetches more than one row.


--Given a valid supplierid identify whether he supplies one item or more
--than one
SET SERVEROUTPUT ON
DECLARE
v_supplierid itemsupplier.supplierid%TYPE;
v_supplierrec itemsupplier%ROWTYPE;
BEGIN
SELECT * INTO v_supplierrec FROM itemsupplier WHERE supplierid
='&v_supplierid';
DBMS_OUTPUT.PUT_LINE('Supplier '||v_supplierid|| ' supplies only one
item');
EXCEPTION
WHEN TOO_MANY_ROWS THEN
DBMS_OUTPUT.PUT_LINE('Supplier '||v_supplierid|| ' supplies more
than one item');
END;

The above PL/SQL block deals with itemsupplier table, which captures the details of list of
suppliers who supplies various items. Assuming that the same supplier can supply more than
one item, based on supplierid when we try to fetch records from the supplier table, this
might affect more than one record.

Hence this would be leading to TOO_MANY_ROWS predefined exception and would display a
message Supplier <<given supplierid>> supplies more than one item.
Relational Database Management System

216 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
8.5.3. DUP_VAL_ON_INDEX predefined exception
DUP_VAL_ON_INDEX predefined exception is thrown whenever we try to duplicate a primary
key column in a table.


SQL> DECLARE
2 v_student student%ROWTYPE;
3 BEGIN
4 v_student.studentid = &studentid;
5 v_student.applicationid = &applicationid;
6 v_student.currentsemester = &currentsemester;
7 v_student.branchid = '&branchid';
8 v_student.userid = '&userid';
9 v_student.password = '&password';
10 v_student.residentialstatus = '&resstatus';
11 INSERT INTO student VALUES(v_student.studentid,
v_student.applicationid, v_student.currentsemester,
v_student.branchid, v_student.userid, v_student.password,
v_student.residentialstatus);
12 EXCEPTION
13 WHEN DUP_VAL_ON_INDEX THEN
14 DBMS_OUTPUT.PUT_LINE('Duplicate Student ID');
15 WHEN OTHERS THEN
16 DBMS_OUTPUT.PUT_LINE('Transaction Failed');
17 END;
In the above PL/SQL block, while inserting a student record, if the student id is duplicated,
then we would receive a message Duplicate Student ID.
8.5.4. VALUE_ERROR predefined exception
VALUE_ERROR predefined exception is thrown in 2 different scenarios.

1. When an entered or accepted input data value from the user is very large. For
example, v_studentid is declared as VARCHAR2 (6). If we try entering more than 6
characters as input, it would lead to truncation of the given input value, leading to
VALUE_ERROR predefined exception.


SQL> DECLARE
2 v_studentid VARCHAR2(6);
3 v_studentrec student%ROWTYPE;
4 BEGIN
5 v_studentid:= '&v_studentid';
6 SELECT * INTO v_studentrec FROM student WHERE
Relational Database Management System

217 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
studentid=v_studentid;
7 DBMS_OUTPUT.PUT_LINE('Student Name is
'||v_studentrec.studentname);
8 EXCEPTION
9 WHEN VALUE_ERROR THEN
10 DBMS_OUTPUT.PUT_LINE('Entered input is very large');
11 END;

2. When the expected input is numeric but the input entered by the user is characters,
then VALUE_ERROR predefined exception is thrown.
8.5.5. INVALID_NUMBER predefined exception
While we insert records to an underlying table, which expects a numeric value to be entered
for a specific column, but a character value is entered (by mistake), then this would lead to
INVALID_NUMBER predefined exception.


SQL> BEGIN
2 --Inserting departmentid, departmentname, headofdepartment
3 --into department table
4 INSERT INTO department VALUES('X','BioMedical', ' I101');
5 EXCEPTION
6 WHEN INVALID_NUMBER THEN
7 DBMS_OUTPUT.PUT_LINE('Not a valid number');
8 END;

The above example demonstrates the same, wherein the department table expects the
department id (a numeric value) to be entered as input but a character value (X) is entered
while inserting a record into the same table. This PL/SQL block when compiled and executed
throws INVALID_NUMBER predefined exception.
8.6. Non-predefined oracle server exception
Every oracle error has an error code and an error message. Not for all runtime error
situations, predefined exception names are available. There are runtime error situations
which are unnamed in nature. These unnamed runtime error situations can be trapped either
using WHEN OTHERS exception handler or we can associate an exception identifier to it using
PRAGMA EXCEPTION_INIT PL/SQL compiler directive and then handle it implicitly.


SQL> DECLARE
2 e_Missing_Null EXCEPTION;
3 PRAGMA EXCEPTION_INIT( e_Missing_Null, -1400);
4 BEGIN
Relational Database Management System

218 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5 INSERT INTO department VALUES (40, NULL , 'I101');
6 EXCEPTION
7 WHEN e_Missing_Null THEN
8 DBMS_OUTPUT.PUT_LINE('Missing value for a NOT NULL column ');
9 END;


PRAGMA EXCEPTION_INIT compiler directive during compile time associates an oracle error
number with an exception identifier specified. Once the association happens, henceforth we
can handle the error situation with that associated oracle error number by writing an
exception handler for the same.

As shown in the above example, whenever a NOT NULL constraint violation happens while
inserting or updating a record an error code -1400 is thrown during runtime with an oracle
defined error message. There is no predefined exception identifier which can handle this
oracle error situation. e_Missing_Null is an exception identifier declared in the declaration
section. Using PRAGMA EXCEPTION_INIT we have associated that exception identifier with the
oracle error number -1400.

While inserting a department record, value has to be provided for department name and NULL
value cannot be inserted for department name, else NOT NULL constraint violation exception
would be thrown. As the above PL/SQL block does NULL value insertion for department name,
implicitly NOT NULL constraint violation exception is raised, which in turn is handled in the
exception block and displays Missing value for a NOT NULL column.

8.7. User-defined exception
Exceptions which are very much specific to the business requirements can be implemented
with the help of user defined exceptions. User defined exception identifiers are defined in
the declaration section. Using RAISE exceptionidentifier; statement we raise a user-defined
exception and exception handlers are written to handle the user-defined exception.

e_Invalid_Departmentid is a user defined exception identifier declared in the declaration
section. SELECT statement in the executable section identifies the count of number of
records in the department table with the given departmentid. If no record exists with the
given department id then v_count is set to 0.


SQL> DECLARE
2 v_departmentid department.departmentid%TYPE;
3 v_count NUMBER;
4 e_Invalid_Departmentid EXCEPTION;
Relational Database Management System

219 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5 BEGIN
6 v_departmentid := '&v_departmentid';
7 SELECT count(*) INTO v_count FROM department WHERE
departmentid=v_departmentid;
8 IF v_count = 0 THEN
9 RAISE e_Invalid_Departmentid;
10 END IF;
11 DBMS_OUTPUT.PUT_LINE('Valid Department id');
12 EXCEPTION
13 WHEN e_Invalid_Departmentid THEN
14 DBMS_OUTPUT.PUT_LINE('Invalid Department id');
15 END;
If v_count is zero, e_Invalid_Departmentid is raised in the executable part and trapped in the
exception handling part which prints Invalid Department id.
8.8. WHEN OTHERS exception handler
Exceptions which are not handled by any exception handlers, will be caught with the help of
WHEN OTHERS exception handler. WHEN OTHERS can be used to handled all kinds of
exception, irrespective of whether it is predefined or non-predefined or user-defined.


SQL> DECLARE
2 v_departmentid department.departmentid%TYPE;
3 v_count NUMBER;
4 e_Invalid_Departmentid EXCEPTION;
5 BEGIN
6 v_departmentid := '&v_departmentid';
7 SELECT count(*) INTO v_count FROM department WHERE
departmentid=v_departmentid;
8 IF v_count = 0 THEN
9 RAISE e_Invalid_Departmentid;
10 END IF;
11 DBMS_OUTPUT.PUT_LINE('Valid Department id');
12 EXCEPTION
13 WHEN OTHERS THEN
14 DBMS_OUTPUT.PUT_LINE('Invalid Department id');
15 END;

A simple analogy to persons familiar with Java, is that this is similar to the generic Exception
class. We can have only one WHEN OTHERS exception handler within the exception section of
a PL/SQL block.

Relational Database Management System

220 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
WHEN OTHERS should be the last among the exception handler as it refers to rest of all errors
not handled in the exception block of the PL/SQL block in which it is defined. Always place
the WHEN OTHERS exception handler in the outermost block of PL/SQL block, when nested
blocks are present within it.

8.9. Using SQLCODE and SQLERRM
Using WHEN OTHERS we are able to just handle the unknown or unexpected runtime errors
but to know the name of Oracle error and the Oracle code because of which the PL/SQL block
failed we use SQLCODE and SQLERRM. Thus it helps us in identifying the reason behind the
exception raised.

A programmer might be interested in inserting the reasons behind failure of PL/SQL block into
an audit_log table, which has the details of log records tracking the errors which happened
over a period of time. (not dealt in the code snippet )


SQL> DECLARE
2 e_Missing_Null Exception;
3 PRAGMA EXCEPTION_INIT( e_Missing_Null, -1400);
4 v_sqlcode number;
5 v_sqlerrmsg varchar2(255);
6 BEGIN
7 INSERT INTO department VALUES (40, NULL, 'I101');
8 EXCEPTION
9 WHEN OTHERS THEN
10 v_sqlcode:=SQLCODE;
11 v_sqlerrmsg:= SUBSTR(SQLERRM,1,255);
12 DBMS_OUTPUT.PUT_LINE('SQLCODE '||v_sqlcode);
13 DBMS_OUTPUT.PUT_LINE('SQLERRM '||v_sqlerrmsg);
14 END;

SQLCODE and SQLERRM can be used both in the executable part and the exception part of a
PL/SQL block. SQLCODE gives the numeric value of the oracle error code and SQLERRM gives
the oracle error code and message associated with the oracle error. The maximum length of
SQLERRM is 512 characters.

The above example depicts the best practice that can be adhered to by assigning the
SQLCODE and SQLERRM to a local variable in PL/SQL block and then using it. As these
functions are procedural, we cannot use these variables directly inside an SQL statement.
Relational Database Management System

221 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
8.10. RAISE_APPLICATION_ERROR built in procedure

RASIE_APPLICATION_ERROR is a built in procedure used to create error messages, very much
similar in a manner consistent with other oracle errors. In one shot, if we want to define the
customized error messages and use it without writing any separate exception handlers for the
same then we can go with this built in procedure.

The two mandatory parameters for this are the error number and the error message. The
error number has to be in the range of -20000 to -20999. The error message should not exceed
512 characters.


SQL> DECLARE
2 v_departmentid department.departmentid%TYPE;
3 v_count NUMBER;
4 BEGIN
5 v_departmentid := '&v_departmentid';
6 SELECT count(*) INTO v_count FROM department WHERE
departmentid=v_departmentid;
7 IF v_count = 0 THEN
8 RAISE_APPLICATION_ERROR(-20000, 'Invalid Department id');
9 END IF;
10 DBMS_OUTPUT.PUT_LINE('Valid Department id');
11 END;

In the above example, given a non-existent department id it would throw, Invalid
Department id using RAISE_APPLICATION_ERROR built in. An implicit rollback also would
happen whenever this procedure is executed, and hence changes initiated by this procedure
would be rolled back.
8.11. Exception Propagation
Exception can be raised in the declaration section, executable section and exception section.
Exception raised in the executable section alone can be handled in the same PL/SQL block.
Exception raised in the declaration and exception section of a PL/SQL block can be handled in
the outer block in which it is enclosed.

If it is not handled in the outer block, it would check whether the outer block is enclosed
with any other block and so on. This we call it as propagation of exception. If none of the
block handles the exception raised it would be propagated to the calling environment.
Relational Database Management System

222 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
8.11.1. Exception raised in the declaration section
In the below PL/SQL block exception is raised in the declaration section. As the programmer
assigns a character value to a numeric variable, this leads to VALUE_ERROR predefined
exception. Even though WHEN OTHERS is handled in the same block, as the exception is raised
in the declaration part it can be handled only in the outer block. Hence we could see the
error message displayed as numeric or value error.


DECLARE
v_seatsavailable NUMBER(3) := 'ABC';
BEGIN
DBMS_OUTPUT.PUT_LINE(v_seatsavailable);
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Value error occurred');
END;
DECLARE
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character to number
conversion error
ORA-06512: at line 2

In the below code snippet a PL/SQL block is enclosed in yet another block. WHEN OTHERS
handler present in the outer block, handles the exception raised in the declaration part of the
inner block, producing Other error as output.


SQL> BEGIN
2
3 DECLARE
4 v_seatsavailable NUMBER(3) := 'ABC';
5 BEGIN
6 DBMS_OUTPUT.PUT_LINE(v_seatsavailable);
7 EXCEPTION
8 WHEN OTHERS THEN
9 DBMS_OUTPUT.PUT_LINE('Value error occurred');
10 END;
11
12 DBMS_OUTPUT.PUT_LINE('Completed');
13 EXCEPTION
14 WHEN OTHERS THEN
15 DBMS_OUTPUT.PUT_LINE('Other error');
Relational Database Management System

223 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
16 END;

8.11.2. Exception raised in the executable section

e_Invalid_Departmentid is an exception raised in the executable part of a PL/SQL block using
RAISE statement and has been handled in the same block. Hence the following code produces
'Invalid Departmentid and prints Successful completion' as execution of the outer block
continues normally.





SQL> DECLARE
2 e_Invalid_Departmentid EXCEPTION;
3 BEGIN
4 BEGIN
5 RAISE e_Invalid_Departmentid;
6 EXCEPTION
7 WHEN e_Invalid_Departmentid THEN
8 DBMS_OUTPUT.PUT_LINE('Invalid Departmentid');
9 END;
10 DBMS_OUTPUT.PUT_LINE('Successful completion');
11 END;

8.11.3. Exception raised in the exception section
In the below code snippet, e_Invalid_Itemid is an exception raised in the inner block and
handled in the same block. But the e_Invalid_Itemid exception handler in turn raise yet
another exception named e_Invalid_Customerid and this exception as it is raised in the
exception block of a PL/SQL block it has to be handled in the exception section of the outer
block.


DECLARE
e_Invalid_Itemid EXCEPTION;
e_Invalid_Customerid EXCEPTION;
BEGIN
BEGIN
RAISE e_Invalid_Itemid;
EXCEPTION
WHEN e_Invalid_Itemid THEN
Relational Database Management System

224 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
RAISE e_Invalid_Customerid;
WHEN e_Invalid_Customerid THEN
DBMS_OUTPUT.PUT_LINE('Invalid Customerid');
END;
END;

As there is no outer block, PL/SQL runtime engine would assume that e_Invalid_Customerid is
an unhandled user-defined exception and as that exception is in turn raised by
e_Invalid_Itemid, the engine would also say that e_Invalid_Itemid is also an unhandled user
defined exception. Hence, if we execute the above PL/SQL block, we could see twice in our
output an error message is shown saying unhandled user-defined exception


DECLARE
*
ERROR at line 1:
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at line 9
ORA-06510: PL/SQL: unhandled user-defined exception

The above the PL/SQL block is slightly modified and shown below with e_Invalid_Customerid
handled in the outer block.


DECLARE
e_Invalid_Itemid EXCEPTION;
e_Invalid_Customerid EXCEPTION;
BEGIN
BEGIN
RAISE e_Invalid_Itemid;
EXCEPTION
WHEN e_Invalid_Itemid THEN
RAISE e_Invalid_Customerid;
WHEN e_Invalid_Customerid THEN
DBMS_OUTPUT.PUT_LINE(Invalid Customerid
in the nested block);
END;
EXCEPTION
WHEN e_Invalid_Customerid THEN
DBMS_OUTPUT.PUT_LINE(Invalid Customerid in the outer block);
END;


Relational Database Management System

225 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m














9. PL/SQL cursors

9.1. Cursors
Every SQL query submitted to the Oracle server affects one or more rows. The subset of row
which is affected by the submitted SQL query is momentarily kept in a special place in the
system memory of the oracle server. This temporary area is called private SQL work area, in
which the rows affected by the query, count of number of records affected by the given
query and a pointer to the parsed query, all are kept. Thus cursor is a private SQL work area.
Every SQL statement executed by the oracle server has a separate private SQL work area
associated with it.

More than one row can be kept in the private SQL work area, but only one row can be
processed at a time. This area is also called as context area by some authors. The set of rows
that are held by the cursor currently is called an active set. Oracle can manage the cursor
operations by itself for statements such as SELECT and DML statements then it is called
implicit cursor. When the programmer manages the cursor operations then we call it as
explicit cursor. Managing the cursor involves allocation of memory for the work area, opening
the work area, fetching the records from the work area, closing or releasing the work area
after the processing is done.
9.2. Implicit cursors
Whenever INSERT, UPDATE or DELETE statements are executed, PL/SQL implicit cursors are
created by default and rows are processed. Also when we write a SELECT statement which
deals with only one row, implicit cursors are created and managed by the oracle server.
Relational Database Management System

226 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
9.3. Implicit cursors attributes

Implicit Cursor Attribute Meaning
SQL%ROWCOUNT Number of records affected by the most
recent SQL statement
SQL%FOUND Evaluates to TRUE if the most recent SQL
statement affects one or more rows
SQL%NOTFOUND Evaluates to TRUE if the most recent SQL
statement does not affect any rows
SQL%ISOPEN Always evaluates to FALSE because
PL/SQL closes implicit cursors
immediately after they are executed

After successful insertion, the values of implicit cursor attributes are as shown below.





After successful updation, the values of implicit cursor attributes are as shown below.





After successful deletion, the values of implicit cursor attributes are as shown below.







Do not make use of implicit cursor attributes to test the unsuccessfulness of SELECT
statement using SQL%NOTFOUND beneath the SELECT statement. When a SELECT statement
SQL%ISOPEN FALSE
SQL%FOUND TRUE
SQL%NOTFOUND FALSE
SQL%ROWCOUNT 1
SQL%ISOPEN FALSE
SQL%FOUND TRUE
SQL%NOTFOUND FALSE
SQL%ROWCOUNT Depends on number of rows updated
SQL%ISOPEN FALSE
SQL%FOUND TRUE
SQL%NOTFOUND FALSE
SQL%ROWCOUNT Depends on number of rows deleted
Relational Database Management System

227 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
fails, NO_DATA_FOUND predefined exception will be thrown. As soon as the control moves to
NO_DATA_FOUND exception handler, if we try to check the values of all implicit cursor
attributes it would be as shown below.







SQL%ISOPEN FALSE
SQL%FOUND FALSE
SQL%NOTFOUND TRUE
SQL%ROWCOUNT 0
Relational Database Management System

228 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
9.4. Implicit cursor example

The below example shows the usage of implicit cursor attributes.


BEGIN
UPDATE instructor SET remaininghours=NULL WHERE dateofjoining >
'10-JAN-2007';
DBMS_OUTPUT.PUT_LINE(SQL%ROWCOUNT ||' rows updated');
IF SQL%NOTFOUND THEN
DBMS_OUTPUT.PUT_LINE(' Nobody joined after 10-JAN-2007');
END IF;
COMMIT;
END;
9.5. Explicit Cursors
When we want to write SELECT statements in PL/SQL that deals with more than one row, we
have to go for explicit cursors. When we specify any SELECT statement in the declaration
part, we can consider that our plan is to go for explicit cursors.

As mentioned earlier, we need to manage the explicit cursors of our own. Hence what are the
various operations that the developer needs to do while managing explicit cursors, to gain
complete control is what is discussed in the subsequent sections.
9.6. Operations on explicit cursor
The operations on explicit cursor are as follows:
1. Declaring the cursor
2. Opening the cursor
3. Fetching the cursor
4. Closing the cursor
Lets have a closer look into all these activities in the below sections.
9.6.1. Declaring the cursor
Use the CURSOR keyword to start the cursor declaration followed by the cursor identifier
name and immediately followed by IS keyword. Note that this cursor identifier need not be
declared anywhere in the PL/SQL block. Followed by IS keyword we can write any SQL query.
All SQL queries supported in SQL environment are supported here in the declaration part too.

Relational Database Management System

229 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

CURSOR c1 IS SELECT branchid FROM branch WHERE departmentid IN
(SELECT departmentid FROM department);

CURSOR c2 IS SELECT branchid, branchname, headofdepartment FROM
department WHERE departmentid > 20;

CURSOR c3 IS SELECT departmentid, count(*)
FROM branch WHERE departmentid > 20 GROUP BY branchid;
It is not necessary to write an INTO clause in the cursor declaration, as this does not make
any sense. Even if by chance INTO clause is made use of in the cursor declaration, PL/SQL
compiler would not throw any error. Since only when we actually fetch the records from the
active set one after the other, we need to supply appropriate place holders to store the
resultant value fetched and this is taken care by the FETCH statement (discussed later).

Mere declaration of CURSOR alone will not immediately identify the active set.
9.6.2. Opening the cursor

OPEN cursorname;

Example:
OPEN c1;

OPEN c2;

OPEN c3;

The above code snippet shows the syntax for opening a cursor. The cursor identifier used in
the cursor declaration needs to be specified while opening the cursor. We can open cursors
both in the executable part as well as in the exception part of a PL/SQL block. If the cursor
is already open, then the PL/SQL runtime engine would throw CURSOR_ALREADY_OPEN
predefined runtime exception.

The SELECT associated with the cursor declaration is executed only when we open the cursor.
Thus the OPEN command prepares the cursor for use, identifies the active set associated with
the given SQL query, and positions the cursor before the first row. If the SQL query fetches
no rows from the database, it would not throw any exception.

We need to make use of explicit cursor attributes to test the outcome after fetch. The same
set of cursor attributes what we have discussed earlier with respect to implicit cursor
attributes, can be made use with explicit cursor also by replacing the SQL keyword prefixed
with every implicit cursor attribute with the respective cursor identifier name.
Relational Database Management System

230 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Within a PL/SQL block a cursor can be opened any number of times. Every time when we open
the cursor different active sets can be identified, depending on the current state of the
records in the database. Do not try to reopen the cursor without closing it as it would throw
an exception, which is discussed earlier.

The safest way of opening a cursor is as shown below, wherein we check whether the cursor is
already open and if not, we open the cursor.





Before we open the cursor, assuming that C1 is the explicit cursor which we are dealing with,
the values of various explicit cursor attributes will be as shown in the below table:






After we open the cursor, assuming that C1 is the explicit cursor which we are dealing with,
the values of various explicit cursor attributes, before fetching any record will be as shown in
the below table:






Do not make use of implicit cursor attributes like SQL%FOUND to test the outcome of explicit
cursors. If used, the outcome of recently executed SQL statement will be reflected in those
variables if present and not the outcome of explicit cursor. If no SQL statement is present
then all these variables would evaluate to NULL.
9.6.3. Fetching records from the cursor
The syntax for fetching records from the cursor is as shown below.


FETCH cursorname INTO listofvariables | PL/SQL record variable

Example:

IF NOT c1%ISOPEN THEN
OPEN c1;
END IF;
C1%ISOPEN FALSE
C1%FOUND INVALID_CURSOR exception
C1%NOTFOUND INVALID_CURSOR exception
C1%ROWCOUNT INVALID_CURSOR exception
C1%ISOPEN TRUE
C1%FOUND NULL
C1%NOTFOUND NULL
C1%ROWCOUNT 0
Relational Database Management System

231 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
FETCH c1 INTO v_branchid;

FETCH c2 INTO v_branchid, v_branchname, v_headofdepartment;

FETCH c3 INTO v_branchid, v_count;

FETCH c3 INTO v_branchrec;
Immediately after opening the cursor, we can start fetching the records from the active set
identified. If we try to fetch records from an unopened cursor, an INVALID_CURSOR exception
would be thrown. Make sure to specify the name of cursor which is already opened. Followed
by INTO keyword we need to specify the list of variable names in which the values have to be
populated.

Care should be taken that datatype of the list of columns mentioned in the SELECT statement
should exactly match with the datatype of the variables in the FETCH statement. Usually we
place this FETCH statement within a LOOP .. END LOOP construct as we need to repeatedly
execute the same statement, several times for fetching all the subsequent records until we
reach the last record. Here we assume that our active set has more than one record, and that
is again the reason why we have opted for explicit cursors.

Whenever the FETCH is successful, %FOUND is set to TRUE and if unsuccessful %NOTFOUND is
set to TRUE. Thus to transfer the control outside the LOOP .. END LOOP construct we have to
make use of an EXIT WHEN statement immediately after the FETCH. As soon as %NOTFOUND is
set to TRUE we can transfer the control outside the LOOP .. END LOOP construct.

Before we fetch the first record from the cursor, the values of various explicit cursor
attributes will be as shown in the below table:






After we successfully fetch the first record from an explicit cursor, the values of various
explicit cursor attributes will be as shown in the below table:






C1%ISOPEN TRUE
C1%FOUND NULL
C1%NOTFOUND NULL
C1%ROWCOUNT 0
C1%ISOPEN TRUE
C1%FOUND TRUE
C1%NOTFOUND FALSE
C1%ROWCOUNT 1
Relational Database Management System

232 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Subsequently for every successful fetch all other explicit cursor attributes will be present as
such except C1%ROWCOUNT which is incremented by 1.

After the first UNSUCESSFUL fetch from explicit cursor, the value of various explicit cursor
attributes will be as shown in the below table:






where n is the maximum number of records present in the explicit cursor.
9.6.4. Closing the cursor
The below code snippet shows the syntax of close cursor statement.

CLOSE cursorname;

Example:
CLOSE c1;

CLOSE c2;

CLOSE c3;

Cursor name specified in close cursor is the name of the cursor to be closed. If we try to close
a cursor which is already closed INVALID_CURSOR exception would be thrown. Memory
allocated to an explicit cursor is released only when we close the cursor. Usually a
programmer closes the cursor once he has completed processing on the set of records present
in the active set. Reopen the cursor, if required. Do not attempt to fetch records from the
closed cursor as this would also lead to INVALID_CURSOR exception.


IF c1%ISOPEN THEN
CLOSE c1;
END IF;

9.7. Explicit cursor Simple loop

C1%ISOPEN TRUE
C1%FOUND FALSE
C1%NOTFOUND TRUE
C1%ROWCOUNT n
Relational Database Management System

233 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SQL> DECLARE
CURSOR c1 IS SELECT branchid, seatsavailable FROM branch WHERE
departmentid in (SELECT departmentid FROM department);
v_branchid branch.branchid%TYPE;
v_seatsavailable branch.seatsavailable%TYPE;
BEGIN
OPEN C1;
LOOP
FETCH c1 INTO v_branchid, v_seatsavailable;
EXIT WHEN c1%NOTFOUND;
UPDATE branch SET seatsavailable = v_seatsavailable + 1
WHERE branchid=v_branchid;
DBMS_OUTPUT.PUT_LINE(v_branchid);
END LOOP;
CLOSE c1;
COMMIT;
END;
/

The above code snippet is an example of implementation of explicit cursors using LOOP ...
END LOOP; construct. To increment the number of seats available by one in all the branches
associated with every department of a university an explicit cursor implementation is done.
The cursor declaration happens in the declaration part where all the branches associated with
every department are identified.

The cursor is opened and we fetch every record present in the identified active set into
appropriate PL/SQL variables. An update statement wherein the seats available is
incremented by one for every branch id present in the active set. Finally we close the cursor
and the private SQL work area allocated is released and committed.
9.8. Explicit cursor With Group by clause


SQL> DECLARE
CURSOR cur_branch IS
SELECT branchname, COUNT(*) as no_of_applicant_opted
FROM applicant c, branch b WHERE c.optedbranch = b.branchid GROUP
BY branchname ;
v_branchname branch.branchname%TYPE;
v_noofapplicants NUMBER;
BEGIN
OPEN cur_branch;
Relational Database Management System

234 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
DBMS_OUTPUT.PUT_LINE('Branch Name No. of Application Opted');
LOOP
FETCH cur_branch INTO v_branchname, v_noofapplicants;
EXIT WHEN cur_branch %NOTFOUND;
DBMS_OUTPUT.PUT(v_branchname||' ');
DBMS_OUTPUT.PUT(v_noofapplicants);
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE cur_branch;
END;
/
The above code snippet demonstrates the usage of GROUP BY clause in cursor declaration.
This example deals with the display of branch name and the total number of applicants opted
for every branch. As we have used an aggregate function COUNT (*) in the cursor declaration,
an alias name is necessary which would help us in accessing the value for that column.
9.9. Explicit cursor attributes

cursorname%ISOPEN Is the cursor open?
cursorname%ROWCOUNT How many rows have been fetched so far?
cursorname%NOTFOUND Has a fetch failed?
cursorname%FOUND Has a row been fetched?

The discussion about the values present in these explicit cursor attributes is done along with
cursor operations.
9.10. Using record variables with explicit cursors
The below example demonstrates how to make use of record variable for accessing the active
set record details and display of the same. v_curvar is a record variable declared as
%ROWTYPE to hold both the branchname and the count of number of applicants opted in
every branch.


CURSOR cur_branch IS
SELECT branchname, COUNT(*) as no_of_applicant_opted
FROM applicant c, branch b WHERE c.optedbranch = b.branchid GROUP
BY branchname ;
v_curvar cur_branch%ROWTYPE;
v_newrec v_curvar%TYPE;

This record variable declared is made use of in FETCH statement after the INTO clause.
Relational Database Management System

235 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
When we deal with more number of columns using cursors, usage of record variables will
make our life easier, as no separate variables need to be declared for every column. If we
want to come out with yet another record variable which has a similar structure, then we can
simply say the name of newly needed record variable followed by earlier creating record
variable%TYPE.

Thus as shown in the above cursor declaration the record structure of v_curvar and v_newrec
is similar.

SQL> DECLARE
CURSOR cur_branch IS
SELECT branchname, COUNT(*) as no_of_applicant_opted
FROM applicant c, branch b WHERE c.optedbranch = b.branchid GROUP
BY branchname ;
v_curvar cur_branch%ROWTYPE;
BEGIN
OPEN cur_branch;
DBMS_OUTPUT.PUT_LINE('Branch Name No of Application Opted');
LOOP
FETCH cur_branch INTO v_curvar;
EXIT WHEN cur_branch %NOTFOUND;
DBMS_OUTPUT.PUT(v_curvar.branchname||' ');
DBMS_OUTPUT.PUT(v_curvar.no_of_applicant_opted);
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE cur_branch;
END;
/

9.11. Navigating cursors with WHILE LOOP
The below example demonstrates how to deal with explicit cursor and WHILE construct. This
construct would allow us to execute a set of statements repeatedly, when we specify a
condition that evaluates to TRUE. Hence we need to identify an explicit cursor attribute
which would be TRUE as long as we are able to fetch records for an active set. As we
discussed earlier, cursorname%FOUND is an explicit cursor attribute that evaluates to TRUE as
long as we are able to FETCH records from an active set. But this explicit cursor is NULL once
we open the cursor, and is initialized with TRUE value only after the first successful fetch.


SQL> DECLARE
CURSOR cur_branch IS
SELECT branchname, COUNT(*) as no_of_applicant_opted
Relational Database Management System

236 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
FROM applicant c, branch b WHERE c.optedbranch = b.branchid GROUP
BY branchname ;
v_curvar cur_branch%ROWTYPE;
BEGIN
OPEN cur_branch;
FETCH cur_branch INTO v_curvar;
DBMS_OUTPUT.PUT_LINE('Branch Name No of Application Opted');
WHILE cur_branch%FOUND
LOOP
DBMS_OUTPUT.PUT(v_curvar.branchname||' ');
DBMS_OUTPUT.PUT(v_curvar.no_of_applicant_opted);
DBMS_OUTPUT.NEW_LINE;
FETCH cur_branch INTO v_curvar;
END LOOP;
CLOSE cur_branch;
END;
/
Hence twice the fetch statement has to be written, one outside the WHILE construct and
another inside the WHILE construct.
9.12. Cursor FOR LOOP
Cursor FOR construct help us to process explicit cursor, and at the same time, it relieves the
PL/SQL programmer from the burden of dealing with various cursor operations such as
opening, fetching records, closing, exit condition checking. Meaning, these cursor operations
are implicitly taken care by this cursor FOR construct, allowing the programmer to
concentrate on the implementation of business logic.

Below is the syntax for dealing with cursor FOR loop. The set of statements to be repeatedly
executed are enclosed within LOOP.. END LOOP; recname is the name of record variable,
which is implicitly declared for us while using cursor FOR construct. This recname need not be
declared in the declaration section of the PL/SQL block.


FOR recname IN cursorname
LOOP
..
..
END LOOP;

Thus the lists of operations implicitly taken care by Cursor FOR loop are

x Implicit open, fetch, exit condition check, close
Relational Database Management System

237 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
x Implicit record variable declaration


DECLARE
CURSOR cur_branch IS
SELECT branchname, COUNT(*) as no_of_applicant_opted
FROM applicant c, branch b WHERE c.optedbranch = b.branchid GROUP
BY branchname ;
BEGIN
DBMS_OUTPUT.PUT_LINE('Branch Name No of Applicants Opted');
FOR v_curvar IN cur_branch
LOOP
DBMS_OUTPUT.PUT(v_curvar.branchname||' ');
DBMS_OUTPUT.PUT(v_curvar.no_of_applicant_opted);
DBMS_OUTPUT.NEW_LINE;
END LOOP;
END;
/
As we could see in the above code snippet, v_curvar is a record variable implicitly declared
and cursor operations are implicitly taken care of by the cursor FOR loop construct.
9.13. Implicit cursor FOR LOOP
Not only the record variable can be implicitly declared, but even the cursor declaration can
be implicitly declared by placing the cursor definition statements in the FOR loop itself. The
below code snippet demonstrates the same. Thus we do not know the name of private SQL
work area which is set aside for the below cursor operation.


SQL>
BEGIN
DBMS_OUTPUT.PUT_LINE('Branch Name No of Applicants Opted');
FOR v_curvar IN (SELECT branchname, COUNT(*) as
no_of_applicant_opted FROM applicant c, branch b WHERE
c.optedbranch = b.branchid GROUP BY branchname )
LOOP
DBMS_OUTPUT.PUT(v_curvar.branchname||' ');
DBMS_OUTPUT.PUT(v_curvar.no_of_applicant_opted);
DBMS_OUTPUT.NEW_LINE;
END LOOP;
END;
/
Thus the query is placed within parenthesis in the FOR loop itself. Apart from this all other
implicit cursor operations are also available when we use the above construct.
Relational Database Management System

238 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
9.14. Cursor related predefined oracle server exceptions
INVALID_CURSOR predefined exception and CURSOR_ALREADY_OPEN predefined exception are
the two predefined exceptions which we are discussing in this section.
9.14.1. INVALID_CURSOR exception
Two different situations in which INVALID_CURSOR predefined exception is thrown:
x When we try fetching records from an unopened cursor
x When we try to close a cursor which is already closed

The below example demonstrates the first one. Cursor C1 identifies the departmentid
corresponding to various branches, but without opening the cursor C1, we are trying to fetch
records from it.

SQL> DECLARE
2 CURSOR c1 IS SELECT departmentid FROM department
3 WHERE departmentid in (SELECT departmentid FROM branch));
4 v_departmentid department.department%TYPE;
5 BEGIN
6 FETCH C1 INTO v_departmentid;
7 WHILE C1%FOUND
8 LOOP
9 DBMS_OUTPUT.PUT_LINE(v_departmentid);
10 FETCH C1 INTO v_departmentid;
11 END LOOP;
12 CLOSE C1;
13 COMMIT;
14 EXCEPTION
15 WHEN INVALID_CURSOR THEN
16 DBMS_OUTPUT.PUT_LINE('Invalid cursor exception thrown');
17 END;

9.14.2. CURSOR_ALREADY_OPEN exception
The below example demonstrates when CURSOR_ALREADY_OPEN exception is thrown. As we
have learnt, we can open and close the cursor any number of times within PL/SQL blocks.
But before reopening the cursor, we have to make sure that it is closed. Opening a cursor
which is already opened, throws CURSOR_ALREADY_OPEN exception.

SQL> DECLARE
2 CURSOR c1 IS SELECT departmentid FROM department
3 WHERE departmentid in (SELECT departmentid FROM branch));
Relational Database Management System

239 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
4 v_departmentid department.department%TYPE;
5 BEGIN
6 OPEN C1;
7 FETCH C1 INTO v_departmentid;
8 WHILE C1%FOUND
9 LOOP
10 OPEN C1;
11 DBMS_OUTPUT.PUT_LINE(v_departmentid);
12 FETCH C1 INTO v_departmentid;
13 END LOOP;
14 CLOSE C1;
15 COMMIT;
16 EXCEPTION
17 WHEN CURSOR_ALREADY_OPEN THEN
18 DBMS_OUTPUT.PUT_LINE('Cursor already open exception thrown');
19 END;
9.15. Parameterized cursors
We can pass one or more parameters or arguments to a parameterized cursor. This helps us to
identify different active sets at runtime by passing different input values. While opening
cursor we have to pass parameters, which could either hardcoded as shown in the example
below or we can accept input values from the user. Parameters need to start with p_ as a
mark of coding convention.

CURSOR cursorname (parameter datatype) IS query;

Every formal parameter mentioned in the cursor declaration, should have a corresponding
actual parameter in the open statement. The datatype of the formal and actual parameters
also should match. Do not specify the size while mentioning the formal parameter in the
cursor declaration. For example, p_branchid is even though a NUMBER variable of size 3, we
dont mention it as NUMBER (3) in the cursor declaration, whereas we say only NUMBER,
without special mention to the size.

Merely passing the parameters along would not suffice, but use these parameters in the
WHERE clause of the SQL query mentioned in the cursor declaration, as this is the one, which
will help us in identifying different active sets based on different inputs.


SQL> DECLARE
2 CURSOR c1(p_branchid NUMBER) IS SELECT branchid, seatsavailable
3 FROM branch where branchid = p_branchid;
4 v_branchid branch.branchid%TYPE;
Relational Database Management System

240 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
5 v_seatsavailable NUMBER(3);
6 BEGIN
7 OPEN c1(1001);
8 LOOP
9 FETCH c1 INTO v_branchid, v_seatsavailable;
10 EXIT WHEN c1%NOTFOUND;
11 DBMS_OUTPUT.PUT_LINE(v_branchid||' '||v_seatsavailable);
12 END LOOP;
13 CLOSE c1;
14 END;
9.16. Explicit cursor FOR UPDATE

CURSOR cursorname IS SELECT .. FROM .. FOR UPDATE [OF column_reference] [NOWAIT];

The syntax of SELECT statement in cursor declaration is one and the same as we have seen
earlier but with an additional FOR UPDATE clause, which should be last clause even after
ORDER BY (if any).

When we plan for updation of records present in the active set, we can use FOR UPDATE
clause in the cursor declaration of SELECT statement, which helps us to gain exclusive row
lock on the set of records present in the active set. Thus the rows cannot be modified by
other users who intend to operate on the same set of records. The user who has applied an
exclusive row lock has to relinquish the lock by executing either COMMIT or ROLLBACK
statement so that other users can modify the records present in the active set.

If some other session has already acquired an exclusive row lock on one or more records
already, then the current session has to wait for these locks to be released by the other
session. This might lead to an indefinite wait. To avoid this, we can include a NOWAIT clause
in the cursor declaration.

Inclusion of NOWAIT in the cursor declaration checks whether records are not locked by
anybody. If locked by somebody, it throws an oracle error and terminates the execution of
PL/SQL block.

9.17. FOR UPDATE cursor declaration

CURSOR cursorname IS SELECT ... FROM ... FOR UPDATE [OF column_reference] [WAIT n];

Relational Database Management System

241 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
We can even instruct PL/SQL runtime engine to wait for n seconds, as shown in the above
syntax. If the rows are not unlocked within n seconds, then it would return an oracle error.


SQL> DECLARE
CURSOR c1 IS SELECT empno, sal FROM emp FOR UPDATE OF sal;
v_empno emp.empno%TYPE;
v_sal emp.sal%TYPE;
BEGIN
FOR rec IN c1
LOOP
UPDATE emp SET sal=sal + 100 WHERE empno=rec.empno;
END LOOP;
COMMIT;
END;

The above example demonstrates incrementing the salary of all employees by 100 in the EMP
table. The list of columns present after FOR UPDATE clause specifies the column(s) to be
updated.

Do not try to update any derived column(s) or column(s) with aggregate functions in the
SELECT query, as it is not possible. For example, in the below code snippet max
(hoursremaining) is an aggregate function based column, meaning there is no column in the
instructor table with name max (hourseremaining).


SQL> DECLARE
2 CURSOR c1 IS SELECT instructorid, max(hoursremaining)
3 as maximumhours FROM instructor
4 GROUP BY instructorid FOR UPDATE OF maximumhours;
5 BEGIN
6 FOR rec IN c1
7 LOOP
8 DBMS_OUTPUT.PUT_LINE(rec.instructorid ||' '||rec.maximumhours);
9 END LOOP;
10 END;
ERROR at line 6:
ORA-06550: line 2, column 76:
PL/SQL: ORA-01786: FOR UPDATE of this query expression is not allowed
ORA-06550: line 2, column 15:
PL/SQL: SQL Statement ignored
Relational Database Management System

242 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
9.18. WHERE CURRENT OF clause

WHERE CURRENT OF cursorname;

Whenever we make use of FOR UPDATE in the cursor declaration, we are allowed to make use
of WHERE CURRENT OF clause with the UPDATE statement. This clause can be used just to say
that the updation has to happen at the current row pointed by the explicit cursor. In other
words, the updation should be applied only to the row which we have just fetched (recently).

The below code snippet demonstrates the usage of WHERE CURRENT OF clause.

SQL> DECLARE
CURSOR c1 IS SELECT empno, sal FROM emp FOR UPDATE OF sal;
v_empno emp.empno%TYPE;
v_sal emp.sal%TYPE;
BEGIN
FOR rec IN c1
LOOP
UPDATE emp SET sal=sal + 100 WHERE CURRENT OF c1;
END LOOP;
COMMIT;
END;
As we have empno column which distinguishes one row from the other in the active set, we
can use empno to uniquely identify a row, in the WHERE condition of UPDATE statement,
instead of WHERE CURRENT OF cursorname, which was our earlier implementation.

Thus without using WHERE CURRENT OF clause, we can still implement SELECT with FOR
UPDATE. Updates are allowed on columns which are not mentioned in the FOR UPDATE
clause, but this is not a good programming practice.
10. Transaction processing in PL/SQL
Transaction processing available allows multiple users to work on the database concurrently.
At the same time it also ensures that each user sees a consistent version of data and that all
the changes are applied in the right order. There is no need to write extra code to prevent
problems with multiple users accessing data concurrently. Oracle uses locks to control
concurrent access to data and locks only the minimal amount of data necessary, for the least
possible time.
10.1. Using COMMIT statement in PL/SQL

Relational Database Management System

243 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
COMMIT statement in PL/SQL marks the end of current transaction. This statement can be
used both in the executable section and exception section. It helps us to save changes made
during that transaction permanent and is visible to all users. Transactions are not tied to
PL/SQL BEGIN ... END blocks. There can be more than one transaction implemented in the
same PL/SQL block. There might be a situation where in not even one transaction could have
been implemented completely within a PL/SQL block. A block can contain multiple
transactions and a transaction can span multiple blocks.


SQL> BEGIN
UPDATE emp SET sal=sal + 100 WHERE empno=7935;
END;

With reference to the above PL/SQL block, there could have been some other DML statements
which might have been executed before this UPDATE statement. Hence, there is no assurance
that this is the first DML statement which modifies the database, hence may or may not be
the beginning of transaction. Moreover, there is no commit statement present in the PL/SQL
block and hence this is not an end of the transaction.


SQL> DECLARE
--assume declaration of appropriate variables and exceptions
BEGIN
COMMIT;

--Generation of bill and insertion of record to billing table
INSERT INTO billing VALUES(1002, 2345610001, 'C2', 62,'21-Mar-
09','creditcard');

COMMIT;


--updation of stock in the item table

UPDATE item set qtyonhand=qtyonhand-1 WHERE itemid='STN001';
UPDATE item set qtyonhand=qtyonhand-1 WHERE itemid='BAK001';

COMMIT;
EXCEPTION
--assume appropriate exceptions are handled
END;

Relational Database Management System

244 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
With reference to the above PL/SQL block, the first commit statement ends the earlier
transaction. A DML statement for generation of bill starts the transaction. The commit
statement beneath that ends the generation of bill transaction. Updation of stock in the item
table is another transaction which ends with a commit statement. Thus a PL/SQL block can
contain multiple transactions.
10.2. Using ROLLBACK statement in PL/SQL
ROLLBACK statement present in a PL/SQL block ends the current transaction. This statement
helps in undoing any changes made during that transaction. Thus making mistakes such as
deleting a wrong row can be restored with the help of this statement. This statement can be
used both in the executable section and exception section.


SQL>DECLARE
--assume that the itemid is unique
v_itemid item.itemid%TYPE:='STN003';
--assume that an itemname called Pen already exists in the ITEM table
v_itemname item.itemname%TYPE:='Pen';
v_itemrec item%ROWTYPE;
BEGIN
UPDATE item SET qtyonhand=qtyonhand + 50 WHERE itemid='STN001';
INSERT INTO item(itemid, itemname) VALUES (v_itemid, v_itemname);
SELECT * INTO v_itemrec FROM item WHERE itemname=v_itemname;
DBMS_OUTPUT.PUT_LINE('Item name is unique');
EXCEPTION
WHEN TOO_MANY_ROWS THEN
DBMS_OUTPUT.PUT_LINE('Item name duplicated');
ROLLBACK;
END;

As shown in the above PL/SQL block, if we try to duplicate item records, by inserting a new
item with an existing item name, TOO_MANY_ROWS exception is thrown. Thus Item name
duplicated message is printed on the screen after which the database is rollbacked
immediately undoing the above insertion and updation.
10.3. Using SAVEPOINT in PL/SQL
SAVEPOINT statement lets us to rollback part of a transaction instead of the whole
transaction. These are similar to the bookmarks that we create while reading a book, where
at any point of time, we can revert to a particular location, for later reference.

Relational Database Management System

245 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
SAVEPOINT names and marks the current point in the processing of a transaction. Hence when
we want to rollback to specific point, we can do so by using savepoint name with the
ROLLBACK statement.


SQL> INSERT INTO emp VALUES( 1004, 6000);

1 row created.

SQL> SAVEPOINT S1;

Savepoint created.

SQL> UPDATE emp SET sal=1000 WHERE empno=1002;

1 row updated.

SQL> SAVEPOINT S2;

Savepoint created.

SQL> DELETE FROM emp WHERE empno=1003;

1 row deleted.

SQL> SAVEPOINT S3;

Savepoint created.

For example, we have done insertion, updation and deletion on EMP table and savepoint
created after every DML operation. Savepoint S1 created after inserting an employee record,
S2 created after updating the sal of an employee record and S3 created after deleting an
employee record. Now if we simply say ROLLBACK in the SQL prompt all the changes made in
EMP table would be restored.

Instead if we want to retain INSERTION and UPDATION happened earlier, and discarding the
DELETION alone then we need to say ROLLBACK to S2; in the SQL prompt. In this case, as we
have rolled back to S2, whatever the savepoint(s) which we have after S2 will be cleared.
Thus S3 will be cleared. The same behavior is exhibited within PL/SQL block also.

Instead if we want to retain INSERTION alone, discarding the UPDATION and DELETION then
we need to say ROLLBACK to S1. In this case, as we have rolled back to S1, whatever the
Relational Database Management System

246 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
savepoint(s) which we have after S1 will be cleared. Thus both S2 and S3 will be cleared. The
same behavior is exhibited within PL/SQL block also.

Another important thing is that these savepoint names are undeclared identifiers in a PL/SQL
block. This means that there is no need to do any separate declaration for variables which are
used as savepoint names. The number of save points for each session is also unlimited. These
savepoints are alive only for the current session in which it is created.
10.4. Concurrency control

A simple way to think of oracle read consistency is

readers do not wait for writers ( or other readers of the same data)

John the writer does some update operation on a record, while at the same time Jack who is
reading the record sees the consistent version of the data. Even though John has updated the
supplier name to XYZ, Jack cannot see this updation, as John has not committed. At the same
time Jack need not wait until John completes updation. This proves that readers do not wait
for writers or other readers of the same data.





Relational Database Management System

247 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
writers do not wait for readers (of the same data)

John the writer here does not wait until the reader Jack completes reading of a record.
Simultaneously while John is writing some record, Jack can still read the same record, but the
consistent, committed version of the data alone would be given to him. This proves that
writers do not wait for readers of the same data








Writers only wait for other writers if they attempt to update identical rows in
concurrent transactions

Relational Database Management System

248 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m


Two writers cannot plan for an updation of the same record at the same time. Individual who
gains exclusive access to the record first alone, would be allowed to do modification. Others
have to wait. As shown in the above screenshot, John gains exclusive access to the record
first and does an updation of supplier name. Jack also tries to update the same record later,
but as John has gained exclusive access Jack has to wait, until John releases the lock on the
record. This proves that writers only wait for other writers of the same data.


11. On Line Analytical Processing (OLAP)
Data is the one of the most valuable assets of any organization or enterprise. Operational
activities of an organization include day-to-day business processes necessary to run it.
Systems that support such processes are called the On Line Transaction Processing (OLTP)
systems. Operational data are highly structured data that is continuously generated and
stored in what is typically called as operational or transactional or OLTP databases.

An organizations success also depends on its ability to analyze data and to make intelligent
decisions that would potentially affect its future. Systems that facilitate such analysis are
called On Line Analytical Processing (OLAP) systems.

Relational Database Management System

249 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
An OLTP application rarely requires historical data. An OLAP application requires historical
data because an analysis is generally based on a substantial amount of historical data to
enable trend analysis and future predictions.

An OLTP transaction is characterized by several users creating, updating or retrieving
individual records whereas OLAP application is characterized by higher level views of the
data.

Thus the focus of OLTP and OLAP are fundamentally different. The following section gives
the difference between OLTP and OLAP.
11.1. Difference between OLTP and OLAP

OLTP OLAP
Definition On Line Transaction
Processing
On Line Analytical Processing
Data Dynamic (day to day
transaction / operational
data)
Static (historical data)
Data Atomicity Data is stored at microscopic
level
Data is aggregated or
summarized and stored at the
higher level
Normalization Normalized Databases to
facilitate insertion, deletion
and updation
De-normalized Databases to
facilitate queries and analysis
History Old data is purged or archived Historical data stored to enable
trend analysis and future
predictions
Queries Simple queries and updates
Queries use small amounts of
data
( one record or a few records)
Example:
update account balance
enroll for a course

Complex queries
Queries use large amounts of
data

Example:
Total annual sales for north
region
Total monthly sales for north
region
Updates Updates are frequent Updates are infrequent
Response time Fast response time is
important
Data must be up-to-date,
Transactions are slow
Queries consume a lot of
bandwidth
Relational Database Management System

250 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
consistent at all times
Joins in queries Joins are more and complex as
tables are normalized
Joins are few and simple as
tables are de-normalized
An OLTP system aims at one
specific process
Example: ordering from an
online store
An OLAP integrates data from
different processes
Example: Combines sales,
inventory and purchasing data
Data models Complex data models, many
tables
Simple data models, fewer
tables
Focus OLTP focuses on performance OLAP focuses on flexibility and
broader scope

A practical solution to enable analytical processes is to implement a data warehouse.
11.2. Data Warehouse
A data warehouse is a repository which stores integrated information for efficient querying
and analysis. Data warehouse has data collected from multiple, disparate sources of an
organization. It is the basis for decision support and data analysis systems.
11.2.1. Why data warehouse is needed?
x Analysis requires millions of records of data which are historical in nature
x Data is collected from heterogeneous sources (e.g. RDBMS, flat files, etc.)
x Need to make quick and effective strategic decisions

In essence, it is a copy of the organizations operational data adequately modified to support
the needs of analytical processes and stored outside the operational database.
11.2.2. Characteristics of Data Warehouse:
According to Bill Inmon, known as the father of Data Warehousing, a data warehouse is a
subject oriented, integrated, time-variant, nonvolatile collection of data in support of
management decisions.

x Subject-oriented: means that all data pertinent to a subject/ business area are
collected and stored as a single unit

x Integrated: means that data from multiple disparate sources are transformed and
stored in a globally accepted fashion

x Static/non-volatile: means data once entered into the warehouse does not change. It
is periodically added if required
Relational Database Management System

251 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

x Time variant: Data warehouse maintains historical data which are used to analyze the
business or market trends and facilitate future predictions



Figure 11-1: Data warehouse architecture


11.2.3. Data Warehousing Terminology
Data sources: An organization has many functional units with their own data. Data from all
such sources have to be consolidated and put into a consistent form that would reflect the
business of an organization as a whole. These sources of data for a data warehouse are known
as data sources or operational data sources.

Metadata: Metadata is the data about the data. Metadata is the layer of the data warehouse,
which stores the information like the source data, transformed data, date and time of data
extraction, target databases, date and time of data loading, etc.

Measure attributes: A numerical value that can be summarized or can be aggregated upon.
Example: Consider an inventory application. Assume that the inventory store sells twenty
products in one day, each for 5 dollars. Thus it generates 100 dollars in total sales for the
day. Therefore, sales dollars is one measure. The store owner might want to get the
ROLAP
MOLAP
Analysis
Reporting
Data Mining
Operational
databases
Flat
Files
Operational
databases
ETL
Process
Data
Warehouse
Output

Data Marts

Data Marts
Data
Sources
Data Warehouse Server
OLAP Servers
Presentation
Tier
Relational Database Management System

252 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
information about the number of customers they had that day. Did 5 customers buy 4
products each, or did one customer buy twenty products. Thus, customer count is another
measure.

Dimension attributes: Dimensions can be defined as the perspectives used for looking at the
data. How you want your data to be seen? this answers your question about what is a
Dimension? Some examples of dimensions are:

Product
Time
Location
Customer Age
Customer Income

There is almost always a time dimension on anything which is being analyzed. Considering
the example given for measure attributes, sales of a product can be analyzed by day, or by
month or by quarter, or by half year, or by year. Sales can also be analyzed by category or by
product. The time, product, geographic dimensions are very common.

Data that can be modeled as dimension attributes and measure attributes are called
multidimensional data.

Relational Database Management System

253 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
11.2.4. Data Collection for Data Warehouse Applications
Extraction, transformation and loading (ETL): This is the most important step in Data
Warehousing.

Definition of ETL: The processes such as Extract, Transform and Load are described as the
process of selecting, migrating, transforming, cleansing and converting mapped data from the
operational environment to data warehouse environment.

Data needs to be taken from various disparate sources to the data preparation area. This
process is known as data extraction. This data preparation area, also known as data staging
area consists of relational tables. Data from various heterogeneous sources are altered into a
uniform format and put into relational tables of data preparation area which can be readily
loaded into the data warehouse database. This process is known as loading. The data is
loaded into the fact table and dimension tables in the data warehouse database. Refer to
Figure 11-2.

Figure 11-2: Extraction, Transformation and Loading Process
11.2.5. Storing of data in Data warehouse
Dimensional Modeling: The dimensional modeling is also known as star schema because in
dimensional modeling there is a large central fact table with many dimension tables
surrounding it.

Fact Tables: Each data warehouse or data mart includes one or more fact tables. A fact table
is the central table of a star or snowflake schema. This central table captures the data that
measures the organization's business operations. Fact tables usually contain large numbers of
rows.

Source Systems
Data Staging Area
Data
Warehouse
Data is periodically extracted
Data is cleansed and transformed
User query the data warehouse
Relational Database Management System

254 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
One of the main features of the fact table is that, it has numerical data or facts, which can
be summarized to give the information about the operational history of the organization. The
fact tables also contain a multipart index which is nothing but a foreign key to the primary
key of a related dimension table. The dimension tables contain the attributes of the fact
records. The fact tables should not contain the attributes, which hold descriptive
information.

Dimension Tables:
The attributes in these tables describe the fact records in the fact table. It contains
attributes which summarize the useful information required by the analyst. Dimension table
even contains attributes providing descriptive information. Some attributes have hierarchies
for example a dimension containing information about product may contain a hierarchy that
separates products into categories, with each of these categories further subdivided into
manufacturer.

Cube: The OLAP tools allows you to turn data stored in relational databases into meaningful,
easy to navigate business information by creating data cube. The dimensions of a cube
represent distinct categories for analyzing business data. Categories such as time, geography
or product line breakdowns are typical cube dimensions.

Dimension hierarchies: Refer to Figure 11-3. The product dimension contains individual
products. Products are further divided into categories, and further divided according to
manufacturer. The dimension table stores the hierarchy for the dimension.


Figure 11-3: Dimension Hierarchies

Available Schemas for dimensional modeling:

Star schema
Snowflake Schema

Products
Products_
Category
Products_
Manufacturer
Year
Quarter
Month
Day
Relational Database Management System

255 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Star Schema: It is the simplest data warehouse schema. It resembles a star. The center of the
star consists of at least one or more fact tables and the points radiating from the center are
the dimension tables. Refer to Figure 11-4.

Figure 11-4: Star Schema

Snowflake Schema: It is a complex data warehouse schema. The snowflake schema consists
of a single, central fact table, which is surrounded by dimension hierarchies which are
normalized. Each level of the dimension is represented in a table. Refer to Figure 11-5.

Figure 11-5: Snowflake Schema

Disadvantages of Snowflake Schema:
x It increase the number of dimension tables
x It requires more foreign key joins
Fact Table
Dimension
Table
Dimension
Table
Dimension
Table
Dimension
Table
Star Schema
Fact Table
E.g. Sales
Dimension
Table
Products
Products_
Category
Products_
Manufacturer
Dimension
Table
Countries
Customers
Cities
Relational Database Management System

256 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
11.2.6. Reporting of a Data warehouse application
A data mart is a subset of a data warehouse which focuses on a single area of data and it is
organized for quick analysis. It can be a small data warehouse itself.
11.2.6.1. Advantages of Data Marts:
x It focuses on presentation rather than the organization of data
x It facilitates data reporting
x It provides meaningful reports to the users pertaining to their business area thereby
allowing them to view and concentrate only on the data that is related to their
business area
Example: providing sales data to the sales department, providing financial data to
the financial department
x It makes the data design simpler and easier. It breaks the whole design into several
smaller sub units which is beneficial to the customers and the team that is involved in
development. It is also easier to maintain.
x Reporting of data becomes faster and more efficient because reporting is generally
done at the sub unit level and data marts assist in faster retrieval compared to
querying the entire data warehouse
x It helps in incrementally building up the enterprise data warehouse
x It helps to ensure security



Figure 11-6: Each end user works with a focused subset of Data Warehouse called Data Mart

Data Warehouse
Data Mart1 Data Mart2 Data Mart3
End User 1 End User 2 End User 3
Relational Database Management System

257 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Several data marts can be built, each for a particular business area provided they all conform
to the data warehouse architecture from where they get the data for reporting. Data marts
can be used in conjunction with each other. Refer to Figure 11-6.

Relational Database Management System

258 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
11.2.7. Difference between Data Warehouse and Data Mart
Data Warehouse Data Mart
A data warehouse is a repository which
stores integrated information from
multiple disparate sources for efficient
querying and analysis

A data mart is a subset of a data
warehouse which focuses on a single area
of data and it is organized for quick
analysis.
It mainly focuses on the organization of
data and offers little focus about the
presentation of data.
It focuses mainly on the presentation of
data to the customers rather than the way
in which the data is organized in the data
warehouse
There is usually a central data warehouse
system
There can be several data marts that
operate on the central data warehouse
Data Warehouse is used on an enterprise
level
Data Mart is used on a business division /
department level
Data Warehouse contains data from
heterogeneous sources for analysis
Data Mart only contains the required
subject specific data for local analysis
11.2.8. Popular tools available for data warehousing

Reporting / Analysis Tools:
x Micro Strategy: DSS Agent / Server
x Cognos: Improptu
x Brio Technology: Brio Query
x Seagate Software: Crystal Reports
x MS-SQL Server 2005 SQL Server Reporting Service (SSRS)

ETL:
x Oracle Warehouse Builder
x Informatica: Power Center
x Acta: ActaWorks
x MS-SQL Server 2005 SQL Server Integration Service (SSIS)


Databases:
x MDDB
o Oracle
o MS-SQL Server 2005 SQL Server Application Service (SSAS)
Relational Database Management System

259 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
11.3. Summary
x An OLAP application requires historical data because an analysis is generally based on
a substantial amount of historical data to enable trend analysis and future predictions
x A data warehouse is a repository which stores integrated information for efficient
querying and analysis
x Extract, transform and load process (ETL) is described as the process of selecting,
migrating, transforming, cleansing and converting mapped data from the operational
environment to data warehouse environment
x A data mart is a subset of a data warehouse which focuses on a single area of data and
it is organized for quick analysis.
x Star schema is the simplest data warehouse schema. It resembles a star. The center of
the star consists of at least one or more than one fact tables and the points radiating
from the center are the dimension tables.
x The snowflake schema consists of a single, central fact table, which is surrounded by
dimension hierarchies which are normalized.


























Relational Database Management System

260 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

Appendix-A

Boyce Codd Normal Form (BCNF)
A relation is said to be in Boyce Codd Normal Form (BCNF) if and only if all the determinants
are candidate keys. BCNF relation is a strong 3NF, but not every 3NF relation is BCNF.

Let us understand this concept using slightly different Result table structure.

Student# EmailID Course# Marks
101 Davis@myuni.edu M4 82
102 Daniel@myuni.edu M4 62
101 Davis@myuni.edu H6 79
103 Sandra@myuni.edu C3 65
104 Evelyn@myuni.edu B3 77
102 Daniel@myuni.edu P3 68
105 Susan@myuni.edu P3 89
103 Sandra@myuni.edu B4 54
105 Susan@myuni.edu H6 87
104 Evelyn@myuni.edu M4 65
RESULT Table

Overlapping Candidate Key
In the RESULT table, we have two candidate keys namely Student# Course# and Course#
EmaiIId. Course# is overlapping among those candidate keys. Hence these candidate keys are
called as overlapping candidate keys as shown above.

Student# Course#
EmailID
Appendix

261 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
The non-key attribute, Marks is non-transitively and fully functionally dependant on key
attributes. Hence this is in 3NF. But this is not in BCNF because there are four determinants in
this relation namely:
x Student# (Student# decides EmailiD)
x EMailID (EmailID decides Student#)
x Student# Course# (decides rest of the attributes in RESULT table)
x Course# EMailID (decides rest of the attributes in RESULT table)

All above determinants are not candidate keys. EMailID decides Student# but EMailID on its
own is not a candidate key. Similarly Student# decides EMailID of a student but Student#
alone is not a candidate key. Only combination of Student# Course# and Course# EMailID are
candidate keys.
To make this table BCNF, we need to split this table into the following structure:


STUDENT TABLE
Student# EmailID
101 Davis@myuni.edu
102 Daniel@myuni.edu
103 Sandra@myuni.edu
104 Evelyn@myuni.edu
105 Susan@myuni.edu
Student# Course# Marks
101 M4 82
102 M4 62
101 H6 79
103 C3 65
104 B3 77
102 P3 68
105 P3 89
103 B4 54
105 H6 87
104 M4 65
Boyce Codd Normal Form
Now both the tables are not only in 3NF, but also in BCNF because all the determinants are
candidate keys. In the first table, Student# decides EMailID and EMailID decides Student# and
both are candidate keys.

In second table, Student# Course# is only determinant and candidate key. Hence it qualifies
BCNF definition that every determinant must be a candidate key.


Note: If the table has only one non-composite candidate key and if it
is in 3NF, then the table will also be in BCNF.

Basically 2NF and 3NF takes away the redundancy, anomalies which exist among the key and
non-key attributes on other hand BCNF takes away the redundancy, anomalies which exist
Appendix

262 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
among the key attributes. At Infosys, we rarely (around 1% of database design) normalize the
databases to BCNF.


Embedded SQL

Purpose
To blend SQL language statements directly into a program written in a host programming
languages, such as C, Pascal, COBOL, FORTRAN and PL/I, use embedded SQL statements.

The following techniques are used to embed the SQL statements:

x SQL statements are intermixed with statements of the host language in the source
program. This embedded SQL source program is submitted to a SQL pre-complier,
which processes the SQL statements
x Variables of the host programming language can be referenced in the embedded SQL
statements, allowing values calculated by the program to be used by the SQL
statements
x Program language variables are also used by the embedded SQL statements to receive
the results of SQL queries, allowing the program to use and process the retrieved
values
x Special program variables are used to assign NULL values to database columns and to
support the retrieval of NULL values from the database

Why Embedded SQL?
SQL has the following limitations:

x No provision to declare variables
x No unconditional branching/jump statement
x No IF statement to test conditions
x No FOR, DO or WHILE statements to construct loops
x No block structure

In order to understand the embedded SQL program, one has to be familiar with the following
terminologies:

EXEC SQL: Every embedded SQL statement begins with an introducer that flags it as a SQL
statement. The IBM SQL products use the introducer exec sql for most host languages.
Appendix

263 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m

SQLCA: The sqlca (SQL Communication Area) is a data structure that contains error variables
and status indicators. By examining the SQLCA, the application program can determine the
success or failure of its embedded SQL statements.

exec sql include sqlca;
This statement tells the SQL pre-complier to include a SQL Communications Area in the
program.

As RDBMS executes each embedded SQL statement, it sets the value of the variable sqlcode in
the SQLCA to indicate the completion status of the statement.
x A sqlcode of zero indicates successful completion of the statement
x A negative sqlcode indicates a serious error that prevented the statement from
executing correctly
x A positive sqlcode indicates a warning condition. The most common warning with a
value of 100, is the out of data warning returned when a program tries to retrieve the
next row of query results and no more rows are left to retrieve.

Host variables: A host variable is a program variable. It is declared using the data types of
the programming language such as C and manipulated by programming language
statements. A host variable is also used in embedded SQL statements to store/retrieve data
to/from the database. To identify the host variable, the variable is prefixed by a colon (:)
when it appears in an embedded SQL statement. A host variable can appear in an embedded
SQL statement wherever a constant can appear.

The two embedded SQL statements begin declare section and end declare section bracket the
host variable declarations and are non-executable.

Use of host variables to store data into the database:
x The input provided by the user using the standard input device is stored in the host
variables
x Values of the host variable is then written to the database using the INSERT SQL
statement

Use of host variables to retrieve data from the database:
x The data values retrieved from the database using the SELECT SQL statement are held
in the host variables
x The contents of the host variables are then displayed on the standard output device
using functions such as printf() in C

Indicator variables: To store NULL values in the database or retrieve NULL values from the
database, embedded SQL allows each host variable to have a companion host indicator
Appendix

264 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
variable. In an embedded SQL statement, the host variable and the indicator variable
together specify a single SQL-style value, as follows:
x An indicator value of zero indicates that the host variable contains a valid value
x A negative indicator value indicates that the host variable should be assumed to have
a NULL value; the actual value of the host variable is irrelevant and should be
disregarded
x A positive indicator value indicates that the host variable contains a valid value, which
may have been rounded off or truncated

A host variable is immediately followed by the name of the corresponding indicator variable.
Both variable names are preceded by a colon.

Example: A simple embedded SQL program written in C.

Problem statement: This program asks the customer for his Cust_ID, retrieves his record from
the Customer_Details table and displays it on the standard output device.

int main(int argc, char* argv)
{
/* inclusion of the SQL Communication Area in the program */
exec sql include sqlca;


/* declaration of the HOST VARIABLES */
exec sql begin declare section

char Mem_Cust_ID[5];
char Mem_Cust_Last_Name[25];
char Mem_Account_No[5];
char Mem_Bank_Branch[25];
char Mem_Cust_Email[30];
short iBank_Branch;

exec sql end declare section;

/* Prompt the user for Customer ID */
printf(Enter Customer ID:);
scanf(%s,Mem_Cust_ID);

/* execute the SQL query */
/* HOST VARIABLES are preceded by a colon (:) e.g.:Mem_Cust_ID */
/* HOST Variable followed by a companion */
/* host indicator variable */
/* e.g. :Mem_Bank_Branch :iBank_Branch */
exec sql SELECT Cust_ID, Cust_Last_Name,
Account_No, Bank_Branch, Cust_Email
FROM Customer_Details
WHERE Cust_ID =:Mem_Cust_ID INTO
:Mem_Cust_ID,
:Mem_Cust_Last_Name,
:Mem_Account_No,
Appendix

265 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
:Mem_Bank_Branch :iBank_Branch,
:Mem_Cust_Email;


/* Display the retrieved data */
/* sqlca.sqlcode contains status information
of the embedded SQL statement executed */
if (sqlca.sqlcode = = 0) {
printf(Customer ID: %s\n, Mem_Cust_ID);
printf(Customer Name: %s\n, Mem_Cust_Last_Name);
printf(Account No.: %s\n, Mem_Account_No);

/* checking the value of the INDICATOR VARIABLE */
if (iBank_Branch < 0) {
printf(Bank Branch is NULL\n);
}
else {
printf(Bank Branch: %s\n, Mem_Bank_Branch);
}

printf(Customer Email: %s\n, Mem_Cust_Email);
}
else if (sqlca.sqlcode = = 100) {
printf(No customer with that Customer ID.\n);
}
else {
printf(SQL error: %ld\n, sqlca.sqlcode);
}

/* returns success code to the operating system */
return 0;
}

Timestamping
Another concurrency management technique is Timestamping. Every resource in database will
be associated with last successful read and last successful write timestamp (time of
occurrence up to milliseconds Ex: 12th December 2004 11:22:33.345).

Let us consider:

x RDBMS author by name Hanu is modifying this course material as one transaction
x Trainees reading this course material as another transaction

If other RDBMS author, say Seema, is also modifying the same course material at the same
time, it leads to Lost-update and Phantom record conditions.

If Hanu starts modifying while trainees are studying this material, it leads to dirty read or
incorrect summary problems.

Appendix

266 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
To avoid these problems we can follow these two rules:

Hanu can start the course modification transaction only if:
x Course material is successfully modified before starting this transaction
x No trainees are currently reading this course material

Similarly trainees can start reading course material transaction only if:
x It was successfully updated before they start reading it

Let us consider an example of database DB_BANK_DETAILS as discussed earlier. In table
ACC_DETAILS, a particular row is read successfully by transaction BalanceEnquiry at
12:11:45.345 of 12th December 1945. This will be the last read timestamp of this row. If any
other transaction reads this row after this time, that particular time will be the last read
timestamp of the row.

Similarly every row will have last updated timestamp. If transaction BalanceUpdate updates
the row R1 at 13:32:22.345 of 12th December 1945, this will be recorded as last updated
timestamp of the row R1.

A transaction can read only rows or columns that have been updated by an older transaction
if not, transaction is rolled back.

Let us assume that Row R7 of the table ACC_DETAILS was successfully updated at 09:24:22.46
Hrs of 15th August 1947 by some transaction.

Any transactions started after 09:24:22.46 Hrs of 15th August 1947 can read this row.

Transactions started before 09:24:22.46 Hrs of 15th August 1947 need to be rolled back and
start afresh to read this data.

In general for read, the condition can be defined as TS > TU where TS is the start time of
transaction and TU is the last successful update timestamp of the resource.

A transaction can update only rows or columns that have been read and updated by an older
transaction else this transaction is rolled back.

Similarly any transaction can update row R7 only if it is started after the last successful
update and the last successful read.

Assume a transaction started at 10:24:23.49 Hrs of 15th August 1947 and wishes to update
row R7 at 10:29:11.34 Hrs of 15th August 1947.

Appendix

267 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
It is possible to update this row only if the row R7 was successfully updated and read before
10:24:23.49 Hrs of 15th August 1947.

Any transaction started after 10:24:23.49 Hrs of 15th August 1947 cannot change the value of
this row.

Generic rule for updating data is TS > TU and TS > TR. Where TS is transaction start
timestamp, TU is the last successful updated timestamp and TR is the last successful read
time.

The biggest advantage of timestamping is it leads to no dead lock condition as no resources
are locked.

Timestamping technique leads to large number of rollbacks. Due to this reason timestamping
technique is not implemented as the concurrency control mechanism in most of the
commercial RDBMS applications.


Note: Almost all the commercial RDBMS packages use a locking technique as
the concurrency controlling mechanism while maintaining the consistency in
the system.

Glossary

268 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Glossary
Abstract: Conceptual/theoretical object.

Abstraction: A simplified representation of something that is potentially quite complex. It is often not
necessary to know the exact details of how something works, is represented or is implemented,
because it can still be used in its simplified form.

Ambiguity: Uncertainty.

Anomalies: Irregularities.

Anomaly: A departure from the expected; an abnormality.

Atomic: The smallest levels to which a data can be broken down and still remains meaningful.

Attribute: The literal meaning is quality; characteristic; trait or feature. Entities get their meaning in
a database with the help of a set of attributes. Consider for e.g., in the bank system, Cust_ID,
Cust_Email etc. describe Customer-Detail entity set.

Backup: A second copy of a file or a set of files to be used if the primary or the main file(s) are
destroyed or corrupted. Backups are essential for every data but it is one of the most trivial work. For
critical work, two backup sets are recommended.

Business rules: The rules or the policies which govern the functioning of the application.

Business users: The users who owns the application.

Cardinality of a relation: It is the number of rows or tuples in a table.

Centralized: Systems where the flow of data or the beginning of activities, decision making are
initiated at the same central point and spread to other remote points in the organization

Conceptual: To generalize abstract ideas from specific instances.

Concurrent Access: Performing two (or more) operations on the same piece of data at the same time.

Constraints: restriction, limitation.

Data manipulation: Data manipulation refers to the insertion of new data, modification of existing
data, etc.

Data Redundancy: The same data is stored in more than one place in a database.

Decomposable: Further split or reduce.

Degree of a relation: It is the number of attributes or columns in a table.

Distinct: Not identical.

Glossary

269 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Distributed: We say that a computer system is distributed when many different types of components
and objects related to an application can be situated on different computers, which are connected to a
network.

Encryption: The process of manipulating the data in such a way that it should not be interpreted by all
but should be interpreted by the intended users.

End User: The person for whom a system is being developed. Example: a bank teller or a bank manager
is an end user of a bank system.

Entity: An entity is a thing or object in the real world that is distinguishable from other objects.
Example: employee is an entity, and book can be considered to be another entity.

Flat files: File containing records that has no structured interrelationship. Files used in programming
fundamentals (PF) projects were essentially flat files.

Fourth Generation Language (4GL): A 4GL is typically non-procedural and designed so that end users
can specify what they want without having to know how the computer will process their requirement.

Grant Privilege: To assign a privilege to a user or to a group.

Heterogeneous: diverse, mixed, varied.

Heterogeneous Network: A network that consists of network interface cards, servers, workstations,
operating systems, and applications from many vendors, all these working together as a single unit.
The network usually uses different media and different types of protocols on different network links.

Homogeneous: All the same, uniform, harmonized.

Homogeneous Network: A network composed of systems of similar architecture and runs a single
network layer protocol.

Inconsistency: lacking uniformity or agreement.

Instance: Occurrence.

Integrated: United into a larger unit. Something, which is brought together in order to form a working
whole in a satisfactory manner.

Integrity Constraints: It is a set of rules to ensure the correctness and accuracy of data.

Interrelated: interconnected

Intuitive: Natural.

Iterative: Process of repeating the same task.

Jargons: It is a specialized language or a technical language of a profession or a trade.

Main Memory: This concept is discussed in OS course - All the read and write operations happen in
main memory before they are written into hard disks.

Model: A representation or a scaled down structure of an object.

Glossary

270 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Page: It is part of a table. Usually in one page multiple rows are stored.

Participating entities: The entities which are joined by the relation.

Queries: A request that a user makes on the database.

Recovery: Restoration, return to an original state.

Requirement specification: A document which contains requirement for a specific application.

Revoke Privilege: Cancel, withdraw.

Schema: A description of a database. It specifies (among other things) the relations or Tables, their
attributes or columns, and the domains of the attributes.

Semantic: Meaning.

Shared: It is a type of database access, which allows multiple users to log on to the database at the
same time.

Simulate: To make a model.

Site: Geographical location.

Software application designer: The person who designs software applications.

SQL: (Structured Query Language). It is a language, which is used by relational databases to request or
to query, or to update and manage data.

Static: Something which does not change. (Example: the typical web page is static in which the
content of the webpage does not change until the owner of the web page or the web master physically
alters the document.)

Superset: Given two sets, X and Y, we say X is a superset of Y if all the elements of Y are also elements
of X. Every set is a superset of itself. Every set is a superset of the empty set.

Table: It is a two dimensional space having columns and rows. A table contains a specified number of
attributes or columns but can have any number of records or rows.

Tablespace: The logical part of the database which represents collection of the structures like tables,
etc created by various users.

Tangible: Physical object.

Transaction: It is a set of processing steps, which are considered as a single activity or unit of work to
achieve a desired result. In DBMS, collection of processing steps that form a single logical unit of work
is called a transaction. A database system ensures proper execution of transactions despite failures
either the entire transaction executes, or none of it does.

Transient: Temporary, transitory, momentary.

Transitive: In-direct.

Glossary

271 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Tuple: This is a mathematical term for a finite sequence of n terms. E.g., the set {1, 2, 3, 4} is a four-
tuple. A tuple is equivalent of a record. In RDBMS, a table has n tuples.

Unauthorized: Not permitted, illegal, unlawful.

View: A virtual table in the database defined by a query.
Index

272 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Index

A
ABORT .............................................. 137
ACID ................................................ 139
ALTER TABLE ....................................... 80
Application Programmer ........................ 21
Attribute ............................................ 37
B
Bottom-Up .......................................... 55
Boyce Codd Normal Form ....................... 249
C
Candidate Key ...................................... 27
Cardinality of a Relation ......................... 26
Cardinality of relationship ....................... 37
Cartesian product ................................ 118
Centralized ......................................... 15
CHECK OPTION .................................... 128
Check-Points ...................................... 160
COMMIT ............................................ 137
Conceptual / Logical level ...................... 18
Concurrency ....................................... 141
Concurrent Access ................................... 4
Concurrent Access Anomalies ................. 13
Co-Related Sub-Queries ......................... 114
CREATE TABLE ..................................... 75
Cube ................................................ 244
D
Data Control Language (DCL) ................... 129
Data Definition Language (DDL)................. 74
Data Isolation ...................................... 12
Data Manipulation Language (DML)............. 87
Data Model .......................................... 22
Data Redundancy ................................. 11
Data Security ...................................... 10
Data Warehouse .................................. 240
Database .............................................. 2
Database Administrator ......................... 21
Database Management System .................... 1
DBMS Interface....................................... 8
Deadlock ........................................... 154
Deferred Update .................................. 157
Degree of a Relation .............................. 26
DELETE .............................................. 90
Derived Attribute ................................. 41
Determinant ........................................ 56
Dimension Hierarchies ......................... 244
Dimension Tables ................................ 244
Dimensional Modeling .......................... 243
Dirty Read ......................................... 143
Distributed .......................................... 15
Domain Integrity ................................. 140
DROP TABLE ........................................ 83
DROP VIEW ......................................... 126
E
Embedded SQL .................................... 251
End User ............................................ 21
Entity ................................................ 37
Entity Integrity Constraint ...................... 29
Exclusive Lock ..................................... 146
EXISTS .............................................. 124
External / View level ............................ 18
F
Fact Tables ........................................ 243
File System ........................................... 6
First Normal Form ................................. 60
Flat Files .............................................. 6
Foreign Key ......................................... 30
Full Functional Dependency ................... 58
Functionally Dependent ........................ 57
G
GRANT .............................................. 130
GROUP BY .......................................... 103
H
HAVING ............................................. 107
Heterogeneous ..................................... 17
Hierarchical Data Model .......................... 23
Homogeneous ...................................... 17
Horizontal View ................................... 126
I
Immediate Update ............................... 158
Incorrect Summary ............................... 143
Independent Sub-Queries ....................... 112
Index ................................................. 84
INNER JOINS ....................................... 120
INSERT ............................................... 87
Integrated............................................. 3
Intent Locking ..................................... 149
Internal / Physical level ......................... 19
Index

273 | P a g e I n f o s y s F o u n d a t i o n P r o g r a m
Interrelated .......................................... 2
INTERSECT ......................................... 111
J
Joins ................................................ 118
K
Key Attribute .................................. 40, 59
L
Lack of Flexibility ................................ 13
LEFT OUTER JOIN................................ 122
Lost Update ........................................ 142
M
Many to Many Relationship ...................... 39
Many to One Relationship ........................ 39
Master File ............................................ 8
Multivalued Attribute ............................ 41
N
Network Model ..................................... 24
Non-Key Attributes ............................... 33
Normalization ..................................... 55
NOT EXISTS ........................................ 125
O
Object Based Logical Model ..................... 23
On Line Analytical Processing (OLAP) ......... 239
One to Many Relationship ........................ 38
One to One Relationship ......................... 38
Overlapping Candidate Keys .................. 249
P
Partially Dependent ............................... 59
Participating Entities ............................. 52
Phantom Record .................................. 144
Primary Key ......................................... 29
Program/Data Dependence ..................... 12
R
RDBMS ............................................... 26
Record Based Logical Model ............... 23, 175
Recursive Relationship .......................... 41
Referential Constraint ............................ 31
Relational Database .............................. 26
Relational Model ................................... 25
Relationship ....................................... 37
REVOKE ............................................. 131
RIGHT OUTER JOIN ............................... 123
ROLLBACK ......................................... 137
S
Second Normal Form .............................. 61
SELECT............................................... 93
SELF JOIN .......................................... 119
Self Referencing .................................. 31
Shared Intent Exclusive ......................... 150
Shared Lock ........................................ 146
Sharing ................................................ 3
Snowflake Schema .............................. 245
SQL ................................................... 71
Star Schema ....................................... 244
Super Key .......................................... 33
T
Third Normal Form ................................ 64
Timestamping ..................................... 254
Top-Down Approach ............................. 42
Transaction Log ................................... 156
Transactions .......................................... 1
Transitive Dependency ........................... 59
TRUNCATE TABLE .................................. 84
U
UNION............................................... 109
V
Vertical View ...................................... 126
View................................................. 126
W
WHERE ............................................... 94

Vous aimerez peut-être aussi