Vous êtes sur la page 1sur 89

DATABASE

What is a Database ?
Database

can be defined as a
structured collection of interrelated data that is stored in a
computer. Typically a database is
Shared by multiple applications/users
Independent of application
Reduces redundancy

Database Applications:
Banking: All transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online

retailers: order tracking,


customized recommendations
Manufacturing: production, inventory,
orders, supply chain
Human
resources:
employee
records, salaries, tax deductions

Database vs. File Systems

Database management
system
A Database management

system (DBMS) is computer


software designed for the
purpose of creating and
managing databases.
Some common DBMS :
Oracle, DB2, Microsoft
Access, Microsoft SQL
Server, Postgres, MySQL and
FileMaker.
DBMS contains information
about a particular enterprise
Collection of interrelated
data
Set of programs to access

Advantages of DBMS
In the early days, database applications were
built directly on top of file systems. Advantages
of DBMS over file system include
Minimal data redundancy
Promotes data consistency
Promotes integration of data
Allows concurrent-access of data
Allows enforcement of standards
Ease of application development
Better security
Ensures atomicity of transactions

Levels of Abstraction
Physical level: describes how a record (e.g.,

customer) is stored.
Logical level: describes data stored in database,
and the relationships among the data.
type customer = record
customer_id : string;
customer_name : string;
customer_street : string;
customer_city : integer;
end;
View level: application programs hide details of
data types. Views can also hide information
(such as an employees salary) for security
purposes.

Data independence
Is the separation of data definition from

program allowing the developer to change


data definition without changing program.
Levels
Physical data independence
Logical data independence

Data Independence
Logical Data Independence: The

capacity to change the conceptual


schema without having to change the
external schemas and their application
programs.
Physical Data Independence: The
capacity to change the internal schema
without having to change the
conceptual schema.

Instance and schema


Overall design of database is called

schema. It includes the structural


description of the type of facts held in that
database
Example: The database consists of information

about a set of customers and accounts and the


relationship between them)
Analogous to type information of a variable in a

program
Physical schema: database design at the physical
level
Logical schema: database design at the logical
level

Collection of data stored in database at a

Data Models
There are a number of different ways of

organizing a schema, that is, of modeling


the database structure: these are known as
database models (or data models).
A data model includes tools for
representing
Data
Data relationships
Data semantics
Data constraints

Database management systems are usually

categorized according to the data model


that they support: Relational, Hierarchical,
Network, Object oriented , Object-

Database Models
Conceptual models: logical nature of data

representation; if emphasizes on what entity


is presented; it is used for database design
as blueprint
Implementation models: emphasis on how
the data are represented in the database

RDBMS
Relational DataBase Management
Systems

Data models:Relational model


A relational model uses a

collection of tables to
represent both data and
relationship

Three key terms are used

extensively in relational
database models: relations,
attributes, and domains. A
relation is a table with
columns and rows. The
named columns of the
relation are called attributes,
and the domain is the set of
values the attributes are
allowed to take

Name City

Phone

Smith

192346

Alto

Jones Park 120943


John

Park 291239

Relational Database
Concept
Dr. E. F. Codd proposed the
relational model for
database
systems in 1970.
It is the basis for the relational
database management
system
(RDBMS).
The relational model consists of

Relational Database Properties


A relational database:
Can be accessed and modified
by executing structured query
language (SQL) statements
Contains a collection of tables
with no physical pointers

Relational Database Model

Relational Database Model


Advantages
Structural independence: data access path is

is irrelevant to database design; change


structure will not affect the database
Improved conceptual simplicity
Easier database design, implementation,
management, and use
Ad hoc query capability with SQL (4GL is
added)
Powerful database management system

Relational Database Model


Disadvantages
Substantial hardware and system

software overhead
Poor design and implementation

Data Models: Hierarchical


model
In a hierarchical data

model, data are organized


into a tree-like structure. The
structure allows repeating
information using parent/child
relationships: each parent can
have many children but each
child only has one parent..
Hierarchical relationships
between different types of
data can make it very easy to
answer some questions, but
very difficult to answer others.
If a one-to-many relationship
is violated (e.g. a patient can
have more than one
physician), then the hierarchy

Cname

Bookid

phno

address

Bookid

Hierarchical Database Model


Logically represented by an upside down tree

1:M relationship

Hierarchical Database Model


Advantages
Conceptual simplicity: relationship between layers is logically
simple; design process is simple
Efficiency in 1:M relationships and when uses require large
numbers of transactions
Dominant in 1970s , when we used mainframe system with
large databases

Hierarchical Database Model


Disadvantages
Complex implementation: physical data

storage characteristics; database design is


complicated
Difficult to manage and lack of standards
Lacks structural independence
Applications programming and use
complexity (pointer based)
Implementation limitations, i.e. especially it
only handle 1:M type of model

Data Models :Network model


The network model is a database model

conceived as a flexible way of representing


objects and their relationships. Where the
hierarchical model structures data as a tree of
records, with each record having one parent
record and many children, the network model
allows each record to have multiple parent and
child records, forming a lattice structure.
Each record can have multiple parents
Introduce set to describe relationship
Each set has owner record and member record,

parallel to parent and child in HDM


Member may have several owners
One-ownership

Network Database Model


Advantages
Conceptual simplicity, just like HDM
Data access flexibility
Promotes database integrity
Data independence
Conformance to standards

Network Database Model


Disadvantages
System complexity
Lack of structural independence

Network Database Model


Member may have several owners

Data Models: Object relational


model
An object-relational database (ORD) or

object-relational database management


system (ORDBMS) is a relational database
management system that allows
developers to integrate the database with
their own custom data types and methods.
Extend the relational data model by including

object orientation and constructs to deal with


added data types.
Allow attributes of tuples to have complex types,
including non-atomic values such as nested
relations.
Preserve relational foundations, in particular the
declarative access to data, while extending
modeling power.
Provide upward compatibility with existing

Storage Management
Storage manager is a program module that

provides the interface between the low-level


data stored in the database and the
application programs and queries submitted to
the system.
The storage manager is responsible to the
following tasks:
Interaction with the file manager
Efficient storing, retrieving and updating of
data
Issues:
Storage access
File organization

Transaction Management
A transaction is a collection of operations

that performs a single logical function in a


database application
Transaction-management component
ensures that the database remains in a
consistent (correct) state despite system
failures (e.g., power failures and operating
system crashes) and transaction failures.
Concurrency-control manager controls the
interaction among the concurrent
transactions, to ensure the consistency of the
database.

Query Processing
1. Parsing
and
translation
2.
Optimizati
on
3.
Evaluation

NORMAL FORMS

Normalization
Normalization

is

the

process

of

efficiently

organizing data in a database


There

are

two

goals

of

the

normalization

process:
Eliminating Redundant Data
Ensuring Data Dependencies

Reduce

the

amount

of

space

database

consumes and ensure that data is logically


stored

FIRST NORMAL FORM


EMP ID

ENAME

E001

PRJID

PRJNAME

P001

SQL DVLP

Ashok
P003

ORACLE

DEPT

DEPTNAME

D001

DATABASE

E002

Baskar

P002

.NET

D002

ADV PROG

E004

Catherine

P004

C++

D003

PROG

P005

SYS-ADMIN

P006

SQL DBA

D004

TECH-ADMIN

E005

Domnic

EMP ID
E001
E002
E001
E004
E005
E005

ENAME
Ashok
Baskar
Ashok
Catherine
Domnic
Domnic

PRJID
P001
P002
P003
P004
P005
P006

PRJNAME
SQL DVLP
.NET
ORACLE
C++
SYS-ADMIN
SQL DBA

DEPT
D001
D002
D001
D003
D004
D004

DEPTNAME
DATABASE
ADV PROG
DATABASE
PROG
TECH-ADMIN
TECH-ADMIN

All data's precisely in individual rows and

columns OR Create separate tables for each


group of related data and identify each row
with a unique column or set of columns
Remove all duplicate attributes
Data doesnt have any dependencies.

SECOND NORMAL FORM


EMP ENAM
ID
E
E001 Ashok
E002 Baskar
Cather
E003
ine
Domni
E004
c

EMP ID

ENAME

PRJID

PRJNAME

DEPT

DEPTNAME

E001

Ashok

P001

SQL DVLP

D001

DATABASE

E002

Baskar

P002

.NET

D002

ADV PROG

E001

Ashok

P003

ORACLE

D001

DATABASE

E003

Catherine

P004

C++

D003

PROG

E004

Domnic

P005

SYS-ADMIN

D004

E004

Domnic

P006

SQL DBA

D004

PRJID

PRJNAME

DEPT

P001
P002
P003
P004

SQL DVLP
.NET
ORACLE
C++

D001
D002
D001
D003

P005

SYS-ADMIN

D004

DEPTNAM
E
DATABASE
ADV PROG
DATABASE
PROG
TECHADMIN

EMP ID
E001
E002
E001
E003
E004

TECHADMIN
TECHADMIN

Datas follow the 1NF and logically related

datas

are

brought

together

dependent on a whole key.


Unrelated datas are put separately
Datas are functionally dependent.

and

are

3rd NORMAL FORM


PRJID
P001
P002
P003
P004
P005
P006

PRJNA
DEPTN
DEPT
EMP ID
ME
AME
SQL
DATABA
D001
E001
DVLP
SE
ADV
.NET
D002
E002
PROG
DATABA
ORACLE D001
E001
SE
C++
D003 PROG E003
SYSTECHD004
E004
ADMIN
ADMIN
SQL
TECHD004
E004
DBA
ADMIN
DEPTID
D001
D002
D003
D004

DEPTNA
ME
DATABAS
E
ADV
PROG
PROG
TECHADMIN

PRJID

PRJNAME

EMP ID

P001

SQL DVLP

E001

P002

.NET

E002

P003

ORACLE

E001

P004

C++

E003

P005

SYS-ADMIN

E004

P006

SQL DBA

E004

EMP ENAM
DEPT
ID
E

EMP ID

ENAME

E001 Ashok D001

E001

Ashok

E002 Baskar D002

E002

Baskar

E003

Catherine

E004

Domnic

Cather
D003
ine
Domni
E004
D004
c
E003

Datas are in 2nd NF and functionally related

and are dependent on a Primary key and not


partial keys.
Remove columns that are not dependent upon
the primary key
Unrelated datas are separated.

BOYCE CODD NORMAL FORM


USED IN CASES:
Multiple candidate keys exists
Multiple composite keys exists
Candidate keys overlap each other

BCNF is in 3rd normal form and resolving the

above mentioned all(/some) rules.

BOYCE CODD NORMAL


FORM
EMP ID

ENAME

DEPT

CITY

EMAIL

CITYCD

E001

Ashok

D001

CHENNAI

Ashok

C001

E002

Baskar

D002

MUMBAI

Baskar

C002

E003

Catherine

D003

CHENNAI

Catherine

C001

E004

Domnic

D004

KOLKATA

Domnic

C003

EMP ID

ENAME

DEPT

CITYCD

EMAIL

E001

Ashok

D001

C001

Ashok

E002

Baskar

D002

C002

Baskar

E003

Catherine

D003

C001

Catherine

E004

Domnic

D004

C003

Domnic

CITY

CITYCD

CHENNAI

C001

MUMBAI

C002

KOLKATA

C003

Why to NOT normalize?


Joins are expensive
Normalized design is difficult
Quick and dirty should be quick and

dirty

What is Denormalization?
Prod_ID
1001

Products
ProdNam
UnitPrice
e
ABCD
10

1002

EFGH

13

1003

IJKL

12

Sales_Order
Prod_id Qty
1001
23
1002
21
1003
4

Sales_Prof
Profit_Pc
1004
MNOP
15
Prod_id
tg
Select p.Prod_id, ProdName, UnitPrice * Qty *
1001
1.34
Profit_Pctg
1002
1.45
Total Price From Products p JOIN Sales_Order s 1003
1.56
ON p.Prod_id = s.Prod_id
JOIN Sales_Prof sp ON p.Prod_id = s.Prod_id
I need to Join Multiple tables to get a total price of the
product, which may take a long time.

What is Denormalization?
Product_Details
Prod_ID

ProdNam UnitPric
e
e

Qty

Profit_Pct
g

1001

ABCD

10

23

1.34

1002

EFGH

13

21

1.45

1003

IJKL

12

1.56

1004

MNOP

15

(NULL)

1.23

Select Prod_ID, ProdName, (UnitPrice * Qty) * Profit_Pctg


Total Price From Product_Details

Instead, I shall introduce all the columns into


one single table (even if they cause
redundancy),
just to
faster output.
Intentional
Introduction
ofget
Redundancy
in data to
improve the query performance

ACID PROPERTY
ATOMICITY
Atomicity requires that each transaction is "all or nothing": if one part of the

transaction fails, the entire transaction fails, and the database state is left
unchanged. An atomic system must guarantee atomicity in each and every situation,
including power failures, errors, and crashes.
CONSISTENCY
The consistency property ensures that any transaction will bring the database from

one valid state to another. Any data written to the database must be valid according
to all defined rules, including but not limited to constraints, cascades, triggers, and
any combination thereof.
ISOLATION
The isolation property ensures that the concurrent execution of transactions results in

a system state that could have been obtained if transactions are executed serially,
i.e. one after the other.
DURABILITY
Durability means that once a transaction has been committed, it will remain so, even

in the event of power loss, crashes, or errors. In a relational database, for instance,
once a group of SQL statements execute, the results need to be stored permanently
(even if the database crashes immediately thereafter).

ENTITY-RELATIONSHIP
DIAGRAMS
(ER-DIAGRAMS)

Entity Relationship Model


ER model is a conceptual data model that

views the real world as entities and


relationships.
Describes data as entities, attributes and
relationships
nouns = entities
adjectives = attributes
verbs = relationships

Entity Relationship Model


-Advantages
Exceptional conceptual simplicity
Visual representation
it maps well to the relational model. The

constructs used in the ER model can easily


be transformed into relational tables.
it is simple and easy to understand with a
minimum of training. Therefore, the model
can be used by the database designer to
communicate the design to the end user.

Entity Relationship Database Model


Disadvantages
Limited constraint representation
Limited relationship representation

(internal relationship can not be depicted;


multiple relationships)
No data manipulation language (no
complete)
Loss of information content

ELEMENTS IN ER DIAGRAMS

Entity
Entities are the principal data

object about which information


is to be collected. Entities are
usually recognizable concepts,
either concrete or abstract, such
as person, places, things, or
events which have relevance to
the database. Some specific
examples of entities are
EMPLOYEES, PROJECTS,
INVOICES.
Entities are classified as
independent or dependent An
independent entity is one that
does not rely on another for

Entity and Entity Types


Name

Entity Type

Number

Topic
Course

Entity

Number: 3C13
Name: Database and Information Management Systems
Topic:

Attributes

Each entity has a set of associated properties that


describes the entity. These properties are known
as attributes.
Attributes can be classified as identifiers or
descriptors. Identifiers, more commonly called
keys, uniquely identify an instance of an entity. A
descriptor describes a non-unique characteristic
of an entity instance.
Attributes can also be classified as:
Simple or Composite
Single or Multi-valued
Stored or Derived
NULL

Attributes
Simple

Professor

Start Date

First

Composite

Professor

Name
Last

Attributes
Single

Multi-Valued

Professor

Employee ID#

Professor

Email

Attributes
Stored

Professor

Start Date

Derived

Professor

Years Teaching

Primary Keys
Professor

Employee ID

Employee ID is the primary key


Primary keys must be unique for the entity

in question

Relationships
A Relationship represents an association

between two or more entities


An example of a relationship would be:

employees are assigned to projects


projects have subtasks
departments manage one or more projects

can have attributes to define them


Relationships are classified in terms of

degree, connectivity, cardinality, and


participation

Relationships

Professor

teaches

Course

Weak entity
Weak entities do not have key attributes of

their own.
Weak entities cannot exist without another
a relationship to another entity.
A partial key is the portion of the key that
comes from the weak entity. The rest of the
key comes from the other entity in the
relationship.
Weak entities always have total
participation as they cannot exist without
the identifying relationship.

Weak Entity -Representation


Payment ID

Identifying Relationship

Payment

Made for

Loan

Loan-No

Sample problem:
Design an E-R schema for a database to store info about
professors, courses and course sections indicating the
following:
The name and employee ID number of each professor
The salary and email address (es) for each professor
How long each professor has been at the university
The course sections each professor teaches
The name, number and topic for each course offered
The section and room number for each course section
Each course section must have only one professor
Each course can have multiple sections

Solution:
Employee ID

Email

Start Date

Years Teaching

Professor

teaches

Section
ID
N

Room

Section
N

Salar
y

First
Part of

Name
Last

Number

Course

Topic

Name

SQL
{STRUCTURED QUERY
LANGUAGE }

SQL QUERYING
SQL Commands are used to extract or

store data from or to any database


server.
They are of multiple types:
DDL
DML
DCL
Transaction Control Language

65

visit: www.digiterati.com

DDL STATEMENTS
CREATE
ALTER
DROP

CREATE QUERY
SYNTAX:
USER
CREATE
DEFINED
TABLE
<TABLE-NAME>
( <COL-1 > <DATA-TYPE> <CONSTRAINTS>
.
<COL-N> <DATA-TYPE> <CONSTARINTS>
)

CREATE QUERY
EXAMPLE:

CREATE
TABLE
hire_dates
(id NUMBER(8),
hire_date DATE
SYSDATE);

DEFAULT

RESULT:

CREATE TABLE succeeded.

ALTER QUERY
SYNTAX

To rename the table from t1 to t2:


ALTER
TABLE
<TABLE-NAME>
<RENAME> <TABLE-NAME>;

THIS COULD BE ANY


OPERATIONS
LIKE DROP OR ADD OR
MODIFY

DROP QUERY
SYNTAX

DROP
TABLE

<TABLE-NAME>;

THE QUERY DROPS THE STRUCTURE OF

THE TABLE .
THE DROPs CAN BE PERFORMED ON
VIEWS & TRIGGERS TOO.

TRUNCATE QUERY
SYNTAX

TRUNCATE
TABLE
<TABLE-NAME>;

THE QUERY TRUNCATES THE DATA

LEAVING THE STRUCTURE INTACT.

DML STATEMENTS
SELECT
INSERT
UPDATE
DELETE

SELECT QUERY
SYNTAX:

RETURNS ALL COLS &


ROWS

SELECT *
FROM <TABLE-NAME>;

ONLY SELECTED COLS & ALL


ROWS

SELECT <COL-NAME-1>, .., <COL-NAME-N>


FROM <TABLE-NAME>;
SELECT <COL-NAME-1>, .., <COL-NAME-N>
FROM <TABLE-NAME>;
ONLY SELECTED ROWS &
WHERE <CONDITION>
COLS ARE DISPLAYED

SELECT QUERY
EX-1:
SELECT *
FROM
employees;
EX-2:
SELECT employee_id, last_name, job_id, department_id
FROM
employees
EX-3:
SELECT employee_id, last_name, job_id, department_id
FROM
employees
WHERE department_id = 90 ;

INSERT QUERY
SYNTAX:

INSERT INTO TABLE


<COL-NAMES>
VALUES
(<VALUE-1>,<VALUE-2>);
EXAMPLE:

INSERT INTO departments


(department_id, department_name )
VALUES
(30, 'Purchasing');
RESULT:

1 rows inserted

UPDATE QUERY
SYNTAX:

UPDATE table
SET
column = value [, column = value, ...]
[WHERE condition];
EXAMPLE:

UPDATE employees
SET
department_id = 70
WHERE employee_id = 113;
RESULT:

1 rows updated

DELETE QUERY
SYNTAX:

DELETE [FROM] table


[WHERE condition];
EXAMPLE:

DELETE FROM departments


WHERE department_name = 'Finance';
RESULT:

1 rows deleted

DCL STATEMENTS
GRANT
REVOKE

GRANT STATEMENT
GRANT : gives user rights and privileges on

database objects or schema.


QUERY-MODEL:

GRANT privilege_name
ON
object_name
TO
{user_name |PUBLIC |role_name}
[WITH GRANT OPTION];

REVOKE STATEMENT
REVOKE : removes or restricts user rights

or privileges on database objects.


QUERY-MODEL:

REVOKE privilege_name
ON
object_name
FROM
{user_name |PUBLIC |role_name}

PRIVELAGES
Privileges: Privileges defines the access

rights provided to a user on a database


object. There are two types of privileges.
1) System privileges - This allows the user

to CREATE, ALTER, or DROP database


objects.
2) Object privileges - This allows the user

to EXECUTE, SELECT, INSERT, UPDATE, or


DELETE data from database objects to
which the privileges apply.

TCL STATEMENTS
COMMIT
ROLLBACK
SAVEPOINT

Advantages of COMMIT &


ROLLBACK Statements
With COMMIT and ROLLBACK
statements, you can:
Ensure data consistency
Preview data changes before making

changes permanent
Group logically related operations

TCL - OPERATIONS

State of the Data After


COMMIT
Data changes are made permanent in the

database.
The previous state of the data is
permanently lost.
All users can view the results.
Locks on the affected rows are released;
those rows are available for other users to
manipulate.
All savepoints are erased.

State of the Data After


ROLLBACK
Discard all pending changes by using the
ROLLBACK statement:
Data changes are undone.
Previous state of the data is restored.
Locks on the affected rows are released.

SUMMARY
DATABASE
DBMS
RDBMS
E-R DIAGRAMS
SQL-[STRUCTURED QUERY LANGUAGE ]

QUERIES

THANK

YOU

Vous aimerez peut-être aussi