Vous êtes sur la page 1sur 144

Database Management System

Unit II
Database System Concepts and Architecture

UNIT II:
2.1 Data Models
2.2 DBMS architecture and Data
independence
2.3 Database user languages
2.4 Database users and Database
administrators
2.5 E-R Model: Entities, Attributes,
Relationships, Keys, Cardinalities,
Participation constraints, E-R Diagram
2.6 Data Dictionary

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

Entity-Relationship Model
E-R data model is based on a perception of
real world that consists of a collection of
basic objects, called entities, and of
relationships among these objects.
E-R models separate the information required by
business from the activities performed within a
business. Although business can change their
activities, the type of information tends to remains
constant. Therefore, the data structures also tend
to be constant.

Graphical representation:
Rectangles - entity sets
Ellipses attributes
Diamonds relationships among entity sets
Lines - link

Entity A thing of significance about which


information needs to be known. e.g.
customers, sales representatives, orders

Attribute Something that describes or qualifies


an entity. Attributes are information about
an entity that needs to be known or held. e.g.
name, phone, identification number
Relationship - A name association between
entities showing optionality or degree. e.g.
orders and items, customers and sales
representatives

entity-relationship diagram (E-R Diagram):

A methodology for documenting databases


illustrating the relationship between various
entities in the database.

The relationships between the entities ORDER,


LINE_ITEM, PART, and SUPPLIER that might be used
to model the database

Relationship Diagramming Conventions:


---------

Dashed line (optional element


indicating may be)
Solid line (mandatory element
indicating must be)
Crows foot (degree element indicating
one or more)
Single line (degree element indicating )

Relationship Types: Relationship between related


entities or tables are:
One-to-One : Degree of one and only one in both
directions. These types are rate, and may
really be the same entity, or and attribute of
the entity.

One-to-many : degree of one or more in one direction


and a degree of one and only one in other
direction. Very common.
Many-to-Many : degree of one or more in both directions.
Very common. Resolve them with an
intersection entity.

Employee

Department

Read this relationship first from left to right, and then from right to left

Employee

Department

Each Employee must be assigned to one and only one Department

Employee

Department

Each Department may be responsible for one or more Employees

Identify and model the entities in the following set of


information requirements.
Im the manager of a training company that provides instructor
courses in management techniques. We teach many courses,
each of which has code, a name and a fee. Introduction to UNIX,
and C Programming are two of our more popular courses.
Courses vary in length from one day to four days. An instructor
can teach several courses. Paul Rogers and Maria Gonzelez are
two of our best teachers. We track each instructors name and
phone number. Each course is taught by only one instructor. We
create a course and then line up an instructor. The students can
take several courses over time, and many of them do this. Jamie
Brown from AT&T took every course we offer. We track each
students name and phone number. Some of our students and
instructors do not give us their phone numbers.

Identify and Model Entities


COURSE

INSTRUCTOR

Code
Name
Fee
Duration

Name
Phone number

STUDENT
Name
Phone number

A COURSE has significance as training event offered by the


Training Company. For example, Introduction to UNIX and C
Programming.
A STUDENT has significance as a participant in one or more
COURSEs. For example, Jamie Brown
An INSTRUCTIOR has significance as a teacher of one or
more COURSEs. For example, Paul Rogers and Maria
Gonzales

Analyze and Model Relationships


E-R Model
COURSE
Code
Name
Fee
Duration

INSTRUCTOR
Taught by

Taken by

Enrolled in

STUDENT
Name
Phone number

The teacher of

Name
Phone

o The E-R model represents certain constraints


to which the contents of a database must
confirm.
o One important constraint is Mapping
cardinalities, which express the number of
entities to which another entity can be
associated via a relationship set.

o e.g. if each account must be belong to only


one customer, the E-R model can express that
constraint.

o Mapping cardinalities are most useful in


describing binary relationship sets, although
they can contribute to the description of
relationship sets that involve more than two
entity sets.
o For a binary relationship set R between entity
sets A and B, the mapping cardinality must be
one of the following:
o One to One
o One to Many
o Many to One
o Many to Many

Relationship Types: Relationship between related


entities or tables are:
One-to-One : Degree of one and only one in both
directions. These types are rate, and may
really be the same entity, or and attribute of
the entity.

One-to-many : degree of one or more in one direction


and a degree of one and only one in other
direction. Very common.
Many-to-Many : degree of one or more in both directions.
Very common. Resolve them with an
intersection entity.

Participation Constraints
o The participation of an entity set E in a
relationship set R is said to be Total if every
entity in E participates in at least one
relationship in R.
o If only some entities in E participate in
relationships in R, the participation of entity
set E in relationship R is said to be Partial.

Database
Development
Process

Common Data Models


UML/OO
class
object
attribute
association
inheritance

ER
entity type

Relational
relation/table

entity
attribute
relationship
key attribute
inheritance

tuple/row
attribute/column
foreign key
primary key
foreign key

We have standard techniques for translating


between data models.

Entity Types
Entity types are similar to classes,
they describe potential objects (entities)
that will appear in the database.
Weak entity types describe dependent
entities,
entities that depend on other entities for
identity.
EMPLOYEE

DEPENDENT

Entity

Weak Entity

Attributes and Keys


Attributes ovals
Key attributes underlined name
Partial key attributes dotted underlined
name
Age

SSN

Date

Attribute

Key Attribute

Partial Key
Attribute

Attributes and Keys


Key attributes must be unique for each
entity
Keys are used to identify particular entities
Partial keys are only partially unique
used for weak entity types
Age

SSN

Date

Attribute

Key Attribute

Partial Key
Attribute

Entity Types and Attributes


Attributes are connected to entity types by
lines
Name

EMPLOYEE

Phone

EID
Name
DEPENDENT
Age

Entity Types and Keys


All regular entity types must
have a key attribute or set of key
attributes

Weak entity types must have partial keys


Weak entities get part of their key
(and part of their identity)
from some related entity.

Relationships
Relationships diamonds
Identifying relationship double diamond
WorksOn

DependentOf

Relationship

Identifying
Relationship

Relationships
Relationships indicate a meaningful connection
between two entity types
Relationships may have attributes,
but they cannot have key attributes.
Identifying relationships connect a weak entity
type
to some other entity type
indicates where the weak entity gets a key
to complete its own partial key
WorksOn

DependentOf

Relationship

Identifying
Relationship

Example Schema
DEPENDENT

DependentOf

Name

Age

EID
Name

EMPLOYEE

Phone

Name
WorksOn

PROJECT
StartDate

Budget

Participation and Cardinality


Participation and cardinality
define constraints on relationships
Participation indicates whether an entity
is required to take part in a relationship
Cardinality ratios and structural constraints
place limits on the number of entities
that may participate in a relationship

Participation Constraints
Total participation double or thick line
indicates required participation

Partial participation thin line


indicates optional participation
EMPLOYEE

WorksFor

DEPARTMENT

Total Participation
EMPLOYEE

WorksOn

Partial Participation

PROJECT

Participation Constraints
Arrowheads can be used to indicate
an upper bound of 1 for participation

X must participate in exactly one R


X

X may participate in at most one R

Cardinality Ratios
Cardinality ratios specify
the maximum number of relationship instances
that an entity may participate in
EMPLOYEE

Manages

DEPARTMENT

1:1 ratio
EMPLOYEE

WorksFor

DEPARTMENT

n:1 ratio
EMPLOYEE

WorksOn

n:m ratio

PROJECT

Structural Constraints
Structural constraints specify the minimum and
maximum number of relationship instances
that an entity may participate in
EMPLOYEE

(1,1)

WorksFor

(4,n)

DEPARTMENT

An employee must work for exactly 1 department.


A department must have at least 4 employees.
EMPLOYEE

(0,1)

Manages

(1,1)

DEPARTMENT

An employee may manage at most 1 department.


A department must have exactly 1 manager.

Relational Roles
It is sometimes convenient
to name an entitys role in a relationship.
particularly useful in recursive relationships
removes ambiguity in direction of relationship
Supervision

supervisor

supervisee

(0,N)

(0,1)
EMPLOYEE

Recursive Relationship
Supervision

supervisor

supervisee

(0,N)

(0,1)
EMPLOYEE

1 = supervisor
2 = supervisee

Notation Summary

Enhanced ER Model (EER)


aka Extended Entity-Relationship Model
adds Inheritance
indicates that one entity type is
an extension of another entity type
often referred to as an IS-A relationship

o E.g. we expect every loan entity to be related


to at least one customer through the borrower
relationship.
o Therefore the participation of loan in the
relationship set borrower is Total.
o In contrast, an individual can be a bank
customer whether or not s/he has a loan with
the bank. Hence, it is possible that only some
of the customer entities are related to the
loan entity set through the borrower
relationship, and the participation of
customer in the borrower relationship set is
therefore partial.

Keys:
o No two entities in an entity set are allowed to
have exactly the same value for all attributes.
o A keys allows us to identify a set of attributes
that suffice to distinguish entities from each
other.

o Keys also help uniquely identify relationships,


and thus distinguish relationships from each
other.

Key field: A field in a record that uniquely


identifies instances of that record so that
it can be retrieved, updated, or sorted

Primary Key: Unique identifier for all the


information in any row of a database table
Foreign Key: Field in a database table that
enables users find related information in
another database table.

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

Relational DBMS are suited for handling data,


not graphics or multimedia. An object-oriented
DBMS (OODBMS) stores the data and
procedures that act on those data as objects that

can be automatically retrieved and shared, and


can manage multimedia and java applets.
However, OODBMS are slower in handling
large numbers of transactions. Hybrid objectrelational DBMS systems are now available to
provide capabilities of both object-oriented and
relational DBMS.

The select, project, and join operations


enable data from two different tables to be
combined and only selected attributes to
be displayed.

THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS

Edgar F. Codd

Ted Codd proposed


the relational data model
in 1970.
He received
the ACM Turing Award
in 1981.
Communications of the ACM 13[6] June 1970

Relational Data Model


Core of majority of modern databases
Virtually all business relies
on some form of relational database
Solid theoretical/mathematical foundation

Simple but robust implementation

Models, Schemas and States


A data model defines the constructs
available for defining a schema
defines possible schemas

A schema defines the constructs


available for storing the data
defines database structure
limits the possible database states

A database state (or instance) is all


the data at some point in time

the database content

Models, Schemas and States


data model
fixed by the DBMS

schema
defined by the DB designer
generally fixed once defined *

database state
changes over time due to user updates

* schema modifications are possible once the


database is populated, but this generally causes
difficulties

The Relational Data Model


All data is stored in relations
relations are sets, but generally viewed as 2D tables

DB schema = a set of relation specifications


the specification of a particular relation is called a relation
schema

DB state = the data stored in the relations


the data in a particular relation is called a relation state
(or relation instance or simply relation)
Principle of Uniform Representation:
The entire content of a relational database is represented in one and only one way:
namely, as attribute values within tuples within relations. (A tuple is analogous to
a record in nonrelational databases.) In the context of databases, a tuple is one
record (one row).

RDM Schemas

External View

relation
specifications

mapping from
relations to
storage layout (files)

External View

Conceptual Schema
Internal Schema

External View

Relational Data Definition


application
application
application
program(s)
application
program(s)
program(s)
program(s)

query processor
security manager
concurrency manager
index manager

database designer
enters the
definition of
relation schemas
SQL DDL = relation
definition language
(CREATE TABLE)

users of
the data

data
definition
processor

relation
schemas

relations

Relation Schemas
A relation is defined by
a name and
a set of attributes

Each attribute has a name and a domain


a domain is a set of possible values
types are domain names
all domains are sets of atomic values
RDM does not allow complex data types
domains may contain a special null value

Example Relation Schema


relation
name

set of
attributes

StockItem
Attribute
ItemID
Description
Price
Taxable
attribute
names

Domain
string(4)
string(50)
currency/dollars
boolean
attribute
domains

Definition: Relation
A relation is denoted by r(R)
R is the name of the relation schema for the
relation

A relation is a set of tuples


r(R) = (t1, t2, , tm)

Definition: Relation
Each tuple is an ordered list of n values
t = < v1, v2, , vn >
n is the degree of R

Each value in the tuple must be vi dom(Ai)


in the domain of the corresponding
attribute
vi

t[Ai]

or

vi

t.Ai

Alternate notations:
ith value of tuple t is also referred to as

Example Relation
r(STOCKITEM) =
{ < I119, "Monopoly", $29.95, true >,
< I007, "Risk", $25.45, true >,
< I801, "Bazooka Gum", $0.25, false > }
t2 = < I007, "Risk", $25.45, true >
t2[Price] = t2.Price = $25.45
t2[Price] dom(Price) = currency/dollars

Characteristics of Relations
A relation is a set
tuples are unordered
no duplicate tuples

Attribute values within tuples are ordered


values are matched to attributes by position
alternate definition defines a tuple
as a set of (name,value) pairs,
which makes ordering of tuple unnecessary

Characteristics of Relations

Values in tuples are atomic

atomic = non-structured
(similar to primitive types in C++)
implication:
no nested relations or other complex data
structures

If domain includes null values,


null may have many interpretations
"does not exist"
"not applicable"
"unknown"

Theory vs. Reality


The theoretical data model is
mathematical:
a relation is a set of tuples
this is Codd's definition

In the real-world, the model is practical:


efficiency concerns
excepted standard: SQL
a relation is a table, not a set
a relation may have order and duplicates

Example Schema

Example
State

CONSTRAINTS

Constraints
Constraints are restrictions on legal relation states
they add further semantics to the schema

Domain constraints

vi

dom(Ai)

values for an attribute must be from


the domain associated with the attribute

Non-null constraints
the domain of some attributes may not include null,
implying that a value for that attribute
is required for all tuples

Key Constraints
By definition, all tuples in a relation are unique
Often, we want to restrict tuples further such
that some subset of the attributes
is unique for all tuples
Example: in the StockItem relation,
no ItemID should appear in more than one
tuple
ItemID is called a key attribute

Keys and Superkeys


Any subset of attributes
that must be unique is called a superkey
If no subset of the attributes of a superkey
must also be unique,
then that superkey is called a key
Example:

key

key

VEHICLE(LicenseNumber, SerialNumber, Model, Year)


superkey

Candidate and Primary Keys


If a relation has more than one key,
each key is called a candidate key
One candidate key must be chosen
to be the primary key
The primary key is the one that will be
used to identify tuples
If there is only one key, it is the primary key

Candidate and Primary Keys


Primary keys are indicated
by underlining the attributes that make
up that key
candidate key

candidate key

VEHICLE(LicenseNumber, VIN, Model, Year)


primary key

Example Keys
STOCKITEM( ItemId, Description, Price, Taxable )
superkeys:
(ItemId), (Description), (ItemId, Description)
keys:
(ItemId), (Description)
candidate keys:
(ItemId), (Description)
primary key:
(ItemId)

(assumes that
Description is
unique for all items)

STATE CHANGE AND


CONSTRAINT ENFORCEMENT

Causes of Constraint Violations


What can cause a
referential integrity constraint violation?
inserting a tuple in R1 with an illegal FK
modifying a tuple in R1 to have an illegal FK
deleting a tuple in R2 that had the PK referenced
by some FK in R1

How can a referential integrity constraint


be enforced?
reject the operation that attempts to violate it
(may cause other operations to be rejected
transactions)
or
repair the violation, by cascading inserts or deletes

Data Manipulation Operations


There are three ways to modify the value of a
relation:
Insert: add a new tuple to R
Delete: remove an existing tuple from R
Update: change the value of an existing tuple
in R
Delete and Update both require some way
to identify an existing tuple (a selection)

Inserting Tuples
r1(STORESTOCK) =

< "S002", "I065", 120 >,


< "S047", "I954", 300 >,
< "S333", "I954", 198 >

insert < "S047", "I099", 267 >

r2(STORESTOCK) =

< "S002", "I065", 120 >,


< "S333", "I954", 198 >,
< "S047", "I099", 267 >,
< "S047", "I954", 300 >

any constraint violations?

Deleting Tuples
r2(STORESTOCK) =

< "S002", "I065", 120 >,


< "S333", "I954", 198 >,
< "S047", "I099", 267 >,
< "S047", "I954", 300 >

delete tuples with Item = "I954"

r3(STORESTOCK) =

< "S002", "I065", 120 >,


< "S047", "I099", 267 >

Updating Tuples
r3(STORESTOCK) =

< "S002", "I065", 120 >,


< "S047", "I099", 267 >

change the Quantity of tuples


with StoreID = "S002" and Item = "I954" to 250

r3(STORESTOCK) =

< "S002", "I065", 250 >,


< "S047", "I099", 267 >

Analyzing State Changes


Any update can be viewed as (delete and insert)
update: < "S002", "I065", 120 > to < "S002", "I065", 250 >
is equivalent to
delete: < "S002", "I065", 120 >
insert: < "S002", "I065", 250 >

Any database state change can be viewed


as a set of deletes and inserts on individual
relations
This makes the analysis of potential constraint
violations a well defined problem

Enforcing Constraints
constraint enforcement:
ensuring that no invalid database states can
exist
invalid state: a state in which a constraint is
violated
Possible ways to enforce constraints:
reject any operation that causes a violation, or
allow the violating operation and then attempt
to correct the database

Schema for
Airline
Database

EER Bank Schema

Step 1: Regular Entities


Regular entity types become relations
include all simple attributes
include only components of compound
attributes
keys become primary keys
if multiple keys (candidates) select a primary
key

CUSTOMER(Ssn, Name, Addr, Phone)

Step 1: Regular Entities


BANK(Code, Name, Addr)

ACCOUNT(Acct_no, Type, Balance)

LOAN(Loan_no, Type, Amount)

Step 2: Weak Entities


Weak entity types become relations
include all simple attributes
include only components of compound
attributes
create a primary key from partial key
and key of owning entity type (through
identifying relationship)
attributes acquired through identifying
relationship
become a foreign key*
* typically, deletions and insertions will be propagated
through this foreign key

Step 2: Weak Entities


Weak entity types become relations

BANK_BRANCH(Bank_code, Branch_No, Addr)


FK

BANK(Code, Name, Addr)

Step 3: Binary 1:1 Relationships


Approach 1: Foreign Key
Chose one of the related entity types to hold the
relationship
(chose one with total participation, if possible)
add FK to other relation
move all relationship attributes to this relation
this approach is preferable, except as noted below

Approach 2: Merged Relation


combine the relations for the related entities into a
single relation
use only when both participations are total

Approach 3: Separate Relation


same as binary M:N relationship (see step 5)
not generally a good option

Step 3: Binary 1:1 Relationships


Approach 1: Foreign Key

EMPLOYEE(Ssn, Name, )
FK

DEPARTMENT(Name, Number, Mgr, Mgr_start_date)

Step 3: Binary 1:1 Relationships


Approach 2: Merged Relation

AJB(x, y, p, q, r)

or
AJB(x, y, p, q, r)

Step 4: Binary 1:N Relationships


1:N Relationships become foreign key at N
side
any relationship attributes also go to N side

LOAN(Loan_no, Type, Amount,


Bank, Branch)

BANK_BRANCH(Bank_code, Branch_No,
Addr)

Step 4: Binary 1:N Relationships


1:N Relationships become foreign key at N
side
any relationship attributes also go to N side

ACCOUNT(Acct_no, Type, Balance,


Bank, Branch)

BANK_BRANCH(Bank_code, Branch_No,
Addr)

Step 5: Binary M:N Relationships


M:N Relationships must become a new
relation
contains FKs to both related entities
combined FKs become PK for new relations
CUSTOMER(Ssn,
Addr,
relationship attributes
go in newName,
relation
Phone)
A_C(Acct, Cust)

ACCOUNT(Acct_no, Type,
Balance, Bank, Branch)

Step 6: Multivalued Attributes


Multivalued attributes must become new
relations
FK to associated entity type
PK is whole relation

DEPARTMENT(Name, Number, Mgr, Mgr_start_date)


DEPT_LOCATIONS(DName, Dno, Location)

Step 7: N-ary Relationships


Non-Binary Relationships become new
relations
FKs to all participating entity types
Combine FKs to make a PK
(exclude entities with max participation of 1)
Include any relationship attributes
SUPPLIER(SName)
PROJECT(Proj_name)
PART(Part_no)
SUPPLY(SName, PName, Part, Quantity)

Completed Bank Schema


CUSTOMER(Ssn, Name, Addr, Phone)
BANK(Code, Name, Addr)
ACCOUNT(Acct_no, Type, Balance, Bank, Branch)
LOAN(Loan_no, Type, Amount, Bank, Branch)
BANK_BRANCH(Bank_code, Branch_No, Addr)
A_C(Acct, Cust)
L_C(Loan, Cust)
BANK_BRANCH(Bank_code) refers to BANK
LOAN(Bank,Branch) refers to BANK_BRANCH
ACCOUNT(Bank,Branch) refers to BANK_BRANCH
A_C(Acct) refers to ACCOUNT
A_C(Cust) refers to CUSTOMER
L_C(Loan) refers to LOAN
L_C(Cust) refers to CUSTOMER

Bank Schema: MS Access

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

Object DBMSs add database functionality


to object programming languages.
They bring much more than persistent
storage of programming language objects.
A major benefit of this approach is the
unification of the application and database
development into a seamless data model
and language environment. As a result,
applications require less code, use more
natural data modeling, and code bases are
easier to maintain.

Object DBMSs add database functionality to


object programming languages. They bring
much more than persistent storage of
programming language objects. Object DBMSs
extend the semantics of the C++, Smalltalk and
Java object programming languages to provide
full-featured database programming capability,
while retaining native language compatibility. A
major benefit of this approach is the unification
of the application and database development
into a seamless data model and language
environment. As a result, applications require
less code, use more natural data modeling, and
code bases are easier to maintain.

OO vs. EER Data Modeling


Object Oriented

EER

Class

Entity type

Object

Entity instance

Association

Relationship

Inheritance of
attributes

Inheritance of
attributes

Inheritance of
behavior

No representation of
behavior

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

ORDBMSs add new object storage


capabilities to the relational systems at the
core of modern information systems. These
new facilities integrate management of
traditional fielded data, complex objects
such as time-series and geospatial data and
diverse binary media such as audio, video,
images, and applets. By encapsulating
methods with data structures, an ORDBMS
server can execute complex analytical and
data manipulation operations to search and
transform multimedia and other complex
objects.

As an evolutionary technology, the object /


relational (OR) approach has inherited the

robust transaction - and performance management features of its relational ancestor

and the flexibility of its object-oriented cousin.


Database designers can work with familiar
tabular structures and data definition
languages (DDLs) while assimilating new
object-management possibilities.

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

In 1971, the Conference on Data Systems


Languages (CODASYL) formally defined the
network model. The basic data modeling
construct in the network model is the set
construct.
The popularity of the network data model
coincided with the popularity of the
hierarchical data model. Some data were
more naturally modeled with more than one
parent per child. So, the network model
permitted the modeling of many-to-many
relationships in data

A set consists of an owner record type, a set


name, and a member record type. A member
record type can have that role in more than
one set, hence the multi-parent concept is
supported. An owner record type can also be
a member or owner in another set. The data
model is a simple network, and link and
intersection record types may exist, as well
as sets between them .

Thus, the complete network of relationships


is represented by several pairwise sets; in

each set some (one) record type is owner


(at the tail of the network arrow) and one or

more record types are members (at the


head of the relationship arrow).

The popularity of the network data model


coincided with the popularity of the
hierarchical data model.
Some data were more naturally modeled
with more than one parent per child.
So, the network model permitted the
modeling of many-to-many relationships
in data.
The basic data modeling construct in the
network model is the set construct.
A set consists of an owner record type, a
set name, and a member record type.

Data Models

Type:

A collection of conceptual tools for


describing data, data relationships, data
semantics, and consistency constraints.
o
o
o
o
o
o

E-R Model
Relational Model
Object-Oriented Data Model
Object-Relational Data Model
Network Data Model
Hierarchical Data Model

The hierarchical data model organizes data


in a tree structure. There is a hierarchy of
parent and child data segments. This
structure implies that a record can have
repeating information, generally in the child
data segments. Data in a series of records,
which have a set of field values attached to
it. It collects all the instances of a specific
record together as a record type. These
record types are the equivalent of tables in
the relational model, and with the individual
records being the equivalent of rows.

To create links between these record types,


the hierarchical model uses Parent Child
Relationships. These are a 1:N mapping
between record types. This is done by using
trees, like set theory used in the relational
model, "borrowed" from maths. For
example, an organization might store
information about an employee, such as
name, employee number, department, salary.
The organization might also store
information about an employee's children,
such as name and date of birth.

The employee and children data forms a


hierarchy, where the employee data
represents the parent segment and the
children data represents the child segment.
If an employee has three children, then there
would be three child segments associated
with one employee segment.
In a hierarchical database the parent-child
relationship is one to many. This restricts a
child segment to having only one parent
segment.

The hierarchical data model organizes


data in a tree structure.
There is a hierarchy of parent and child
data segments.
This structure implies that a record can
have repeating information, generally in
the child data segments.
Data in a series of records, which have a
set of field values attached to it.
It collects all the instances of a specific
record together as a record type.

Data Models (continued)


Data Model Operations:
These operations are used for specifying
database retrievals and updates by referring to
the constructs of the data model.
Operations on the data model may include
basic model operations (e.g. generic insert,
delete, update) and user-defined operations
(e.g. compute_student_gpa, update_inventory)

Categories of Data Models


Conceptual (high-level, semantic) data models:
Provide concepts that are close to the way many users
perceive data.
(Also called entity-based or object-based data models.)

Physical (low-level, internal) data models:


Provide concepts that describe details of how data is stored
in the computer. These are usually specified in an ad-hoc
manner through DBMS design and administration manuals

Implementation (representational) data models:


Provide concepts that fall between the above two, used by
many commercial DBMS implementations (e.g. relational
data models used in many commercial systems).

Schemas versus Instances


Database Schema:
The description of a database.
Includes descriptions of the database
structure, data types, and the constraints on
the database.

Schema Diagram:
An illustrative display of (most aspects of) a
database schema.

Schema Construct:
A component of the schema or an object
within the schema, e.g., STUDENT, COURSE.

Schemas versus Instances


Database State:
The actual data stored in a database at a
particular moment in time. This includes the
collection of all the data in the database.
Also called database instance (or occurrence
or snapshot).
The term instance is also applied to
individual database components, e.g.
record instance, table instance, entity
instance

Database Schema
vs. Database State
Database State:
Refers to the content of a database at a
moment in time.

Initial Database State:


Refers to the database state when it is
initially loaded into the system.

Valid State:
A state that satisfies the structure and
constraints of the database.

Database Schema
vs. Database State (continued)
Distinction
The database schema changes very
infrequently.
The database state changes every time the
database is updated.

Schema is also called intension.


State is also called extension.

Example of a Database Schema

Example of a database state

Three-Schema Architecture
Proposed to support DBMS
characteristics of:
Program-data independence.
Support of multiple views of the data.

Not explicitly used in commercial DBMS


products, but has been useful in
explaining database system
organization

Three-Schema Architecture
Defines DBMS schemas at three levels:
Internal schema at the internal level to describe
physical storage structures and access paths (e.g
indexes).
Typically uses a physical data model.
Conceptual schema at the conceptual level to describe
the structure and constraints for the whole database
for a community of users.
Uses a conceptual or an implementation data
model.
External schemas at the external level to describe the
various user views.
Usually uses the same data model as the conceptual
schema.

The three-schema architecture

Three-Schema Architecture
Mappings among schema levels are
needed to transform requests and data.
Programs refer to an external schema, and
are mapped by the DBMS to the internal
schema for execution.
Data extracted from the internal DBMS
level is reformatted to match the users
external view (e.g. formatting the results of
an SQL query for display in a Web page)

Data Independence
Logical Data Independence:
The capacity to change the conceptual schema
without having to change the external schemas
and their associated application programs.

Physical Data Independence:


The capacity to change the internal schema
without having to change the conceptual
schema.
For example, the internal schema may be
changed when certain file structures are
reorganized or new indexes are created to
improve database performance

Data Independence (continued)


When a schema at a lower level is changed,
only the mappings between this schema
and higher-level schemas need to be
changed in a DBMS that fully supports data
independence.
The higher-level schemas themselves are
unchanged.
Hence, the application programs need not be
changed since they refer to the external
schemas.

DBMS Languages
Data Definition Language (DDL)
Data Manipulation Language (DML)
High-Level or Non-procedural Languages:
These include the relational language SQL
May be used in a standalone way or may be
embedded in a programming language

Low Level or Procedural Languages:


These must be embedded in a programming
language

DBMS Languages
Data Definition Language (DDL):
Used by the DBA and database designers to
specify the conceptual schema of a database.
In many DBMSs, the DDL is also used to define
internal and external schemas (views).
In some DBMSs, separate storage definition
language (SDL) and view definition
language (VDL) are used to define internal
and external schemas.
SDL is typically realized via DBMS commands
provided to the DBA and database designers

DBMS Languages
Data Manipulation Language (DML):
Used to specify database retrievals and updates
DML commands (data sublanguage) can be
embedded in a general-purpose programming
language (host language), such as COBOL, C,
C++, or Java.
A library of functions can also be provided to access
the DBMS from a programming language

Alternatively, stand-alone DML commands can


be applied directly (called a query language).

Types of DML
High Level or Non-procedural
Language:
For example, the SQL relational language
Are set-oriented and specify what data to
retrieve rather than how to retrieve it.
Also called declarative languages.

Low Level or Procedural Language:


Retrieve data one record-at-a-time;
Constructs such as looping are needed to
retrieve multiple records, along with
positioning pointers.

DBMS Interfaces
Stand-alone query language interfaces
Example: Entering SQL queries at the DBMS
interactive SQL interface (e.g. SQL*Plus in
ORACLE)

Programmer interfaces for embedding DML


in programming languages
User-friendly interfaces
Menu-based, forms-based, graphics-based, etc.

DBMS Programming Language Interfaces


Programmer interfaces for embedding DML
in a programming languages:
Embedded Approach: e.g embedded SQL (for
C, C++, etc.), SQLJ (for Java)
Procedure Call Approach: e.g. JDBC for Java,
ODBC for other programming languages
Database Programming Language
Approach: e.g. ORACLE has PL/SQL, a
programming language based on SQL;
language incorporates SQL and its data types as
integral components

User-Friendly DBMS Interfaces


Menu-based, popular for browsing on the
web
Forms-based, designed for immature users
Graphics-based
(Point and Click, Drag and Drop, etc.)

Natural language: requests in written English


Combinations of the above:
For example, both menus and forms used
extensively in Web database interfaces

Other DBMS Interfaces


Speech as Input and Output
Web Browser as an interface
Parametric interfaces, e.g., bank tellers
using function keys.
Interfaces for the DBA:
Creating user accounts, granting authorizations
Setting system parameters
Changing schemas or access paths

Database System Utilities


To perform certain functions such as:
Loading data stored in files into a database.
Includes data conversion tools.
Backing up the database periodically on tape.
Reorganizing database file structures.
Report generation utilities.
Performance monitoring utilities.
Other functions, such as sorting, user
monitoring, data compression, etc.

Other Tools
Data dictionary / repository:
Used to store schema descriptions and
other information such as design decisions,
application program descriptions, user
information, usage standards, etc.
Active data dictionary is accessed by DBMS
software and users/DBA.
Passive data dictionary is accessed by
users/DBA only.

Other Tools
Application Development Environments
and CASE (computer-aided software
engineering) tools:
Examples:
PowerBuilder (Sybase)
JBuilder (Borland)
JDeveloper 10G (Oracle)

Typical DBMS Component


Modules

A DBMS includes capabilities and tools for accessing


and managing data in a database, including:
Data definition language (DDL) or capability: Used to
specify the structure of the database content, creating
and defining tables and fields. DDL Commands are:
CREATE ALTER
TRUNCATE

DROP

RENAME

Data dictionary: An automated or manual file that


stores definitions of data elements and their
characteristics

Data manipulation language (DML): a specialized


language, such as Structured Query Language, or
SQL, that is used to add, change, delete, and retrieve
the data in the database. Commands for entering new
rows, change existing rows, and removing unwanted
rows from tables in the database are collectively known
as DML commands. They are:
SELECT INSERT
UPDATE DELETE
Data Control Language (DCL): Commands that give
or remove access right to both the Oracle database
and the structures within it are collectively.
GRANT
REVOKE
Other commands like COMMIT, ROLLBACK & SAVEPOINT manage the changes made by
DML statements. Changes to the data can be grouped together into logical transactions.

Data base function


Data classification

Data dictionary

Data definition

Data organization

Data updating

Data base file


structuring

Data processing

Data retrieval
Data reporting

Data base
language

EXAMPLE OF AN SQL
QUERY

SAMPLE DATA
DICTIONARY REPORT

AN ACCESS QUERY

shows the tables, fields, and selection criteria used for


the query.

In a large company, special capabilities and tools are required


for analyzing vast quantities of data and for accessing data
from multiple systems, such as:
Data warehouse: a database that stores current and
historical data from core operational transactional systems for
use in management analysis, but this data cannot be altered.

Data mart: A subset of a data warehouse in


which a summarized or highly focused portion
of the organization's data is placed in a
separate database for a specific population of
users.
Business intelligence (BI) tools: Data
analysis tools used for consolidating,
analyzing, and accessing vast stores of data
to help in decision making, such as software
for database query and reporting, tools for
multidimensional data analysis (online
analytical processing), and data mining.

A series of analytical tools works with data stored in databases to find


patterns and insights for helping managers and employees make better
decisions to improve organizational performance.

Vous aimerez peut-être aussi