Vous êtes sur la page 1sur 306

DATABASE MANAGEMENT SYSTEMS

TERM 2008-09

B. Tech II/IT II Semester

UNIT-I PPT SLIDES

Text Books: (1) DBMS by Raghu Ramakrishnan


(2) DBMS by Sudarshan and Korth
INDEX
UNIT-1 PPT SLIDES
S.NO Module as per Lecture PPT
Session planner No Slide NO
------------------------------------------------------------------------------------------
1. DBS Application and L1 L1- 1 to L1- 9
DBMS Vs File Systems
2. View of DATA L2 L2- 1 to L2- 7
3. DB Language (DML, DDL) L3 L3- 1 to L3- 6
4. DB Users and Administrator L4 L4- 1 to L4- 3
5. Data storage and Querying L5 L5- 1 to L5 -5
6. DBMS Architecture L6 L6- 1 to L6- 3
Database System Applications
• DBMS contains information about a particular enterprise
– Collection of interrelated data
– Set of programs to access the data
– An environment that is both convenient and efficient to use
• Database Applications:
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Online retailers: order tracking, customized recommendations
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax deductions
• Databases touch all aspects of our lives

Slide No:L1-1
What Is a DBMS?

• A very large, integrated collection of data.


• Models real-world enterprise.
– Entities (e.g., students, courses)
– Relationships (e.g., Madonna is taking
CS564)
• A Database Management System (DBMS) is a
software package designed to store and
manage databases.

Slide No:L1-2
Why Use a DBMS?

• Data independence and efficient


access.
• Reduced application development
time.
• Data integrity and security.
• Uniform data administration.
• Concurrent access, recovery from
crashes.

Slide No:L1-3
Why Study Databases?? ?

• Shift from computation to information


– at the “low end”: scramble to webspace (a mess!)
– at the “high end”: scientific applications

• Datasets increasing in diversity and volume.


– Digital libraries, interactive video, Human
Genome project, EOS project
– ... need for DBMS exploding

• DBMS encompasses most of CS


– OS, languages, theory, AI, multimedia, logic

Slide No:L1-4
Files vs. DBMS

• Application must stage large datasets


between main memory and secondary
storage (e.g., buffering, page-oriented
access, 32-bit addressing, etc.)
• Special code for different queries
• Must protect data from inconsistency
due to multiple concurrent users
• Crash recovery
• Security and access control

Slide No:L1-5
Purpose of Database Systems
• In the early days, database applications were built
directly on top of file systems
• Drawbacks of using file systems to store data:
– Data redundancy and inconsistency
• Multiple file formats, duplication of information in
different files
– Difficulty in accessing data
• Need to write a new program to carry out each new
task
– Data isolation — multiple files and formats
– Integrity problems
• Integrity constraints (e.g. account balance > 0)
become “buried” in program code rather than being
stated explicitly
• Hard to add new constraints or change existing
ones
Slide No:L1-6
Purpose of Database Systems (Cont.)

• Drawbacks of using file systems (cont.)


– Atomicity of updates
• Failures may leave database in an inconsistent state
with partial updates carried out
• Example: Transfer of funds from one account to
another should either complete or not happen at all
– Concurrent access by multiple users
• Concurrent accessed needed for performance
• Uncontrolled concurrent accesses can lead to
inconsistencies
– Example: Two people reading a balance and
updating it at the same time
– Security problems
• Hard to provide user access to some, but not all, data
• Database systems offer solutions to all the above problems

Slide No:L1-7
Levels of Abstraction

• Physical level: describes how a record (e.g., customer) is


stored.
• Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
customer_id : string;
customer_name : string;
customer_street : string;
customer_city : string;
end;
• View level: application programs hide details of data
types. Views can also hide information (such as an
employee’s salary) for security purposes.

Slide No:L1-8
Summary
• DBMS used to maintain, query large datasets.
• Benefits include recovery from system crashes,
concurrent access, quick application development,
data integrity and security.
• Levels of abstraction give data independence.
• A DBMS typically has a layered architecture.
• DBAs hold responsible jobs
and are well-paid! 
• DBMS R&D is one of the broadest,
most exciting areas in CS.

Slide No:L1-9
View of Data

An architecture for a database system

Slide No:L2-1
Instances and Schemas

• Similar to types and variables in programming


languages
• Schema – the logical structure of the database
– Example: The database consists of information
about a set of customers and accounts and the
relationship between them)
– Analogous to type information of a variable in a
program
– Physical schema: database design at the
physical level
– Logical schema: database design at the logical
level
Slide No:L2-2
Instances and Schemas
• Instance – the actual content of the database
at a particular point in time
– Analogous to the value of a variable
• Physical Data Independence – the ability to
modify the physical schema without changing
the logical schema
– Applications depend on the logical schema
– In general, the interfaces between the
various levels and components should be
well defined so that changes in some parts
do not seriously influence others.

Slide No:L2-3
Data Models

• A collection of tools for describing


– Data
– Data relationships
– Data semantics
– Data constraints
• Relational model
• Entity-Relationship data model (mainly for
database design)
• Object-based data models (Object-oriented and
Object-relational)
• Semi structured data model (XML)
• Other older models:
– Network model
– Hierarchical model

Slide No:L2-4
Data Models

• A data model is a collection of concepts for


describing data.
• A schema is a description of a particular
collection of data, using the a given data
model.
• The relational model of data is the most
widely used model today.
– Main concept: relation, basically a table with
rows and columns.
– Every relation has a schema, which describes
the columns, or fields.

Slide No:L2-5
Example: University Database
• Conceptual schema:
– Students(sid: string, name: string, login: string,

age: integer, gpa:real)


– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)

• Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.

• External Schema (View):


– Course_info(cid:string,enrollment:integer)

Slide No:L2-6
Data Independence
• Applications insulated from how data
is structured and stored.
• Logical data independence: Protection
from changes in logical structure of
data.
• Physical data independence:
Protection from changes in physical
structure of data.
 One of the most important benefits of using a DBMS!

Slide No:L2-7
DATA BASE LANGUAGE
Data Manipulation Language (DML)

• Language for accessing and manipulating the


data organized by the appropriate data model
– DML also known as query language
• Two classes of languages
– Procedural – user specifies what data is
required and how to get those data
– Declarative (nonprocedural) – user
specifies what data is required without
specifying how to get those data
• SQL is the most widely used query language

Slide No:L3-1
Data Definition Language (DDL)
• Specification notation for defining the database schema
Example: create table account (
account_number char(10),
branch_name char(10),
balance integer)
• DDL compiler generates a set of tables stored in a data
dictionary
• Data dictionary contains metadata (i.e., data about data)
– Database schema
– Data storage and definition language
• Specifies the storage structure and access methods used
– Integrity constraints
• Domain constraints
• Referential integrity (e.g. branch_name must correspond
to a valid branch in the branch table)
– Authorization

Slide No:L3-2
Relational Model

• Example of tabular data in the relational model


Attributes

Slide No:L3-3
A Sample Relational Database

Slide No:L3-4
SQL
• SQL: widely used non-procedural language
– Example: Find the name of the customer with
customer-id 192-83-7465
select customer.customer_name
from customer
where customer.customer_id = ‘192-83-7465’
– Example: Find the balances of all accounts held by
the customer with customer-id 192-83-7465
select account.balance
from depositor, account
where depositor.customer_id = ‘192-83-7465’
and
depositor.account_number =
account.account_number

Slide No:L3-5
SQL

• Application programs generally access databases


through one of
– Language extensions to allow embedded SQL
– Application program interface (e.g., ODBC/JDBC)
which allow SQL queries to be sent to a database

Slide No:L3-6
Database Users

Users are differentiated by the way they expect to interact with


the system
• Application programmers – interact with system through
DML calls
• Sophisticated users – form requests in a database query
language
• Specialized users – write specialized database applications
that do not fit into the traditional data processing framework
• Naïve users – invoke one of the permanent application
programs that have been written previously
– Examples, people accessing database over the web, bank
tellers, clerical staff

Slide No:L4-1
Database Administrator

• Coordinates all the activities of the database system


– has a good understanding of the enterprise’s
information resources and needs.
• Database administrator's duties include:
– Storage structure and access method definition
– Schema and physical organization modification
– Granting users authority to access the database
– Backing up data
– Monitoring performance and responding to
changes
• Database tuning

Slide No:L4-2
Data storage and Querying

• Storage management
• Query processing
• Transaction processing

Slide No:L5-1
Storage Management

• Storage manager is a program module that provides the


interface between the low-level data stored in the
database and the application programs and queries
submitted to the system.
• The storage manager is responsible to the following
tasks:
– Interaction with the file manager
– Efficient storing, retrieving and updating of data
• Issues:
– Storage access
– File organization
– Indexing and hashing

Slide No:L5-2
Query Processing

1.Parsing and translation


2. Optimization
3. Evaluation

Slide No:L5-3
Query Processing (Cont.)

• Alternative ways of evaluating a given query


– Equivalent expressions
– Different algorithms for each operation
• Cost difference between a good and a bad way of
evaluating a query can be enormous
• Need to estimate the cost of operations
– Depends critically on statistical information about
relations which the database must maintain
– Need to estimate statistics for intermediate
results to compute cost of complex expressions

Slide No:L5-4
Transaction Management

• A transaction is a collection of operations that


performs a single logical function in a database
application
• Transaction-management component
ensures that the database remains in a
consistent (correct) state despite system
failures (e.g., power failures and operating
system crashes) and transaction failures.
• Concurrency-control manager controls the
interaction among the concurrent transactions,
to ensure the consistency of the database.

Slide No:L5-5
Database Architecture

The architecture of a database systems is greatly influenced by


the underlying computer system on which the database is
running:
• Centralized
• Client-server
• Parallel (multiple processors and disks)
• Distributed

Slide No:L6-1
Overall System Structure

Slide No:L6-2
Database Application Architectures

(web browser)

Old Modern

Slide No:L6-3
Slide No:L1-6
Slide No:L1-7
Slide No:L1-8
Slide No:L1-9
Slide No:L1-10
Database Design

• Conceptual design: (ER Model is used at this stage.)


– What are the entities and relationships in the
enterprise?
– What information about these entities and
relationships should we store in the database?
– What are the integrity constraints or business rules
that hold?
– A database `schema’ in the ER Model can be
represented pictorially (ER diagrams).
– Can map an ER diagram into a relational schema.

Slide No:L2-1
Modeling
• A database can be modeled as:
– a collection of entities,
– relationship among entities.
• An entity is an object that exists and is
distinguishable from other objects.
– Example: specific person, company, event, plant
• Entities have attributes
– Example: people have names and addresses
• An entity set is a set of entities of the same type
that share the same properties.
– Example: set of all persons, companies, trees,
holidays

Slide No:L2-2
Entity Sets customer and loan
customer_id customer_ customer_ customer_ loan_ amount
name street city number

Slide No:L2-3
Attributes
• An entity is represented by a set of attributes, that is
descriptive properties possessed by all members of an
entity set.
Example:
customer = (customer_id, customer_name,
customer_street, customer_city )
loan = (loan_number, amount )
• Domain – the set of permitted values for each attribute
• Attribute types:
– Simple and composite attributes.
– Single-valued and multi-valued attributes
• Example: multivalued attribute: phone_numbers
– Derived attributes
• Can be computed from other attributes
• Example: age, given date_of_birth

Slide No:L2-4
Composite Attributes

Slide No:L2-5
Mapping Cardinality Constraints

• Express the number of entities to which another


entity can be associated via a relationship set.
• Most useful in describing binary relationship sets.
• For a binary relationship set the mapping
cardinality must be one of the following types:
– One to one
– One to many
– Many to one
– Many to many

Slide No:L2-6
Mapping Cardinalities

One to one One to many


Note: Some elements in A and B may not be mapped to any
elements in the other set

Slide No:L2-7
Mapping Cardinalities

Many to one Many to many


Note: Some elements in A and B may not be mapped to any
elements in the other set

Slide No:L2-8
ER Model Basics

name
ssn lot

Employees

• Entity: Real-world object distinguishable from other


objects. An entity is described (in DB) using a set of
attributes.
• Entity Set: A collection of similar entities. E.g., all
employees.
– All entities in an entity set have the same set of
attributes. (Until we consider ISA hierarchies,
anyway!)
– Each entity set has a key.
– Each attribute has a domain.
Slide No:L2-9
ER Model Basics (Contd.)
name

ssn lot
since
name dname
ssn lot did budget Employees

super- subord
Employees Works_In Departments visor inate
Reports_To

• Relationship: Association among two or more entities. E.g.,


Attishoo works in Pharmacy department.
• Relationship Set: Collection of similar relationships.
– An n-ary relationship set R relates n entity sets E1 ... En;
each relationship in R involves entities e1 E1, ..., en En
• Same entity set could participate in different
relationship sets, or in different “roles” in same set.

Slide No:L2-10
Relationship Sets
• A relationship is an association among several
entities
Example:
Hayes depositor A-102
customer entityrelationship setaccount entity
• A relationship set is a mathematical relation among
n  2 entities, each taken from entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En}

where (e1, e2, …, en) is a relationship


– Example:
(Hayes, A-102)  depositor

Slide No:L3-1
Relationship Set borrower

Slide No:L3-2
Relationship Sets (Cont.)
• An attribute can also be property of a relationship set.
• For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date

Slide No:L3-3
Degree of a Relationship Set

• Refers to number of entity sets that


participate in a relationship set.
• Relationship sets that involve two entity
sets are binary (or degree two).
Generally, most relationship sets in a
database system are binary.
• Relationship sets may involve more than
two entity sets.

Slide No:L3-4
Degree of a Relationship Set
Example: Suppose employees of a bank
may have jobs (responsibilities) at
multiple branches, with different jobs at
different branches. Then there is a
ternary relationship set between entity
sets employee, job, and branch
• Relationships between more than two entity sets
are rare. Most relationships are binary. (More on
this later.)

Slide No:L3-5
Additional
since
features of the ER name dname
model
ssn lot did budget

Key Constraints
Employees Manages Departments

• Consider Works_In:
An employee can
work in many
departments; a dept
can have many
employees.
• In contrast, each dept
has at most one
manager, according
to the key
constraint on 1-to-1 1-to Many Many-to-1 Many-to-Many
Manages.
Slide No:L4-1
Participation Constraints
• Does every department have a manager?
– If so, this is a participation constraint: the participation
of Departments in Manages is said to be total (vs.
partial).
• Every Departments entity must appear in an instance
of the Manages relationship.

since
name dname
ssn lot did budget

Employees Manages Departments

Works_In

since

Slide No:L4-2
Weak Entities
• A weak entity can be identified uniquely only by considering the
primary key of another (owner) entity.
– Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak entities).
– Weak entity set must have total participation in this
identifying relationship set.

name
cost pname age
ssn lot

Employees Policy Dependents

Slide No:L4-3
Weak Entity Sets
• An entity set that does not have a primary key is referred to
as a weak entity set.
• The existence of a weak entity set depends on the existence of
a identifying entity set
– it must relate to the identifying entity set via a total, one-
to-many relationship set from the identifying to the weak
entity set
– Identifying relationship depicted using a double diamond
• The discriminator (or partial key) of a weak entity set is the
set of attributes that distinguishes among all the entities of a
weak entity set.
• The primary key of a weak entity set is formed by the primary
key of the strong entity set on which the weak entity set is
existence dependent, plus the weak entity set’s discriminator.

Slide No:L4-4
Weak Entity Sets (Cont.)
• We depict a weak entity set by double rectangles.
• We underline the discriminator of a weak entity set with a
dashed line.
• payment_number – discriminator of the payment entity set
• Primary key for payment – (loan_number, payment_number)

Slide No:L4-5
Weak Entity Sets (Cont.)

• Note: the primary key of the strong entity set is


not explicitly stored with the weak entity set,
since it is implicit in the identifying relationship.
• If loan_number were explicitly stored, payment
could be made a strong entity, but then the
relationship between payment and loan would be
duplicated by an implicit relationship defined by
the attribute loan_number common to payment
and loan

Slide No:L4-6
More Weak Entity Set Examples

• In a university, a course is a strong entity and a course_offering


can be modeled as a weak entity
• The discriminator of course_offering would be semester
(including year) and section_number (if there is more than one
section)
• If we model course_offering as a strong entity we would model
course_number as an attribute.
Then the relationship with course would be implicit in the
course_number attribute

Slide No:L4-7
ISA (`is a’) Hierarchies
name
 As in C++, or other PLs, ssn lot

attributes are inherited.


 If we declare A ISA B, Employees

every A entity is also hourly_wages hours_worked


considered to be a B ISA
contractid
entity.
Contract_Emps
Hourly_Emps

• Overlap constraints: Can Joe be an Hourly_Emps as well as a


Contract_Emps entity? (Allowed/disallowed)
• Covering constraints: Does every Employees entity also have to
be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
• Reasons for using ISA:
– To add descriptive attributes specific to a subclass.
– To identify entitities that participate in a relationship.

Slide No:L5-1
Aggregation
• Used when we have to name
ssn lot
model a relationship
involving (entitity sets Employees
and) a relationship set.
– Aggregation allows
us to treat a Monitors until
relationship set as
an entity set for
purposes of started_on since
dname
participation in pid pbudget did
(other) budget
relationships. Projects Sponsors Departments

 Aggregation vs. ternary relationship:


Monitors is a distinct relationship, with a descriptive attribute.
 Also, can say that each sponsorship is monitored by at most one

employee.
Slide No:L5-2
Aggregation
Consider the ternary relationship works_on, which we
saw earlier
 Suppose we want to record managers for tasks
performed by an employee at a branch

Slide No:L5-3
Aggregation (Cont.)
• Relationship sets works_on and manages represent
overlapping information
– Every manages relationship corresponds to a works_on
relationship
– However, some works_on relationships may not
correspond to any manages relationships
• So we can’t discard the works_on relationship
• Eliminate this redundancy via aggregation
– Treat relationship as an abstract entity
– Allows relationships between relationships
– Abstraction of relationship into new entity

Slide No:L5-4
Aggregation (Cont.)
• Eliminate this redundancy via aggregation
– Treat relationship as an abstract entity
– Allows relationships between relationships
– Abstraction of relationship into new entity
• Without introducing redundancy, the following diagram
represents:
– An employee works on a particular job at a particular
branch
– An employee, branch, job combination may have an
associated manager

Slide No:L5-5
E-R Diagram With Aggregation

Slide No:L5-6
Conceptual Design Using the ER Model

• Design choices:
– Should a concept be modeled as an entity or an
attribute?
– Should a concept be modeled as an entity or a
relationship?
– Identifying relationships: Binary or ternary?
Aggregation?
• Constraints in the ER Model:
– A lot of data semantics can (and should) be captured.
– But some constraints cannot be captured in ER
diagrams.

Slide No:L6-1
Entity vs. Attribute
• Should address be an attribute of Employees or an entity
(connected to Employees by a relationship)?
• Depends upon the use we want to make of address
information, and the semantics of the data:
• If we have several addresses per employee, address
must be an entity (since attributes cannot be set-
valued).
• If the structure (city, street, etc.) is important, e.g.,
we want to retrieve employees in a given city,
address must be modeled as an entity (since
attribute values are atomic).

Slide No:L6-2
Entity vs. Attribute (Contd.)
• Works_In4 does not
allow an employee to
from to
work in a department name dname
for two or more ssn lot did budget
periods.
• Similar to the problem Employees Works_In4 Departments
of wanting to record
several addresses for
an employee: We
want to record several
values of the
descriptive attributes name dname
ssn lot did
for each instance of budget
this relationship. Works_In4 Departments
Employees
Accomplished by
introducing new entity
set, Duration. from Duration to

Slide No:L6-3
Entity vs. Relationship

• First ER diagram OK if a
manager gets a separate since dbudget
name dname
discretionary budget for ssn lot did budget
each dept.
• What if a manager gets a Employees Manages2 Departments
discretionary budget
that covers all managed
depts? name
ssn lot
– Redundancy: dbudget
since dname
stored for each dept did
Employees budget
managed by manager.
– Misleading: Suggests
Manages2 Departments
dbudget associated ISA
with department-mgr
combination. This fixes the
Managers dbudget
problem!
Slide No:L6-4
Binary vs. Ternary Relationships

• If each policy is name


owned by just 1 ssn lot pname age
employee, and Employees Dependents
Covers
each dependent
is tied to the Bad design Policies
covering policy,
first diagram is policyid cost
inaccurate. name pname age
ssn lot
• What are the
additional Dependents
Employees
constraints in
the 2nd Purchaser
Beneficiary
diagram?
Better design Policies

Slide No:L6-5
policyid cost
Binary vs. Ternary Relationships (Contd.)

• Previous example illustrated a case when two binary


relationships were better than one ternary relationship.
• An example in the other direction: a ternary relation
Contracts relates entity sets Parts, Departments and
Suppliers, and has descriptive attribute qty. No
combination of binary relationships is an adequate
substitute:
– S “can-supply” P, D “needs” P, and D “deals-with” S
does not imply that D has agreed to buy P from S.
– How do we record qty?

Slide No:L6-6
Summary of Conceptual Design

• Conceptual design follows requirements analysis,


– Yields a high-level description of data to be stored

• ER model popular for conceptual design


– Constructs are expressive, close to the way people think
about their applications.
• Basic constructs: entities, relationships, and attributes (of
entities and relationships).
• Some additional constructs: weak entities, ISA hierarchies,
and aggregation.
• Note: There are many variations on ER model.

Slide No:L7-1
Summary of ER (Contd.)

• Several kinds of integrity constraints can be expressed in the


ER model: key constraints, participation constraints, and
overlap/covering constraints for ISA hierarchies. Some foreign
key constraints are also implicit in the definition of a
relationship set.
– Some constraints (notably, functional dependencies) cannot
be expressed in the ER model.
– Constraints play an important role in determining the best
database design for an enterprise.

Slide No:L7-2
Summary of ER (Contd.)

• ER design is subjective. There are often many ways to


model a given scenario! Analyzing alternatives can be
tricky, especially for a large enterprise. Common choices
include:
– Entity vs. attribute, entity vs. relationship, binary or n-
ary relationship, whether or not to use ISA hierarchies,
and whether or not to use aggregation.
• Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are especially
useful.

Slide No:L7-3
Views

• A view is just a relation, but we store a definition, rather than a


set of tuples.

CREATE VIEW YoungActiveStudents (name, grade)


AS SELECT S.name, E.grade
FROM Students S, Enrolled E
WHERE S.sid = E.sid and S.age<21

 Views can be dropped using the DROP VIEW command.


 How to handle DROP TABLE if there’s a view on the table?
• DROP TABLE command has options to let the user
specify this.

Slide No:L5-1
Views and Security

• Views can be used to present necessary information (or a


summary), while hiding details in underlying relation(s).
– Given YoungStudents, but not Students or Enrolled,
we can find students s who have are enrolled, but not
the cid’s of the courses they are enrolled in.

Slide No:L5-2
View Definition
• A relation that is not of the conceptual model but is
made visible to a user as a “virtual relation” is called
a view.
• A view is defined using the create view statement
which has the form

create view v as < query expression >


where <query expression> is any legal SQL
expression. The view name is represented by v.
• Once a view is defined, the view name can be used to
refer to the virtual relation that the view generates.

Slide No:L5-3
Example Queries
• A view consisting of branches and their customers
create view all_customer as
(select branch_name, customer_name
from depositor, account
where depositor.account_number =
account.account_number )
union
(select branch_name, customer_name
from borrower, loan
where borrower.loan_number = loan.loan_number )

 Find all customers of the Perryridge branch

select customer_name
from all_customer
where branch_name = 'Perryridge'

Slide No:L5-4
Uses of Views
• Hiding some information from some users
– Consider a user who needs to know a customer’s name,
loan number and branch name, but has no need to see the
loan amount.
– Define a view
(create view cust_loan_data as
select customer_name, borrower.loan_number,
branch_name
from borrower, loan
where borrower.loan_number = loan.loan_number )
– Grant the user permission to read cust_loan_data, but not
borrower or loan
• Predefined queries to make writing of other queries easier
– Common example: Aggregate queries used for statistical
analysis of data

Slide No:L5-5
Processing of Views
• When a view is created
– the query expression is stored in the database along with the
view name
– the expression is substituted into any query using the view
• Views definitions containing views
– One view may be used in the expression defining another view
– A view relation v1 is said to depend directly on a view relation
v2 if v2 is used in the expression defining v1
– A view relation v1 is said to depend on view relation v2 if either
v1 depends directly to v2 or there is a path of dependencies
from v1 to v2
– A view relation v is said to be recursive if it depends on itself.

Slide No:L5-6
View Expansion
• A way to define the meaning of views defined in terms of
other views.
• Let view v1 be defined by an expression e1 that may itself
contain uses of view relations.
• View expansion of an expression repeats the following
replacement step:
repeat
Find any view relation vi in e1
Replace the view relation vi by the expression defining vi
until no more view relations are present in e1
• As long as the view definitions are not recursive, this loop
will terminate

Slide No:L5-7
With Clause
• The with clause provides a way of defining a
temporary view whose definition is available only to
the query in which the with clause occurs.
• Find all accounts with the maximum balance

with max_balance (value) as


select max (balance)
from account
select account_number
from account, max_balance
where account.balance = max_balance.value

Slide No:L5-8
Complex Queries using With Clause
• Find all branches where the total account deposit is greater
than the average of the total account deposits at all branches.

with branch_total (branch_name, value) as


select branch_name, sum (balance)
from account
group by branch_name
with branch_total_avg (value) as
select avg (value)
from branch_total
select branch_name
from branch_total, branch_total_avg
where branch_total.value >= branch_total_avg.value

• Note: the exact syntax supported by your database may


vary slightly.
– E.g. Oracle syntax is of the form
with branch_total as ( select .. ),
branch_total_avg as ( select .. )
select … Slide No:L5-9
Update of a View
• Create a view of all loan data in the loan relation,
hiding the amount attribute
create view loan_branch as
select loan_number, branch_name
from loan
• Add a new tuple to loan_branch
insert into loan_branch
values ('L-37‘, 'Perryridge‘)
This insertion must be represented by the insertion
of the tuple
('L-37', 'Perryridge', null )
into the loan relation

Slide No:L5-10
Formal Relational Query Languages

• Two mathematical Query Languages form the


basis for “real” languages (e.g. SQL), and for
implementation:
– Relational Algebra: More operational, very useful for
representing execution plans.
– Relational Calculus: Lets users describe what they
want, rather than how to compute it. (Non-
operational, declarative.)

Slide No:L6-1
Preliminaries

• A query is applied to relation instances, and the result of a


query is also a relation instance.
– Schemas of input relations for a query are fixed (but query
will run regardless of instance!)
– The schema for the result of a given query is also fixed!
Determined by definition of query language constructs.
• Positional vs. named-field notation:
– Positional notation easier for formal definitions, named-
field notation more readable.
– Both used in SQL

Slide No:L6-2
Example Instances
R1 sid bid day
22 101 10/10/96
58 103 11/12/96
• “Sailors” and “Reserves”
relations for our examples. sid sname rating age
• We’ll use positional or S1
named field notation, 22 dustin 7 45.0
assume that names of
fields in query results are 31 lubber 8 55.5
`inherited’ from names of 58 rusty 10 35.0
fields in query input
relations. sid sname rating age
S2
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Slide No:L6-3
Relational Algebra

• Basic operations:
– Selection ( ) Selects a subset of rows from relation.
– Projection ( ) Deletes unwanted columns from relation.


Cross-product (
Set-difference (
 ) Allows us to combine two relations.
) Tuples in reln. 1, but not in reln. 2.


Union (  ) Tuples in reln. 1 and in reln. 2.
• Additional operations:
– Intersection, join, division, renaming: Not essential, but (very!)
useful.
• Since each operation returns a relation, operations can be composed!
(Algebra is “closed”.)

Slide No:L6-4
Projection
sname rating
yuppy 9
• Deletes attributes that are not
in projection list.
lubber 8
• Schema of result contains
guppy 5
exactly the fields in the rusty 10
projection list, with the same
names that they had in the
 sname,rating(S2)
(only) input relation.
• Projection operator has to
eliminate duplicates! (Why??) age
– Note: real systems typically 35.0
don’t do duplicate elimination
unless the user explicitly asks 55.5
for it. (Why not?)
 age(S2)
Slide No:L6-5
Selection
sid sname rating age
• Selects rows that satisfy 28 yuppy 9 35.0
selection condition. 58 rusty 10 35.0
 rating 8(S2)
• No duplicates in result!
(Why?)
• Schema of result identical to
schema of (only) input
relation. sname rating
• Result relation can be the
input for another relational
yuppy 9
algebra operation! (Operator rusty 10
composition.)

 sname,rating( rating 8(S2))

Slide No:L6-6
Union, Intersection, Set-Difference
• All of these operations take two sid sname rating age
input relations, which must be
union-compatible: 22 dustin 7 45.0
– Same number of fields.
31 lubber 8 55.5
– `Corresponding’ fields have
the same type.
58 rusty 10 35.0
• What is the schema of result? 44 guppy 5 35.0
28 yuppy 9 35.0

S1 S2
sid sname rating age sid sname rating age
22 dustin 7 45.0 31 lubber 8 55.5
58 rusty 10 35.0
S1 S2
S1 S2
Slide No:L6-7
Cross-Product
• Each row of S1 is paired with each row of R1.
• Result schema has one field per field of S1 and R1,
with field names `inherited’ if possible.
– Conflict: Both S1 and R1 have a field called sid.

(sid) sname rating age (sid) bid day


22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96

 Renaming operator:  (C(1 sid1, 5  sid 2), S1 R1)

Slide No:L6-8
Joined Relations**
• Join operations take two relations and return as a
result another relation.
• These additional operations are typically used as
subquery expressions in the from clause
• Join condition – defines which tuples in the two
relations match, and what attributes are present in
the result of the join.
• Join type – defines how tuples in each relation that
do not match any tuple in the other relation (based
on the join condition) are treated.

Slide No:L7-5
Joined Relations – Datasets for Examples
 Relation borrower

• Relation loan

 Note: borrower information missing for L-260 and loan


information missing for L-155

Slide No:L8-1
Joined Relations – Examples
• loan inner join borrower on
loan.loan_number = borrower.loan_number

 loan left outer join borrower on


loan.loan_number = borrower.loan_number

Slide No:L8-2
Joined Relations – Examples
• loan natural inner join borrower

 loan natural right outer join borrower

 Find all customers who have either an account or a loan (but not both) at the bank.

select customer_name
from (depositor natural full outer join borrower )
where account_number is null or loan_number is null
Slide No:L8-3
Joined Relations – Examples
• Natural join can get into trouble if two relations have an
attribute with
same name that should not affect the join condition
– e.g. an attribute such as remarks may be present in
many tables
• Solution:
– loan full outer join borrower using (loan_number)

Slide No:L8-4
Division

• Not supported as a primitive operator, but useful for


expressing queries like:
Find sailors who have
reserved all boats.

– A/B =

• Let A have 2xfields,
|  xx, and
y y;
A Bhave
y only 
B field y:

– i.e., A/B contains all x tuples (sailors) such that for


every y tuple (boat) in B, there is an xy tuple in A.
– Or: If the set of y values (boats) associated with an x
value (sailor) in A contains all y values in B, the x value
is in A/B.

• In general, x and y can be any lists of fields; y is the list of
fields in B, and x y is the list of fields of A.
Slide No:L6-11
Examples of Division A/B

sno pno pno pno pno


s1 p1 p2 p2 p1
s1 p2 p4 p2
s1 p3 p4
B
s1 p4 B2
1 B3
s2 p1 sno
s2 p2 s1 sno
s3 p2 sno
s2 s1 s1
s4 p2 s3 s4
s4 p4 s4

A A/B1 A/B2 A/B3


Slide No:L6-12
Expressing A/B Using Basic Operators

• Division is not essential op; just a useful shorthand.


– (Also true of joins, but joins are so common that systems implement
joins specially.)
• Idea: For A/B, compute all x values that are not `disqualified’ by some y
value in B.
– x value is disqualified if by attaching y value from B, we obtain an xy
tuple that is not in A.

Disqualified x values:  x (( x ( A)  B)  A)

A/B:  x ( A)  all disqualified tuples

Slide No:L6-13
Find names of sailors who’ve reserved boat
#103
• Solution 1:  sname(( Reserves)  Sailors)
bid 103
 Solution 2:  (Temp1,  Re serves)
bid  103

 ( Temp2, Temp1  Sailors)


 sname (Temp2)

 Solution 3:  sname ( (Re serves  Sailors))


bid 103

Slide No:L6-14
Find names of sailors who’ve reserved a red boat
• Information about boat color only available in Boats; so need an
extra join:

 sname (( Boats)  Re serves  Sailors)


color ' red '
 A more efficient solution:

 sname ( ((  Boats)  Re s)  Sailors)


sid bid color ' red '

A query optimizer can find this, given the first solution!

Slide No:L6-15
Find sailors who’ve reserved a red or a green boat
• Can identify all red or green boats, then find sailors who’ve
reserved one of these boats:

 (Tempboats, ( Boats))
color ' red '  color ' green '
 sname(Tempboats  Re serves  Sailors)

 Can also define Tempboats using union! (How?)

 What happens if  is replaced by  in this query?

Slide No:L6-16
Find sailors who’ve reserved a red and a green boat
• Previous approach won’t work! Must identify sailors who’ve
reserved red boats, sailors who’ve reserved green boats, then find
the intersection (note that sid is a key for Sailors):

 (Tempred,  (( Boats)  Re serves))


sid color ' red '

 (Tempgreen,  (( Boats)  Re serves))


sid color ' green'

 sname((Tempred  Tempgreen)  Sailors)

Slide No:L6-17
Relational Calculus

• Comes in two flavors: Tuple relational calculus (TRC)


and Domain relational calculus (DRC).
• Calculus has variables, constants, comparison ops,
logical connectives and quantifiers.
– TRC: Variables range over (i.e., get bound to) tuples.
– DRC: Variables range over domain elements (= field
values).
– Both TRC and DRC are simple subsets of first-order
logic.
• Expressions in the calculus are called formulas. An
answer tuple is essentially an assignment of constants
to variables that make the formula evaluate to true.

Slide No:L7-1
Domain Relational Calculus

• Query has the form:


 



x1, x2,..., xn | p x1, x2,..., xn




 
   

 Answer includes all tuples x1, x2,..., xn that


make the formula p x1, x2,..., xn  be true.
 

 

 Formula is recursively defined, starting with


simple atomic formulas (getting tuples from
relations or making comparisons of values),
and building bigger and better formulas using
the logical connectives.
Slide No:L7-2
DRC Formulas

• Atomic formula:
– x1, x2,..., xn  Rname, or X op Y, or X op constant
– op is one of , , , , , 
• Formula:
– an atomic formula, or
 p, p  q, p  q
– , where p and q are formulas, or
–X ( p( X )) , where variable X is free in p(X), or
 X ( p( X ))
– , where variable X is free in p(X)
• X
The use of quantifiers and X is said to bind X.
– A variable that is not bound is free.

Slide No:L7-3
Free and Bound Variables

• The use of quantifiers and X X


in a formula is said to bind X.
– A variable that is not bound is free.

• Let us revisit the definition of a query:

 



x1, x2,..., xn | p x1, x2,..., xn




 
   

 There is an important restriction: the variables


x1, ..., xn that appear to the left of `|’ must be the
only free variables in the formula p(...).

Slide No:L8-1
Find all sailors with a rating above 7




I, N,T, A | I, N,T, A  Sailors  T  7 




• The condition I, N,T, A  Sailors ensures that


the domain variables I, N, T and A are bound to fields
of the same Sailors tuple.
• The term I, N,T, A to the left of `|’ (which should be
read as such that) says that every tuple I, N,T, A
that satisfies T>7 is in the answer.
• Modify this query to answer:
– Find sailors who are older than 18 or have a rating
under 9, and are called ‘Joe’.

Slide No:L8-2
Find sailors rated > 7 who have reserved boat #103




I, N,T, A | I, N,T, A  Sailors  T  7 
 Ir, Br, D Ir, Br, D  Re serves  Ir  I  Br  103
  
  
  
  

• We have used  Ir , Br , D  . . . as a shorthand for


 Ir   Br   D  . . .  
• Note the use of 
to find a tuple in Reserves that `joins with’ the
Sailors tuple under consideration.

Slide No:L8-3
Find sailors rated > 7 who’ve reserved a red boat





I, N, T, A | I, N, T, A  Sailors  T  7 

 Ir, Br, D Ir, Br, D  Re serves  Ir  I 





 B, BN,C B, BN,C  Boats  B  Br  C  ' red '









 
 

• Observe how the parentheses control the scope of each quantifier’s


binding.
• This may look cumbersome, but with a good user interface, it is very
intuitive. (MS Access, QBE)

Slide No:L8-4
Find sailors who’ve reserved all boats





I, N,T, A | I, N, T, A  Sailors 

 B, BN, C  B, BN, C  Boats 


  
  
  
  




 Ir, Br, D 

 Ir, Br, D  Re serves  I  Ir  Br  B  
 
   
    

• Find all sailors I such that for each 3-tuple B, BN,C either it is not
a tuple in Boats or there is a tuple in Reserves showing that sailor I has
reserved it.

Slide No:L8-5
Find sailors who’ve reserved all boats (again!)





I, N,T, A | I, N, T, A  Sailors 

 B, BN, C  Boats


  Ir, Br, D  Re serves I  Ir  Br  B




 
 
 
 
 

• Simpler notation, same query. (Much clearer!)


• To find sailors who’ve reserved all red boats:

....


 C  ' red '   Ir, Br, D  Re serves I  Ir  Br  B




  
 

 
.  

Slide No:L8-6
Unsafe Queries, Expressive Power

• It is possible to write syntactically correct calculus


queries that have an infinite number of answers!
Such queries are called unsafe.
– e.g.,  S |  S  Sailors 
   
 
 
   

• It is known that every query that can be expressed in


relational algebra can be expressed as a safe query
in DRC / TRC; the converse is also true.
• Relational Completeness: Query language (e.g., SQL)
can express every query that is expressible in
relational algebra/calculus.

Slide No:L8-7
Functional Dependencies (FDs)

• A functional dependency X  Y holds over relation R if, for


every allowable instance r of R:
– t1  r, t2 r,  (t1) =
X  (t2)
X implies  = (t2)
(t1)
Y Y
– i.e., given two tuples in r, if the X values agree, then the Y values
must also agree. (X and Y are sets of attributes.)
• An FD is a statement about all allowable relations.
– Must be identified based on semantics of application.
– Given some allowable instance r1 of R, we can check if it violates
some FD f, but we cannot tell if f holds over R!
• K is a candidate key for R means that K R

– However, K R does not require K to be minimal!

Slide No. L2-1
Example: Constraints on Entity Set

• Consider relation obtained from Hourly_Emps:


– Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)
• Notation: We will denote this relation schema by listing the
attributes: SNLRWH
– This is really the set of attributes {S,N,L,R,W,H}.
– Sometimes, we will refer to all attributes of a relation by using
the relation name. (e.g., Hourly_Emps for SNLRWH)
• Some FDs on Hourly_Emps:
– ssn is the key: S SNLRWH
– 
rating determines hrly_wages: R W

Slide No. L2-2
Wages R W
Example (Contd.) 8 10
5 7
Hourly_Emps2
• Problems due to R W: S N L R H
– Update anomaly: Can
we change W in just 123-22-3666 Attishoo 48 8 40
the 1st tuple of 231-31-5368 Smiley 22 8 30
SNLRWH?
131-24-3650 Smethurst 35 5 30
– Insertion anomaly: What
if we want to insert an 434-26-3751 Guldu 35 5 32
employee and don’t 612-67-4134 Madayan 35 8 40
know the hourly wage
S N L R W H
for his rating?
– Deletion anomaly: If we 123-22-3666 Attishoo 48 8 10 40
delete all employees with 231-31-5368 Smiley 22 8 10 30
rating 5, we lose the 131-24-3650 Smethurst 35 5 7 30
information about the
wage for rating 5! 434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
Slide No. L2-3
Constraints on a Relationship Set
• Suppose that we have entity sets Parts, Suppliers,
and Departments, as well as a relationship set
Contracts that involves all of them. We refer to the
schema for Contracts as CQPSD. A contract with
contract id
• C species that a supplier S will supply some quantity
Q of a part P to a department D.
• We might have a policy that a department purchases
at most one part from any given supplier.
• Thus, if there are several contracts between the same
supplier and department,
• we know that the same part must be involved in all of
them. This constraint is an FD, DS ! P.

Slide No:L2-4
Reasoning About FDs

• Given some FDs, we can usually infer additional FDs:


– ssn  did, did  lot implies ssn 
lot
• An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
– = closure of F is the set of all FDs that are implied by F.
F
• Armstrong’s Axioms (X, Y, Z are sets of attributes):
– Reflexivity: If X Y, then Y  X
– Augmentation: If X  Y, then XZ 
YZ for any Z
– Transitivity: If X Y and Y
 
Z, then X Z
• These are sound and complete inference rules for FDs!

Slide No. L2-5


Reasoning About FDs (Contd.)

• Couple of additional rules (that follow from AA):


– Union: If X  Y and X Z, then X 
YZ
– Decomposition: If X YZ, then X 
Y and X Z
• Example: Contracts(cid,sid,jid,did,pid,qty,value), and:
– C is the key: C CSJDPQV
– Project purchases each part using single contract:
– JP  C
– Dept purchases at most one part from a supplier: S
– D P
• JP C, C CSJDPQV imply JP CSJDPQV
 P implies
 SDJ JP 
• SD
 
• SDJ  JP, JP CSJDPQV imply SDJ CSJDPQV
 
Slide No. L2-6
Reasoning About FDs (Contd.)

• Computing the closure of a set of FDs can be expensive. (Size


of closure is exponential in # attrs!)
• Typically, we just want to check if a given FD X Y is in the
closure of a set of FDs F. An efficient check:
– Compute attribute closure of X (denoted  F:
X) wrt
• Set of all attributes A such that X  A is in F 
• There is a linear time algorithm to compute this.
– Check if Y is in
X 
• Does F = {A  B, B C, C D  E } imply A  E?
– i.e, is A E in the closure ? Equivalently, is E in ?A
F

Slide No. L2-7


Closure of a Set of FDs

• The set of all FDs implied by a given set F of FDs is


called the closure of F and is denoted as F+.

• An important question is how we can infer, or


compute, the closure of a given set F of FDs.

• The following three rules, called Armstrong's


Axioms, can be applied repeatedly to infer all FDs
implied by a set F of FDs.

• We use X, Y, and Z to denote sets of attributes over


a relation schema R:

Slide No:L2-8
Closure of a Set of FDs
• Reflexivity: If X Y, then X !Y.
• Augmentation: If X ! Y, then XZ ! YZ for any Z.
• Transitivity: If X ! Y and Y ! Z, then X ! Z.
• Armstrong's Axioms are sound in that they generate
only FDs in F+ when applied to a set F of FDs.
• They are complete in that repeated application of
these rules will generate all FDs in the closure F+.
• It is convenient to use some additional rules while
reasoning about F+:
• Union: If X ! Y and X ! Z, then X !YZ.
• Decomposition: If X ! YZ, then X !Y and X ! Z.
• These additional rules are not essential; their
soundness can be proved using Armstrong's Axioms.
Slide No:L2-9
Attribute Closure

• If we just want to check whether a given dependency,


say, X → Y, is in the closure of a set F of FDs,
• we can do so eciently without computing F+. We rst
compute the attribute closure X+ with respect to F,
• which is the set of attributes A such that X → A can be
inferred using the Armstrong Axioms.
• The algorithm for computing the attribute closure of a
set X of attributes is
• closure = X;
repeat until there is no change: {
if there is an FD U → V in F such that U subset of
closure,
then set closure = closure union of V
}
Slide No:L2-10
NORMAL FORMS
• The normal forms based on FDs are rst normal form (1NF),
second normal form (2NF), third normal form (3NF), and Boyce-
Codd normal form (BCNF).
• These forms have increasingly restrictive requirements: Every
relation in BCNF is also in 3NF,
• every relation in 3NF is also in 2NF, and every relation in 2NF
is in 1NF.
• A relation
• is in first normal form if every field contains only atomic
values, that is, not lists or sets.
• This requirement is implicit in our defition of the relational
model.
• Although some of the newer database systems are relaxing
this requirement
• 2NF is mainly of historical interest.
• 3NF and BCNF are important from a database design
standpoint.
Slide No:L3-1
Normal Forms

• Returning to the issue of schema refinement, the first


question to ask is whether any refinement is needed!
• If a relation is in a certain normal form (BCNF, 3NF etc.), it is
known that certain kinds of problems are avoided/minimized.
This can be used to help us decide whether decomposing the
relation will help.
• Role of FDs in detecting redundancy:
– Consider a relation R with 3 attributes, ABC.
• No FDs hold: There is no redundancy here.
• Given A  B: Several tuples could have the same A value,
and if so, they’ll all have the same B value!

Slide No. L3-2


First Normal Form
 1NF (First Normal Form)
• a relation R is in 1NF if and only if it has only single-
valued attributes (atomic values)
• EMP_PROJ (SSN, PNO, HOURS, ENAME, PNAME,
PLOCATION)
PLOCATION is not in 1NF (multi-valued attrib.)
• solution: decompose the relation
EMP_PROJ2 (SSN, PNO, HOURS, ENAME, PNAME)
LOC (PNO, PLOCATION)

Slide No. L3-3


Second Normal Form
 2NF (Second Normal Form)
• a relation R in 2NF if and only if it is in 1NF and every
nonkey column depends on a key not a subset of a key
• all nonprime attributes of R must be fully functionally
dependent on a whole key(s) of the relation, not a part of
the key
• no violation: single-attribute key or no nonprime attribute

Slide No. L3-4


Second Normal Form ( Contd)
 2NF (Second Normal Form)
• violation: part of a key  nonkey
EMP_PROJ2 (SSN, PNO, HOURS, ENAME, PNAME)
SSN  ENAME
PNO  PNAME
• solution: decompose the relation
EMP_PROJ3 (SSN, PNO, HOURS)
EMP (SSN, ENAME)
PROJ (PNO, PNAME)

Slide No. L3-5


Third Normal Form

 3NF (Third Normal Form)


• a relation R in 3NF if and only if it is in 2NF and every
nonkey column does not depend on another nonkey
column
• all nonprime attributes of R must be non-transitively
functionally dependent on a key of the relation
• violation: nonkey  nonkey
• fig14.10: 2NF & 3NF normalization

Slide No. L3-6


Third Normal Form (Contd)

 3NF (Third Normal Form)


• SUPPLIER (SNAME, STREET, CITY, STATE, TAX)
SNAME  STREET, CITY, STATE
STATE  TAX (nonkey  nonkey)
SNAME  STATE  TAX (transitive FD)
• solution: decompose the relation
SUPPLIER2 (SNAME, STREET, CITY, STATE)
TAXINFO (STATE, TAX)

Slide No. L3-7


Boyce-Codd Normal Form (BCNF)

• Reln R with FDs F is in BCNF if, for all X A 


in F
– A  X (called a trivial FD), or
– X contains a key for R.
• In other words, R is in BCNF if the only non-trivial FDs that
hold over R are key constraints.
– No dependency in R that can be predicted using FDs alone.
– If we are shown two tuples that agree upon the
X value, we cannot infer the A value in
one tuple from the A value in the other. X Y A
– If example relation is in BCNF, the 2 tuples
must be identical (since X is a key). x y1 a
x y2 ?
Slide No. L3-8
Third Normal Form (3NF)

• Reln R with FDs F is in 3NF if, for all X A


in F
– A  X (called a trivial FD), or
– X contains a key for R, or
– A is part of some key for R.
• Minimality of a key is crucial in third condition above!
• If R is in BCNF, obviously in 3NF.
• If R is in 3NF, some redundancy is possible. It is a
compromise, used when BCNF not achievable (e.g., no ``good’’
decomp, or performance considerations).
– Lossless-join, dependency-preserving decomposition of R into a
collection of 3NF relations always possible.

Slide No. L3-9


Decomposition of a Relation Scheme

• Suppose that relation R contains attributes A1 ... An. A


decomposition of R consists of replacing R by two or more
relations such that:
– Each new relation scheme contains a subset of the attributes of
R (and no attributes that do not appear in R), and
– Every attribute of R appears as an attribute of one of the new
relations.
• Intuitively, decomposing R means we will store instances of
the relation schemes produced by the decomposition, instead
of instances of R.
• E.g., Can decompose SNLRWH into SNLRH and RW.

Slide No. L4-1


Example Decomposition

• Decompositions should be used only when needed.


– SNLRWH has FDs S SNLRWH and R 
W
– Second FD causes violation of 3NF; W values repeatedly
associated with R values. Easiest way to fix this is to create a
relation RW to store these associations, and to remove W from
the main schema:
• i.e., we decompose SNLRWH into SNLRH and RW
• The information to be stored consists of SNLRWH tuples. If
we just store the projections of these tuples onto SNLRH and
RW, are there any potential problems that we should be aware
of?

Slide No. L4-2


Problems with Decompositions

• There are three potential problems to consider:


– Some queries become more expensive.
• e.g., How much did sailor Joe earn? (salary = W*H)
– Given instances of the decomposed relations, we may not be
able to reconstruct the corresponding instance of the original
relation!
• Fortunately, not in the SNLRWH example.
– Checking some dependencies may require joining the
instances of the decomposed relations.
• Fortunately, not in the SNLRWH example.
• Tradeoff: Must consider these issues vs. redundancy.

Slide No. L4-3


Lossless Join Decompositions

• Decomposition of R into X and Y is lossless-join w.r.t. a set of


FDs F if, for every instance r that satisfies F:
– (r) (r) = r
X
• It is always
 
Y
true that r (r) (r)
– In general, the other direction  X not
 does  YIf it does, the
hold!
decomposition is lossless-join.
• Definition extended to decomposition into 3 or more relations
in a straightforward way.
• It is essential that all decompositions used to deal with
redundancy be lossless! (Avoids Problem (2).)

Slide No. L4-4


More on Lossless Join
A B
1 2
A B C 4 5
• The decomposition of R into X and
7 2
Y is lossless-join wrt F if and only 1 2 3
if the closure of F contains: 4 5 6
B C
– X Y X, or 7 2 8
– X Y Y 2 3
 the decomposition of
• In particular, 5 6
R into  and R - V is
UV 2 8
lossless-join if U V holds
A B C
over R. 1 2 3
4 5 6
7 2 8
 1 2 8
Slide No. L4-5 7 2 3
Dependency Preserving Decomposition

• Consider CSJDPQV, C is key, JP and SD


C P.
– BCNF decomposition: CSJDQV and SDP
– Problem: Checking JP C requires a join!

• Dependency preserving decomposition (Intuitive):
– If R is decomposed into X, Y and Z, and we enforce the FDs that
hold on X, on Y and on Z, then all FDs that were given to hold
on R must also hold. (Avoids Problem (3).)
• Projection of set of FDs F: If R is decomposed into X, ...
projection of F onto X (denoted FX ) is the set of FDs U V
in F+ (closure of F ) such that U, V are in X.


Slide No. L4-6
Dependency Preserving Decompositions
(Contd.)

• Decomposition of R into X and Y is dependency preserving if


(FX union FY ) + = F +
– i.e., if we consider only dependencies in the closure F + that can
be checked in X without considering Y, and in Y without
considering X, these imply all dependencies in F +.
• Important to consider F +, not F, in this definition:
– ABC, A B, B C, C A, decomposed into AB and BC.
– Is this dependency
 preserving?
 Is C A preserved?????
• Dependency preserving does not imply lossless join:

– ABC, A B, decomposed into AB and BC.
• And vice-versa! (Example?)

Slide No. L4-7
Decomposition into BCNF

• Consider relation R with FDs F. If X Y


violates BCNF,
decompose R into R - Y and XY.
– Repeated application of this idea will give us a collection of
relations that are in BCNF; lossless join decomposition, and
guaranteed to terminate.
– e.g., CSJDPQV, key C, JP C, SD P, J S
– To deal with SD  into SDP,
P, decompose  CSJDQV. 
– To deal with J S, decompose CSJDQV into JS and CJDQV
• In general, several dependencies may cause violation of BCNF.

The order in which we ``deal with’’ them could lead to very
different sets of relations!

Slide No. L4-8


BCNF and Dependency Preservation

• In general, there may not be a dependency preserving


decomposition into BCNF.
e.g., CSZ, CS

 Z, Z C
– Can’t decompose while preserving 1st FD; not in BCNF.
• Similarly, decomposition of CSJDQV into SDP, JS and
CJDQV is not dependency preserving (w.r.t. the FDs JP
C, SD P and J S).
– However, it is a lossless join decomposition.
  
– In this case, adding JPC to the collection of relations gives us a
dependency preserving decomposition.
• JPC tuples stored only for checking FD! (Redundancy!)

Slide No. L4-9


Decomposition into 3NF

• Obviously, the algorithm for lossless join decomp into BCNF


can be used to obtain a lossless join decomp into 3NF
(typically, can stop earlier).
• To ensure dependency preservation, one idea:
– If X Y is not preserved, add relation XY.
–  is that XY may violate 3NF! e.g., consider the addition
Problem
of CJP to `preserve’ JP C. What if we also have J C?
• Refinement: Instead of the given set of FDs F, use a minimal
cover for F.

Slide No. L4-10


SCHEMA REFINEMENT

Constraints on an Entity Set


• Consider the Hourly Emps relation again. The
constraint that attribute ssn is a key can be
expressed as an FD:
• { ssn }-> { ssn, name, lot, rating, hourly wages, hours
worked}
• For brevity, we will write this FD as S -> SNLRWH,
using a single letter to denote each attribute
• In addition, the constraint that the hourly wages
attribute is determined by the rating attribute is an
FD: R -> W.

Slide No:L5-1
Constraints on a Relationship Set

• The previous example illustrated how FDs can


help to rene the subjective decisions made
during ER design,
• but one could argue that the best possible ER
diagram would have led to the same nal set of
relations.
• Our next example shows how FD information
can lead to a set of relations that eliminates
some redundancy problems and is unlikely to
be arrived at solely through ER design.

Slide No:L5-2
Identifying Attributes of Entities

• in particular, it shows that attributes can easily be


associated with the `wrong' entity set during ER design.
• The ER diagram shows a relationship set called Works
In that is similar to the Works In relationship set
• Using the key constraint, we can translate this ER
diagram into two relations:
• Workers(ssn, name, lot, did, since)

Slide No:L5-3
Identifying Entity Sets

• Let Reserves contain attributes S, B, and D as before,


indicating that sailor S has a reservation for boat B
on day D.
• In addition, let there be an attribute C denoting the
credit card to which the reservation is charged.
• Suppose that every sailor uses a unique credit card
for reservations. This constraint is expressed by the
FD
S -> C. This constraint indicates that in relation
Reserves, we store the credit card number for a sailor
as often as we have reservations for that
• sailor, and we have redundancy and potential update
anomalies.

Slide No:L5-4
Multivalued Dependencies

• Suppose that we have a relation with attributes course,


teacher, and book, which we denote as CTB.
• The meaning of a tuple is that teacher T can teach
course C, and book B is a recommended text for the
course.
• There are no FDs; the key is CTB. However, the
recommended texts for a course are independent of the
instructor.

Slide No:L6-1
There are three points to note here:
• The relation schema CTB is in BCNF; thus we would not
consider decomposing it further if we looked only at the
FDs that hold over CTB.
• There is redundancy. The fact that Green can teach
Physics101 is recorded once per recommended text for the
course. Similarly, the fact that Optics is a text for
Physics101 is recorded once per potential teacher.
• The redundancy can be eliminated by decomposing CTB
into CT and CB.
• Let R be a relation schema and let X and Y be subsets of
the attributes of R. Intuitively,
• the multivalued dependency X !! Y is said to hold over R
if, in every legal

Slide No:L6-2
• The redundancy in this example is due to the
constraint that the texts for a course are
independent of the instructors, which cannot
be epressed in terms of FDs.
• This constraint is an example of a multivalued
dependency, or MVD. Ideally, we should model
this situation using two binary relationship
sets, Instructors with attributes CT and Text
with attributes CB.
• Because these are two essentially independent
relationships, modeling them with a single
ternary relationship set with attributes CTB is
inappropriate.

Slide No:L6-3
• Three of the additional rules involve only MVDs:
• MVD Complementation: If X →→Y, then X →→ R −
XY
• MVD Augmentation: If X →→ Y and W > Z, then
WX →→ YZ.
• MVD Transitivity: If X →→ Y and Y →→ Z, then
X →→ (Z − Y ).
• Fourth Normal Form
• R is said to be in fourth normal form (4NF) if for
every MVD X →→Y that holds over R, one of the
following statements is true:
• Y subset of X or XY = R, or
• X is a superkey.

Slide No:L6-4
Join Dependencies

• A join dependency is a further generalization of


MVDs. A join dependency (JD) ∞{ R1,….. Rn } is
said to hold over a relation R if R1,…. Rn is a
lossless-join decomposition of R.
• An MVD X ->-> Y over a relation R can be expressed
as the join dependency ∞ { XY,X(R−Y)}
• As an example, in the CTB relation, the MVD C ->-
>T can be expressed as the join dependency ∞{ CT,
CB}
• Unlike FDs and MVDs, there is no set of sound and
complete inference rules for JDs.

Slide No:L7-1
Fifth Normal Form

• A relation schema R is said to be in fth normal


form (5NF) if for every JD ∞{ R1,…. Rn } that holds
over R, one of the following statements is true:
• Ri = R for some i, or
• The JD is implied by the set of those FDs over R in
which the left side is a key for R.
• The following result, also due to Date and Fagin,
identies conditions|again, detected using only FD
information|under which we can safely ignore JD
information.
• If a relation schema is in 3NF and each of its keys
consists of a single attribute,it is also in 5NF.

Slide No:L7-2
Inclusion Dependencies

• MVDs and JDs can be used to guide database


design, as we have seen, although they are less
common than FDs and harder to recognize and
reason about.
• In contrast, inclusion dependencies are very
intuitive and quite common. However, they
typically have little influence on database design
• The main point to bear in mind is that we should
not split groups of attributes that participate in an
inclusion dependency.
• Most inclusion dependencies in practice are key-
based, that is, involve only keys.

Slide No:L7-3
Transaction Concept
• A transaction is a unit of program execution that accesses
and possibly updates various data items.
• E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues to deal with:
– Failures of various kinds, such as hardware failures and
system crashes
– Concurrent execution of multiple transactions

Slide No.L1-1
Example of Fund Transfer
• Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement
– if the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state
• Failure could be due to software or hardware
– the system should ensure that updates of a partially executed
transaction are not reflected in the database
• Durability requirement — once the user has been notified that the
transaction has completed (i.e., the transfer of the $50 has taken
place), the updates to the database by the transaction must persist
even if there are software or hardware failures.

Slide No.L1-2
Example of Fund Transfer (Cont.)
• Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Consistency requirement in above example:
– the sum of A and B is unchanged by the execution of the
transaction
• In general, consistency requirements include
• Explicitly specified integrity constraints such as primary
keys and foreign keys
• Implicit integrity constraints
– e.g. sum of balances of all accounts, minus sum of
loan amounts must equal value of cash-in-hand
– A transaction must see a consistent database.
– During transaction execution the database may be temporarily
inconsistent.
– When the transaction completes successfully the database must be
consistent
• Erroneous transaction logic can lead to inconsistency

Slide No.L1-3
Example of Fund Transfer (Cont.)
• Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated database, it
will see an inconsistent database (the sum A + B will be less than it
should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Isolation can be ensured trivially by running transactions serially
– that is, one after the other.
• However, executing multiple transactions concurrently has significant
benefits, as we will see later.

Slide No.L1-4
ACID Properties
A transaction is a unit of program execution that accesses and possibly
updates various data items.To preserve the integrity of data the database
system must ensure:
• Atomicity. Either all operations of the transaction are properly
reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
• Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions. Intermediate transaction
results must be hidden from other concurrently executed
transactions.
– That is, for every pair of transactions Ti and Tj, it appears to Ti
that either Tj, finished execution before Ti started, or Tj started
execution after Ti finished.
• Durability. After a transaction completes successfully, the changes
it has made to the database persist, even if there are system failures.

Slide No.L1-5
Transaction State
• Active – the initial state; the transaction stays in this state
while it is executing
• Partially committed – after the final statement has been
executed.
• Failed -- after the discovery that normal execution can no
longer proceed.
• Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
– restart the transaction
• can be done only if no internal logical error
– kill the transaction
• Committed – after successful completion.

Slide No.L1-6
Transaction State (Cont.)

Slide No.L1-7
Implementation of Atomicity and Durability
• The recovery-management component of a database system
implements the support for atomicity and durability.
• E.g. the shadow-database scheme:
– all updates are made on a shadow copy of the database
• db_pointer is made to point to the updated shadow
copy after
– the transaction reaches partial commit and
– all updated pages have been flushed to disk.

Slide No.L2-1
Implementation of Atomicity and Durability (Cont.)
• db_pointer always points to the current consistent copy of the
database.
– In case transaction fails, old consistent copy pointed to by
db_pointer can be used, and the shadow copy can be deleted.
• The shadow-database scheme:
– Assumes that only one transaction is active at a time.
– Assumes disks do not fail
– Useful for text editors, but
• extremely inefficient for large databases (why?)
– Variant called shadow paging reduces copying of
data, but is still not practical for large databases
– Does not handle concurrent transactions
• Will study better schemes in Chapter 17.

Slide No.L2-2
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the
system. Advantages are:
– increased processor and disk utilization, leading to
better transaction throughput
• E.g. one transaction can be using the CPU while
another is reading from or writing to the disk
– reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve
isolation
– that is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
• Will study in Chapter 16, after studying notion
of correctness of concurrent executions.

Slide No.L2-3
Schedules
• Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
– a schedule for a set of transactions must consist of all
instructions of those transactions
– must preserve the order in which the instructions
appear in each individual transaction.
• A transaction that successfully completes its execution
will have a commit instructions as the last statement
– by default transaction assumed to execute commit
instruction as its last step
• A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement

Slide No.L2-4
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance
from A to B.
• A serial schedule in which T1 is followed by T2 :

Slide No.L2-5
Schedule 2
• A serial schedule where T2 is followed by T1

Slide No.L2-6
Schedule 3
• Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.

In Schedules 1, 2 and 3, the sum A + B is preserved.


Slide No.L2-7
Schedule 4
• The following concurrent schedule does not preserve
the value of (A + B ).

Slide No.L2-8
Serializability
• Basic Assumption – Each transaction preserves database
consistency.
• Thus serial execution of a set of transactions preserves database
consistency.
• A (possibly concurrent) schedule is serializable if it is equivalent to a
serial schedule. Different forms of schedule equivalence give rise to
the notions of:
1. conflict serializability
2. view serializability
• Simplified view of transactions
– We ignore operations other than read and write instructions
– We assume that transactions may perform arbitrary computations
on data in local buffers in between reads and writes.
– Our simplified schedules consist of only read and write
instructions.

Slide No.L3-1
Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj respectively,
conflict if and only if there exists some item Q accessed
by both li and lj, and at least one of these instructions
wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
• Intuitively, a conflict between li and lj forces a (logical)
temporal order between them.
– If li and lj are consecutive in a schedule and they do
not conflict, their results would remain the same even
if they had been interchanged in the schedule.

Slide No.L3-2
Conflict Serializability

• If a schedule S can be transformed into a schedule S´ by a


series of swaps of non-conflicting instructions, we say that S
and S´ are conflict equivalent.
• We say that a schedule S is conflict serializable if it is
conflict equivalent to a serial schedule

Slide No.L3-3
Conflict Serializability (Cont.)
• Schedule 3 can be transformed into Schedule 6, a serial
schedule where T2 follows T1, by series of swaps of non-
conflicting instructions.
– Therefore Schedule 3 is conflict serializable.

Schedule 3 Schedule 6
Slide No.L3-4
Conflict Serializability (Cont.)
• Example of a schedule that is not conflict serializable:

• We are unable to swap instructions in the above schedule to


obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.

Slide No.L3-5
View Serializability
• Let S and S´ be two schedules with the same set of
transactions. S and S´ are view equivalent if the following
three conditions are met, for each data item Q,
1. If in schedule S, transaction Ti reads the initial value of Q,
then in schedule S’ also transaction Ti must read the
initial value of Q.
2. If in schedule S transaction Ti executes read(Q), and that
value was produced by transaction Tj (if any), then in
schedule S’ also transaction Ti must read the value of Q
that was produced by the same write(Q) operation of
transaction Tj .
3. The transaction (if any) that performs the final write(Q)
operation in schedule S must also perform the final
write(Q) operation in schedule S’.
As can be seen, view equivalence is also based purely on reads
and writes alone.

Slide No.L3-6
View Serializability (Cont.)
• A schedule S is view serializable if it is view equivalent to
a serial schedule.
• Every conflict serializable schedule is also view
serializable.
• Below is a schedule which is view-serializable but not
conflict serializable.

• What serial schedule is above equivalent to?


• Every view serializable schedule that is not conflict
serializable has blind writes.

Slide No.L3-7
Other Notions of Serializability
• The schedule below produces same outcome as the serial
schedule < T1, T5 >, yet is not conflict equivalent or view
equivalent to it.

Determining such equivalence requires analysis of operations


other than read and write.

Slide No.L3-8
Recoverable Schedules
Need to address the effect of transaction failures on concurrently
running transactions.
• Recoverable schedule — if a transaction Tj reads a data
item previously written by a transaction Ti , then the
commit operation of Ti appears before the commit
operation of Tj.
• The following schedule (Schedule 11) is not recoverable if
T9 commits immediately after the read

• If T8 should abort, T9 would have read (and possibly shown to the


user) an inconsistent database state. Hence, database must
ensure that schedules are recoverable.

Slide No.L4-1
Cascading Rollbacks
• Cascading rollback – a single transaction failure leads to
a series of transaction rollbacks. Consider the following
schedule where none of the transactions has yet
committed (so the schedule is recoverable)

If T10 fails, T11 and T12 must also be rolled back.


• Can lead to the undoing of a significant amount of work

Slide No.L4-2
Cascadeless Schedules

• Cascadeless schedules — cascading rollbacks cannot occur;


for each pair of transactions Ti and Tj such that Tj reads a
data item previously written by Ti, the commit operation of Ti
appears before the read operation of Tj.
• Every cascadeless schedule is also recoverable
• It is desirable to restrict the schedules to those that are
cascadeless

Slide No.L4-3
Concurrency Control
• A database must provide a mechanism that will ensure
that all possible schedules are
– either conflict or view serializable, and
– are recoverable and preferably cascadeless
• A policy in which only one transaction can execute at a
time generates serial schedules, but provides a poor degree
of concurrency
– Are serial schedules recoverable/cascadeless?
• Testing a schedule for serializability after it has executed is
a little too late!
• Goal – to develop concurrency control protocols that will
assure serializability.

Slide No.L4-4
Concurrency Control vs. Serializability Tests
• Concurrency-control protocols allow concurrent schedules,
but ensure that the schedules are conflict/view serializable,
and are recoverable and cascadeless .
• Concurrency control protocols generally do not examine the
precedence graph as it is being created
– Instead a protocol imposes a discipline that avoids
nonseralizable schedules.
– We study such protocols in Chapter 16.
• Different concurrency control protocols provide different
tradeoffs between the amount of concurrency they allow and
the amount of overhead that they incur.
• Tests for serializability help us understand why a
concurrency control protocol is correct.

Slide No.L4-5
Weak Levels of Consistency

• Some applications are willing to live with weak levels of


consistency, allowing schedules that are not serializable
– E.g. a read-only transaction that wants to get an
approximate total balance of all accounts
– E.g. database statistics computed for query optimization
can be approximate (why?)
– Such transactions need not be serializable with respect to
other transactions
• Tradeoff accuracy for performance

Slide No.L4-6
Levels of Consistency in SQL-92
• Serializable — default
• Repeatable read — only committed records to be read,
repeated reads of same record must return same value.
However, a transaction may not be serializable – it may find
some records inserted by a transaction but not find others.
• Read committed — only committed records can be read,
but successive reads of record may return different (but
committed) values.
• Read uncommitted — even uncommitted records may be
read.

• Lower degrees of consistency useful for gathering approximate


information about the database
• Warning: some database systems do not ensure serializable schedules
by default
– E.g. Oracle and PostgreSQL by default support a level of
consistency called snapshot isolation (not part of the SQL
standard)
Slide No.L4-7
Transaction Definition in SQL
• Data manipulation language must include a construct for
specifying the set of actions that comprise a transaction.
• In SQL, a transaction begins implicitly.
• A transaction in SQL ends by:
– Commit work commits current transaction and begins a
new one.
– Rollback work causes current transaction to abort.
• In almost all database systems, by default, every SQL statement
also commits implicitly if it executes successfully
– Implicit commit can be turned off by a database directive
• E.g. in JDBC, connection.setAutoCommit(false);

Slide No.L4-8
Implementation of Isolation
• Schedules must be conflict or view serializable, and
recoverable, for the sake of database consistency, and
preferably cascadeless.
• A policy in which only one transaction can execute at a time
generates serial schedules, but provides a poor degree of
concurrency.
• Concurrency-control schemes tradeoff between the amount of
concurrency they allow and the amount of overhead that they
incur.
• Some schemes allow only conflict-serializable schedules to be
generated, while others allow view-serializable schedules that
are not conflict-serializable.

Slide No.L5-1
Figure 15.6

Slide No.L5-2
Testing for Serializability
• Consider some schedule of a set of transactions T1, T2, ...,
Tn
• Precedence graph — a direct graph where the vertices are
the transactions (names).
• We draw an arc from Ti to Tj if the two transaction conflict,
and Ti accessed the data item on which the conflict arose
earlier.
• We may label the arc by the item that was accessed.
• Example 1
x

Slide No.L5-3
Example Schedule (Schedule A) + Precedence Graph

T1 T2 T3 T4 T5
read(X)
read(Y)
read(Z)
read(V)
read(W)
read(W) T1 T2
read(Y)
write(Y)
write(Z)
read(U)
read(Y)
write(Y)
read(Z)
T3 T4
write(Z)
read(U)
write(U)

T5

Slide No.L5-4
Test for Conflict Serializability
• A schedule is conflict serializable if and only
if its precedence graph is acyclic.
• Cycle-detection algorithms exist which take
order n2 time, where n is the number of
vertices in the graph.
– (Better algorithms take order n + e where
e is the number of edges.)
• If precedence graph is acyclic, the
serializability order can be obtained by a
topological sorting of the graph.
– This is a linear order consistent with the
partial order of the graph.
– For example, a serializability order for
Schedule A would be
T5  T1  T3  T2  T4
• Are there others?

Slide No.L5-5
Test for View Serializability
• The precedence graph test for conflict serializability
cannot be used directly to test for view serializability.
– Extension to test for view serializability has cost
exponential in the size of the precedence graph.
• The problem of checking if a schedule is view serializable
falls in the class of NP-complete problems.
– Thus existence of an efficient algorithm is extremely
unlikely.
• However practical algorithms that just check some
sufficient conditions for view serializability can still be
used.

Slide No.L5-6
Lock-Based Protocols
• A lock is a mechanism to control concurrent access to a
data item
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as

written. X-lock is requested using lock-X instruction.


2. shared (S) mode. Data item can only be read. S-lock is

requested using lock-S instruction.


• Lock requests are made to concurrency-control manager.
Transaction can proceed only after request is granted.

Slide No.L6-1
Lock-Based Protocols (Cont.)
• Lock-compatibility matrix

• A transaction may be granted a lock on an item if the requested


lock is compatible with locks already held on the item by other
transactions
• Any number of transactions can hold shared locks on an item,
– but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made to
wait till all incompatible locks held by other transactions have
been released. The lock is then granted.

Slide No.L6-2
Lock-Based Protocols (Cont.)
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
• Locking as above is not sufficient to guarantee serializability
— if A and B get updated in-between the read of A and B, the
displayed sum would be wrong.
• A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks. Locking
protocols restrict the set of possible schedules.

Slide No.L6-3
Pitfalls of Lock-Based Protocols
• Consider the partial schedule

• Neither T3 nor T4 can make progress — executing lock-S(B)


causes T4 to wait for T3 to release its lock on B, while executing
lock-X(A) causes T3 to wait for T4 to release its lock on A.
• Such a situation is called a deadlock.
– To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.

Slide No.L6-4
Pitfalls of Lock-Based Protocols (Cont.)
• The potential for deadlock exists in most locking
protocols. Deadlocks are a necessary evil.
• Starvation is also possible if concurrency control
manager is badly designed. For example:
– A transaction may be waiting for an X-lock on an
item, while a sequence of other transactions request
and are granted an S-lock on the same item.
– The same transaction is repeatedly rolled back due to
deadlocks.
• Concurrency control manager can be designed to prevent
starvation.

Slide No.L6-5
The Two-Phase Locking Protocol
• This is a protocol which ensures conflict-serializable
schedules.
• Phase 1: Growing Phase
– transaction may obtain locks
– transaction may not release locks
• Phase 2: Shrinking Phase
– transaction may release locks
– transaction may not obtain locks
• The protocol assures serializability. It can be proved that the
transactions can be serialized in the order of their lock
points (i.e. the point where a transaction acquired its final
lock).

Slide No.L7-1
The Two-Phase Locking Protocol (Cont.)
• Two-phase locking does not ensure freedom from deadlocks
• Cascading roll-back is possible under two-phase locking. To
avoid this, follow a modified protocol called strict two-phase
locking. Here a transaction must hold all its exclusive locks
till it commits/aborts.
• Rigorous two-phase locking is even stricter: here all locks
are held till commit/abort. In this protocol transactions can
be serialized in the order in which they commit.

Slide No.L7-2
The Two-Phase Locking Protocol (Cont.)

• There can be conflict serializable schedules that cannot be


obtained if two-phase locking is used.
• However, in the absence of extra information (e.g., ordering of
access to data), two-phase locking is needed for conflict
serializability in the following sense:
Given a transaction Ti that does not follow two-phase locking,
we can find a transaction Tj that uses two-phase locking, and
a schedule for Ti and Tj that is not conflict serializable.

Slide No.L7-3
Lock Conversions
• Two-phase locking with lock conversions:
– First Phase:
– can acquire a lock-S on item
– can acquire a lock-X on item
– can convert a lock-S to a lock-X (upgrade)
– Second Phase:
– can release a lock-S
– can release a lock-X
– can convert a lock-X to a lock-S (downgrade)
• This protocol assures serializability. But still relies on the
programmer to insert the various locking instructions.

Slide No.L7-4
Automatic Acquisition of Locks
• A transaction Ti issues the standard read/write
instruction, without explicit locking calls.
• The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end

Slide No.L7-5
Automatic Acquisition of Locks (Cont.)
• write(D) is processed as:
if Ti has a lock-X on D
then
write(D)
else begin
if necessary wait until no other trans. has any lock
on D,
if Ti has a lock-S on D
then
upgrade lock on D to lock-X
else
grant Ti a lock-X on D
write(D)
end;
• All locks are released after commit or abort

Slide No.L7-6
Implementation of Locking
• A lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
• The lock manager replies to a lock request by sending a
lock grant messages (or a message asking the
transaction to roll back, in case of a deadlock)
• The requesting transaction waits until its request is
answered
• The lock manager maintains a data-structure called a
lock table to record granted locks and pending requests
• The lock table is usually implemented as an in-memory
hash table indexed on the name of the data item being
locked

Slide No.L7-7
Lock Table
• Black rectangles indicate granted
locks, white ones indicate waiting
requests
• Lock table also records the type of
lock granted or requested
• New request is added to the end of
the queue of requests for the data
item, and granted if it is compatible
with all earlier locks
• Unlock requests result in the
request being deleted, and later
requests are checked to see if they
can now be granted
Granted • If transaction aborts, all waiting or
granted requests of the transaction
Waiting are deleted
– lock manager may keep a list of
locks held by each transaction,
to implement this efficiently
Slide No.L7-8
Graph-Based Protocols

• Graph-based protocols are an alternative to two-phase


locking
• Impose a partial ordering  on the set D = {d1, d2 ,..., dh} of all
data items.
– If di  dj then any transaction accessing both di and dj
must access di before accessing dj.
– Implies that the set D may now be viewed as a directed
acyclic graph, called a database graph.
• The tree-protocol is a simple kind of graph protocol.

Slide No.L7-9
Tree Protocol

1. Only exclusive locks are allowed.


2. The first lock by Ti may be on any data item. Subsequently, a data Q
can be locked by Ti only if the parent of Q is currently locked by Ti.
3. Data items may be unlocked at any time.
4. A data item that has been locked and unlocked by Ti cannot
subsequently be relocked by Ti

Slide No.L7-10
Timestamp-Based Protocols
• Each transaction is issued a timestamp when it enters the system.
If an old transaction Ti has time-stamp TS(Ti), a new transaction Tj
is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).
• The protocol manages concurrent execution such that the time-
stamps determine the serializability order.
• In order to assure such behavior, the protocol maintains for each
data Q two timestamp values:
– W-timestamp(Q) is the largest time-stamp of any transaction
that executed write(Q) successfully.
– R-timestamp(Q) is the largest time-stamp of any transaction
that executed read(Q) successfully.

Slide No. L8-1


Timestamp-Based Protocols (Cont.)

• The timestamp ordering protocol ensures that any


conflicting read and write operations are executed in
timestamp order.
• Suppose a transaction Ti issues a read(Q)
1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a
value of Q that was already overwritten.
 Hence, the read operation is rejected, and Ti is
rolled back.
2. If TS(Ti) W-timestamp(Q), then the read operation is
executed, and R-timestamp(Q) is set to max(R-
timestamp(Q), TS(Ti)).

Slide No. L8-2


Timestamp-Based Protocols (Cont.)
• Suppose that transaction Ti issues write(Q).
1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is
producing was needed previously, and the system assumed
that that value would never be produced.
 Hence, the write operation is rejected, and Ti is rolled back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an
obsolete value of Q.
 Hence, this write operation is rejected, and Ti is rolled back.
3. Otherwise, the write operation is executed, and W-
timestamp(Q) is set to TS(Ti).

Slide No. L8-3


Example Use of the Protocol
A partial schedule for several data items for transactions with
timestamps 1, 2, 3, 4, 5

T1 T2 T3 T4 T5
read(X)
read(Y)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
abort
read(X)
write(Z)
abort
write(Y)
write(Z)

Slide No. L8-4


Correctness of Timestamp-Ordering Protocol

• The timestamp-ordering protocol guarantees serializability


since all the arcs in the precedence graph are of the form:

transaction transaction
with smaller with larger
timestamp timestamp

Thus, there will be no cycles in the precedence graph


• Timestamp protocol ensures freedom from deadlock as no
transaction ever waits.
• But the schedule may not be cascade-free, and may not
even be recoverable.

Slide No. L8-5


Thomas’ Write Rule
• Modified version of the timestamp-ordering protocol in which
obsolete write operations may be ignored under certain
circumstances.
• When Ti attempts to write data item Q, if TS(Ti) < W-timestamp(Q),
then Ti is attempting to write an obsolete value of {Q}.
– Rather than rolling back Ti as the timestamp ordering protocol
would have done, this {write} operation can be ignored.
• Otherwise this protocol is the same as the timestamp ordering
protocol.
• Thomas' Write Rule allows greater potential concurrency.
– Allows some view-serializable schedules that are not conflict-
serializable.

Slide No. L8-6


Validation-Based Protocol
• Execution of transaction Ti is done in three phases.
1. Read and execution phase: Transaction Ti writes only to
temporary local variables
2. Validation phase: Transaction Ti performs a ``validation test''
to determine if local variables can be written without violating
serializability.
3. Write phase: If Ti is validated, the updates are applied to the
database; otherwise, Ti is rolled back.
• The three phases of concurrently executing transactions can be
interleaved, but each transaction must go through the three
phases in that order.
– Assume for simplicity that the validation and write phase
occur together, atomically and serially
• I.e., only one transaction executes validation/write at a
time.
• Also called as optimistic concurrency control since transaction
executes fully in the hope that all will go well during validation

Slide No. L9-1


Validation-Based Protocol (Cont.)
• Each transaction Ti has 3 timestamps
– Start(Ti) : the time when Ti started its execution
– Validation(Ti): the time when Ti entered its validation phase
– Finish(Ti) : the time when Ti finished its write phase
• Serializability order is determined by timestamp given at validation
time, to increase concurrency.
– Thus TS(Ti) is given the value of Validation(Ti).
• This protocol is useful and gives greater degree of concurrency if
probability of conflicts is low.
– because the serializability order is not pre-decided, and
– relatively few transactions will have to be rolled back.

Slide No. L9-2


Validation Test for Transaction Tj
• If for all Ti with TS (Ti) < TS (Tj) either one of the following condition
holds:
– finish(Ti) < start(Tj)
– start(Tj) < finish(Ti) < validation(Tj) and the set of data items
written by Ti does not intersect with the set of data items read by
Tj.
then validation succeeds and Tj can be committed. Otherwise,
validation fails and Tj is aborted.
• Justification: Either the first condition is satisfied, and there is no
overlapped execution, or the second condition is satisfied and
 the writes of Tj do not affect reads of Ti since they occur after Ti
has finished its reads.
 the writes of Ti do not affect reads of Tj since Tj does not read any
item written by Ti.

Slide No. L9-3


Schedule Produced by Validation
• Example of schedule produced using validation

T14 T15
read(B)
read(B)
B:= B-50
read(A)
A:= A+50
read(A)
(validate)
display (A+B)
(validate)
write (B)
write (A)

Slide No. L9-4


Multiple Granularity

• Allow data items to be of various sizes and define a hierarchy


of data granularities, where the small granularities are nested
within larger ones
• Can be represented graphically as a tree (but don't confuse
with tree-locking protocol)
• When a transaction locks a node in the tree explicitly, it
implicitly locks all the node's descendents in the same mode.
• Granularity of locking (level in tree where locking is done):
– fine granularity (lower in tree): high concurrency, high
locking overhead
– coarse granularity (higher in tree): low locking overhead,
low concurrency

Slide No. L9-5


Example of Granularity Hierarchy

The levels, starting from the coarsest (top) level are


– database
– area
– file
– record
Slide No. L9-6
Intention Lock Modes

• In addition to S and X lock modes, there are three additional


lock modes with multiple granularity:
– intention-shared (IS): indicates explicit locking at a lower
level of the tree but only with shared locks.
– intention-exclusive (IX): indicates explicit locking at a
lower level with exclusive or shared locks
– shared and intention-exclusive (SIX): the subtree rooted
by that node is locked explicitly in shared mode and
explicit locking is being done at a lower level with
exclusive-mode locks.
• intention locks allow a higher level node to be locked in S or
X mode without having to check all descendent nodes.

Slide No. L9-7


Compatibility Matrix with
Intention Lock Modes

• The compatibility matrix for all lock modes is:

IS IX S S IX X
IS     

IX     

S     

S IX     

X     

Slide No. L9-8


Multiple Granularity Locking Scheme
• Transaction Ti can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be
locked in any mode.
3. A node Q can be locked by Ti in S or IS mode only if the
parent of Q is currently locked by Ti in either IX or IS
mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if
the parent of Q is currently locked by Ti in either IX or SIX
mode.
5. Ti can lock a node only if it has not previously unlocked
any node (that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are
currently locked by Ti.
• Observe that locks are acquired in root-to-leaf order, whereas
they are released in leaf-to-root order.
Slide No. L9-9
Recovery and Atomicity

• Modifying the database without ensuring that the


transaction will commit may leave the database in
an inconsistent state.
• Consider transaction Ti that transfers $50 from
account A to account B; goal is either to perform all
database modifications made by Ti or none at all.
• Several output operations may be required for Ti (to
output A and B). A failure may occur after one of
these modifications have been made but before all of
them are made.

Slide No.L1-1
Recovery and Atomicity (Cont.)
• To ensure atomicity despite failures, we first output
information describing the modifications to stable
storage without modifying the database itself.
• We study two approaches:
– log-based recovery, and
– shadow-paging
• We assume (initially) that transactions run serially, that
is, one after the other.

Slide No.L1-2
Recovery Algorithms
• Recovery algorithms are techniques to ensure database
consistency and transaction atomicity and durability
despite failures
– Focus of this chapter
• Recovery algorithms have two parts
1. Actions taken during normal transaction processing
to ensure enough information exists to recover from
failures
2. Actions taken after a failure to recover the database
contents to a state that ensures atomicity,
consistency and durability

Slide No.L1-3
Log-Based Recovery
• A log is kept on stable storage.
– The log is a sequence of log records, and maintains a record of
update activities on the database.
• When transaction Ti starts, it registers itself by writing a
<Ti start>log record
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is written,
where V1 is the value of X before the write, and V2 is the value to be
written to X.
– Log record notes that Ti has performed a write on data item Xj Xj
had value V1 before the write, and will have value V2 after the
write.
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
• Two approaches using logs
– Deferred database modification
– Immediate database modification

Slide No.L1-4
Deferred Database Modification
• The deferred database modification scheme records all
modifications to the log, but defers all the writes to after
partial commit.
• Assume that transactions execute serially
• Transaction starts by writing <Ti start> record to log.
• A write(X) operation results in a log record <Ti, X, V>
being written, where V is the new value for X
– Note: old value is not needed for this scheme
• The write is not performed on X at this time, but is
deferred.
• When Ti partially commits, <Ti commit> is written to the
log
• Finally, the log records are read and used to actually
execute the previously deferred writes.

Slide No.L1-5
Deferred Database Modification (Cont.)
• During recovery after a crash, a transaction needs to be redone if
and only if both <Ti start> and<Ti commit> are there in the log.
• Redoing a transaction Ti ( redoTi) sets the value of all data items
updated by the transaction to the new values.
• Crashes can occur while
– the transaction is executing the original updates, or
– while recovery action is being taken
• example transactions T0 and T1 (T0 executes before T1):T0: read (A)
T1 : read (C)
A: - A - 50 C:-C- 100
Write (A) write (C)
read (B)
B:- B + 50
write (B)

Slide No.L1-6
Deferred Database Modification (Cont.)

• Below we show the log as it appears at three instances of


time.

• If log on stable storage at time of crash is as in case:


(a) No redo actions need to be taken
(b) redo(T0) must be performed since <T0 commit> is present
(c) redo(T0) must be performed followed by redo(T1) since
<T0 commit> and <Ti commit> are present
Slide No.L1-7
Immediate Database Modification
• The immediate database modification scheme
allows database updates of an uncommitted
transaction to be made as the writes are issued
– since undoing may be needed, update logs must
have both old value and new value
• Update log record must be written before database
item is written
– We assume that the log record is output directly
to stable storage
– Can be extended to postpone log record output,
so long as prior to execution of an output(B)
operation for a data block B, all log records
corresponding to items B must be flushed to
stable storage

Slide No.L2-1
Immediate Database Modification

• Output of updated blocks can take place at any time


before or after transaction commit
• Order in which blocks are output can be different
from the order in which they are written.
• Recovery procedure has two operations instead of one:
– undo(Ti) restores the value of all data items
updated by Ti to their old values, going backwards
from the last log record for Ti
– redo(Ti) sets the value of all data items updated by
Ti to the new values, going forward from the first
log record for Ti

Slide No.L2-2
Immediate Database Modification (Cont.)
• Both operations must be idempotent
– That is, even if the operation is executed multiple
times the effect is the same as if it is executed once
• Needed since operations may get re-executed
during recovery
• When recovering after failure:
– Transaction Ti needs to be undone if the log contains
the record
<Ti start>, but does not contain the record <Ti
commit>.
– Transaction Ti needs to be redone if the log contains
both the record <Ti start> and the record <Ti
commit>.
• Undo operations are performed first, then redo
operations.

Slide No.L2-3
Immediate Database Modification Example
Log Write Output

<T0 start>
<T0, A, 1000, 950>
To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
x1
C = 600
BB, BC
<T1 commit>
BA
• Note: BX denotes block containing X.

Slide No.L2-4
Immediate DB Modification Recovery Example
Below we show the log as it appears at three instances of
time.

Recovery actions in each case above are:


(a) undo (T0): B is restored to 2000 and A to 1000.
(b) undo (T1) and redo (T0): C is restored to 700, and then A
and B are
set to 950 and 2050 respectively.
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
respectively. Then C is set to 600

Slide No.L2-5
Checkpoints
• Problems in recovery procedure as discussed earlier :
1. searching the entire log is time-consuming
2. we might unnecessarily redo transactions which have
already
3. output their updates to the database.
• Streamline recovery procedure by periodically performing
checkpointing
1. Output all log records currently residing in main
memory onto stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint> onto stable storage.

Slide No.L2-6
Checkpoints (Cont.)
• During recovery we need to consider only the most recent
transaction Ti that started before the checkpoint, and transactions
that started after Ti.
1. Scan backwards from end of log to find the most recent
<checkpoint> record
2. Continue scanning backwards till a record <Ti start> is found.
3. Need only consider the part of log following above start record.
Earlier part of log can be ignored during recovery, and can be
erased whenever desired.
4. For all transactions (starting from Ti or later) with no <Ti
commit>, execute undo(Ti). (Done only in case of immediate
modification.)
5. Scanning forward in the log, for all transactions starting from Ti
or later with a <Ti commit>, execute redo(Ti).

Slide No.L2-7
Example of Checkpoints

Tc Tf
T1
T2
T3
T4

checkpoint system failure

• T1 can be ignored (updates already output to disk due to


checkpoint)
• T2 and T3 redone.
• T4 undone
Slide No.L2-8
Recovery With Concurrent Transactions
• We modify the log-based recovery schemes to allow multiple
transactions to execute concurrently.
– All transactions share a single disk buffer and a single log
– A buffer block can have data items updated by one or more
transactions
• We assume concurrency control using strict two-phase locking;
– i.e. the updates of uncommitted transactions should not be
visible to other transactions
• Otherwise how to perform undo if T1 updates A, then T2
updates A and commits, and finally T1 has to abort?
• Logging is done as described earlier.
– Log records of different transactions may be interspersed in the log.

Slide No.L3-1
Recovery With Concurrent Transactions
• The checkpointing technique and actions taken
on recovery have to be changed
– since several transactions may be active when
a checkpoint is performed.

• Checkpoints are performed as before, except that


the checkpoint log record is now of the form
< checkpoint L>
where L is the list of transactions active at the
time of the checkpoint
– We assume no updates are in progress while
the checkpoint is carried out (will relax this
later)

Slide No.L3-2
Recovery With Concurrent Transactions (Cont.)

• When the system recovers from a crash, it first does


the following:
1. Initialize undo-list and redo-list to empty
2. Scan the log backwards from the end, stopping
when the first <checkpoint L> record is found.
For each record found during the backward
scan:
 if the record is <Ti commit>, add Ti to redo-list
 if the record is <Ti start>, then if Ti is not in
redo-list, add Ti to undo-list
3. For every Ti in L, if Ti is not in redo-list, add Ti to
undo-list

Slide No.L3-3
Recovery With Concurrent Transactions (Cont.)
• At this point undo-list consists of incomplete transactions which
must be undone, and redo-list consists of finished transactions
that must be redone.
• Recovery now continues as follows:
1. Scan log backwards from most recent record, stopping when
<Ti start> records have been encountered for every Ti in
undo-list.
 During the scan, perform undo for each log record that belongs to a
transaction in undo-list.
2. Locate the most recent <checkpoint L> record.
3. Scan log forwards from the <checkpoint L> record till the
end of the log.
 During the scan, perform redo for each log record that belongs to a
transaction on redo-list

Slide No.L3-4
Example of Recovery
• Go over the steps of the recovery algorithm on the following
log:
<T0 start>
<T0, A, 0, 10>
<T0 commit>
<T1 start> /* Scan at step 1 comes up to here */
<T1, B, 0, 10>
<T2 start>
<T2, C, 0, 10>
<T2, C, 10, 20>
<checkpoint {T1, T2}>
<T3 start>
<T3, A, 10, 20>
<T3, D, 0, 10>
<T3 commit>

Slide No.L3-5
Log Record Buffering

• Log record buffering: log records are buffered in main


memory, instead of of being output directly to stable
storage.
– Log records are output to stable storage when a
block of log records in the buffer is full, or a log
force operation is executed.
• Log force is performed to commit a transaction by
forcing all its log records (including the commit record)
to stable storage.
• Several log records can thus be output using a single
output operation, reducing the I/O cost.

Slide No.L4-1
Log Record Buffering (Cont.)
• The rules below must be followed if log records are
buffered:
– Log records are output to stable storage in the order in
which they are created.
– Transaction Ti enters the commit state only when the
log record
<Ti commit> has been output to stable storage.
– Before a block of data in main memory is output to the
database, all log records pertaining to data in that
block must have been output to stable storage.
• This rule is called the write-ahead logging or WAL rule
– Strictly speaking WAL only requires undo information to be
output

Slide No.L4-2
Database Buffering
• Database maintains an in-memory buffer of data blocks
– When a new block is needed, if buffer is full an existing block
needs to be removed from buffer
– If the block chosen for removal has been updated, it must be
output to disk
• If a block with uncommitted updates is output to disk, log records
with undo information for the updates are output to the log on
stable storage first
– (Write ahead logging)
• No updates should be in progress on a block when it is output to
disk. Can be ensured as follows.
– Before writing a data item, transaction acquires exclusive lock
on block containing the data item
– Lock can be released once the write is completed.
• Such locks held for short duration are called latches.
– Before a block is output to disk, the system acquires an
exclusive latch on the block
• Ensures no update can be in progress on the block

Slide No.L4-3
Buffer Management (Cont.)
• Database buffer can be implemented either
– in an area of real main-memory reserved for the
database, or
– in virtual memory
• Implementing buffer in reserved main-memory has
drawbacks:
– Memory is partitioned before-hand between database
buffer and applications, limiting flexibility.
– Needs may change, and although operating system
knows best how memory should be divided up at any
time, it cannot change the partitioning of memory.
• Database buffers are generally implemented in virtual
memory in spite of some drawbacks:
– When operating system needs to evict a page that has
been modified, the page is written to swap space on disk.

Slide No.L4-4
Buffer Management (Cont.)
– When database decides to write buffer page to disk,
buffer page may be in swap space, and may have to
be read from swap space on disk and output to the
database on disk, resulting in extra I/O!
• Known as dual paging problem.
– Ideally when OS needs to evict a page from the buffer,
it should pass control to database, which in turn
should
1. Output the page to database instead of to swap
space (making sure to output log records first), if it
is modified
2. Release the page from the buffer, for the OS to use
Dual paging can thus be avoided, but common
operating systems do not support such
functionality.

Slide No.L4-5
Failure with Loss of Nonvolatile Storage

• So far we assumed no loss of non-volatile


storage
• Technique similar to checkpointing used to
deal with loss of non-volatile storage
– Periodically dump the entire content of the
database to stable storage
– No transaction may be active during the dump
procedure; a procedure similar to checkpointing
must take place
• Output all log records currently residing in main
memory onto stable storage.
• Output all buffer blocks onto the disk.
• Copy the contents of the database to stable storage.
• Output a record <dump> to log on stable storage.

Slide No.L5-1
Recovering from Failure of Non-Volatile Storage

• To recover from disk failure


– restore database from most recent dump.
– Consult the log and redo all transactions that
committed after the dump
• Can be extended to allow transactions to be
active during dump;
known as fuzzy dump or online dump
– Will study fuzzy checkpointing later

Slide No.L5-2
Advanced Recovery: Key Features
• Support for high-concurrency locking techniques, such
as those used for B+-tree concurrency control, which
release locks early
– Supports “logical undo”
• Recovery based on “repeating history”, whereby recovery
executes exactly the same actions as normal processing
– including redo of log records of incomplete
transactions, followed by subsequent undo
– Key benefits
• supports logical undo
• easier to understand/show correctness

Slide No.L5-3
Advanced Recovery: Logical Undo Logging

• Operations like B+-tree insertions and deletions


release locks early.
– They cannot be undone by restoring old values
(physical undo), since once a lock is released,
other transactions may have updated the B +-
tree.
– Instead, insertions (resp. deletions) are undone
by executing a deletion (resp. insertion)
operation (known as logical undo).
• For such operations, undo log records should
contain the undo operation to be executed
– Such logging is called logical undo logging, in
contrast to physical undo logging
• Operations are called logical operations

Slide No.L5-4
Advanced Recovery: Physical Redo
• Redo information is logged physically (that is, new
value for each write) even for operations with logical
undo
– Logical redo is very complicated since database state
on disk may not be “operation consistent” when
recovery starts
– Physical redo logging does not conflict with early lock
release

Slide No.L5-5
Advanced Recovery: Operation Logging
• Operation logging is done as follows:
1. When operation starts, log <Ti, Oj, operation-begin>.
Here Oj is a unique identifier of the operation
instance.
2. While operation is executing, normal log records with
physical redo and physical undo information are
logged.
3. When operation completes, <Ti, Oj, operation-end,
U> is logged, where U contains information needed
to perform a logical undo information.
Example: insert of (key, record-id) pair (K5, RID7) into
index
<T1,I9
O1, operation-begin>
…. Physical redo of steps in insert
<T1, X, 10, K5>
<T1, Y, 45, RID7>
<T1, O1, operation-end, (delete I9, K5, RID7)>
Slide No.L5-6
Advanced Recovery: Operation Logging (Cont.)
• If crash/rollback occurs before operation completes:
– the operation-end log record is not found, and
– the physical undo information is used to undo
operation.
• If crash/rollback occurs after the operation completes:
– the operation-end log record is found, and in this
case
– logical undo is performed using U; the physical
undo information for the operation is ignored.
• Redo of operation (after crash) still uses physical redo
information.

Slide No.L5-7
Advanced Recovery: Txn Rollback
Rollback of transaction Ti is done as follows:
• Scan the log backwards
1. If a log record <Ti, X, V1, V2> is found, perform the undo
and log a special redo-only log record <Ti, X, V1>.
2. If a <Ti, Oj, operation-end, U> record is found
• Rollback the operation logically using the undo
information U.
– Updates performed during roll back are logged just
like during normal operation execution.
– At the end of the operation rollback, instead of
logging an operation-end record, generate a record
<Ti, Oj, operation-abort>.
• Skip all preceding log records for Ti until the record
<Ti, Oj operation-begin> is found

Slide No.L5-8
Advanced Recovery: Txn Rollback (Cont.)

• Scan the log backwards (cont.):


3. If a redo-only record is found ignore it
4. If a <Ti, Oj, operation-abort> record is found:
 skip all preceding log records for Ti until the record
<Ti, Oj, operation-begin> is found.
5. Stop the scan when the record <Ti, start> is found
6. Add a <Ti, abort> record to the log
Some points to note:
• Cases 3 and 4 above can occur only if the database
crashes while a transaction is being rolled back.
• Skipping of log records as in case 4 is important to
prevent multiple rollback of the same operation.

Slide No.L5-9
Advanced Recovery: Txn Rollback Example

• Example with a complete and an incomplete operation


<T1, start>
<T1, O1, operation-begin>
….
<T1, X, 10, K5>
<T1, Y, 45, RID7>
<T1, O1, operation-end, (delete I9, K5, RID7)>
<T1, O2, operation-begin>
<T1, Z, 45, 70>
 T1 Rollback begins here
<T1, Z, 45>  redo-only log record during physical undo (of incomplete O2)
<T1, Y, .., ..>  Normal redo records for logical undo of O1

<T1, O1, operation-abort>  What if crash occurred immediately after this?
<T1, abort>

Slide No.L5-10
Advanced Recovery: Crash Recovery
The following actions are taken when recovering from system crash
1. (Redo phase): Scan log forward from last < checkpoint L> record till
end of log
1. Repeat history by physically redoing all updates of all
transactions,
2. Create an undo-list during the scan as follows
• undo-list is set to L initially
• Whenever <Ti start> is found Ti is added to undo-list
• Whenever <Ti commit> or <Ti abort> is found, Ti is deleted
from undo-list
This brings database to state as of crash, with committed as well as
uncommitted transactions having been redone.
Now undo-list contains transactions that are incomplete, that is,
have neither committed nor been fully rolled back.

Slide No.L6-1
Advanced Recovery: Crash Recovery (Cont.)

Recovery from system crash (cont.)


2. (Undo phase): Scan log backwards, performing undo
on log records of transactions found in undo-list.
– Log records of transactions being rolled back are
processed as described earlier, as they are found
• Single shared scan for all transactions being undone
– When <Ti start> is found for a transaction Ti in
undo-list, write a <Ti abort> log record.
– Stop scan when <Ti start> records have been
found for all Ti in undo-list
• This undoes the effects of incomplete transactions
(those with neither commit nor abort log records).
Recovery is now complete.

Slide No.L6-2
Advanced Recovery: Checkpointing
• Checkpointing is done as follows:
1. Output all log records in memory to stable storage
2. Output to disk all modified buffer blocks
3. Output to log on stable storage a < checkpoint L>
record.
Transactions are not allowed to perform any actions
while checkpointing is in progress.
• Fuzzy checkpointing allows transactions to progress
while the most time consuming parts of checkpointing
are in progress
– Performed as described on next slide

Slide No.L6-3
Advanced Recovery: Fuzzy Checkpointing

• Fuzzy checkpointing is done as follows:


1. Temporarily stop all updates by transactions
2. Write a <checkpoint L> log record and force log to
stable storage
3. Note list M of modified buffer blocks
4. Now permit transactions to proceed with their
actions
5. Output to disk all modified buffer blocks in list M
 blocks should not be updated while being output
 Follow WAL: all log records pertaining to a block
must be output before the block is output
6. Store a pointer to the checkpoint record in a fixed
position last_checkpoint on disk ……
<checkpoint L>
…..
<checkpoint L>
last_checkpoint
…..

Slide No.L6-4 Log


Advanced Rec: Fuzzy Checkpointing (Cont.)

• When recovering using a fuzzy checkpoint, start scan


from the checkpoint record pointed to by
last_checkpoint
– Log records before last_checkpoint have their
updates reflected in database on disk, and need not
be redone.
– Incomplete checkpoints, where system had crashed
while performing checkpoint, are handled safely

Slide No.L6-5
Data on External Storage
• Disks: Can retrieve random page at fixed cost
– But reading several consecutive pages is much cheaper than
reading them in random order
• Tapes: Can only read pages in sequence
– Cheaper than disks; used for archival storage
• File organization: Method of arranging a file of records on external
storage.
– Record id (rid) is sufficient to physically locate record
– Indexes are data structures that allow us to find the record ids of
records with given values in index search key fields
• Architecture: Buffer manager stages pages from external storage to
main memory buffer pool. File and index layers make calls to the
buffer manager.

Slide No:L1-1
Alternative File Organizations
Many alternatives exist, each ideal for some situations, and not
so good in others:
– Heap (random order) files: Suitable when typical access
is a file scan retrieving all records.
– Sorted Files: Best if records must be retrieved in some
order, or only a `range’ of records is needed.
– Indexes: Data structures to organize records via trees or
hashing.
• Like sorted files, they speed up searches for a subset
of records, based on values in certain (“search key”)
fields
• Updates are much faster than in sorted files.

Slide No:L1-2
Index Classification
• Primary vs. secondary: If search key contains primary key, then
called primary index.
– Unique index: Search key contains a candidate key.

• Clustered vs. unclustered: If order of data records is the same as,


or `close to’, order of data entries, then called clustered index.
– Alternative 1 implies clustered; in practice, clustered also
implies Alternative 1 (since sorted files are rare).
– A file can be clustered on at most one search key.
– Cost of retrieving data records through index varies greatly
based on whether index is clustered or not!

Slide No:L1-3
Clustered vs. Unclustered Index
• Suppose that Alternative (2) is used for data entries, and that the data
records are stored in a Heap file.
– To build clustered index, first sort the Heap file (with some free
space on each page for future inserts).
– Overflow pages may be needed for inserts. (Thus, order of data
recs is `close to’, but not identical to, the sort order.)

Index entries
direct search for
data entries

CLUSTERED UNCLUSTERED

Data entries Data entries


(Index File)
(Data file)

Data Records Data Records

Slide No:L1-4
Indexes
• An index on a file speeds up selections on the search
key fields for the index.
– Any subset of the fields of a relation can be the
search key for an index on the relation.
– Search key is not the same as key (minimal set of
fields that uniquely identify a record in a relation).
• An index contains a collection of data entries, and
supports efficient retrieval of all data entries k* with
a given key value k.
– Given data entry k*, we can find record with key
k in at most one disk I/O. (Details soon …)

Slide No:L2-1
B+ Tree Indexes

Non-leaf
Pages

Leaf
Pages
(Sorted by search key)

 Leaf pages contain data entries, and are chained (prev & next)
 Non-leaf pages have index entries; only used to direct searches:

index entry

P0 K 1 P1 K 2 P 2 K m Pm

Slide No:L2-2
Example B+ Tree

Root Note how data entries


17 in leaf level are sorted

Entries <= 17 Entries > 17

5 13 27 30

2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*

• Find 28*? 29*? All > 15* and < 30*


• Insert/delete: Find data entry in leaf, then change
it. Need to adjust parent sometimes.
– And change sometimes bubbles up the tree
Slide No:L2-3
Hash-Based Indexes

• Good for equality selections.


• Index is a collection of buckets.
– Bucket = primary page plus zero or more
overflow pages.
– Buckets contain data entries.
• Hashing function h: h(r) = bucket in which (data
entry for) record r belongs. h looks at the search
key fields of r.
– No need for “index entries” in this scheme.

Slide No:L2-4
Alternatives for Data Entry k* in Index
• In a data entry k* we can store:
– Data record with key value k, or
– <k, rid of data record with search key value k>, or
– <k, list of rids of data records with search key k>
• Choice of alternative for data entries is orthogonal to the
indexing technique used to locate data entries with a
given key value k.
– Examples of indexing techniques: B+ trees, hash-
based structures
– Typically, index contains auxiliary information that
directs searches to the desired data entries
Slide No:L2-5
Alternatives for Data Entries (Contd.)
• Alternative 1:
– If this is used, index structure is a file organization
for data records (instead of a Heap file or sorted
file).
– At most one index on a given collection of data
records can use Alternative 1. (Otherwise, data
records are duplicated, leading to redundant storage
and potential inconsistency.)
– If data records are very large, # of pages containing
data entries is high. Implies size of auxiliary
information in the index is also large, typically.

Slide No:L2-6
Alternatives for Data Entries (Contd.)
• Alternatives 2 and 3:
– Data entries typically much smaller than data
records. So, better than Alternative 1 with large
data records, especially if search keys are small.
(Portion of index structure used to direct search,
which depends on size of data entries, is much
smaller than with Alternative 1.)
– Alternative 3 more compact than Alternative 2,
but leads to variable sized data entries even if
search keys are of fixed length.

Slide No:L2-7
Cost Model for Our Analysis
We ignore CPU costs, for simplicity:
– B: The number of data pages
– R: Number of records per page
– D: (Average) time to read or write disk
page
– Measuring number of page I/O’s ignores
gains of pre-fetching a sequence of pages;
thus, even I/O cost is only approximated.
– Average-case analysis; based on several
simplistic assumptions.

Slide No:L3-1
Comparing File Organizations
• Heap files (random order; insert at eof)
• Sorted files, sorted on <age, sal>
• Clustered B+ tree file, Alternative (1), search key
<age, sal>
• Heap file with unclustered B + tree index on search
key <age, sal>
• Heap file with unclustered hash index on search key
<age, sal>

Slide No:L3-2
Operations to Compare
• Scan: Fetch all records from disk
• Equality search
• Range selection
• Insert a record
• Delete a record

Slide No:L3-3
Assumptions in Our Analysis
• Heap Files:
– Equality selection on key; exactly one match.

• Sorted Files:
– Files compacted after deletions.

• Indexes:
– Alt (2), (3): data entry size = 10% size of record
– Hash: No overflow buckets.
• 80% page occupancy => File size = 1.25 data size
– Tree: 67% occupancy (this is typical).
• Implies file size = 1.5 data size

Slide No:L3-4
Assumptions (contd.)
• Scans:
– Leaf levels of a tree-index are chained.
– Index data-entries plus actual file
scanned for unclustered indexes.
• Range searches:
– We use tree indexes to restrict the set
of data records fetched, but ignore hash
indexes.

Slide No:L3-5
Cost of Operations

(a) Scan (b) Equality (c ) Range (d) Insert (e) Delete

(1) Heap BD 0.5BD BD 2D Search


+D
(2) Sorted BD Dlog 2B D(log 2 B + Search Search
# pgs with + BD +BD
match recs)
(3) 1.5BD Dlog F 1.5B D(log F 1.5B Search Search
Clustered + # pgs w. +D +D
match recs)
(4) Unclust. BD(R+0.15) D(1 + D(log F 0.15B Search Search
Tree index log F 0.15B) + # pgs w. + 2D + 2D
match recs)
(5) Unclust. BD(R+0.125) 2D BD Search Search
Hash index + 2D + 2D

Slide No:L4-1
Understanding the Workload
• For each query in the workload:
– Which relations does it access?
– Which attributes are retrieved?
– Which attributes are involved in selection/join
conditions? How selective are these conditions likely
to be?
• For each update in the workload:
– Which attributes are involved in selection/join
conditions? How selective are these conditions likely
to be?
– The type of update (INSERT/DELETE/UPDATE), and the
attributes that are affected.

Slide No:L4-2
Choice of Indexes

• What indexes should we create?


– Which relations should have indexes?
What field(s) should be the search
key? Should we build several indexes?
• For each index, what kind of an index should it be?
– Clustered? Hash/tree?

Slide No:L5-1
Choice of Indexes (Contd.)
• One approach: Consider the most important queries in turn. Consider
the best plan using the current indexes, and see if a better plan is
possible with an additional index. If so, create it.
– Obviously, this implies that we must understand
how a DBMS evaluates queries and creates query
evaluation plans!
– For now, we discuss simple 1-table queries.
• Before creating an index, must also consider the impact on updates in
the workload!
– Trade-off: Indexes can make queries go faster,
updates slower. Require disk space, too.

Slide No:L5-2
Index Selection Guidelines
• Attributes in WHERE clause are candidates for index keys.
– Exact match condition suggests hash index.
– Range query suggests tree index.
• Clustering is especially useful for range
queries; can also help on equality queries if
there are many duplicates.
• Multi-attribute search keys should be considered when a WHERE
clause contains several conditions.
– Order of attributes is important for range queries.
– Such indexes can sometimes enable index-only
strategies for important queries.
• For index-only strategies, clustering is not
important!

Slide No:L5-3
Examples of Clustered Indexes
• B+ tree index on E.age can be used to get SELECT E.dno
qualifying tuples. FROM Emp E
– How selective is the condition? WHERE E.age>40
– Is the index clustered?
SELECT E.dno, COUNT (*)
• Consider the GROUP BY query.
FROM Emp E
– If many tuples have E.age > 10,
WHERE E.age>10
using E.age index and sorting the GROUP BY E.dno
retrieved tuples may be costly.
– Clustered E.dno index may be
better!
• Equality queries and duplicates:
– Clustering on E.hobby helps!
SELECT E.dno
FROM Emp E
Slide No:L5-4 WHERE E.hobby=Stamps
Indexes with Composite Search Keys
• Composite Search Keys: Search on a
combination of fields. Examples of composite key
– Equality query: Every field indexes using lexicographic order.
value is equal to a constant
value. E.g. wrt <sal,age>
index:
• age=20 and sal =75
– Range query: Some field value 11,80 11
is not a constant. E.g.: 12,10 12
name age sal
• age =20; or age=20 and sal 12,20 12
> 10 13,75 bob 12 10 13
• Data entries in index sorted by search key <age, sal> cal 11 80 <age>
to support range queries. joe 12 20
– Lexicographic order, or 10,12 sue 13 75 10

– Spatial order. 20,12 Data records 20


75,13 sorted by name 75
80,11 80
<sal, age> <sal>
Data entries in index Data entries
sorted by <sal,age> sorted by <sal>
Slide No:L6-1
Composite Search Keys
• To retrieve Emp records with age=30 AND sal=4000, an index
on <age,sal> would be better than an index on age or an index
on sal.
– Choice of index key orthogonal to clustering etc.

• If condition is: 20<age<30 AND 3000<sal<5000:


– Clustered tree index on <age,sal> or <sal,age>
is best.
• If condition is: age=30 AND 3000<sal<5000:
– Clustered <age,sal> index much better than
<sal,age> index!
• Composite indexes are larger, updated more often.

Slide No:L6-2
Index-Only Plans
• A number of SELECT E.dno, COUNT(*)
queries can be <E.dno> FROM Emp E
GROUP BY E.dno
answered without
retrieving any
tuples from one <E.dno,E.sal> SELECT E.dno, MIN(E.sal)
or more of the FROM Emp E
Tree index!
relations GROUP BY E.dno

involved if a
suitable index is
<E. age,E.sal> SELECT AVG(E.sal)
available. or FROM Emp E

E.age> WHERE E.age=25 AND


Tree index!
<E.sal,
E.sal BETWEEN 3000 AND 5000

Slide No:L6-3
Summary
• Many alternative file organizations exist, each appropriate in some
situation.
• If selection queries are frequent, sorting the file or building an
index is important.
– Hash-based indexes only good for equality search.
– Sorted files and tree-based indexes best for range
search; also good for equality search. (Files rarely
kept sorted in practice; B+ tree index is better.)
• Index is a collection of data entries plus a way to quickly find
entries with given key values.

Slide No:L6-4
Summary (Contd.)

• Data entries can be actual data records, <key, rid> pairs, or


<key, rid-list> pairs.
– Choice orthogonal to indexing technique
used to locate data entries with a given key
value.
• Can have several indexes on a given file of data records,
each with a different search key.
• Indexes can be classified as clustered vs. unclustered,
primary vs. secondary, and dense vs. sparse. Differences
have important consequences for utility/performance.

Slide No:L6-5
Introduction
• As for any index, 3 alternatives for data entries k*:
– Data record with key value k
– <k, rid of data record with search key value
k>
– <k, list of rids of data records with search key
k>
• Choice is orthogonal to the indexing technique used to locate
data entries k*.
• Tree-structured indexing techniques support both range
searches and equality searches.
• ISAM: static structure; B+ tree: dynamic, adjusts gracefully
under inserts and deletes.

Slide No:L7-1
Range Searches
• ``Find all students with gpa > 3.0’’
– If data is in sorted file, do binary
search to find first such student, then
scan to find others.
– Cost of binary search can be quite
high.
• Simple idea: Create an `index’ file.
k1 k2 kN Index File

Page 1 Page 2 Page 3 Page N Data File

Slide No:L7-2
index entry
ISAM
P K P K 2 P K m Pm
0 1 1 2

• Index file may still be quite large. But we can apply


the idea repeatedly!

Non-leaf
Pages

Leaf
Pages
Overflow
page
Primary pages

Slide No:L7-3
Comments on ISAM
Data
• File creation: Leaf (data) pages allocated Pages
sequentially, sorted by search key; then index pages
allocated, then space for overflow pages.
Index Pages
• Index entries: <search key value, page id>; they
`direct’ search for data entries, which are in leaf pages.
• Search: Start at root; use key comparisons to go to leaf.
Overflow pages
Cost log F N ; F = # entries/index pg, N = # leaf pgs
• Insert: Find leaf data entry belongs to, and put it there.
• Delete: Find and remove from leaf; if empty overflow
page, de-allocate.

Slide No:L7-4
Example ISAM Tree
• Each node can hold 2 entries; no need for `next-leaf-
page’ pointers. (Why?)

Root
40

20 33 51 63

10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*

Slide No:L7-5
After Inserting 23*, 48*, 41*, 42* ...

Root
Index 40

Pages

20 33 51 63

Primary
Leaf
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
Pages

Overflow 23* 48* 41*

Pages
42*

Slide No:L7-6
... Then Deleting 42*, 51*, 97*

Root
40

20 33 51 63

10* 15* 20* 27* 33* 37* 40* 46* 55* 63*

23* 48* 41*

Slide No:L7-7
B+ Tree: Most Widely Used Index
• Insert/delete at log F N cost; keep tree height-
balanced. (F = fanout, N = # leaf pages)
• Minimum 50% occupancy (except for root). Each
node contains d <= m <= 2d entries. The parameter
d is called the order of the tree.
• Supports equality and range-searches efficiently.
Index Entries
(Direct search)

Data Entries
("Sequence set")

Slide No:L8-1
Example B+ Tree
• Search begins at root, and key comparisons direct it
to a leaf (as in ISAM).
• Search for 5*, 15*, all data entries >= 24* ...

Root

13 17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

Slide No:L8-2
B+ Trees in Practice
• Typical order: 100. Typical fill-factor: 67%.
– average fanout = 133
• Typical capacities:
– Height 4: 1334 = 312,900,700 records
– Height 3: 1333 = 2,352,637 records
• Can often hold top levels in buffer pool:
– Level 1 = 1 page = 8 Kbytes
– Level 2 = 133 pages = 1 Mbyte
– Level 3 = 17,689 pages = 133 MBytes

Slide No:L8-3
Inserting a Data Entry into a B+ Tree
• Find correct leaf L.
• Put data entry onto L.
– If L has enough space, done!
– Else, must split L (into L and a new node L2)

• Redistribute entries evenly, copy up middle key.


• Insert index entry pointing to L2 into parent of L.
• This can happen recursively
– To split index node, redistribute entries evenly, but
push up middle key. (Contrast with leaf splits.)
• Splits “grow” tree; root split increases height.
– Tree growth: gets wider or one level taller at top.

Slide No:L8-4
Inserting 8* into Example B+ Tree

Entry to be inserted in parent node.


• Observe how 5 (Note that 5 is
s copied up and
minimum continues to appear in the leaf.)

occupancy is
guaranteed in both 2* 3* 5* 7* 8*
leaf and index pg
splits.
• Note difference
between copy-up Entry to be inserted in parent node.
17 (Note that 17 is pushed up and only
and push-up; be appears once in the index. Contrast
this with a leaf split.)
sure you
understand the
5 13 24 30
reasons for this.

Slide No:L8-5
Example B+ Tree After Inserting 8*

Root
17

5 13 24 30

2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

 Notice that root was split, leading to increase in height.


 In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
Slide No:L8-6
Deleting a Data Entry from a B+ Tree
• Start at root, find leaf L where entry belongs.
• Remove the entry.
– If L is at least half-full, done!
– If L has only d-1 entries,

• Try to re-distribute, borrowing from sibling


(adjacent node with same parent as L).
• If re-distribution fails, merge L and sibling.
• If merge occurred, must delete entry (pointing to L or sibling) from
parent of L.
• Merge could propagate to root, decreasing height.

Slide No:L8-7
Example Tree After (Inserting 8*, Then)
Deleting 19* and 20* ...

Root

17

5 13 27 30

2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*

• Deleting 19* is easy.


• Deleting 20* is done with re-distribution. Notice
how middle key is copied up.
Slide No:L8-8
... And Then Deleting 24*

• Must merge.
30
• Observe `toss’ of index
entry (on right), and 22* 27* 38* 39*
29* 33* 34*
`pull down’ of index
entry (below).
Root
5 13 17 30

2* 3* 5* 7* 8* 14* 16* 22* 27* 29* 33* 34* 38* 39*

Slide No:L8-9

Vous aimerez peut-être aussi