Vous êtes sur la page 1sur 44


Query processing is the process by which a declarative query is translated into low-level data
manipulation operations. SQL is the standard query language that is supported in current
Query Processing steps:

Parsing and Translating

o Translate the query into its internal form (parse tree).
o This is then translated into an expression of the relational algebra.
o Parser checks syntax, validates relations, attributes and access permissions
o The query execution engine takes a physical query plan
(aka execution plan),
executes the plan, and returns the result.
Optimization: Find the cheapest" execution plan for a query
A relational algebra expression may have many equivalent expressions, e.g.,


Representation as logical query plan (a tree):

Non-leaf nodes = operations of relational algebra (with parameters); Leaf nodes = relations

A relational algebra expression can be evaluated in many ways. An annotated expression

specifying detailed evaluation strategy is called the execution plan (includes, e.g.,
whether index is used, join algorithms, . . . )
Among all semantically equivalent expressions, the one with the least costly evaluation
plan is chosen. Cost estimate of a plan is based on statistical information in the system

Query optimization refers to the process by which the best execution strategy for a given query is
found from among a set of alternatives.
The process typically involves two steps:
Query Decomposition: Query decomposition takes an SQL query and translates it into relational algebra.
In the process, the query is analyzed semantically so that incorrect queries are detected and rejected as
easily as possible, and correct queries are simplified. Simplification involves the elimination of redundant
predicates which may be introduced as a result of query modification to deal with views, security
enforcement and semantic integrity control. The simplified query is then restructured as an algebraic
Query Optimization: For a given SQL query, there are more than one possible relation algebraic
expressions. Some of these algebraic expressions are better than others. The quality of an algebraic
expression is defined in terms of expected performance.
The traditional procedure is to obtain an initial algebraic expression by translating the predicates and the
target statement into relational operations as they appear in the query. This initial algebraic query is then
transformed, using algebraic transformation rules, into other algebraic queries until the best one is
The best algebraic expression is determined according to a cost function which calculates the cost of
executing the query according to that algebraic specification. This is the process of query optimization.

Optimization typically takes one of two forms:

We divide the query optimization into two types: Heuristic (sometimes called Rule based) and
Systematic (Cost based).

Heuristic Optimization
Cost Based Optimization

In Heuristic Optimization, the query execution is refined based on heuristic rules for reordering
the individual operations.
With Cost Based Optimization, the overall cost of executing the query is systematically
reduced by estimating the costs of executing several different execution plans.

Heuristic Query Optimization

In this method relational algebra expressions are expressed in equivalent expressions that take
much less time and resource to process. As we illustrated, repositioning relational algebra
operations in certain ways does not affect the results. First we present an example to show the
effect of this repositioning and then present a list of heuristic rules for optimizing relational algebra
expressions. Once an expression is optimized, it can then be implemented efficiently.

A query can be represented as a tree data structure. Operations are at the interior nodes
and data items (tables, columns) are at the leaves.
The query is evaluated in a depth-first pattern.

For Example:
PLOCATION = 'Stafford';

Or, in relational algebra:

on the following schema:


-- ------B SMITH



------------------------731 FONDREN, HOUSTON, TX


- ------ --------- -M 30000 333445555 5
M 40000 888665555 5
F 25000 987654321 4
F 43000 888665555 4
M 38000 333445555 5
F 25000 333445555 5




987987987 29-MAR-59 980 DALLAS, HOUSTON, TX

888665555 10-NOV-27 450 STONE, HOUSTON, TX

--------------- --------HEADQUARTERS


---------------- ------ProductX




--------- --123456789

Which of the following query trees is more efficient ?

The left hand tree is evaluated in steps as follows:


25000 987654321 4


The right hand tree is evaluated in steps as follows:

Note the two cross product operations. These require lots of space and time (nested
loops) to build.
After the two cross products, we have a temporary table with 144 records (6 projects * 3
departments * 8 employees).
An overall rule for heuristic query optimization is to perform as many select and project
operations as possible before doing any joins.
There are a number of transformation rules that can be used to transform a query:
1. Cascading selections. A list of conjunctive conditions can be broken up into
separate individual conditions.

c1(c2(E)) = c1 c2(E)
2. Commutativity of the selection operation.
3. Cascading projections. All but the last projection can be ignored.
Assume that attributes A1, . . . ,An are among B1, . . . ,Bm. Then
A1,...,An( B1,...,Bm(E)) = A1,...,An(E)
4. Commuting selection and projection. If a selection condition only involves
attributes contained in a projection clause, the two can be commuted.
5. Commutativity of Join and Cross Product.
6. Commuting selection with Join.
If c only involves attributes from E1,then

c(E1 E2) = c(E1) E2

7. Commuting projection with Join.
8. Commutativity of set operations. Union and Intersection are commutative.
9. Associativity of Union, Intersection, Join and Cross Product.
10. Commuting selection with set operations.

c(E1 E2) = c(E1) c(E2)

11. Commuting projection with set operations.
A1,...,An(E1 E2) = A1,...,An(E1) A1,...,An(E2)
12. Logical transformation of selection conditions. For example, using DeMorgan's
law, etc.
13. Combine Selection and Cartesian product to form Joins.

Systematic (Cost based) Query Optimization

Just looking at the Syntax of the query may not give the whole picture - need to look at
the data as well.
Several Cost components to consider:
1. Access cost to secondary storage (hard disk)
2. Storage Cost for intermediate result sets
3. Computation costs: CPU, memory transfers, etc. for performing in-memory
4. Communications Costs to ship data around a network. e.g., in a distributed or
client/server database.

Of these, Access cost is the most crucial in a centralized DBMS. The more work we can
do with data in cache or in memory, the better.
Access Routines are algorithms that are used to access and aggregate data in a database.
An RDBMS may have a collection of general purpose access routines that can be
combined to implement a query execution plan.
We are interested in access routines for selection, projection, join and set operations such
as union, intersection, set difference, cartesian product, etc.
As with heuristic optimization, there can be many different plans that lead to the same
In general, if a query contains n operations, there will be n! possible plans.
However, not all plans will make sense. We should consider:
Perform all simple selections first
Perform joins next
Perform projection last
Overview of the Cost Based optimization process
1. Enumerate all of the legitimate plans (call these P1...Pn) where each plan contains
a set of operations O1...Ok
2. Select a plan
3. For each operation Oi in the plan, enumerate the access routines
4. For each possible Access routine for Oi, estimate the cost
Select the access routine with the lowest cost
5. Repeat previous 2 steps until an efficient access routine has been selected for each
Sum up the costs of each access routine to determine a total cost for the plan
6. Repeat steps 2 through 5 for each plan and choose the plan with the lowest total

Catalog Information for Cost Estimation

Information about relations and attributes:
NR: number of tuples in the relation R.
BR: number of blocks that contain tuples of the relation R.
SR: size of a tuple of R.
FR: blocking factor; number of tuples from R that fit into one block
(FR = [NR/BR])
V(A,R): number of distinct values for attribute A in R.
SC(A, R): selectivity of attribute A
=average number of tuples of R that satisfy an equality condition on A.
SC(A, R) = NR/V(A, R).
Information about indexes:
HTI: number of levels in index I (B+-tree).
LBI: number of blocks occupied by leaf nodes in index I (first-level blocks).
ValI: number of distinct values for the search key.

Measures of Query Cost

There are many possible ways to estimate cost, e.g., based on disk accesses, CPU
time, or communication overhead.
Disk access is the predominant cost (in terms of time); relatively easy to estimate;
therefore, number of block transfers from/to disk is typically used as measure.
Simplifying assumption: each block transfer has the same cost.
Cost of algorithm (e.g., for join or selection) depends on database buffer size; more
memory for DB buffer reduces disk accesses. Thus DB buffer size is a parameter for
estimating cost.
We refer to the cost estimate of algorithm S as cost(S). We do not consider cost of
writing output to disk.

Relational Algebra Equivalences:

Equivalence Rules (for expressions E, E1, E2, conditions Fi) Applying distribution and
commutativity of relational algebra operations

F1( F2(E)) = F1^F2(E)

2. F(E1[U, , --] E2) = F(E1) [U,,--] F(E2)

3. F(E1 X E2) = F0( F1(E1) X F2(E2));
F =F0 ^ F1 ^ F2, Fi contains only attributes of Ei; i = 1; 2.
4. A=B(E1 X E2) = E1(A=B) E2
5. A(E1 [U,,--] E2) A(E1) [U,,--] A(E2)
6. A(E1 X E2) = A1(E1) X A2(E2) with Ai = A { attributes in Ei}, i = 1, 2.
7. E1 [U,] E2 = E2 [U,] E1
(E1 U E2) U E3 = E1 U (E2 U E3) (the analogous holds for )
8. E1 X E2 = A1,A2(E2 X E1)
(E1 X E2) X E3 = E1 X (E2 X E3)
(E1 X E2) X E3 = ((E1 X E3) X E2)
9. E1 E2 = E2 E1
(E1 E2) E3 = E1 (E2 E3)

Disadvantages of RDBMS

RDBMSs are not suitable for applications with complex data structures or new data types
for large, unstructured objects, such as CAD/CAM, Geographic information systems,
multimedia databases, imaging and graphics.
The RDBMSs typically do not allow users to extend the type system by adding new data
They also only support first-normal-form relations in which the type of every column
must be atomic, i.e., no sets, lists, or tables are allowed inside a column.
Recursive queries are difficult to write.

As a specific example of the need for object-relational systems, we focus on a new business data
processing problem that is both harder and (in our view) more entertaining than the dollars and
cents bookkeeping of previous decades. Today, companies in industries such as entertainment
are in the business of selling bits; their basic corporate assets are not tangible products, but rather
software artifacts such as video and audio.
We consider the fictional Dinky Entertainment Company, a large Hollywood conglomerate
whose main assets are a collection of cartoon characters, especially the cuddly and
internationally beloved Herbert the Worm. Dinky has a number of Herbert the Worm films,
many of which are being shown in theaters around the world at any given time. Dinky also
makes a good deal of money licensing Herbert's image, voice, and video footage for various
purposes: action figures, video games, product endorsements, and so on. Dinky's database is used
to manage the sales and leasing records for the various Herbert-related products, as well as the
video and audio data that make up Herbert's many films.
Traditional database systems, such as RDBMS, have been quite successful in developing the
database technology required for many traditional business database applications. However, they
have certain shortcomings when more complex database applications must be designed and
implementedfor example, databases for engineering design and manufacturing (CAD/CAM ),
scientific experiments, telecommunications, geographic information systems, and multimedia.
These newer applications have requirements and characteristics that differ from those of
traditional business applications, such as more complex structures for objects, longer-duration
transactions, new data types for storing images or large textual items, and the need to define
nonstandard application-specific operations.
Object-oriented databases were proposed to meet the needs of these more complex applications.
The object-oriented approach offers the flexibility to handle some of these requirements without
being limited by the data types and query languages available in traditional database systems. A
key feature of object-oriented databases is the power they give the designer to specify both the
structure of complex objects and the operations that can be applied to these objects.

Object database systems combine the classical capabilities of relational database management
systems (RDBMS), with new functionalities assumed by the object-orientedness. The traditional
capabilities include:

Secondary storage management

Schema management
Concurrency control
Transaction management, recovery
Query processing
Access authorization and control, safety, security

New capabilities of object databases include:

Complex objects
Object identities
User-defined types
Type/class hierarchy with inheritance
Overloading, overriding, late binding, polymorphism

Mandatory features of object-oriented systems

Support for complex objects

A complex object mechanism allows an object to contain attributes that can themselves be
objects. In other words, the schema of an object is not in first-normal-form. Examples of
attributes that can comprise a complex object include lists, bags, and embedded objects.
Object identity
Every instance in the database has a unique identifier (OID), which is a property of an object
that distinguishes it from all other objects and remains for the lifetime of the object. In
object-oriented systems, an object has an existence (identity) independent of its value.
Each database object has identity, i.e. a unique internal identitifier (OID) (with no meaning in the
problem domain). Each object has one or more external names that can be used to identify the object by
the programmer.
Properties of OID:

It is unique
It is system generated
It is invisible to the user. That is it cannot be modified by the user.
It is immutable. That is, once generated, it is never regenerated.
It is a long integer value

Object-oriented models enforce encapsulation and information hiding. This means, the state of
objects can be manipulated and read only by invoking operations that are specified within the
type definition and made visible through the public clause.
In an object-oriented database system encapsulation is achieved if only the operations are
visible to the programmer and both the data and the implementation are hidden.
Support for types or classes
Type: in an object-oriented system, summarizes the common features of a set of objects
with the same characteristics. In programming languages types can be used at
compilation time to check the correctness of programs.
Class: The concept is similar to type but associated with run-time execution. The term
class refers to a collection of all objects with the same internal structure (attributes) and
methods. These objects are called instances of the class.
Both of these two features can be used to group similar objects together, but it is normal
for a system to support either classes or types and not both.
Class or type hierarchies
Any subclass or subtype will inherit attributes and methods from its superclass or supertype.
Overriding, Overloading and Late Binding
Overloading: A class modifies an existing method, by using the same name, but with a
different list, or type, of parameters.
Overriding: The implementation of the operation will depend on the type of the object it is
applied to.
Late binding: The implementation code cannot be referenced until run-time.
Computational Completeness

SQL does not have the full power of a conventional programming language. Languages such as
Pascal or C are said to be computationally complete because they can exploit the full
capabilities of a computer. SQL is only relationally complete, that is, it has the full power of
relational algebra. Whilst any SQL code could be rewritten as a C++ program, not all C++
programs could be rewritten in SQL.
Mandatory features of database systems
A database is a collection of data that is organized so that its contents can easily be accessed,
managed, and updated. Thus, a database system contains the five following features:
As in a conventional database, data must remain after the process that created it has
terminated. For this purpose data has to be stored permanently on secondary storage.
Secondary Storage Management
Traditional databases employ techniques, which manage secondary storage in order to improve
the performance of the system. These are usually invisible to the user of the system.
The system should provide a concurrency mechanism, which is similar to the concurrency
mechanisms in conventional databases.
The system should provide a recovery mechanism similar to recovery mechanisms in
conventional databases.
Ad hoc query facility
The database should provide a high-level, efficient, application independent query facility.
This needs not necessarily be a query language but could instead, be some type of graphical

Structured Data types:

A structured data type is a form of user-defined data type that contains a sequence of attributes, each of
which has a data type. An attribute is a property that helps describe an instance of the type. For
example, if we were to define a structured type called address_t, city might be one of the attributes of
this structured type. Structured types make it easy to use data, such as an address, either as a single
unit, or as separate data items, without having to store each of those items (or attributes) in a separate
A structured data type can be used as the type for a column in a regular table, the type for an entire
table (or view), or as an attribute of another structured type. When used as the type for a table, the
table is known as a typed table.
Structured data types exhibit a behavior known as inheritance. A structured type can have subtypes,
other structured types that reuse all of its attributes and contain their own specific attributes. The type
from which a subtype inherits attributes is known as its supertype.
For Example:
We have to create table employee









create type address_t as (street varchar(12), city varchar(12), province

varchar(12), postal_code char(6));
create type Name_t as (FName varchar(12),LName varchar(20));

create a new structure type by inheriting these two structure types

create type employee_t as(emp_id integer, ename Name_t, address address_t);

now we can create table of the above structure type

create table employee of employee_t
REF is emp_id system generated ;

We can also declare array type to define multivalued attributes

For Example:
Create type phone_t as( phoneno char(10) array[3]);

Here user can save three phone nos of a employee.

Complex objects, object identity. The database should consist of objects having arbitrary complexity

and an arbitrary number of hierarchy levels. Objects can be aggregates of (sub-) objects.
An object typically has two components: state (value) and behavior (operations). Hence, it is somewhat
similar to a program variable in a programming language, except that it will typically have a complex
data structure as well as specific operations defined by the programmer.
Types of objects:
Transient objects: Objects in an OOPL exist only during program execution and are hence called
transient objects.
Persistent objects: An OO database can extend the existence of objects so that they are stored
permanently, and hence the objects persist beyond program termination and can be retrieved later and
shared by other programs. In other words, OO databases store persistent objects permanently on
secondary storage, and allow the sharing of these objects among multiple programs and applications.
This requires the incorporation of other well-known features of database management systems, such as
indexing mechanisms, concurrency control, and recovery. An OO database system interfaces with one or
more OO programming languages to provide persistent and shared object capabilities.
Relationships, associations, links. Objects are connected by conceptual links. For instance, the

Employee and Department objects can be connected by a link worksFor. In the data structure links
are implemented as logical pointers (bi-directional or uni-directional).

Encapsulation and information hiding. The internal properties of an object are subdivided into two

parts: public (visible from the outside) and private (invisible from the outside). The user of an object
can refer to public properties only.
Classes, types, interfaces. Each object is an instance of one or more classes. The class is understood

as a blueprint for objects; i.e. objects are instantiated according to information presented in the class
and the class contains the properties that are common for some collection of objects (objects
invariants). Each object is assigned a type. Objects are accessible through their interfaces, which
specify all the information that is necessary for using objects.
Abstract data types (ADTs): a kind of a class, which assumes that any access to an object is limited to

the predefined collection of operations.

Operations, methods and messages. An object is associated with a set of operations (called

methods). The object performs the operation after receiving a message with the name of operation
to be performed (and parameters of this operation).
Inheritance. Classes are organized in a hierarchy reflecting the hierarchy of real world concepts. For

instance, the class Person is a super class of the classes Employee and Student. Properties of more
abstract classes are inherited by more specific classes. Multi-inheritance means that a specific class
inherits from several independent classes.
Polymorphism, late binding, overriding. The operation to be executed on an object is chosen

dynamically, after the object receives the message with the operation name. The same message sent
to different objects can invoke different operations.
Persistence. Database objects are persistent, i.e., they live as long as necessary. They can outlive

programs, which created these objects.

Object Database Management Group (ODMG).

Special interest group to develop standards that allow ODBMS customers to write portable
Standards include:
Object Model
Object Specification Languages
Object Definition Language (ODL) for schema definition
Object Interchange Format (OIF) to exchange objects between databases
Object Query Language
declarative language to query and update database objects
Language Bindings (C++, Java, Smalltalk)
Object manipulation language
Mechanisms to invoke OQL from language
Procedures for operation on databases and transactions


The enhanced functionality of ORDBMSs raises several implementation challenges. Some of these are
well understood and solutions have been implemented in products, others are subjects of current

research. In this section we examine a few of the key challenges that arise in implementing an efficient,
fully functional ORDBMS. Many more issues are involved than those discussed here

Storage and Access Methods

Since object-relational databases store new types of data, ORDBMS implementers need to revisit some
of the storage and indexing issues. In particular, the system must efficiently store ADT objects and
structured objects and provide efficient indexed access to both.

Storing Large ADT and Structured Type Objects

Large ADT objects and structured objects complicate the layout of data on disk. This problem is well
understood and has been solved in essentially all ORDBMSs and OODBMSs. We present some of the
main issues here. User-defined ADTs can be quite large. In particular, they can be bigger than a single
disk page. Large ADTs, like BLOBs, require special storage, typically in a different location on disk from
the tuples that contain them. Disk-based pointers are maintained from the tuples to the objects they

Structured objects can also be large, but unlike ADT objects they often vary in size during the
lifetime of a database. For example, consider the stars attribute of the films table. As the years
pass, some of the bit actors in an old movie may become famous. When a bit actor becomes
famous, we might want to advertise his or her presence in the earlier films. This involves an
insertion into the stars attribute of an individual tuple in lms. Because these bulk attributes can
grow arbitrarily, flexible disk layout mechanisms are required. An additional complication arises
with array types. Traditionally, array elements are stored sequentially on disk in a row-by-row
fashion, for example
A11,.A1n, A21,..,A2n Am1,.....,Amn
However, queries may often request sub arrays that are not stored contiguously on disk (e.g.,
A11,A21,...,Am1). Such requests can result in a very high I/O cost for retrieving the sub array. In
order to reduce the number of I/Os required in general, arrays are often broken into contiguous
chunks, which are then stored in some order on disk. Although each chunk is some contiguous
region of the array, chunks need not be row-by-row or column-by-column. For example, a chunk
of size 4 might be A11,A12,A21,A22, which is a square region if we think of the array as being
arranged row-by-row in two dimensions.

Indexing New Types

One important reason for users to place their data in a database is to allow for efficient access via
indexes. Unfortunately, the standard RDBMS index structures support only equality conditions
(B+ trees and hash indexes) and range conditions (B+ trees). An important issue for ORDBMSs
is to provide efficient indexes for ADT methods and operators on structured objects. Many
specialized index structures have been proposed by researchers for particular applications such as
cartography, genome research, multimedia repositories, Web search, and so on. An ORDBMS
company cannot possibly implement every index that has been invented. Instead, the set of index
structures in an ORDBMS should be user-extensible. Extensibility would allow an expert in
cartography, for example, to not only register an ADT for points on a map (i.e.,
latitude/longitude pairs), but also implement an index structure that supports natural map queries

(e.g., the R-tree, which matches conditions such as Find me all theaters within 100 miles of
One way to make the set of index structures extensible is to publish an access method interface
that lets users implement an index structure outside of the DBMS. The index and data can be
stored in a file system, and the DBMS simply issues the open , next ,and close iterator requests to
the users external index code. Such functionality makes it possible for a user to connect a
DBMS to a Web search engine, for example. A main drawback of this approach is that data in an
external index is not protected by the DBMSs support for concurrency and recovery. An
alternative is for the ORDBMS to provide a generic template index structure that is sufficiently
general to encompass most index structures that users might invent. Because such a structure is
implemented within the DBMS, it can support high concurrency and recovery. The Generalized
Search Tree (GiST) is such a structure. It is a template index structure based on B+trees, which
allows most of the tree index structures invented so far to be implemented with only a few lines
of user-defined ADT code.

Query Processing
ADTs and structured types call for new functionality in processing queries in ORDBMSs. They
also change a number of assumptions that affect the efficiency of queries. In this section we look
at two functionality issues (user-defined aggregates and security) and two efficiency issues
(method caching and pointer swizzling).

User-Defined Aggregation Functions

Since users are allowed to define new methods for their ADTs, it is not unreasonable to expect
them to want to define new aggregation functions for their ADTs as well. For example, the usual
SQL aggregates COUNT, SUM, MIN, MAX, AVGare not particularly appropriate for the
Image type schema.
Most ORDBMSs allow users to register new aggregation functions with the system. To register
an aggregation function, a user must implement three methods, which we will call initialize,
iterate, and terminate. The initialize method initializes the internal state for the aggregation. The
iterate method updates that state for every tuple seen, while the terminate method computes the
aggregation result based on the final state and then cleans up. As an example, consider an
aggregation function to compute the second-highest value in a field. The initialize call would
allocate storage for the top two values, the iterate call would compare the current tuples value
with the top two and update the top two as necessary, and the terminate call would delete the
storage for the top two values, returning a copy of the second-highest value.

Method Security
ADTs give users the power to add code to the DBMS, this power can be abused. A buggy or
malicious ADT method can bring down the database server or even corrupt the database. The
DBMS must have mechanisms to prevent buggy or malicious user code from causing problems.
It may make sense to override these mechanisms for efficiency in production environments with

vendor-supplied methods. However, it is important for the mechanisms to exist, if only to

support debugging of ADT methods, otherwise method writers would have to write bug-free
code before registering their methods with the DBMSnot a very forgiving programming
environment.One mechanism to prevent problems is to have the user methods be interpreted
rather than compiled . The DBMS can check that the method is well behaved either by restricting
the power of the interpreted language or by ensuring that each step taken by a method is safe
before executing it. Typical interpreted languages for this purpose include Java and the
procedural portions of SQL:1999
An alternative mechanism is to allow user methods to be compiled from a general-purpose
programming language such as C++, but to run those methods in a different address space than
the DBMS. In this case the DBMS sends explicit interprocess communications (IPCs) to the user
method, which sends IPCs back in return. This approach prevents bugs in the user methods (e.g.,
stray pointers) from corrupting the state of the DBMS or database and prevents malicious
methods from reading or modifying the DBMS state or database as well. Note that the user
writing the method need not know that the DBMS is running the method in a separate process:
The user code can be linked with a wrapper that turns method invocations and return values
into IPCs

Method Caching
User-defined ADT methods can be very expensive to execute and can account for the bulk of the
time spent in processing a query. During query processing it may make sense to cache the results
of methods, in case they are invoked multiple times with the same argument. Within the scope of
a single query, one can avoid calling a method twice on duplicate values in a column by either
sorting the table on that column or using a hash-based scheme much like that used for
aggregation. An alternative is to maintain a cache of method inputs and matching outputs as a
table in the database. Then to find the value of a method on particular inputs, we essentially join
the input tuples with the cache table. These two approaches can also be combined.

Pointer Swizzling
In some applications, objects are retrieved into memory and accessed frequently through their
oids, dereferencing must be implemented very efficiently. Some systems maintains table of oids
of objects that are (currently) in memory. When an object O is brought into memory, they check
each oid contained in O and replace oids of in-memory objects by in-memory pointers to those
objects. This technique is called pointer swizzling and makes references to in-memory objects
very fast. The downside is that when an object is paged out, in-memory references to it must
somehow be invalidated and replaced with its oid.

Query Optimization
New indexes and query processing techniques widen the choices available to a query optimizer.
In order to handle the new query processing functionality, an optimizer must know about the new
functionality and use it appropriately. In this section we discuss two issues in exposing
information to the optimizer (new indexes and ADT method estimation) and an issue in query
planning that was ignored in relational systems (expensive selection optimization).

Registering Indexes with the Optimizer

As new index structures are added to a systemeither via external interfaces or built-in template
structures like GiSTsthe optimizer must be informed of their existence, and their costs of
access. In particular, for a given index structure the optimizer must know (a) what WHERE clause conditions are matched by that index, and (b) what the cost of fetching a tuple is for that
index. Given this information, the optimizer can use any index structure in constructing a query
plan. Different ORDBMSs vary in the syntax for registering new index structures. Most systems
require users to state a number representing the cost of access, but an alternative is for the DBMS
to measure the structure as it is used and maintain running statistics on cost.

Expensive selection optimization

In relational systems, selection is expected to be a zero-time operation. For example, it requires
no I/Os and few CPU cycles to test if emp.salary <10 . However, conditions such as
is herbert(Frames.image)
can be quite expensive because they may fetch large objects off the disk and process them in
memory in complicated ways. ORDBMS optimizers must consider carefully how to order
selection conditions. For example, consider a selection query that tests tuples in the Frames table
with two conditions:
Frames.frameno<100 isherbert(Frame.image). It is probably preferable to check the frameno
condition before testing is herbert. The first condition is quick and may often return false, saving
the trouble of checking the second condition. In general, the best ordering among selections is a
function of their costs and reduction factors. It can be shown that selections should be ordered by
increasing rank, where rank = (reduction factor1)/cost. If a selection with very high rank
appears in a multi-table query, it may even make sense to postpone the selection until after performing joins. Note that this approach is the opposite of the heuristic for pushing selections. The
details of optimally placing expensive selections among joins are somewhat complicated, adding
to the complexity of optimization in ORDBMSs.

Comparison Between RDBMS, OODBMS, and ORDBMS




Does not support object oriented


Support user-defined ADTs,

structured types, object

Support user-defined ADTs,

structured types, object

identity and reference types, and


identity and reference types, and


RDBMS supports SQL


Aimed at designing management

and finance systems i.e.: hotel
management, etc.

OODBMSs aim to achieve

seamless integration with a
programming language
such as C++, Java or Smalltalk.
Such integration is not an

ORDBMSs support an extended

form of SQL,
An OODBMS is aimed at
applications where an objectcentric viewpoint is appropriate;
that is, typical user sessions
consist of retrieving a few objects

important goal for an

Transactions are short and adhoc in nature


working on them for long periods,
with related objects (e.g., objects

Transactions are complex and

are of long duration

by the original objects) fetched

Transactions are assumed to be
short and ordinary mechanisms
of RDBMS are used to manage

Every record is uniquely Here every object is uniquely Here every object is uniquely
identified by primary key
identified by system generated identified by system generated
Object ID
Object ID
RDBMS is suitable for small
database management systems
university management, shop
management, etc.

applications like:
Manufacturing (CIM), Advanced
office automation systems,
Hospital patient care tracking
systems, etc. All of these
applications are characterized by
having to manage complex,
highly interrelated information,
which is a strength of objectoriented database systems.

applications like: Complex data
Management, Gio-graphic Data,

Examples of RDBMS: Oracle, SQL Examples of OODBMS: Object Examples of ORDBMS: Postgres,
server, MySQL, etc
store, Versant, Gemstone, etc.
SQL 92
Standard Query Language is Lack of standard query language.
present i.e: SQL

Lack of standard query language

Parallel and Distributed Databases
A parallel database system is one that seeks to improve performance through parallel
implementation of various operations such as loading data, building indexes, and evaluating
Parallel Database Systems
A parallel database system tries to improve performance through parallelization of various
operations such as loading data , evaluating queries etc. the main goal of such system is to
improve the performance. Whereas, in case of distributed database systems, the data
distribution is the governing factor. The main goal of such systems is to increase the
availability and reliability.
Some terms that defines systems performance:
Throughput: Number of tasks (transactions) that can be completed in a given time interval.
Response Time: Amount of time taken to complete a single task from the time it is
A system that processes large number of small transactions can improve throughput by
processing many transactions in parallel.
A system that processes large transactions can improve response time and throughput by
dividing each transaction into number of sub-transactions that can be executed in parallel.
Speed-Up: Running a given task in less time by increasing the degree of parallelism is call
speed up.
Speed Up = Ts/Tl
where Ts= Time required on small system
Tl= time required on large system with more resources.
A parallel system is said to demonstrate linear speed up if the speed up is N, when
resources are increased N times.
Scale-Up: Handling larger tasks in same amount of time by increasing the degree of
parallelism is called scale up.
Scale-Up= Ts/Tl
where Ts= time required to execute task of size Q
Tl= time required to execute task of size Q*N
The parallel system is said to demonstrate linear scale up on task of size Q if Ts=Tl when
resources are increased N times.
Parallel Database architectures:
Three main architectures are proposed for building parallel databases:
1. Shared - memory :- (All processors share common memery) where multiple CPUs are
attached to an interconnection network and can access a common region of main
In shared memory architecture, the processors and disks have access to common
memory via a bus or through an interconnection network.
A processor can send messages to other processors using memory writes.
This message sending is the much faster communication mechanism.

Advantage: Shared memory is an extremely effiecient communication between processors

and data in shared memory can be accessed by any processor without being moved with
Disadvantage: shared memory architecture is not scalable beyond 32 or 64 processors,
since the bus or interconnection network becomes bottleneck.
2. Shared disk(All processors share common disk & have private memories). where each
CPU has a private memory and direct access to all disks through an interconnection
Advantages: Each processor has its own local memory, so the memory bus is not
This architecture provides higher degree of fault tolerance.(If a processor fails, the other
processors can take over its task)
Disadvantage: The interconnection to the disk subsystem is now a bottleneck.
3. Shared nothing (Each node of machine consists of a processor, memory and one or
more disks). where each CPU has local main memory and disk space, but no two CPUs
can access the same storage area; all communication between CPUs is through a
network connection.
Advantages: Instead of passing all I/O to go through a single interconnection network, only
queries to non local disks and result relations are passed through network.
These architectures are more scalable and can easily support large number of
Transmissions capacity increases as more nodes can be added.
Disadvantage: Cost of communication and non local disk access are higher as compared
to others because transmitting data involves software interaction at both ends.


Parallel evaluation of a relational query in a DBMS with a shared-nothing architecture is

discussed. Parallel execution of a single query has been emphasized.
A relational query execution plan is a graph of relational algebra operators and the
operators in a graph can be executed in parallel. If an operator consumes the output of a
second operator, we have pipelined parallelism.
Each individual operator can also be executed in parallel by partitioning the input data
and then working on each partition in parallel and then combining the result of each
partition. This approach is called Data Partitioned parallel Evaluation.
Data Partitioning: Here large datasets are partitioned horizontally across several disk, this
enables us to exploit the I/O bandwidth of the disks by reading and writing them in parallel.
This can be done in the following ways:
a. Round Robin Partitioning
b. Hash Partitioning
c. Range Partitioning

a. Round Robin Partitioning :If there are n processors, the ith tuple is assigned to
processor i mod n
b. Hash Partitioning : A hash function is applied to (selected fields of) a tuple to determine
its processor.
Hash partitioning has the additional virtue that it keeps data evenly distributed even if the
data grows and shrinks over time.
c. Range Partitioning : Tuples are sorted (conceptually), and n ranges are chosen for the
sort key values so that each range contains roughly the same number of tuples; tuples in
range i are assigned to processor i.
Range partitioning can lead to data skew; that is, partitions with widely varying numbers of
tuples across partitions or disks. Skew causes processors dealing with large partitions to
become performance bottlenecks.


Various operations can be implemented in parallel in a shared nothing architecture.
Bulk Loading and Scanning:
Pages can be read in parallel while scanning a relation and the retrieved tuples can then
be merged, if the relation is partitioned across several disks.
If a relation has associated indexes, any sorting of data entries required for building the
indexes during bulk loading can also be done in parallel.

Sorting could be done by redistributing all tuples in the relation using range partitioning.
Ex. Sorting a collection of employee tuples by salary whose values are in a certain
For N processors each processor gets the tuples which lie in range assigned to it. Like
processor 1 contains all tuples in range 10 to 20 and so on.
Each processor has a sorted version of the tuples which can then be combined by
traversing and collecting the tuples in the order on the processors (according to the range
The problem with range partitioning is data skew which limits the scalability of the
parallel sort. One good approach to range partitioning is to obtain a sample of the entire
relation by taking samples at each processor that initially contains part of the relation. The
(relatively small) sample is sorted and used to identify ranges with equal numbers of tuples.
This set of range values, called a splitting vector, is then distributed to all processors and
used to range partition the entire relation.
Here we consider how the join operation can be parallelized
Consider 2 relations A and B to be joined using the age attribute. A and B are initially
distributed across several disks in a way that is not useful for join operation
So we have to decompose the join into a collection of k smaller joins by partitioning both
A and B into a collection of k logical partitions.
If same partitioning function is used for both A and B then the union of k smaller joins will
compute to the join of A and B.

The idea of a distributed database is that the data should be physically stored at different
locations but its distribution and access should be transparent to the user.
Introduction to DBMS:
A Distributed Database should exhibit the following properties:
1) Distributed Data Independence: - The user should be able to access the database
without having the need to know the location of the data.
2) Distributed Transaction Atomicity: - The concept of atomicity should be distributed for
the operation taking place at the distributed sites.

Types of Distributed Databases are:a) Homegeneous Distributed Database is where the data stored across multiple sites is
managed by same DBMS software at all the sites.
b) Heterogeneous Distributed Database is where multiple sites which may be autonomous
are under the control of different DBMS software.
Architecture of DDBs :
There are 3 architectures: Client-Server:
A Client-Server system has one or more client processes and one or more server
processes, and a client process can send a query to any one server process. Clients are
responsible for user-interface issues, and servers manage data and execute transactions.
Thus, a client process could run on a personal computer and send queries to a server
running on a mainframe.
Advantages: 1. Simple to implement because of the centralized server and separation of functionality.
2. Expensive server machines are not underutilized with simple user interactions which are
now pushed on to inexpensive client machines.
3. The users can have a familiar and friendly client side user interface rather than unfamiliar
and unfriendly server interface
Collaborating Server:
In the client sever architecture a single query cannot be split and executed across
multiple servers because the client process would have to be quite complex and intelligent
enough to break a query into sub queries to be executed at different sites and then place
their results together making the client capabilities overlap with the server. This makes it
hard to distinguish between the client and server

In Collaborating Server system, we can have collection of database servers, each

capable of running transactions against local data, which cooperatively execute transactions
spanning multiple servers.
When a server receives a query that requires access to data at other servers, it
generates appropriate sub queries to be executed by other servers and puts the results
together to compute answers to the original query.
Middleware system is as special server, a layer of software that coordinates the
execution of queries and transactions across one or more independent database servers.

The Middleware architecture is designed to allow a single query to span multiple

servers, without requiring all database servers to be capable of managing such multi site
execution strategies. It is especially attractive when trying to integrate several legacy
systems, whose basic capabilities cannot be extended.
We need just one database server that is capable of managing queries and transactions
spanning multiple servers; the remaining servers only need to handle local queries and


Data storage involved 2 concepts
1. Fragmentation
2. Replication
It is the process in which a relation is broken into smaller relations called fragments and
possibly stored at different sites.
It is of 2 types
1. Horizontal Fragmentation where the original relation is broken into a number of
fragments, where each fragment is a subset of rows. The union of the horizontal fragments
should reproduce the original relation.
2. Vertical Fragmentation where the original relation is broken into a number of fragments,
where each fragment consists of a subset of columns.
The system often assigns a unique tuple id to each tuple in the original relation so that the
fragments when joined again should from a lossless join. The collection of all vertical fragments
should reproduce the original relation.
Replication occurs when we store more than one copy of a relation or its fragment at
multiple sites.
Advantages:1. Increased availability of data: If a site that contains a replica goes down, we can find
the same data at other sites. Similarly, if local copies of remote relations are available, we
are less vulnerable to failure of communication links.
2. Faster query evaluation: Queries can execute faster by using a local copy of a relation
instead of going to a remote site.

Distributed catalog management :

Naming Object
Its related to the unique identification of each fragment that has been either partitioned
or replicated.
This can be done by using a global name server that can assign globally unique names.
This can be implemented by using the following two fields:1. Local name field locally assigned name by the site where the relation is created. Two
objects at different sites can have same local names.
2. Birth site field indicates the site at which the relation is created and where information
about its fragments and replicas is maintained.
Catalog Structure:
A centralized system catalog is used to maintain the information about all the
transactions in the distributed database but is vulnerable to the failure of the site containing
the catalog.
This could be avoided by maintaining a copy of the global system catalog but it involves
broadcast of every change done to a local catalog to all its replicas.
Another alternative is to maintain a local catalog at every site which keeps track of all the
replicas of the relation.

Distributed Data Independence:

It means that the user should be able to query the database without needing to specify
the location of the fragments or replicas of a relation which has to be done by the DBMS
Users can be enabled to access relations without considering how the relations are
distributed as follows:
The local name of a relation in the system catalog is a combination of a user name and a
user-defined relation name.
When a query is fired the DBMS adds the user name to the relation name to get a local
name, then adds the user's site-id as the (default) birth site to obtain a global relation name.
By looking up the global relation name in the local catalog if it is cached there or in the
catalog at the birth site the DBMS can locate replicas of the relation.

Distributed query processing:

In a distributed system several factors complicates the query processing.
One of the factors is cost of transferring the data over network.
This data includes the intermediate files that are transferred to other sites for further
processing or the final result files that may have to be transferred to the site where the
query result is needed.

Although these cost may not be very high if the sites are connected via a high local n/w
but sometime they become quit significant in other types of network.
Hence, DDBMS query optimization algorithms consider the goal of reducing the amount
of data transfer as an optimization criterion in choosing a distributed query execution
Consider an EMPLOYEE relation.
The size of the employee relation is 100 * 10,000=10^6 bytes
The size of the department relation is 35 * 100=3500 bytes

10,000 records
Each record is 100 bytes
Fname field is 15 bytes long
SSN field is 9 bytes long
Lname field is 15 bytes long
Dnum field is 4 byte long

Each record is 35 bytes long
Dnumber field is 4 bytes long
Dname field is 10 bytes long
MGRSSN field is 9 bytes long
Now consider the following query:
For each employee, retrieve the employee name and the name of the department for which
the employee works.

Using relational algebra this query can be expressed as

If we assume that every employee is related to a department then the result of this query
will include 10,000 records.

Now suppose that each record in the query result is 40 bytes long and the query is
submitted at a distinct site which is the result site.
Then there are 3 strategies for executing this distributed query:
1. Transfer both the EMPLOYEE and the DEPARTMENT relations to the site 3 that is your
result site and perform the join at that site. In this case a total of 1,000,000 + 3500 =
1,003,500 bytes must be transferred.
2. Transfer the EMPLOYEE relation to site 2 (site where u have Department relation) and
send the result to site 3. the size of the query result is 40 * 10,000 = 400,000 bytes so
400,000 + 1,000,000 = 1,400,000 bytes must be transferred.
3. Transfer the DEPARTEMNT relation to site 1 (site where u have Employee relation) and
send the result to site 3. in this case 400,000 + 3500 = 403,500 bytes must be transferred.
Nonjoin Queries in a Distributed DBMS:
Consider the following two relations:
Sailors (sid: integer, sname:string, rating: integer, age: real)
Reserves (sid: integer, bid: integer, day: date, rname: string)
Now consider the following query:
SELECT S.age FROM Sailors S WHERE S.rating > 3 AND S.rating < 7
Now suppose that sailor relation is horizontally fragmented with all the tuples having a rating
less than 5 at Shanghais and all the tuples having a rating greater than 5 at Tokyo.
The DBMS will answer this query by evaluating it both sites and then taking the union of the
Joins in a Distributed DBMS:
Joins of a relation at different sites can be very expensive so now we will consider the
evaluation option that must be considered in a distributed environment.
Suppose that Sailors relation is stored at London and Reserves relation is stored at
Paris. Hence we will consider the following strategies for computing the joins for Sailors and
In the next example the time taken to read one page from disk (or to write one page to
disk) is denoted as td and the time taken to ship one page (from any site to another site) as


The main issues with respect to the Distributed transaction are:
Distributed Concurrency Control
How can deadlocks be detected in a distributed database?
How can locks for objects stored across several sites be managed?

Distributed Recovery
When a transaction commits, all its actions across all the sites at which it executes must
When a transaction aborts none of its actions must be allowed to persist.
Concurrency Control and Recovery in Distributed Databases: For currency control and recovery
purposes, numerous problems arise in a distributed DBMS environment that is not encountered in a
centralized DBMS environment.
This includes the following:
Dealing with multiple copies of the data items: The concurrency control method is responsible for
maintaining consistency among these copies. The recovery method is responsible for making a copy
consistent with other copies if the site on which he copy is stored fails and recovers later.
Failure of individual sites: The DBMS should continue to operate with its running sites, if possible
when one or the more individual site fall. When a site recovers its local database must be brought
up to date with the rest of the sites before it rejoins the system.
Failure of communication links: The system must be able to deal with failure of one or more of the
communication links that connect the sites. An extreme case of this problem is that network
partitioning may occur. This breaks up the sites into two or more partitions where the sites within
each partition can communicate only with one another and not with sites in other partitions.
Distributed Commit: Problems can arise with committing a transactions that is accessing database
stored on multiple sites if some sites fail during the commit process. The two-phase commit
protocol is often used to deal with this problem.
Distributed Deadlock: Deadlock may occur among several sites so techniques for dealing with
deadlocks must be extended to take this into account.
Lock management can be distributed across sites in many ways:
Centralized: A single site is in charge of handling lock and unlock requests for all
Primary copy: One copy of each object is designates as the primary copy. All requests
to lock or unlock a copy of these objects are handled by the lock manager at the site where
the primary copy is stored, regardless of where the copy itself is stored.
Fully Distributed: Request to lock or unlock a copy of an object stored at a site are
handled by the lock manager at the site where the copy is stored.
Distributed Deadlock
One issue that requires special attention when using either primary copy or fully
distributed locking is deadlocking detection
Each site maintains a local waits-for graph and a cycle in a local graph indicates a
For example:
Suppose that we have two sites A and B, both contain copies of objects O1 and O2 and
that the read-any write-all technique is used.

T1 which wants to read O1 and write O2 obtains an S lock on O1 and X lock on O2 at

site A, and request for an X lock on O2 at site B.
T2 which wants to read O2 and write O1 mean while obtains an S lock on O2 and an X
lock on O1 at site B then request an X lock on O1 at site A.
As shown in the following figure T2 is waiting for T1 at site A and T1 is waiting for T2 at
site B thus we have a Deadlock.

To detect such deadlocks, a distributed deadlock detection algorithm must be used and we
have three types of algorithms:
1. Centralized Algorithm:

It consist of periodically sending all local waits-for graphs to some one site that is
responsible for global deadlock detection.

At this site, the global waits-for graphs is generated by combining all local graphs and in
the graph the set of nodes is the union of nodes in the local graphs and there is an edge
from one node to another if there is such an edge in any of the local graphs.
2. Hierarchical Algorithm:
This algorithm groups the sites into hierarchies and the sites might be grouped by states,
then by country and finally into single group that contain all sites.

Every node in this hierarchy constructs a waits-for graph that reveals deadlocks involving
only sites contained in (the sub tree rooted at) this node.

Thus, all sites periodically (e.g., every 10 seconds) send their local waits-for graph to the
site constructing the waits-for graph for their country.

The sites constructing waits-for graph at the country level periodically (e.g., every 10
minutes) send the country waits-for graph to site constructing the global waits-for graph.

3. Simple Algorithm:

If a transaction waits longer than some chosen time-out interval, it is aborted.

Although this algorithm causes many unnecessary restart but the overhead of the
deadlock detection is low.
Distributed Recovery: Recovery in a distributed DBMS is more complicated than in a
centralized DBMS for the following reasons:
New kinds of failure can arise: failure of communication links and failure of remote site at
which a sub transaction is executing.
Either all sub transactions of a given transaction must commit or none must commit and
this property must be guaranteed despite any combination of site and link failures. This
guarantee is achieved using a commit protocol.
Normal execution and Commit Protocols:
During normal execution each site maintains a log and the actions of a sub transaction
are logged at the site where it executes.
The regular logging activity is carried out which means a commit protocol is followed to
ensure that all sub transactions of a given transaction either commit or abort uniformly.

The transaction manager at the site where the transaction originated is called the
Coordinator for the transaction and the transaction managers where its sub transactions
execute are called Subordinates.
Two Phase Commit Protocol:
When the user decides to commit the transaction and the commit command is sent to
the coordinator for the transaction.
This initiates the 2PC protocol:
The coordinator sends a Prepare message to each subordinate.
When a subordinate receive a Prepare message, it then decides whether to abort or
commit its sub transaction. it force-writes an abort or prepares a log record and then sends
a NO or Yes message to the coordinator.
Here we can have two conditions:
o If the coordinator receives Yes message from all subordinates. It force-writes a
commit log record and then sends a commit message to all the subordinates.
o If it receives even one No message or No response from some coordinates for a
specified time-out period then it will force-write an abort log record and then sends an abort
message to all subordinate.

Here again we can have two conditions:

o When a subordinate receives an abort message, it force-writes an abort log
record sends an ack message to coordinator and aborts the sub transaction.

o When a subordinates receives a commit message, it force-writes a commit log

record and sends an ack message to the coordinator and commits the sub transaction.

There are three main objectives to consider while designing a secure database application:
1. Secrecy: Information should not be disclosed to unauthorized users. For example, a student should
not be allowed to examine other students' grades.
2. Integrity: Only authorized users should be allowed to modify data. For example, students may be
allowed to see their grades, yet not allowed (obviously!) to modify them.
3. Availability: Authorized users should not be denied access. For example, an instructor who wishes to
change a grade should be allowed to do so.
A DBMS typically includes a database security and authorization subsystem that is responsible for
ensuring the security of portions of a database against unauthorized access. It is now customary to refer
to two types of database security mechanisms:
Discretionary Security mechanism: These are used to grant privileges to users, including the capability to
access specific data files, records, or fields in a specified mode(such as read, insert,delete, or update).
Mandatory security mechanisms: These are used to enforce multilevel security by classifying the data and
users into various security classes (or levels) and then implementing the appropriate security policy of the
organization. For example, a typical policy is to purmit users at a certain classification level to see only
data items classified at the users own level. An extension of this is role-based security, which enforces
policies and privileges based on the concept of roles.

A DBMS should provide mechanisms to control access to data. A DBMS offers two main approaches to
access control.
Discretionary access control
Mandatory access control

Discretionary access control: It is based on the concept of access rights, or privileges, and
mechanisms for users. A privilege allows a user to access some data object in a certain manner ( e.g., to
read or to modify). A user who creates a database object such as a table or a view automatically gets all
applicable privileges on that object. SQL-92 supports discretionary access control through the GRANT
and REVOKE commands.
The GRANT command gives privileges to users.
The GRANT command gives privileges to base table and views. The syntax of this command is as
GRANT privileges ON object TO users [WITH GRANT OPTION]
Here object is either a base table or a view.
Several privileges can be specified including:
SELECT: The right to access (read) all columns of the table specified as object, including columns added
later through ALTER TABLE commands.
INSERT(column-name): The right to insert rows with (non-null or non default) values in the named
column of the table named as object. The privileges UPDATE(column-name) and UPDATE are similar to
DELETE: The right to delete rows from the table named as object.
REFERENCES(column-name): The right to define foreign keys (in other tables) that refer to the
specified column of the table object. REFERENCES without a column name specified denotes this right
with respect to all columns.
For Example:
Suppose that user joe has created the tables BOATS, RESERVES, and SAILORS. Some examples of
GRANT command that joe can now execute are:
Adding WITH GRANT OPTION at the end of the grant command allows the user who has been granted
the privilege to pass those privilege to other user.
In the above examples. Yuppy can insert or delete Reserves rows and can authorize someone else to do
the same. Michael can execute Select queries on Sailors and Reserves, and he can pass this privilege to
others for sailors, but not for Reserves.
The REVOKE command takes away privileges.
This is complementary command to GRANT that allows the withdrawal of privileges.
The syntax of REVOKE Command is as follows:
The command can be used to revoke either a privilege or just the grant option on a privilege( by using the
option GRANT OPTION FOR clause).
A user who has granted a privilege to other user may change his mind and want to withdraw the granted
privilege. The intuition behind exactly what effect a REVOKE command has is complicated by the fact
that a user may be granted the same privilege multiple times, possible by different users.

When a user executes a REVOKE command with the CASCADE keyword, the effect is to withdraw the
named privileges or grant option from all users who currently hold these privileges solely through a
GRANT command that was previously executed by some user who is now executing the REVOKE
command. If these users received the privileges with the grant option and passed it along, those
recipients will also lose their privileges as consequence of the REVOKE command unless they received
these privileges independently.
For Example:

(executed by Joe)
(executed by Art)


(executed by Joe)

Art loses the SELECT privilege on Sailors, of course. Then Bob, who received this privilege from Art, and
only Art, also loses this privilege.
If the RESTRICT keyword is specified in the REVOKE command, the command is rejected if revoking the
privileges just from the users specified in the command would result in other privileges becoming
Mandatory access control: It is based on system wide policies that cannot be changed by individual
users. In this approach each database object is assigned a security class, each user is assigned
clearance for a security class, and rules are imposed on reading and writing of database objects by users.
The DBMS determines whether a given user can read or write a given object based on certain rules that
involve the security level of the object and the clearance of the user.
The popular model for mandatory access control, called the Bell-LaPadula model, is described in terms of
objects (e.g., tables, views, rows, columns), subjects (e.g., users, programs), security classes, and
clearances. Each database object is assigned a security class, and each subject is assigned clearance
for a security class; we will denote the class of an object or subject A as class(A). The security classes in
a system are organized according to a partial order, with a most secure class and a least secure class.
For simplicity, we will assume that there are four classes: top secret (TS), secret (S), confidential (C), and
unclassified (U). In this system, TS > S > C > U, where A > B means that class A data is more sensitive
than class B data.
The Bell-LaPadula model imposes two restrictions on all reads and writes of database objects:
1. Simple Security Property: Subject S is allowed to read object O only if class(S) class(O). For
example, a user with TS clearance can read a table with C clearance, but a user with C clearance is not
allowed to read a table with TS classification.
2. *-Property: Subject S is allowed to write object O only if class(S) class(O). For example, a user with
S clearance can only write objects with S or TS classification.

Multilevel Relations and Polyinstantiation

To apply mandatory access control policies in a relational DBMS, a security class must be assigned to
each database object. The objects can be at the granularity of tables, rows, or even individual column
values. Let us assume that each row is assigned a security class. This situation leads to the concept of a
multilevel table, which is a table with the surprising property that users with di_erent security clearances
will see a different collection of rows when they access the same table.
Consider the instance of the Boats table shown in Figure below. Users with S and TS clearance will get
both rows in the answer when they ask to see all rows in Boats. A user with C clearance will get only the
second row, and a user with U clearance will get no rows.




Security class







The Boats table is defined to have bid as the primary key. Suppose that a user with clearance C wishes
to enter the row <101, Picante,Scarlet, i>. We have a dilemma:
If the insertion is permitted, two distinct rows in the table will have key 101.
If the insertion is not permitted because the primary key constraint is violated, the user trying to insert the
new row, who has clearance C, can infer that there is a boat with bid=101 whose security class is higher
than C. This situation compromises the principle that users should not be able to infer any information
about objects that have a higher security classification.

This dilemma is resolved by effectively treating the security classification as part of the key. Thus, the
insertion is allowed to continue, and the table instance is modified as shown in Figure below.




Security class










Users with clearance C or U see just the rows for Picante and Pinto, but users with clearance S or TS see
all three rows. The two rows with bid=101 can be interpreted in one of two ways: only the row with the
higher classification (Salsa, with classification S) actually exists, or both exist and their presence is
revealed to users according to their clearance level. The choice of interpretation is up to application
developers and users.

Covert Channels, DoD Security Levels

Even if a DBMS enforces the mandatory access control scheme discussed above, information can flow
from a higher classification level to a lower classification level through indirect means, called covert
channels. For example, if a transaction accesses data at more than one site in a distributed DBMS, the
actions at the two sites must be coordinated. The process at one site may have a lower clearance (say C)
than the process at another site (say S), and both processes have to agree to commit before the
transaction can be committed. This requirement can be exploited to pass information with an S
classification to the process with a C clearance: The transaction is repeatedly invoked, and the process
with the C clearance always agrees to commit, whereas the process with the S clearance agrees to
commit if it wants to transmit a 1 bit and does not agree if it wants to transmit a 0 bit.
In this manner, information with an S clearance can be sent to a process with a C clearance as a stream
of bits. This covert channel is an indirect violation of the intent behind the *-Property.

Role of the Database Administrator

The database administrator (DBA) plays an important role in enforcing the security related aspects of a
database design. In conjunction with the owners of the data, the DBA will probably also contribute to

developing a security policy. The DBA has a special account, which we will call the system account, and
is responsible for the overall security of the system. In particular the DBA deals with the following:
1. Creating new accounts: Each new user or group of users must be assigned an authorization id and a
password. Note that application programs that access the database have the same authorization id as the
user executing the program.
2. Mandatory control issues: If the DBMS supports mandatory control some customized systems for
applications with very high security requirements (for example, military data) provide such support the
DBA must assign security classes to each database object and assign security clearances to each
authorization id in accordance with the chosen security policy.
3.Audit trail: The DBA is also responsible for maintaining the audit trail, which is essentially the log of
updates with the authorization id (of the user who is executing the transaction) added to each log entry.
This log is just a minor extension of the log mechanism used to recover from crashes. Additionally, the
DBA may choose to maintain a log of all actions, including reads, performed by a user. Analyzing such
histories of how the DBMS was accessed can help prevent security violations by identifying suspicious
patterns before an intruder finally succeeds in breaking in, or it can help track down an intruder after a
violation has been detected.

A DBMS can use encryption to protect information in certain situations where the normal security
mechanism of the DBMS are not adequate. For example, an intruder may steal tapes containing some
data or tape a communication line. By storing and transmitting data in an encrypted form, the DBMS
ensures that such stolen data is not intelligible to the intruder.
Encryption is basically done through encryption algorithm. The output of the algorithm is the encrypted
version of the data. There is also a decryption algorithm, which takes the encrypted data and the
encryption key as input and then returns the original data. This approach is called Data Encryption
Standard (DES). The main weakness of this approach is that authorized users must be told the
encryption key, and the mechanism for communicating this information is vulnerable to clever intruders.
Another approach is called Public Key encryption. The encryption scheme proposed by Rivest, Shamir,
and Adleman, called RSA, is a well-known example of public-key encryption. In this each authorized user
has a public encryption key, known to everyone, and a private decryption key, choosen by the user and
known only to him or her.
For example: Consider a user called sam. Anyone can send sam a secret message by encrypting the
message using sams publicly known encryption key. Only sam can decrypt this secret message because
the decryption algorithm requires sams decryption key, known only to sam. Since users choose their own
decryption keys, the weakness of DES is avoided.


What is Postgres?
Traditional relational database management systems (DBMSs) support a data model consisting of a
collection of named relations, containing attributes of a specific type. In current commercial systems,
possible types include floating point numbers, integers, character strings, money, and dates. It is
commonly recognized that this model is inadequate for future data processing applications. The relational
model successfully replaced previous models in part because of its "Spartan simplicity". However, as
mentioned, this simplicity often makes the implementation of certain applications very difficult. Postgres
offers substantial additional power by incorporating the following four additional basic concepts in such a
way that users can easily extend the system:
Other features provide additional power and flexibility:
transaction integrity
These features put Postgres into the category of databases referred to as object-relational. Postgres is a
client/server application. As a user, you only need access to the client portions of the installation
Postgres uses a simple "process per-user" client/server model. A Postgres session consists of the
following cooperating UNIX processes (programs):
A supervisory daemon process (postmaster),
The users frontend application (e.g., the psql program), and
The one or more backend database servers (the postgres process itself).
A single postmaster manages a given collection of databases on a single host. Such a collection of
databases is called an installation or site. Frontend applications that wish to access a given database
within an installation make calls to the library. The library sends user requests over the network to the
postmaster (How a connection is established), which in turn starts a new backend server process and
connects the frontend process to the new server. From that point on, the frontend process and the
backend server communicate without intervention by the postmaster. Hence, the postmaster is always
running, waiting for requests, whereas frontend and backend processes come and go.

Transactions in POSTGRES

Transactions are a fundamental concept of all database systems. The essential point of a
transaction is that it bundles multiple steps into a single, all-or-nothing operation. The
intermediate states between the steps are not visible to other concurrent transactions, and if some
failure occurs that prevents the transaction from completing, then none of the steps affect the
database at all.
For example, consider a bank database that contains balances for various customer accounts, as
well as total deposit balances for branches. Suppose that we want to record a payment of $100.00
from Alice's account to Bob's account. Simplifying outrageously, the SQL commands for this
might look like
UPDATE accounts SET balance = balance - 100.00
WHERE name = 'Alice';
UPDATE branches SET balance = balance - 100.00
WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice');
UPDATE accounts SET balance = balance + 100.00
WHERE name = 'Bob';
UPDATE branches SET balance = balance + 100.00
WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob');

The details of these commands are not important here; the important point is that there are
several separate updates involved to accomplish this rather simple operation. Our bank's officers
will want to be assured that either all these updates happen, or none of them happen. It would
certainly not do for a system failure to result in Bob receiving $100.00 that was not debited from
Alice. Nor would Alice long remain a happy customer if she was debited without Bob being
credited. We need a guarantee that if something goes wrong partway through the operation, none
of the steps executed so far will take effect. Grouping the updates into a transaction gives us this
guarantee. A transaction is said to be atomic: from the point of view of other transactions, it
either happens completely or not at all.

In PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with
BEGIN and COMMIT commands. So our banking transaction would actually look like
UPDATE accounts SET balance = balance - 100.00
WHERE name = 'Alice';
-- etc etc

If, partway through the transaction, we decide we do not want to commit (perhaps we just
noticed that Alice's balance went negative), we can issue the command ROLLBACK instead of
COMMIT, and all our updates so far will be canceled.

PostgreSQL actually treats every SQL statement as being executed within a transaction. If you
do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if
successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT
is sometimes called a transaction block.

XML stands for the eXtensible Markup Language. It is a new markup language, developed by
the W3C (World Wide Web Consortium)
Some of the areas where XML will be useful in the near-term include:
large Web site maintenance. XML would work behind the scene to simplify the creation of
HTML documents
exchange of information between organizations
off loading and reloading of databases
syndicated content, where content is being made available to different Web sites
electronic commerce applications where different organizations collaborate to serve a customer
scientific applications with new markup languages for mathematical and chemical formulas
electronic books with new markup languages to express rights and ownership
handheld devices and smart phones with new markup languages optimized for these
alternative devices
XML makes essentially two changes to HTML:
It predefines no tags.
It is stricter.
No Predefined Tags
Because there are no predefined tags in XML, you, the author, can create the tags that you need.
<price currency=usd>499.00</price>
<toc xlink:href=/newsletter>Pineapplesoft Link</toc>
HTML has a very forgiving syntax. This is great for authors who can be as lazy as they want, but
it also makes Web browsers more complex. According to some estimates, more than 50% of the
code in a browser handles errors or sloppiness on the authors part.
XML Example:
A List of Products in XML
<?xml version=1.0?>
<product id=p1>

<name>XML Editor</name>
<product id=p2>
<name>DTD Editor</name>
<product id=p3>
<name>XML Book</name>
<product id=p4>
<name>XML Training</name>

In this context, XML is used to exchange information between organizations.

The XML Web is a large database on which applications can tap

Applications exchanging data over the Web

XML Syntax
The syntax rules were described in the previous chapters:

XML documents must have a root element

XML elements must have a closing tag
XML tags are case sensitive
XML elements must be properly nested
XML attribute values must be quoted

XML Schemas
The DTD is the original modeling language or schema for XML.
The syntax for DTDs is different from the syntax for XML documents.

The purpose of a DTD is to define the structure of an XML document. It defines the structure
with a list of legal elements:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<body>Don't forget me this weekend!</body>
<!DOCTYPE note
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT heading (#PCDATA)>
XML Schema

W3C supports an XML-based alternative to DTD, called XML Schema:

<xs:element name="note">
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>


XML Namespaces provide a method to avoid element name conflicts.

An XML namespace is a collection of element and attribute names. XML namespaces provide a
means for document authors to unambiguously refer to elements with the same name (i.e.,
prevent collisions). For example,
Use element subject to mark up data. In the first case, the subject is something one studies in
school, whereas in the second case, the subject is a field of medicine. Namespaces can
differentiate these two subject elementsfor example:

Benefits of the DTD

The main benefits of using a DTD are
The XML processor enforces the structure, as defined in the DTD.
The application accesses the document structure, such as to populate an element list.
The DTD gives hints to the XML processorthat is, it helps separate indenting from content.
The DTD can declare default or fixed values for attributes. This might result in a smaller

XSL stands for EXtensible Stylesheet Language.
The World Wide Web Consortium (W3C) started to develop XSL because there was a need for
an XML-based Stylesheet Language.

XSL = Style Sheets for XML

XML does not use predefined tags (we can use any tag-names we like), and therefore the
meaning of each tag is not well understood.
A <table> tag could mean an HTML table, a piece of furniture, or something else - and a
browser does not know how to display it.
XSL describes how the XML document should be displayed!
XSL consists of three parts:

XSLT - a language for transforming XML documents

XPath - a language for navigating in XML documents
XSL-FO - a language for formatting XML documents

What is XSLT?
XSLT is a language for transforming XML documents into XHTML documents or to other XML documents.

XSLT stands for XSL Transformations

XSLT is the most important part of XSL
XSLT transforms an XML document into another XML document
XSLT uses XPath to navigate in XML documents
XSLT is a W3C Recommendation

XPath is a language for navigating in XML documents. XSLT uses XPath to find information in an XML
document. XPath is used to navigate through elements and attributes in XML documents.

What is XSL-FO?

XSL-FO is a language for formatting XML data

XSL-FO stands for Extensible Stylesheet Language Formatting Objects
XSL-FO is based on XML
XSL-FO is a W3C Recommendation
XSL-FO is now formally named XSL