Académique Documents
Professionnel Documents
Culture Documents
Silberschatz Text: 6
What are Relational Algebra, Relational Calculus?
Relational Algebra
– A basic set of operations to manipulate the database
– Procedural, specify how to retrieve, incorporated in SQL
– Provides formal foundation for relational model operations
– Used as a basis for implementing and optimizing queries in
RDBMS
Relational Calculus
– A higher-level declarative notation for specifying queries
– Non-procedural, specify only what is to be retrieved
– Firm basis in mathematical logic
– Tuple Calculus: operates over the rows
– Domain Calculus: operates over the columns
R.Algebra Slide -2
Topics to cover
Basic Relational Algebra Operations
– Set (∪, ∩, –, X)
– Specific for relational database (σ, π, ⋈ , ρ, ÷)
– Unary (single relation) and Binary (2 relations)
Additional Relational Algebra Operations
– Aggregate Functions and Grouping ℑ
– Recursive Closure Operations
– OUTER JOIN Operations ]><[
– The OUTER UNION Operation
Tuple Relational Calculus
– SQL based on Relational calculus
Domain Relational Calculus
– QBE based on Domain Calculus
R.Algebra Slide -3
Relational Algebra Operations
1. SET OPERATIONS
UNION ∪, INTERSECTION ∩
SET DIFFERENCE (MINUS) –
CARTESIAN (CROSS JOIN) PRODUCT X
2. SPECIFIC OPERATIONS
SELECT σ, PROJECT π (unary operations)
JOIN ⋈ (various types, =, θ, *,) (binary operations)
OTHERS
Sequence, rename ρ (unary operations)
Division ÷ (binary operations)
R.Algebra Slide -4
SELECT OPERATION
UNARY implies it is applied to a single relation and to each tuple individually
Select a subset of tuples from a relation that satisfy a selection condition
σ(selection condition)(R)
σ sigma = SELECT operation
selection condition = Boolean expression
<attribute name> <comparison op> <constant value>, or
<attribute name> <comparison op> <attribute name>
<comparison op> = {=,<, ≤,>, ≥, ≠}
<constant value> = constant value from the attribute domain
Can connect clauses by AND, OR and NOT
The degree of the resulting SELECT operation is same as the degree of R
Number of resulting tuples <= number of tuples in R
Commutative, i.e, apply in any order: σ(cond1) ( σ(cond2)(R)) = σ(cond2) ( σ(cond1)(R))
Can combine a cascade of SELECT operations into a single SELECT operation
with a conjunctive (AND) condition
σ<cond1> ( σ<cond2> ( ..(σ<condn>(R))…)) = σ(cond1) AND <cond2> AND … AND <condn>(R)
R.Algebra Slide -5
PROJECT OPERATION
UNARY implies it is applied to a single relation and to each tuple individually
Select certain columns (attributes) from the table and discards the other columns
π <attribute list>(R)
π i = PROJECT operation
e.g. π LNAME, FNAME, SALARY (EMPLOYEE)
Project (choose) only attributes specified in <attribute list> in same order as they
appear
Duplicate elimination, remove duplicate non-key only attributes of R
Number of resulting tuples <= number of tuples in R
If projection list is a superkey of R, the resulting relation has the same number of
tuples as R
If <list 2> contains attributes in <list1>: π <list1> (π <list2> (R)) = π <list1> (R)
Commutative does not hold on PROJECT operation
R.Algebra Slide -6
Results of SELECT and PROJECT operations.
Refer to COMPANY database schema
SELECT
(a): σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SLARY>30000)(EMPLOYEE).
PROJECT
(b): πLNAME, FNAME, SALARY(EMPLOYEE).
(c): πSEX, SALARY(EMPLOYEE).
Fig 1
R.Algebra Slide -7
SEQUENCES OF OPERATIONS
R.Algebra Slide -8
Results of a sequence of operations.
(a) πFNAME, LNAME, SALARY(σDNO=5(EMPLOYEE)).
(b) Using intermediate relations and renaming of attributes.
Fig 2
R.Algebra Slide -9
RENAME OPERATIONS
● Sometimes the attributes in the intermediate result relations are
renamed
– TEMP ← σ DNO=5 (EMPLOYEE)
– ρS(FIRSTNAME, LASTNAME, SALARY) ← π FNAME, LNAME, SLARAY (TEMP )
● RENAME Operation
– ρ (rho)
– S is new relation name
– B1, B2, …, Bn are new attribute names
Commutative operations:
– R ∪ S = S ∪ R and R ∩ S = S ∩ R
Associative operations:
– R ∪ ( S ∪ T) = (R ∪ S) ∪ T
Fig 4
R.Algebra Slide -14
SET Operation:
The CARTESIAN PRODUCT (CROSS JOIN)
CROSS JOIN = X
R(A1, A2, …, An) X S(B1, B2, …, Bm),
Fig 5c
R.Algebra Slide -18
BINARY JOIN OPERATIONS – INNER JOINS
Only matching tuples of inner joins ⋈ are kept in result relation
THETAJOIN θ, ⋈ θ
– R(A1, A2, …, An) JOIN <join condition> S(B1, B2, …, Bm), result
= Q (A1,, An, B1,, Bm)
– Only combination of tuples that satisfies join condition
appears in the result set
– Each <join condition> is of the form Ai θ Bi, where Ai and Bi
have the same domain. θ (theta) is one of the comparison
operators {=,<, ≤,>, ≥, ≠}
EQUIJOIN
– Similar to theta-join but only comparison operator used in the
<join condition> is =
– Join result carries superfluous attributes
R.Algebra Slide -19
BINARY JOIN OPERATIONS – INNER JOINS
NATURALJOIN *
– similar to EQUIJOIN, but superfluous attributes are removed
– Both join attributes must have same name, if not, apply
renaming operation first
– Q ← R * (<List1>), (<List2>) S
Fig 6
R.Algebra Slide -21
Results of two NATURAL JOIN * operations.
(a) RENAME attributes before doing JOIN
DEPT ← ρ(DNAME,DNUM,MGRSSN,MGRSTARTDATE)(DEPARTMENT)
Fig 7
R.Algebra Slide -22
The DIVISION ÷ operation
SQL does not implement DIVISION directly, it has a round about way to
dealing with it.
Fig 8b
R.Algebra Slide -23
The DIVISION ÷ operation
Example Fig 8a: Retrieve the names of employees who work on all the projects
that “John Smith” works on:
Step 1: retrieve list of PNO that John Smith works on
SMITH ← σ FNAME=‘JOHN’ AND LANME=‘SMITH’(EMPLOYEE)
SMITH_PNOS ← π PNO (WORKS_ON JOIN ⋈ ESSN=SSN SMITH)
Step 2: Create a relation to include a tuple <PNO, ESSN>
SSN_PNOS ← π ESSN, PNO (WORKS_ON)
Step 3: Divide SSN_PNOS by SMITH_PNOS to get the desired SSNS
SSNS (SSN) ← SSN_PNOS ÷ SMITH_PNOS
RESULT ← π ENAME, LNAME (SSNS * EMPLOYEE)
T←R÷S
Fig 8
R.Algebra Slide -24
Additional Relational Operations
Some common database requests that cannot be performed with
basic algebra operations (in the previous list) would require the
following additional Relational algebra operations:
where:
grouping attributes: list of attributes of relation R
function list: list of <function> such as SUM,
AVERAGE, MAXIUM, MINIUM, COUNT and
<attributes> pairs
Fig 10
R.Algebra Slide -28
LEFT, RIGHT, FULL OUTER JOINS
LEFT OUTER JOIN
– keep every tuple in the first, or left, relation R in R ]><| S, if
no matching tuple is found in S, then attributes of S in the join
result are filled or “padded” with null values
Fig 11
R.Algebra Slide -30
OUTER UNION operation
UNION of tuples from 2 relations with partially UNION compatible attributes,
e.g. R(X, Y) and S (X, Z)
UNION compatible attributes are represented once in the result, others are kept in
the result relation T (X, Y, Z)
Example:
Apply OUTER UNION to 2 relations STUDENT (Name, SSN, Department,
Advisor) and INSTRUCTOR(Name, SSN, Department, Rank).
Result relation:
STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)
Tuples with same (Name, SSN, Department) will appear once only
Tuples only in STUDENT will have a null for the Rank attribute
Tuples only in INSTRUCTOR will have a null for the Advisor attribute
– t1.A op tj.B
where op is one of the comparison operators in {=,<, ≤,>, ≥, ≠}
A is an attribute of the relation on which t1 ranges
B is an attribute of the relation on which tj ranges
– t1.A op C or C op tj.B
where C is a constant value
R.Algebra Slide -36
Existential and Universal Quantifiers
1. Every atom is a formula
– t is the only free variable (on left of | ) where d is bounded by the existential
quantifier.
– Conditions EMPLOYEE (t) and DEPARTMENT (d) specify range relations
for t and d.
– SELECT condition is d.DNAME =‘Research”
– JOIN condition is d.NUMBER=t.DNO
General expression:
– {x1, x2,,…, xn, | COND(x1, x2,,…, xn,xn+1,xn+2, …., xn+m )}
where x1, x2,,…, xn,xn+1,xn+2, …., xn+m are domain
variables that range over domains (of attributes) and
COND is a condition or formula of the domain
relational calculus