Vous êtes sur la page 1sur 41

# Relational Algebra Extra Problems

## 1. Consider a database with the following schema:

Person ( name, age, gender ) name is a key
Frequents ( name, pizzeria ) (name, pizzeria) is a key
Eats ( name, pizza ) (name, pizza) is a key
Serves ( pizzeria, pizza, price ) (pizzeria, pizza) is a key

Write relational algebra expressions for the following nine queries. (Warning: some of the later
queries are a bit challenging.)

If you know SQL, you can try running SQL queries to match your relational algebra expressions.
We've created a file for download with schema declarations and sample data. (See our quick
guide for SQL system instructions.) To check your queries, the correct results are found in the

a. Find all pizzerias frequented by at least one person under the age of 18.
b. Find the names of all females who eat either mushroom or pepperoni pizza (or both).
c. Find the names of all females who eat both mushroom and pepperoni pizza.
d. Find all pizzerias that serve at least one pizza that Amy eats for less than \$10.00.
e. Find all pizzerias that are frequented by only females or only males.
f.
For each person, find all pizzas the person eats that are not served by any pizzeria the
person frequents. Return all such person (name) / pizza pairs.
g. Find the names of all people who frequent only pizzerias serving at least one pizza they eat.
h. Find the names of all people who frequent every pizzeria serving at least one pizza they eat.
i.
Find the pizzeria serving the cheapest pepperoni pizza. In the case of ties, return all of the
cheapest-pepperoni pizzerias.

2. Consider a schema with two relations, R(A, B) and S(B, C), where all values are integers.
Make no assumptions about keys. Consider the following three relational algebra expressions:

a. A,C(RB=1S)

b. A(B=1R)C(B=1S)

c. A,C(ARB=1S)

Two of the three expressions are equivalent (i.e., produce the same answer on all databases),
while one of them can produce a different answer. Which query can produce a different
answer? Give the simplest database instance you can think of where a different answer is
produced.

3. Consider a relation R(A, B) that contains r tuples, and a relation S(B, C) that contains s tuples;
assume r > 0 and s > 0. Make no assumptions about keys. For each of the following relational
algebra expressions, state in terms of r and s the minimum and maximum number of tuples
that could be in the result of the expression.

a. RS(A,B)S

b. A,C(RS)

c. BR(BRBS)

d. (RR)R

e. A>BRA<BR

4. Two more exotic relational algebra operators we didn't cover are the semijoin and antijoin.
Semijoin is the same as natural join, except only attributes of the first relation are returned in
the result. For example, if we have relations Student(ID, name) and Enrolled(ID, course), and
not all students are enrolled in courses, then the query "Student Enrolled" returns the ID and
name of all students who are enrolled in at least one course. In the general case, E
1
E
2
returns
all tuples in the result of expression E
1
such that there is at least one tuple in the result of
E
2
with matching values for the shared attributes. Antijoin is the converse: E
1
E
2
retuns all
tuples in the result of expression E
1
such that there are no tuples in the result of E
2
with
matching values for the shared attributes. For example, the query "Student Enrolled" returns
the ID and name of all students who are not enrolled in any courses.

Like some other relational operators (e.g., intersection, natural join), semijoin and antijoin are
abbreviations - they can be defined in terms of other relational operators. Define E
1
E
2
in
terms of other relational operators. That is, give an equation "E
1
E
2
= ??", where ?? on the
right-hand side is a relational algebra expression that doesn't use semijoin. Similarly, give an
equation "E
1
E
2
= ??", where ?? on the right-hand side is a relational algebra expression that
doesn't use antijoin.

5. Consider a relation Temp(regionID, name, high, low) that records historical high and low
temperatures for various regions. Regions have names, but they are identified by regionID,
which is a key. Consider the following query, which uses the linear notation introduced at the
end of the relational algebra videos.

T1(rID,h)=regionID,highTemp

T2(rID,h)=regionID,lowTemp

T3(regionID)=rID(T1 h<high Temp)

T4(regionID)=rID(T2 l>low Temp)

T5(regionID)=regionIDTempT3

T6(regionID)=regionIDTempT4

Result(n)=name(Temp(T5T6))

State in English what is computed as the final Result. The answer can be articulated in a single
phrase.

1. Sample solutions; in general there are many correct expressions for each query.

a. pizzeria(age<18(Person)Frequents)

b. name(gender='female' (pizza='mushroom' pizza='pepperoni')(PersonEats))

c. name(gender='female' pizza='mushroom'(PersonEats)) name(gender='female'
pizza='pepperoni'(PersonEats))

d. pizzeria(name='Amy'(Eats)price<10(Serves))

e. (pizzeria(gender='female'(Person)Frequents) pizzeria(gender='male'(Person)Frequ
ents)) (pizzeria(gender='male'(Person)Frequents) pizzeria(gender='female'(Person)
Frequents))

f. Eatsname,pizza(FrequentsServes)

g. name(Person)name(Frequentsname,pizzeria(EatsServes))

h. name(Person)name(name,pizzeria(EatsServes)Frequents)

i. pizzeria(pizza='pepperoni'Serves) pizzeriaprice>price2pizzeria,price(pizza
='pepperoni'Serves) pizzeria2,price2(pizzeria,price(pizza='pepperoni'Serves))

Query results for SQL data:
a. Straw Hat, New York Pizza, Pizza Hut
b. Amy, Fay
c. Amy
d. Little Caesars, Straw Hat, New York Pizza
e. Little Caesars, Chicago Pizza, New York Pizza
f. Amy: mushroom, Dan: mushroom, Gus: mushroom
g. Amy, Ben, Dan, Eli, Fay, Gus, Hil
h. Fay
i. Straw Hat, New York Pizza

2. Query (c) is different. Let R = {(3, 4)} and S = {(1, 2)}. Then query (a) and (b) produce an
empty result while (c) produces {(3, 2)}.

3. a. Minimum = max(r, s) (if one relation is a subset of the other)
Maximum = r + s (if the relations are disjoint)

b. Minimum = 0 (if there are no shared B values)
Maximum = r x s (if all of the B values are the same)

c. Minimum = 0 (if there are no shared B values)
Maximum = min(r, s)
(if one relation's B values are a subset of the other's, and all B values are distinct)

d. (equivalent to R)
Minimum = r, Maximum = r

e. Minimum = 0 (if A = B in all tuples of R)
Maximum = r (if A <> B in all tuples of R)

4.
E1E2=schema(E1)(E1E2)

E1E2=E1schema(E1)(E1E2)
or
E1E2=E1(E1E2)

5. Names of regions with the highest high temperature and/or lowest low temperature

Relational Design Extra Problems

Functional Dependencies

1. Consider a relation R(A,B,C) and suppose R contains the following four tuples:

A B C
1 2 2
1 3 2
1 4 2
2 5 2

For each of the following functional dependencies, state whether or not the dependency is satisfied by
this relation instance.

(a) A B
(b) A C
(c) B A
(d) B C
(e) C A
(f) C B
(g) AB C
(h) AC B
(i) BC A

2. Which of the following rules for functional dependencies are correct (i.e., the rule holds over all
databases) and which are incorrect (i.e., the rule does not hold over some database)? For incorrect
rules, give the simplest example relation instance you can come up with where the rule does not hold.

(a) If A B and BC D, then AC D
(b) If AB C then A C
(c) If A B1,..,Bn and C1,..,Cm D and {C1,...,Cm} is a subset of {B1,..,Bn}, then A D
(d) If A C and B C and ABC D, then A D

3. Consider a relation R(A,B,C,D,E) with the following functional dependencies:

A B
CD E
E A
B D

Specify all minimal keys for R.

4. Consider a relation R(A,B,C,D,E,F,G,H) with the following functional dependencies:

A BCD
EFG H
F GH

(a) Based on these functional dependencies, there is one minimal key for R. What is it?
(b) One of the four functional dependencies can be removed without altering the key. Which one?

5. Consider a relation R(A,B,C,D,E,F) with the following set of functional dependencies:

A C
DE F
B D

(a) Based on these functional dependencies, there is one minimal key for R. What is it?

(b) Add to the above set of functional dependencies the dependency A B. Now suppose we want A to
be a key. Name one more functional dependency that, if added to the set, makes A a key. As an
additional restriction, the new functional dependency must have only one attribute on the left-hand
side and only one attribute on the right-hand side.

6. Consider the following sets of functional dependencies over a relation R(A,B,C).

F1 = {A B, B C}
F2 = {A B, A C}
F3 = {A B, AB C}

Which of these sets are equivalent? (Two sets of functional dependencies (FDs) F and F' are equivalent if
all FDs in F' follow from the ones in F, and all FDs in F follow from the ones in F'.)

Multivalued Dependencies

7. Consider a relation R(A,B,C) and suppose R contains the following five tuples:

A B C
1 2 3
1 3 2
1 2 2
3 2 1
3 2 3

For each of the following multivalued dependencies, state whether or not the dependency is satisfied by
this relation instance.

(a) A B
(b) A C
(c) B A
(d) B C
(e) C A
(f) C B

8. Consider a relation R(A,B,C,D) that satisfies A B and B C. Suppose R contains the tuples (1,2,3,4)
and (1,5,6,7). What other tuples must also be in R?

9. Which of the following rules for multivalued dependencies are correct (i.e., the rule holds over all
databases) and which are incorrect (i.e., the rule does not hold over some database)? For incorrect
rules, give the simplest example relation instance you can come up with where the rule does not hold.

(a) If A BC and A CD then A C
(b) If AB C then A C
(c) If A BC then A B and A C
(d) If A BCD and A C then A BD

Functional and Multivalued Dependencies

10. The relation R(A,B,C) satisfies an unknown set of functional and multivalued dependencies. All we
know about R is that it allows at least the following two instances:

A B C
1 2 3
1 3 4

A B C
1 3 3
2 2 4
3 3 3

Consider the following possible functional and multivalued dependencies:

(a) A B
(b) A C
(c) B A
(d) B C
(e) C A
(f) C B
(g) AB C
(h) AC B
(i) BC A
(j) A B
(k) A C
(l) B A
(m) B C
(n) C A
(o) C B

Which of these dependencies are ruled out by the two instances of R above?

11. Consider the following relational schema:

Car(make, model, year, color, dealer)

Each tuple in relation Car specifies that one or more cars of a particular make, model, and year in a
particular color are available at a particular dealer. For example, the tuple

(Honda, Civic, 2010, Blue, Fred's Friendly Folks)

indicates that 2010 Honda Civics in blue are available at the Fred's Friendly Folks car dealer.

For each of the following English statements, write one nontrivial functional or multivalued dependency
that best captures the statement.

(a) The model name for a car is trademarked by its make, i.e., no two makes can use the same model
name.

(b) Each dealer sells only one model of each make of car.

(c) If a particular make, model, and year of a car is available in a particular color at a particular dealer,
then that color is available at all dealers carrying the same make, model, and year.

(d) Based on your answers for (a)-(c), specify all minimal keys for relation Car.

Functional and Multivalued Dependencies, Normal Forms, Decomposition

12. Consider the following two relational schemas:

Schema 1: R(A,B,C,D)
Schema 2: R1(A,B,C), R2(B,D)

(a) Consider Schema 1 and suppose that the only functional dependencies that hold on the relations in
this schema are A B, C D, and all dependencies that follow from these. Is Schema 1 in Boyce-Codd
Normal Form (BCNF)?

(b) Consider Schema 2 and suppose that the only functional dependencies that hold on the relations in
this schema are A B, A C, B A, A D, and all dependencies that follow from these. Is Schema 2
in BCNF?

(c) Suppose we omit dependency A D from part (b). Is Schema 2 in BCNF?

(d) Consider Schema 1 and suppose that the only functional and multivalued dependencies that hold on
the relations in this schema are A BC, B D, B CD, and all dependencies that follow from these. Is
Schema 1 in Fourth Normal Form (4NF)?

(e) Consider Schema 2 and suppose that the only functional and multivalued dependencies that hold on
the relations in this schema are A BD, D C, A C, B D, and all dependencies that follow from
these. Is Schema 2 in 4NF?

13. Consider a relation R(A,B,C) and suppose R contains the following four tuples:

A B C
1 2 3
1 2 4
5 2 3
5 2 6

(a) Specify all completely nontrivial functional dependencies that hold on this instance of R.

(b) Specify all nontrivial multivalued dependencies that hold on this instance of R. Do not include
multivalued dependencies that are also functional dependencies.

(c) Is this instance of R in Boyce-Codd Normal Form (BCNF) with respect to the dependencies you gave in
part (a)? If not, specify all valid BCNF decompositions.

14. Consider a relation R(A,B,C,D,E) with the following functional dependencies:

AB C
BC D
CD E
DE A

(a) Specify all minimal keys for R.

(b) Which of the given functional dependencies are Boyce-Codd Normal Form (BCNF) violations?

(c) Give a decomposition of R into BCNF based on the given functional dependencies.

(d) Give a different decomposition of R into BCNF based on the given functional dependencies.

15. Consider the following relational schema:

UnivInfo(studID, studName, course, profID, profOffice)

Each tuple in relation UnivInfo encodes the fact that the student with the given ID and name took the
given course from the professor with the given ID and office. Assume that students have unique IDs but
not necessarily unique names, and professors have unique IDs but not necessarily unique offices. Each
student has one name; each professor has one office.

(a) Specify a set of completely nontrivial functional dependencies for relation UnivInfo that encodes the
assumptions described above and no additional assumptions.

(b) Based on your functional dependencies in part (a), specify all minimal keys for relationUnivInfo.

(c) Is UnivInfo in Boyce-Codd Normal Form (BCNF) according to your answers to (a) and (b)? If not, give a
decomposition of UnivInfo into BCNF.

(d) Now add the following two assumptions: (1) No student takes two different courses from the same
professor; (2) No course is taught by more than one professor (but a professor may teach more than one
course). Specify additional functional dependencies to take these new assumptions into account.

(e) Based on your functional dependencies for parts (a) and (d) together, specify all minimal keys for
relation UnivInfo.

(f) Is UnivInfo in BCNF according to your answers to (d) and (e)? If not, give a decomposition
ofUnivInfo into BCNF.

16. Consider the following relational schema:

Sale(clerk, store, city, date, item, size, color) // a clerk sold an item on a particular day
Item(item, size, color, price) // prices and available sizes and colors for items

Make the following assumptions, and only these assumptions, about the real world being modeled:

-- Each clerk works in one store.
-- Each store is in one city.
-- A given item always has the same price, regardless of size or color.
-- Each item is available in one or more sizes and one or more colors, and each item is available in all
combinations of sizes and colors for that item.

Sale does not contain duplicates: If a clerk sells more than one of a given item in a given size and color
on a given day, still only one tuple appears in relation Sale to record that fact.

(a) Specify a set of completely nontrivial functional dependencies for relations Sale and Itemthat
encodes the assumptions described above and no additional assumptions.

(b) Based on your functional dependencies in part (a), specify all minimal keys for relationsSale and Item.

(c) Is the schema in Boyce-Codd Normal Form (BCNF) according to your answers to (a) and (b)? If not,
give a decomposition into BCNF.

(d) Now consider your decomposed relations from part (c), or the original relations if you did not need
to decompose them for part (c). Specify a set of nontrivial multivalued dependencies for
relations Sale and Item that encodes the assumptions described above and no additional assumptions.
Do not include multivalued dependencies that also are functional dependencies.

(e) Are the relations you used in part (d) in Fourth Normal Form (4NF) according to your answers for (a)-
(d)? If not, give a decomposition into 4NF.

1.
(a) not satisfied
(b) satisfied
(c) satisfied
(d) satisfied
(e) not satisfied
(f) not satisfied
(g) satisfied
(h) not satisfied
(i) satisfied

2.
(a) is correct
(b) is incorrect - R(A,B,C) with R={(1,2,3),(1,4,5)}
(c) is correct
(d) is incorrect - R(A,B,C,D) with R={(1,2,3,4),(1,5,3,6)}

3. AC, BC, CD, CE

4.
(a) AF
(b) EFG H

5.
(a) ABE
(b) A E (or B E or C E or D E)

6.
F2 and F3 are equivalent - A* = ABC, B* = B, C* = C
F1 is not equivalent - A* = ABC, B* = BC, C* = C

7.
(a) not satisfied
(b) not satisfied
(c) not satisfied
(d) not satisfied
(e) satisfied
(f) satisfied

8. (1,2,6,7) (1,5,3,4) (1,2,3,7) (1,2,6,4) (1,5,6,4) (1,5,3,7)

9.
(a) correct
(b) incorrect - R(A,B,C,D) with R={(1,2,3,4),(1,5,6,7)}
(c) incorrect - R(A,B,C,D) with R={(1,2,3,4),(1,5,6,7),(1,2,3,7),(1,5,6,4)}
(d) correct

10. Ruled out - (a) (b) (c) (e) (i) (j) (k)

11.
(a) model make
(b) dealer,make model
(c) make,model,year color (or make,model,year dealer)
(d) (make,year,color,dealer), (model,year,color,dealer)

12.
(a) No - neither A nor C are keys, so A B and C D are BCNF violations
(b) Yes
(c) Yes
(d) No - B is not a key, so B C is a 4NF violation
(e) Yes

13.
(a) A B, C B, AC B
(b) A C, C > A
(c) No - A is not a key in A B and C is not a key in C B
Decomposition 1: R1(A,B), R2(A,C)
Decomposition 2: R1(A,C), R2(B,C)

14.
(a) AB, BC, BDE
(b) CD E, DE A
(c) R1(C,D,E) R2(A,B,C,D)
(d) R1(A,D,E) R2(C,D,E), R3(B,C,D)

15.
(a) studID studName, profID profOffice
(b) (studID,profID,course)
(c) No - neither studID nor profID is a key
Decomposition: R1(studID,studName), R2(profID,profOffice), R3(studID,course,profID)
(d) studID,profID course, course profID
(e) (studID,profID) (studID,course)
(f) No - studID, profID, and course are not keys
Decomposition: R1(studID,studName), R2(profID,profOffice), R3(studID,course),R4(course,profID)

16.
(a) Sale: clerk store, store city; Item: item price
(b) Sale: (clerk,date,item,size,color), Item: (item,size,color)
(c) No. BCNF
decomposition: S1(clerk,store), S2(store,city), S3(clerk,date,item,size,color),I1(item,price), I2(item,size,c
olor)
(d) Sale: none; Item: item size, item color (Also item size,price, item color,price)
(e) Replace I2 with I3(item,size), I4(item,color)

UML Extra Problems

1. Consider the following UML diagram.

(a) If there are 6 authors, what's the minimum and maximum number of books? What's the
minimum and maximum number of readers?
(b) If there are 6 readers, what's the minimum and maxiumum number of books? What's the
minumum and maximum number of authors?

2. Consider a tiny social network containing high school students and their "crushes" (desired
romantic relationships). Each student may have a crush on at most one other student, and
associated with each crush is the length of time the crush has been going on. Students have a
name and a grade, and names are unique. Draw a UML diagram that models this information.
Make sure to capture the asymmetry and multiplicity of the crush relationship.

3. Consider a class Book with four subclasses: Anthology, Fiction, Children, and Nonfiction. Is
the subclassing relationship overlapping or disjoint (exclusive)? Is it complete or incomplete
(partial)?

4. Consider the following UML diagram.

Separate the following statements into those that are true and those that are false.

(a) No two companies can have the same name
(b) No two employees can have the same name
(c) No two companies can be at the same address
(d) No two employees can work at the same address
(e) Each employee works for at least one company
(f) No employees work for more than one company
(g) Each company has at least one employee
(h) Two employees with the same name cannot work for the same company
(i) Two employees with the same name cannot work for different companies

5. Consider the following UML diagram.

(a) According to the diagram, what are the minimum and maximum total number of instructors
for a given course?

(b) According to the diagram, what is the minimum and maximum teaching load (number of
courses) for professors? For assistants?

(c) Translate the UML diagram to a relational schema. There are several possible automatic
translations; use the translation for subclassing most appropriate for the specified properties as
described in the video. If it makes sense to eliminate any association-class relations as
described in the video, do so.

(d) Specify a minimal key for each relation in your solution to part (c).

(e) Suppose by default attribute values cannot contain null. Does your solution to part (c)
require any attributes to permit null values?

1.
(a) Books: minimum 3, maximum 18; Readers: minimum 0, maximum 72
(b) Books: minimum 2, maximum unlimited; Authors: minimum 1, maximum unlimited

2.

3. Overlapping (e.g., Fiction and and Children) and complete (all books
are Fiction orNonfiction).

4. (a),(e),(f) are true; the rest are false

5.
(a) minimum: 1, maximum: 5
(b) Professor minimum/maximum: 0/1, Assistant minimum/maximum: 3/6
(c) Professor(name,office,rank,course#,rating)
Assistant(name,office,years)
Course(course#,dept)
ATeach(name,course#)
(d) Professor: name
Assistant: name
Course: course#
ATeach: (name,course#)
(e) Professor.course# and Professor.rating must permit nulls

Constraints Movie-Ratings Extra
Problems

You will enhance the movie-ratings database that was also used for the SQL Movie-Ratings Query
Exercises. In this set of exercises you will declare integrity constraints on the data, and you will verify
that they are being enforced by the underlying database management system. You will experiment with
several types of constraints: key constraints, non-null constraints,attribute-based and tuple-based check
constraints, and referential integrity. A SQL file to set up the original schema and data for the movie-
ratings database is downloadable here. You will be using the same data, but modifying the schema to
add constraints. The original schema and data can be loaded as specified in the file into SQLite, MySQL,
or PostgreSQL. However, currently MySQL does not enforce constraints (even though it accepts some of
them syntactically). For these exercises, currently you must use SQLite or PostgreSQL. See our quick
guide for installing and using all three systems.

Schema:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release year, and a director.

Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.

Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certainratingDate.

Unlike most of our other exercises, which are a set of queries to be written individually, this exercise set
involves bigger chunks of work followed by a series of tests. If the constraints are implemented
correctly, the tests will generate or not generate errors as specified. To verify that the referential
integrity policies are implemented correctly, there is a check of the final database state.

Task 1: Constraint Declarations

Modify the three CREATE TABLE statements in the movie-rating database to add the following ten
constraints. (Note: You may want to examine the date format in the data file so you can specify date-
related constraints as string comparisons.)

Key Constraints

1. mID is a key for Movie
2. (title,year) is a key for Movie
3. rID is a key for Reviewer
4. (rID,mID,ratingDate) is a key for Rating but with null values allowed

Non-Null Constraints

5. Reviewer.name may not be NULL
6. Rating.stars may not be NULL

Attribute-Based Check Constraints

7. Movie.year must be after 1900
8. Rating.stars must be in {1,2,3,4,5}
9. Rating.ratingDate must be after 2000

Tuple-Based Check Constraints

10. "Steven Spielberg" movies must be before 1990 and "James Cameron" movies must be after 1990

After creating the three tables using your modified CREATE TABLE statements, you should be able to
load the original data (i.e., execute all of the INSERT statements in the data file) without any errors.

Task 3: Constraint Enforcement

Each of the following commands should generate an error.

11. update Movie set mID = mID + 1;

12. insert into Movie values (109, 'Titanic', 1997, 'JC');

13. insert into Reviewer values (201, 'Ted Codd');

14. update Rating set rID = 205, mID = 104;

15. insert into Reviewer values (209, null);

16. update Rating set stars = null where rID = 208;

17. update Movie set year = year - 40;

18. update Rating set stars = stars + 1;

19. insert into Rating values (201, 101, 1, '1999-01-01');

20. insert into Movie values (109, 'Jurassic Park', 1993, 'Steven Spielberg');

21. update Movie set year = year-10 where title = 'Titanic';

None of the following commands should generate errors.

22. insert into Movie values (109, 'Titanic', 2001, null);

23. update Rating set mID = 109;

24. update Movie set year = 1901 where director <> 'James Cameron';

25. update Rating set stars = stars - 1;

Task 4: Referential Integrity Declarations

Further modify one or more of your CREATE TABLE statements to include the following referential
integrity constraints and policies.

26. Referential integrity from Rating.rID to Reviewer.rID
Reviewers deleted: set null
All others: error

26. Referential integrity from Rating.mID to Movie.mID
All others: error

Recreate the three tables using your modified CREATE TABLE statements. You should be able to load the
original data (i.e., execute all of the INSERT statements in the data file) without any errors.

Task 6: Referential Integrity Enforcement

Each of the following commands should generate an error.

Important Note: If using SQLite, make sure to turn on referential integrity checking with the command
"pragma foreign_keys = on;"

27. insert into Rating values (209, 109, 3, '2001-01-01');

28. update Rating set rID = 209 where rID = 208;

29. update Rating set mID = mID + 1;

30. update Movie set mID = 109 where mID = 108;

None of the following commands should generate errors, but they will make additional database
modifications according to the referential-integrity policies.

31. update Movie set mID = 109 where mID = 102;

32. update Reviewer set rID = rID + 10;

33. delete from Reviewer where rID > 215;

34. delete from Movie where mID < 105;

Final Check

35. Check the resulting database by writing SQL queries to compute:
(a) The sum of non-null rIDs in the Rating table -- should be 853
(b) The number of tuples in Rating with null rIDs -- should be 3

Triggers Social-Network Extra
Problems

You will enhance the social-network database that was also used for the SQL Social-Network Query
Exercises. In this set of exercises you will create triggers that add various behaviors to the data, and you
will verify that the triggers are enforcing the desired behavior. You will implement similar triggers to
those used in the SQL Social-Network Triggers Exercises, but these exercises explore in more depth the
interaction among multiple triggers, and trigger behavior with and without recursive triggering enabled.
A SQL file to set up the schema and data for the social-network database is downloadable here. The
schema and data can be loaded as specified in the file into SQLite, MySQL, or PostgreSQL. However,
currently only SQLite and PostgreSQL provide a rich enough trigger language for these exercises.
Furthermore since the current mechanisms for creating triggers in PostgreSQL are quite cumbersome,
we recommend SQLite. See our quick guide for installing and using all three systems.

Schema:
Highschooler ( ID, name, grade )
English: There is a high school student with unique ID and a given first name in a certain grade.

Friend ( ID1, ID2 )
English: The student with ID1 is friends with the student with ID2. Friendship is mutual, so if (123, 456) is
in the Friend table, so is (456, 123).

Likes ( ID1, ID2 )
English: The student with ID1 likes the student with ID2. Liking someone is not necessarily mutual, so if
(123, 456) is in the Likes table, there is no guarantee that (456, 123) is also present.

For your convenience, here is a graph showing the various connections between the people in our
database. 9th graders are blue, 10th graders are green, 11th graders are yellow, and 12th graders are
purple. Undirected black edges indicate friendships, and directed red edges indicate that one person
likes another person.

After each exercise to create a trigger or set of triggers, we include one or more data modification
statements that should activate the triggers, followed by one or more queries to check that the final
database state is correct. The query results over the correct final database for each exercise can be
viewed by pressing the button at the bottom of the page.

Although the exercises are presented as a sequence, not all of them depend on each other. Specifically:
-- Exercises 1-2 can be worked together from the original database, independently of the other
exercises. If performed correctly, these two exercises leave the database in its original state.
-- Exercise 3 can be worked from the original database, independently of the other exercises. If
performed correctly, the command specified at the end of the exercise will return the database to its
original state.
-- Exercises 4-6 can be worked together from the original database, independently of the other
exercises. These three exercises do not leave the database in its original state.

1. Write triggers to maintain symmetry in friend relationships: If (A,B) is in Friend then (B,A) should be
too. The initial database obeys this constraint; your triggers should monitor insertions, deletions, and
updates and perform corresponding modifications to maintain symmetry. Make sure not to create
duplicate tuples in Friend.

Begin with recursive triggering disabled ("pragma recursive_triggers = off;" in SQLite). Run the following
statements, which remove the friendship between Brittany and Kris, add a friendship between Brittany
and Andrew, and change Jessica to be friends with Austin and Andrew instead of Alexis and Kyle.
pragma recursive_triggers=off;
delete from Friend where ID1 = 1641 and ID2 = 1468;
insert into Friend values (1641,1782);
update Friend set ID2 = 1316 where ID1 = 1501 and ID2 = 1247;
update Friend set ID2 = 1782 where ID1 = 1501 and ID2 = 1934;
Check the resulting database by computing the number of tuples in the Friend table. Compare your

2. Continuing with the previous problem, now make sure your solution also works with recursive
triggering enabled ("pragma recursive_triggers = on;"). Run the following statements, which undo the
previous changes -- remove the friendship between Brittany and Andrew, add a friendship between
Brittany and Kris, and change Jessica to be friends with Alexis and Kyle instead of Austin and
Andrew. Hint: If you get the error "too many levels of recursion," your triggers are probably inserting
duplicate tuples.
pragma recursive_triggers = on;
delete from Friend where ID1 = 1641 and ID2 = 1782;
insert into Friend values (1641,1468);
update Friend set ID2 = 1247 where ID1 = 1501 and ID2 = 1316;
update Friend set ID2 = 1934 where ID1 = 1501 and ID2 = 1782;
Check the resulting database by writing a SQL query to compute the number of tuples in theFriend table.

3. Write triggers to manage the grade attribute of new highschoolers. If the inserted tuple has a non-
null value for grade, don't permit the insert unless the grade is between 9 and 12. If the inserted tuple
has a null value for grade, change it to 9.

Run the following statements to insert new highschoolers. To be on the safe side, disable recursive
triggering ("pragma recursive_triggers = off;").
pragma recursive_triggers=off;
insert into HighSchooler values (2121, 'Caitlin', 7);
insert into HighSchooler values (2121, 'Caitlin', 20);
insert into HighSchooler values (2121, 'Caitlin', null);
insert into Highschooler select ID+1000, name, grade+1 from Highschooler;
Check the resulting database by writing one or more SQL queries to compute:
(a) The number of tuples in the Highschooler table
(b) The average grade level in the Highschooler table

Before proceeding to Exercise 4, delete the new Highschooler tuples to bring the database back to its
original state:
delete from Highschooler where ID > 2000;

4. Write a trigger that automatically deletes students when they graduate, i.e., when their grade is
updated to exceed 12. Additionally, write a trigger or triggers that remove all friendships and likes
relationships of deleted students.

Run the following statement to move Austin, Kyle, and Logan up a grade. To be on the safe side, disable
recursive triggering ("pragma recursive_triggers = off;").
update Highschooler set grade = grade + 1
where name = 'Austin' or name = 'Kyle' or name = 'Logan';
Check the resulting database by writing SQL queries to compute:
(a) The number of highschoolers remaining
(b) The number of Friend relationships (including symmetric ones)
(c) The number of Likes relationships

5. Write a trigger so when a student is moved ahead one grade, then so are all of his or her friends.
Your trigger from problem 4 should delete those students who "graduate" as a result.

Make sure recursive triggering is disabled ("pragma recursive_triggers = off;"), then run the following
statement to move Andrew to 11th grade.
pragma recursive_triggers=off;
update Highschooler set grade = 11 where name = 'Andrew';
Check the resulting database by writing one or more SQL queries to compute:
(a) The number of tuples in the Highschooler table
(b) The average grade level in the Highschooler table

Now run the following statements to move Tiffany to 11th grade and Jessica to 9th grade.
pragma recursive_triggers=off;
update Highschooler set grade = 11 where name = 'Tiffany';
update Highschooler set grade = 9 where name = 'Jessica';
Check the resulting database by writing one or more SQL queries to compute:
(c) The number of tuples in the Highschooler table
(d) The average grade level in the Highschooler table

6. Continuing with the previous problem, now explore what happens when recursive triggering is
enabled ("pragma recursive_triggers = on;"). Run the following statement to move Cassandra to 10th
grade. Hint: Did you get the error "too many levels of recursion"? If so, you may need to add a condition
to ensure updates don't continue indefinitely, preventing the graduation-delete rule from being
activated.
pragma recursive_triggers = on;
update Highschooler set grade = 10 where name = 'Cassandra';
Check the resulting database by writing one or more SQL queries to compute:
(a) The names of all remaining highschoolers
(b) The number of Friend relationships (including symmetric ones)
(c) The number of Likes relationships

1. 40
2. 40
3. (a) 30 (b) 10.6333

4. (a) 14 (b) 30 (c) 8

5. (a) 12 (b) 10.333 (c) 12 (d) 10.333

6. (a) John (b) 0 (c) 0
Transactions Extra Problems

1. Consider two tables R(A,B) and S(C). Below are pairs of transactions. For each pair, decide whether it
is possible for nonserializable behavior to be exhibited when executing the transactions concurrently,
while respecting their specified isolation levels. Assume individual statements are executed atomically,
and each transaction executes to completion.

(a) Transaction 1:
Set Transaction Isolation Level Read Committed;
Select count(*) From R;
Select count(*) From S;
Commit;
Transaction 2:
Set Transaction Isolation Level Serializable;
Insert Into R Values (1,2);
Insert Into S Values (3);
Commit;

(b) Transaction 1:
Set Transaction Isolation Level Read Committed;
Select count(*) From R;
Select count(*) From S;
Commit;
Transaction 2:
Set Transaction Isolation Level Serializable;
Insert Into R Values (1,2);
Insert Into R Values (3,4);
Commit;

(c) Transaction 1:
Set Transaction Isolation Level Repeatable Read;
Select count(*) From R;
Select count(*) From S;
Select count(*) From R;
Commit;
Transaction 2:
Set Transaction Isolation Level Serializable;
Insert Into R Values (1,2);
Commit;

2. Consider table Item(name,price) where name is a key, and the following two concurrent transactions.

T1:
Begin Transaction;
S1: Insert Into Item Values ('scissors',40);
S2: Update Item Set price = price + 30 Where name = 'pencil';
Commit;
T2:
Begin Transaction;
S3: Select Avg(price) As a1 From Item;
S4: Select Avg(price) As a2 From Item;
Commit;
Assume that the individual statements S1, S2, S3, and S4 always execute atomically. Suppose initially
there are two tuples in Item: (pencil,20) and (pen,30). Each transaction runs once and commits.
Transaction T1 always executes with isolation level Serializable.

(a) If transaction T2 executes with isolation level Serializable, what possible pairs of values a1 and a2 are
returned by T2?

(b) If transaction T2 executes with isolation level Repeatable-Read, what possible pairs of values a1 and
a2 are returned by T2?

(c) If transaction T2 executes with isolation level Read-Committed, what possible pairs of values a1 and
a2 are returned by T2?

(d) If transaction T2 executes with isolation level Read-Uncommitted, what possible pairs of values a1
and a2 are returned by T2?

3. Consider table Person(name,age) and the following transaction T:
Begin Transaction;
Q1: Select Avg(age) From Person;

Q2: Select Avg(age) From Person;
Commit;
Assume queries Q1 and Q2 always execute atomically.

(a) Suppose all other transactions in the system are declared as Serializable and Read-Only. What is the
weakest isolation level needed for transaction T to ensure that queries Q1 and Q2 will always get the

(b) Suppose all other transactions in the system are declared as Serializable, and they only involve
queries, updates, and deletions. What is the weakest isolation level needed for transaction T to ensure that
queries Q1 and Q2 will always get the same result? Choices are:Read-Uncommitted, Read-

(c) Suppose all other transactions in the system are declared as Serializable, and we know nothing else
about them. What is the weakest isolation level needed for transaction T to ensure that queries Q1 and Q2
will always get the same result? Choices are: Read-Uncommitted, Read-Committed, Repeatable-

Now consider the following variation, where the two queries are in two different transactions:

T1:
Begin Transaction;
Q1: Select Avg(age) From Person;

Commit;
T2:
Begin Transaction;
Q2: Select Avg(age) From Person;

Commit;
(d) Suppose both transactions T1 and T2 execute with isolation level Serializable. Consider scenarios (a),
(b), and (c) above for all other transactions in the system. For which of these scenarios, if any, are we
guaranteed to always get the same result for Q1 and Q2?

4. Consider table Worker(name,pay) where name is a key, and the following two concurrent transactions.

T1:
Begin Transaction
S1: update Worker set pay = 2*pay where name = 'Amy'
S2: update Worker set pay = 3*pay where name = 'Amy'
Commit
T2:
Begin Transaction
S3: update Worker set pay = pay-20 where name = 'Amy'
S4: update Worker set pay = pay-10 where name = 'Amy'
Commit
Assume that the individual statements S1, S2, S3, and S4 always execute atomically. Let Amy's pay be 50
before either transaction begins execution.

(a) Suppose both transactions T1 and T2 execute to completion with isolation levelSerializable. What are
the possible values for Amy's final pay?

(b) Suppose both transactions T1 and T2 execute to completion with isolation level Read-Committed.
What are the possible values for Amy's final pay?

(c) Suppose transaction T1 executes with isolation level Read-Committed, transaction T2 executes with
isolation level Read-Uncommitted, and both transactions execute to completion. What are the possible
values for Amy's final pay?

(d) Suppose both transactions T1 and T2 execute to completion with isolation level Read-Uncommitted.
What are the possible values for Amy's final pay?

(e) Suppose both transactions T1 and T2 execute with isolation level Serializable. Transaction T1
executes to completion, but transaction T2 rolls back after statement S3 and does not re-execute. What are
the possible values for Amy's final pay?

1.
(a) Yes, nonserializable behavior is possible (two statements of Transaction 1 execute before and after
Transaction 2, respectively)
(b) No, nonserializable behavior is not possible (state of S is same before and after Transaction 2)
(c) Yes, nonserializable behavior is possible (first and third statements of Transaction 1 execute before
and after Transaction 2, respectively)

2.
(a) (25,25) (40,40)
(b) (25,25) (40,40)
(c) (25,25) (25,40) (40,40)
(d) (25,25) (25,30) (25,40) (30,30) (30,40) (40,40)

3.
(c) Serializable
(d) Only (a)

4.
(a) 120 270
(b) 120 270
(c) 120 210 270
(d) 120 150 170 210 230 270
(e) 300

Defining and Using Views Extra
Problems

You will define (virtual) views over the movie-ratings database that was also used for the SQL Movie-
Rating Query Exercises, and you will write queries that reference the views instead of or in addition to the
base tables. A SQL file to set up the schema and data for the movie-ratings database is
downloadable here. This schema and data can be loaded as specified in the file into SQLite, MySQL, or
PostgreSQL; see our quick guide for installing and using these systems. These exercises can be performed
on any of the three systems.

Schema:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release year, and a director.

Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.

Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certainratingDate.

Each exercise asks you to create a view, and then write a query using that view, perhaps along with
previously created views and/or the base tables. The correct results for the queries over the provided data
can be seen by pressing the button at the bottom of the page.

1. Create a view called TNS containing title-name-stars triples, where the movie (title) was reviewed by a
reviewer (name) and received the rating (stars). Then referencing only view TNSand table Movie, write a
SQL query that returns the lastest year of any movie reviewed by Chris Jackson. You may assume movie
names are unique.

2. Referencing view TNS from Exercise 1 and no other tables, create a view RatingStatscontaining each
movie title that has at least one rating, the number of ratings it received, and its average rating. Then
referencing view RatingStats and no other tables, write a SQL query to find the title of the highest-
average-rating movie with at least three ratings.

3. Create a view Favorites containing rID-mID pairs, where the reviewer with rID gave the movie with
mID the highest rating he or she gave any movie. Then referencing only view Favorites and
tables Movie and Reviewer, write a SQL query to return reviewer-reviewer-movie triples where the two
(different) reviewers have the movie as their favorite. Return each pair once, i.e., don't return a pair and
its inverse.

1. 1982

2. Raiders of the Lost Ark

3. These tuples can be returned in any order, and it's okay if the reviewer names are reversed
(Sarah Martinez, Mike Anderson, Gone with the Wind)
(Daniel Lewis, Elizabeth Thomas, Snow White)
(Brittany Harris, Chris Jackson, Raiders of the Lost Ark)

View Modifications Using Triggers
Extra Problems

You will create (virtual) views over the social-network database that was also used for the SQL Social-
Network Query Exercises, and you will enable modifications to these views using triggers. A SQL file to
set up the schema and data for the social-network database is downloadable here. The schema and data
can be loaded as specified in the file into SQLite, MySQL, or PostgreSQL. However, currently only SQLite
triggers and PostgreSQL rules can be used for these exercises. (MySQL instead supports automatic view
modifications, as explored in the Automatic View Modifications Exercises.) See our quick guide for
installing and using all three systems.

Schema:
Highschooler ( ID, name, grade )
English: There is a high school student with unique ID and a given first name in a certain grade.

Friend ( ID1, ID2 )
English: The student with ID1 is friends with the student with ID2. Friendship is mutual, so if (123, 456) is
in the Friend table, so is (456, 123).

Likes ( ID1, ID2 )
English: The student with ID1 likes the student with ID2. Liking someone is not necessarily mutual, so if
(123, 456) is in the Likes table, there is no guarantee that (456, 123) is also present.

For your convenience, here is a graph showing the various connections between the people in our
database. 9th graders are blue, 10th graders are green, 11th graders are yellow, and 12th graders are
purple. Undirected black edges indicate friendships, and directed red edges indicate that one person
likes another person.

In each exercise you are first asked to create a view, and then to use triggers (SQLite) or rules
(PostreSQL) to enable one or more types of modifications to the view. Then you are given some
modification commands to execute against the view. To verify correctness of your view-modification
triggers, you are asked to compare your final database against our results.

1. Create a view called JordanFriend(name,grade) containing the names and grades of students with a
friend named Jordan. Your view should initially contain (in some order):

Gabriel 9
Tiffany 9
Andrew 12
Kyle 12
Logan 12

Create a trigger (SQLite) or rule (PostgreSQL) that enables update commands to be executed on
view JordanFriend. Updates should propagate to the Highschooler table under the assumption that
(name,grade) pairs uniquely identify students. Do not allow updates that take the grade out of the 9-12
range, or that violate uniqueness of (name,grade) pairs; otherwise all updates should be permitted.

(a) Execute the following update command:
update JordanFriend set grade = grade + 2;
Compare your resulting view JordanFriendwith ours (in Query Results at the bottom of the page).

(b) Then execute the following update commands:
update JordanFriend set name = 'Tiffany', grade = 10 where name = 'Gabriel';
update JordanFriend set name = 'Jordan' where name = 'Tiffany';
Compare your resulting view JordanFriend with ours. (What do you think about the result?)

Reload the original social-network database before beginning the next exercise.

2. Create a view called OlderFriend(ID1,name1,grade1,ID2,name2,grade2) containing the names and
grades of friends who are at least two years apart in school, with name1/grade1 being the younger
student. After reloading the original database, your view should initially contain (in some order):

1381 Tiffany 9 1247 Alexis 11
1709 Cassandra 9 1247 Alexis 11
1782 Andrew 10 1304 Jodan 12

Create triggers (SQLite) or rules (PostgreSQL) that enable deletions and insertions to be executed on
view OlderFriend. For insertions, only permit new friendships that obey the restrictions of the view and
do not create duplicates. Make sure to maintain the symmetric nature of the underlying Friend relation
even though OlderFriend is not symmetric: a tuple (A,B) is in Friend if and only if (B,A) is also in Friend.

(a) Execute the following deletions:
delete from OlderFriend where name2 = 'Alexis';
delete from OlderFriend where ID1 = 1381;
Check the resulting database by writing SQL queries to compute the number of tuples in theFriend table
and OlderFriend view. Compare your results against ours (in Query Results at the bottom of the page).

(b) Then execute the following insertions:
insert into OlderFriend values (1510, 'Jordan', 9, 1304, 'Jordan', 12);
insert into OlderFriend values (1510, 'Jordan', 9, 1468, 'Kris', 10);
insert into OlderFriend values (1510, 'Jordan', 9, 1468, 'Kris', 11);
insert into OlderFriend values (1510, 'John', 9, 1247, 'Alexis', 11);
insert into OlderFriend
select H1.ID as ID1, H1.name as name1, H1.grade as grade1,
H2.ID as ID2, H2.name as name2, H2.grade as grade2
from Highschooler H1, Highschooler H2 where H1.grade >= 10;
Check the resulting database by writing SQL queries to compute the number of tuples in theFriend table
and OlderFriend view. Compare your results against ours.

1.
(a) In any order:
Gabriel 9
Tiffany 11
Andrew 12
Kyle 12
Logan 12

(b) In any order:
Jordan 9
Jordan 10
Jordan 11
Cassandra 9
Andrew 12
Alexis 11
Kyle 12
Logan 12

2.
(a) Friend contains 36 tuples and OlderFriend contains 1 tuple
(b) Friend contains 68 tuples and OlderFriend contains 17 tuples

Automatic View Modifications Extra
Problems

You will create (virtual) views over the movie-ratings database that was also used for the SQL Movie-
Rating Query Exercises, and you will experiment with how modifications over the views behave when
they are handled by the underlying database system automatically. A SQL file to set up the schema and
data for the movie-ratings database is downloadable here. This schema and data can be loaded as
specified in the file into SQLite, MySQL, or PostgreSQL. However, currently only MySQL supports
automatic view modifications. (SQLite and PostgreSQL instead require users to implement triggers or
rules for view modifications, as explored in the View Modifications Using Triggers Exercises.) Thus, you
must currently use MySQL for these exercises. See our quick guide for installing and using all three
systems.

Schema:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release year, and a director.

Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.

Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certainratingDate.

Each exercise has the same form: It asks you to create a view, and then experiment with modifications
over the view, with and without WITH CHECK OPTION. Sample correct answers can be seen by pressing
the button at the bottom of the page. Sample answers assume that each exercise is solved using the
original data in the movie-rating database.

1. Create a view called Post80 that contains the title and year of movies made after 1980. Do not use
WITH CHECK OPTION in the CREATE VIEW statement.

(a) Is it possible to have a "mishandled insertion," where an insert into view Post80 is accepted, it causes
a modification to the underlying database, but the insertion is not reflected properly in the view
afterward? (For this problem and all others in this set, assume insertion commands specify values for all
attributes.) If so, give an example.

(b) Is it possible to have a "mishandled deletion," where a delete command on view Post80 is accepted,
it causes a modification to the underlying database, but the deletion is not reflected properly in the view
afterward? If so, give an example.

(c) Is it possible to have a "mishandled update," where an update command on view Post80 is accepted,
it causes a modification to the underlying database, but the update is not reflected properly in the view
afterward? If so, give an example.

(d) Drop view Post80 and create it again, this time using WITH CHECK OPTION. For each of (a), (b), and
(c) where you gave an example of a mishandled modification, is the modification still permitted?

2. Create a view called Above3 that contains the mID and title of movies that have at least one rating of
more than 3 stars. Write the view in a form that makes it modifiable according to MySQL and the SQL
standard. Do not use WITH CHECK OPTION in the CREATE VIEW statement.

(a) Is it possible to have a "mishandled insertion" into view Above3 as described in Exercise 1(a)? If so,
give an example.

(b) Is it possible to have a "mishandled deletion" from view Above3 as described in Exercise 1(b)? If so,
give an example.

(c) Is it possible to have a "mishandled update" on view Above3 as described in Exercise 1(c)? If so, give
an example.

(d) Drop view Above3 and create it again, this time using WITH CHECK OPTION. For each of (a), (b), and
(c) where you gave an example of a mishandled modification, is the modification still permitted?

3. Create a view called NoDate(mID,rID,title,name) that contains movie-reviewer pairs where the
reviewer didn't give a date for the review. Do not use WITH CHECK OPTION in the CREATE VIEW
statement.

(a) Is it possible to have a "mishandled insertion" into view NoDate, as described in Exercise 1(a)? If so,
give an example.

(b) Is it possible to have a "mishandled deletion" from view NoDate, as described in Exercise 1(b)? If so,
give an example.

(c) Is it possible to have a "mishandled update" on view NoDate, as described in Exercise 1(c)? If so, give
an example.

(d) Drop view NoDate and create it again, this time using WITH CHECK OPTION. For each of (a), (b), and
(c) where you gave an example of a mishandled modification, is the modification still permitted?

4. Create a view called Liked(title,name) that contains title-name pairs where the reviewer (name) gave
the movie (title) more than 3 stars.

(a) Is it possible to have a "mishandled insertion" into view Liked, as described in Exercise 1(a)? If so, give
an example.

(b) Is it possible to have a "mishandled deletion" from view Liked, as described in Exercise 1(b)? If so,
give an example.

(c) Is it possible to have a "mishandled update" on view Liked, as described in Exercise 1(c)? If so, give an
example.

(d) Drop view Liked and create it again, this time using WITH CHECK OPTION. For each of (a), (b), and (c)
where you gave an example of a mishandled modification, is the modification still permitted?

1.
create view Post80 as
select title, year from Movie where year > 1980
Answers assume the original movie-rating data:
(a) Yes - "insert into Post80 values ('Jaws', 1975)"
(b) No
(c) Yes - "update Post80 set year = year - 5"
(d) Mishandled insertions and updates like the ones in (a) and (c) are now disallowed.

2.
create view Above3 as
select mID, title from Movie M
where mID in (select mID from Rating where stars > 3)
Answers assume the original movie-rating data:
(a) Yes - "insert into Above3 values (109, 'Rain Man')"
(b) No
(c) Yes - "update Above3 set mID = 110 where mID = 101"
(d) Mishandled insertions and updates like the ones in (a) and (c) are now disallowed.

3.
create view NoDate as
select M.mID, R.rID, title, name from Movie M, Reviewer R, Rating T
where M.mID = T.mID and R.rID = T.rID and ratingDate is null
Answers assume the original movie-rating data:
(a) No - system disallows inserts into views with joins
(b) No - system disallows deletions from views with joins
(c) Yes - "update NoDate set mID = 110 where mID = 106"
(d) Mishandled updates like the one in (c) are now disallowed.

4.
create view Liked as
select title, name from Movie M, Reviewer R, Rating T
where M.mID = T.mID and R.rID = T.rID and stars > 3
Answers assume the original movie-rating data:
(a) No - system disallows inserts into views with joins
(b) No - system disallows deletions from views with joins
(c) Yes - "update Liked set title = 'Jaws' where name = 'Daniel Lewis'"
(d) Mishandled updates like the one in (c) are still allowed.

Authorization Extra Problems

1. What privileges are needed for a user to execute the following SQL statement over
tablesWorker(ID,name) and Works(ID,company)?
Delete From Worker
Where ID In (Select ID From Works Group By ID Having Count(*) > 3)

2. What privileges are needed for a user to execute the following SQL statement over
tablesEmployee(ID,salary,rank,deptID) and Department(ID,category)?
Update Employee E1
Set salary = (Select Avg(salary) From Employee E2 Where E1.rank = E2.rank)
Where deptID In (Select ID from Department Where category = 'Sales')

3. Suppose you are the owner of table Employee(ID,salary,dept). You want to authorize user Amy to see
(but not modify) employee information for those employees who earn less than \$80,000 and work in a
department with fewer than 10 people. Specify a SQL statement or sequence of statements that achieves
this goal.

4. Consider tables Worker(ID,name) and Works(ID,company), where ID is a key for each table. Consider
the following pair of SQL statements. Assume Amy is a valid user, and the statements are issued by a
single user who is the owner of both tables Worker and Works.
Create View NoJob As
Select Distinct ID From Worker, Works Where Worker.ID = Works.ID;
Grant Delete on NoJob to Amy With Grant Option;
Why is this pair of statements disallowed by the SQL standard? Can you write an equivalent pair of
statements that conforms to the standard?

5. Consider a table T(A,B,C) with owner Amy, and the following sequence of statements related to
privileges on T. Each statement is prefaced with the user issuing it.
Amy: Grant Select, Delete on T to Bob With Grant Option
Amy: Grant Select, Delete on T to Carol With Grant Option
Bob: Grant Select(A,B), Delete on T to David With Grant Option
Carol: Grant Select(A,C) on T to David With Grant Option
David: Grant Select(A), Delete on T to Eve
Amy: Revoke Select, Delete on T From Bob Cascade
What privileges on table T does Eve have after this sequence of statements?

6. Consider a table T(A,B,C) with owner Amy, and the following sequence of statements related to
privileges on T. Each statement is numbered and prefaced with the user issuing it.
1 - Amy: Grant Select on T to Bob With Grant Option
2 - Bob: Grant Select on T to Carol With Grant Option
3 - Carol: Grant Select(A,C) on T to David With Grant Option
4 - Carol: Grant Select(A,B) on T to Eve With Grant Option
5 - Amy: Grant Select on T to Eve
6 - Amy: Grant Select(C) on T to Frank
7 - David: Grant Select(A,C) on T to Frank With Grant Option
8 - Eve: Grant Select(A,C) on T to Frank
9 - David: Grant Select(A) on T to Gary
10 - Eve: Grant Select(A) on T to Gary
11 - Amy: Revoke Select on T From Eve Restrict
12 - Carol: Revoke Select(A,C) on T From David Restrict
13 - David: Revoke Select(A) on T From Eve
14 - Bob: Revoke Select on T From Carol Cascade
15 - Amy: Revoke Select on T From Bob Restrict
(a) Which of the Grant statments, if any, would be disallowed?

(b) Which of the Revoke statements, if any, would be disallowed?

(c) After the statements complete execution (excluding any disallowed ones), what privileges does user
Frank have on table T?

1.
Worker - Delete, Select(ID)
Works - Select(ID)

2.
Employee - Update(salary), Select(salary,rank,deptID)
Department - Select

3.
Create View V As
Select * From Employee E1
Where salary < 80,0000
and 10 > (Select Count(*) From Employee E2 Where E2.dept = E1.dept);
Grant Select on V to Amy;

4.
NoJob is not an updatable view so delete privileges are disallowed. The following statements are
equivalent, but NoJob2 is an updatable view.
Create View NoJob2 As
Select ID From Worker
Where ID In (Select ID From Works);
Grant Delete on NoJob2 to Amy With Grant Option;

5. Select(A)

6.
(a) 8
(b) 12, 13
(b) Select(C)

Recursion Graph-Analysis Extra
Problems

You're tasked with performing graph analyses using a relational database system. You quickly recognize
that recursion is needed to analyze arbitrary graphs using SQL queries. Fortunately, recursive SQL is
available in some systems. In these exercises, you'll develop queries on a small graph with colored nodes
and weighted edges. A SQL file to set up the schema and data for these exercises is downloadable here.
This schema and data can be loaded as specified in the file into SQLite, MySQL, or PostgreSQL, but
currently only PostgreSQL supports recursion. See our quick guide for installing and using PostgreSQL.

Schema:
Node ( nID, color )
Edge ( n1, n2, weight ) // n1 and n2 identify nID's in table Node

As a guide to test the accuracy of your SQL queries, the correct query results over the provided data can
be seen by pressing the button at the bottom of the page. You can return results in any order, but you
may find it convenient to sort them in order to compare your results against the correct ones.

1. Find all node pairs n1,n2 that are both red and there's a path of length one or more from n1 to n2.

2. If your solution to problem 1 first generates all node pairs with a path between them and then selects
the red pairs, formulate a more efficient query that incorporates "red" into the recursively-defined
relation in some fashion.

3. If your solution to problem 2 incorporates the "red" condition in the recursion by constraining the
start node to be red, modify your solution to constrain the end node in the recursion instead.
Conversely, if your solution to problem 2 incorporates the "red" condition in the recursion by
constraining the end node to be red, modify your solution to constrain the start node in the recursion

4. Modify one of your previous solutions to also return the lengths of the shortest and longest paths
between each pair of nodes. Your result should have four columns: the two nodeID's, the shortest path,
and the longest path.

5. Modify your solution to problem 3 to also return (n1,n2,0,0) for every pair of nodes (n1n2) that are
both red but there's no path from n1 to n2.

6. Find all node pairs n1,n2 that are both red and there's a path of length one or more from n1 to n2
that passes through exclusively red nodes.

7. Find all node pairs n1,n2 such that n1 is yellow and there is a path of length one or more from n1 to
n2 that alternates yellow and blue nodes.

8. Find the highest-weight path(s) in the graph. Return start node, end node, length of path, and total
weight.

9. Add one more edge to the graph: "insert into Edge values ('L','C',5);"
Your solution to problem 7 probably runs indefinitely now. Modify the query to find the highest-weight
path(s) in the graph with total weight under 100. Return the number of such paths, the minimum length,
maximum length, and total weight.

10. Continuing with the additional edge present, find all paths of length exactly 12. Return the number
of such paths and their minimum and maximum total weights.

All query results can be returned in any order and still be correct.

1. (A,D), (A,G), (A,J), (D,G), (D,J)

2. (A,D), (A,G), (A,J), (D,G), (D,J)

3. (A,D), (A,G), (A,J), (D,G), (D,J)

4. (A,D,1,1), (A,G,3,3), (A,J,2,6), (D,G,2,2), (D,J,1,5)

5. (A,D,1,1), (A,G,3,3), (A,J,2,6), (D,A,0,0), (D,G,2,2), (D,J,1,5), (G,A,0,0), (G,D,0,0), (G,J,0,0), (J,A,0,0),
(J,D,0,0), (J,G,0,0)

6. (A,D), (A,J), (D,J)

7. (E,I), (E,K), (E,L), (H,I), (H,K), (H,L), (K,L)

8. (A,L,7,19)

9. (197,28,34,99)

10. (157,32,43)

OLAP Class-Enrollment Extra
Problems

You will perform "on-line analytical processing" (OLAP) style queries over a simple "star schema"
containing information about students, instructors, classes, and students taking classes from instructors.
A SQL file to set up the schema and data for these exercises is downloadable here. This schema and data
can be loaded as specified in the file into SQLite, MySQL, or PostgreSQL. Queries 1-5 can be solved on
any of the three systems, but currently only MySQL supports the "WITH ROLLUP" construct needed for
queries 6-12. See our quick guide for installing and using the three systems.

Schema:
Student( studID, name, major ) // dimension table, studID is key
Instructor( instID, dept ); // dimension table, instID is key
Class( classID, univ, region, country ); // dimension table, classID is key
Took( studID, instID, classID, score ); // fact table, foreign key references to dimension tables

As a guide to test the accuracy of your SQL queries, the correct query results over the provided data can
be seen by pressing the button at the bottom of the page.

1. Find all students who took a class in California from an instructor not in the student's major
department and got a score over 80. Return the student name, university, and score.

2. Find average scores grouped by student and instructor for courses taught in Quebec.

3. "Roll up" your result from problem 2 so it's grouping by instructor only.

4. Find average scores grouped by student major.

5. "Drill down" on your result from problem 4 so it's grouping by instructor's department as well as
student's major.

6. Use "WITH ROLLUP" on attributes of table Class to get average scores for all geographical
granularities: by country, region, and university, as well as the overall average.

7. Create a table containing the result of your query from problem 6. Then use the table to determine
by how much students from USA outperform students from Canada in their average score.

8. Verify your result for problem 7 by writing the same query over the original tables without using
"WITH ROLLUP".

You may want to look over the next four problems before attempting them, so you know where they're
going.

9. Create the following table that simulates the unsupported "WITH CUBE" operator.
create table Cube as
select studID, instID, classID, avg(score) as s from Took
group by studID, instID, classID with rollup
union
select studID, instID, classID, avg(score) as s from Took
group by instID, classID, studID with rollup
union
select studID, instID, classID, avg(score) as s from Took
group by classID, studID, instID with rollup;
Using table Cube instead of table Took, and taking advantage of the special tuples with NULLs, find the
average score of CS major students taking a course at MIT.

10. Verify your result for problem 9 by writing the same query over the original tables.

11. Whoops! Did you get a different answer for problem 10 than you got for problem 9? What went
wrong? Assuming the answer on the original tables is correct, create a slightly different data cube that
allows you to get the correct answer using the special NULL tuples in the cube.Hint: Change what
dependent value(s) you store in the cells of the cube; no change to the overall structure of the query or
the cube is needed.

12. Continuing with your revised cube from problem 11, compute the same value but this time don't
use the NULL tuples (but don't use table Took either). Hint: The syntactic change is very small and of
course the answer should not change.

All query results can be returned in any order and still be correct.

1.
Amy Stanford 90
Amy Berkeley 90
Amy Stanford 85
Brian Stanford 95
Carol Stanford 85
David Berkeley 85

2.
stud1 inst3 80
stud2 inst2 88.333
stud4 inst3 90
stud5 inst4 80
stud6 inst2 90
stud6 inst3 80
stud6 inst4 70
stud6 inst5 60

3.
inst2 88.75
inst3 83.333
inst4 77.5
inst5 60

4.
CS 80.5
EE 78.5

5.
CS CS 77
CS EE 84
EE CS 82.5
EE EE 78.0556

6.
Canada Ontario Toronto 78.8889
Canada Ontario Waterloo 74
Canada Ontario NULL 77.1429
Canada Quebec McGill 81.25
Canada Quebec NULL 81.25
Canada NULL NULL 79.0385
USA CA Berkeley 82.2222
USA CA Stanford 79.0909
USA CA NULL 80.5
USA MA MIT 80.3571
USA MA NULL 80.3571
USA NULL NULL 80.4412
NULL NULL NULL 79.8333

7. 1.4027

8. 1.4027

9. 80.33334

10. 80

11. 80

12. 80