Vous êtes sur la page 1sur 21

Exercise 1: ER Diagrams

Due: November 14, 2005

This exercise should be submitted to the course box. Remember to write your login!

Problem 0 (0 Points)
Register for the database course using the url http://httpdyn.cs.huji.ac.il/course-admin/db/register.

Problem 1 (20 points)


Below we present several entity-relationship diagrams that model information maintained by a rental
company about its tenants, apartments, and building managers. Note that tid is the key for Tenants,
apid is the key for Apartments, and mid is the key for Managers. For each of the entity-relationship
diagrams below:

(a) State briefly the meaning of the diagram. Emphasize the constraints in the diagrams.

(b) Suppose that there are 400 tenants, 150 apartments, and 30 managers. What is the maximal number
of tuples that the relation Rents contains?

(c) Suppose that there are 400 tenants, 150 apartments, and 30 managers. What is the minimal number
of tuples that the relation Rents contains?

Note that the thick lines are participation constraints.

Figure 1:

Apartments apid

Rents tid
mid

Managers Tenants

Figure 2:

Apartments apid

Rents tid
mid

Managers Tenants

Figure 3:

1
Apartments apid

Rents tid
mid

Managers Tenants

Figure 4:

Apartments apid

Rents tid
mid

Managers Tenants

Problem 2 (20 Points)


1. As in Problem 1, we consider information about apartments, tenants, and building managers. We
wish to store information about what apartment each tenant rents. And for each rented apartment
we would like to know who manages it. Two basic approaches can be taken to model this informa-
tion. One option is to use a single relationship between all three entities. The other option is to
use two binary relationships. Here are the 2 diagrams that might be used:

Apartments apid Apartments apid

Manages Rents
Rents tid
mid

Managers Tenants Managers mid Tenants tid

(a) Assuming that there are no further constraints on the information, can the information be
captured in both diagrams? Explain.

(b) Suppose that each apartment is managed by at most one manager. Can you add edge con-
straints to the diagram on the left in order to capture this information? What about the
diagram on the right? Explain.

2. Assuming there are no further constraints, is there anything wrong with using the following diagram
to model the information?

apid tid

Apartments Rents Tenants

mid

2
Problem 3: ER Modeling (60 Points)
You are asked to model a certain security trading situation. Below is the list of requirements:

• A company is uniquely determined by its name and headquarters address. We are concerned only
with companies that have publicly traded common stocks. Each such company has only one such
stock. Include the number of shares outstanding in your model.

• Every stock trades on one or more exchanges. An exchange is uniquely determined by its name.
A stock has a symbol under which it trades on an exchange. The same stock may have different
symbols on different exchanges.

• An option on a stock symbol is a security that is uniquely determined by its type, stock symbol,
strike price, and expiration date. An option trades on the same exchange as its stock symbol. The
type of an option is either a “put” or a “call”. It can not be both, and it can not be something else.

• Include the last trading price and current daily volume for every symbol and option. You do not
need to include historical trading info (such as past closing prices).

• Securities (i.e. stocks and options) are owned and traded by traders. A trader has a name and a
tax id. The tax id uniquely determines the trader.

• Traders do not trade directly, but via brokerages. A brokerage is uniquely determined by its name
and address.

• Each brokerage deals with one or more exchanges and pays a fixed yearly fee to every exchange it
deals with. The fee could be different for every brokerage/exchange pair. Include the fee in your
model.

• A trader owns at least one account with at least one broker. She may hold more than one account
with the same broker and deal with more than one broker. An account is uniquely determined by
brokerage and account number. A brokerage may have no accounts. Each account has exactly one
owner.

• Accounts hold securities and cash. Note that a stock bought on one exchange could be sold on
another, so it is stocks, not symbols, that are held. Do not forget to include options in accounts.

• Traders place trading orders via their brokerages. An order specifies the account, exactly one
symbol or option to trade, “bid” (buy) or “ask” (sell), number of shares to trade, and the order
expiration. There are two types of orders: “market” and “limit”. A limit order has the limit price
in addition to the mentioned properties.

• The brokerage and order id uniquely determine the order.

• An order could be placed only for securities that are traded on the exchanges with which the broker
deals.

3
• A transaction is effected in (possibly partial) fulfillment of two orders. Every transaction contains
the following information: exactly one bid order, exactly one ask order, number of shares, transac-
tion price, commissions paid by the buyer and the seller to their brokerages, and the timestamp.

• Exchange and transaction number uniquely determine the transaction.

• Note that an order could be filled by several transactions.

Draw an entity-relationship diagram to model the information described above. Remember to put in
constraints, key attributes, etc. If you use the ISA relationship, state any covering and overlap constraints
that hold. Make any necessary and logical assumptions. State any such assumptions clearly. Also state
clearly all the constraints you recognize, but unable to express in the diagram.
For clarity you may want to present your diagram on several sheets. Similar to this:

Sheet 1:

Widget Makes Gadget

Sheet 2:

Gadget

ISA

Doohickey Thingy

4
Exercise 2
Due: November 22, Midnight

This exercise has 3 problems. Part of the first problem should be submitted electronically, and the
rest should be submitted to the course box. Do not be alarmed by the length of this document. It is
detailed in order to minimize ambiguities and confusions. Remember to write your name, ID and
login on the exercise.

Problem 1 (60 points)


In this question you will design an Entity-Relationship diagram, translate the diagram into relations, and
write SQL queries to create and delete the corresponding tables.
You are asked to represent a small group of entities and relations:

1. Students: Each student has a StudID (mispar zehut, 6-10 digits, can start with zero), StudName(student
name, first and last, divided by a single space), Address, and an EmailAddress.

2. Courses: Each course has a CourseNum(course number), CourseName, and TeacherName(same


format as StudName)

Note the following:

• Each student can Study 0 or more courses. For every student who studies a course, the database
holds the registration date of the student to that course

• A student can also be a Teaching Assistant in a course. In such a case his reception hour is kept in
the DB (this can be held as a string). A student cannot be a teaching assistant in more than one
course.

• The database also holds information about pairs of students who are married (for payment pur-
poses). For each married pair, the database stores whether they study in the same faculty or not
(’yes’/’no’).

With this information you should perform the following:

1. Draw an Entity-Relationship diagram which represents the above information as best you can.
Notice that you should not create any additional entity groups (besides Students and Courses).
Consider the structure, constraints, and special entities learnt in class. Include any constraints you
see fit, and explain your choices. Specify any assumptions you make.

2. Translate the diagram into relation definitions. Underline the key fields.

3. Write SQL statements which create the tables corresponding to your ER diagram, in a file called
create.sql. If there are elements in your diagram which you cannot translate, explain why (Natu-
rally, your task is to avoid this situation as best you can).

1
4. Write in a file called delete.sql, SQL statements to delete the tables correctly.

The files have to run in the local version of Oracle at CS. We will check your files by running your
create.sql file, inserting values (legal and illegal) into the created tables, running queries on these tables,
and deleting them with your delete.sql file. All of this should work properly and behave according to
the required structure and constraints. Documentation in the files themselves is not required unless you
perform something special or unusual.
Create a file called ex2.tar (not jar or anything else) containing create.sql and delete.sql, and submit
it electronically. To create the tar file write: tar -cvf ex2.tar create.sql delete.sql. To decompress
the tar file (if you wish to check it before submission), write tar -xvf ex2.tar.
Notice: make sure that the names of all your files and entities are exactly as specified and that everything
works and is in the correct format. Points will be reduced for files failing automatic checks.

Problem 2 (20 points)


Consider the relations Students and Courses from question 1 (not any other relations which you might
have created). For each of the following queries, give an expression in relational algebra:

1. Find the name of the student with ID=1335.

2. Find the names of all courses taught by ’Michael Levi’.

3. Find the names of all courses whose name is lexicographically between ’d’ and ’g’.

4. Find the ID’s of all students who have the same name as some other student.

Problem 3 (20 Points)


R, S and T are relations. For each of the following equivalences, state if it is correct and prove it or give
a counter example.
Hint: In order to prove correctness, you can show inclusion in both directions.

1. Associativity for natural join: (R ⊲⊳ S) ⊲⊳ T ≡ R ⊲⊳ (S ⊲⊳ T )

2. σC∧D (R) ≡ σC (R) ∩ σD (R) where C and D are arbitrary conditions about the tuples of R.

3. σc (R − S) = σc (R) − σc (S)

4. π1 (R − S) = π1 (R) − π1 (S)

σc means selection over some condition c, and π1 means projection of the first column. Notice that if
the equivalence is correct, it should be correct for any condition c, and for any content of the tables.

2
Exercise 3
Due: November 29th, Ross closing time

Again, don’t be alarmed by the length of this document, it is detailed in order to avoid confusions
and the actual amount of code you need to write is very little. In the following exercise you will practice
alteration (changing) of tables, insertion of tuples to the tables, and querying your database. Download
the file create.sql from the site. This file includes commands for creating tables which correspond to the
diagram you represented in the previous exercise. Notice that it’s a simplified version: not all constraints
mentioned in the previous exercise are captured in these tables.
Drop all existing tables from your local DB. Run the create.sql file which you downloaded and view the
created tables.
Tips: Remember the ’describe’ command, and the ’cat’ table which holds the names of all your tables.

Table alteration ( 25 points )


Create a file called alter.sql In this file you will write SQL commands which will alter the created tables.
Notice: Do not make changes to the create.sql file. This is meaningless because we will create the
tables with the original create.sql file and then run your alter.sql file.
In the file alter.sql you should write commands which alter the tables so that the following demands are
met:

1. Delete the Address column from table Students.

2. Add a column named Faculty which holds data of type varchar2(20) to the Courses table.

3. Change the definition of the table Courses so that names of teachers with up to 40 chars will be
accepted.

Assume that the tables are empty before running your alter.sql. We will check this part by creating the
tables with the published create.sql, then running your alter.sql and then checking your alterations by
trying to insert various values to the tables. You should check your alterations in the same way.

Inserting tuples ( 20 points )


Write a file called insert1.sql. In this file you should write SQL commands which insert the following
tuples to the modified Students table: (’0123456’,’Mike Roth’, ’mike@zmail.com’), (’6543210’,’Bill Mor-
ris’,’bill@smail.co.il’), (’0246810’,’Pete Baily’ ,’jimmy@pmail.com’), (’1086420’,’Jo Katz’, ’jo@bmail.com’)

The bulk Loader ( 15 points )


Now we would like to insert the same tuples using the Oracle Bulk Loader. First empty your Students
table (you can assume we will also run your file on an empty table). Write a file called insert2.ctl, which

1
will be the control file and will contain the data as well. Notice that in the data part, chars should not
be written inside parentheses (’). Run your control file and view the added tuples.

Queries ( 40 points )
In the last part of the exercise you are asked to write queries about data in the modified tables. Create
a set of files called query1.sql, query2.sql, ..., query5.sql. In each of these files you should write a
single query as follows:

1. In file query1.sql write a query which finds the names of all students whose name starts with the
letter ’D’ (Case sensitive).

2. In file query2.sql write a query which finds the names of all students who study a course which
belongs to the ’Math’ faculty.

3. In file query3.sql write a query which finds the names of all students which study more than one
course.

4. In file query4.sql write a query which finds the number of students whose email account is in
hotmail: their email address contains the string ’@hotmail.’, and there is at least one char before
and after this string.

5. In file query5.sql write a query which finds the names of all students which are married, and both
them and their partners (husbands/wives) study course number 23456 (You can assume that every
married couple appears in the Married table twice- once when the first partner is in studIDA and
the second in studIDB, and once in reverse order)

Note:

1. Your queries should return the right answer for Any legal content of the tables, not just for the
tuples you have inserted in the previous section. We will check this section by entering many
different tuples (all legal) to the tables, and then running your queries.

2. All queries must return distinct tuples and returned tuples must appear in ascending alphabetical
order according to the StudName attribute

3. There is more than one way to perform some of the queries. You can choose any way you like. The
only thing we will check is if your query finds the right tuples (we will not take off points for style,
efficiency, etc.)

Create a tar file called ex3.tar which includes the files alter.sql, insert1.sql, insert2.ctl, query1.sql,
query2.sql, query3.sql, query4.sql, query5.sql and submit ex3.tar electronically.

Good luck!

2
Exercise 4
Due: December 6, 2005. Midnight

This exercise has 3 problems. It should be submitted to the course box. Do not forget to write
your login!!

Problem 1: Different Axioms (35 points)


Armstrong’s axioms for inferring functional dependencies are one example of an axiom system. For each
of the three axiom systems below, answer the following questions:

• Is the axiom system sound? Prove your answer.

• Is the axiom system complete? Prove your answer.

Hint: You may use soundness and completeness of Armstrong’s axioms in your positive proofs. For
negative proofs, either demonstrate a contrary example, or use induction on the length of derivation.

1. Suppose that X, Y , Z and C are (possibly empty) sets of attributes. The first axiom system
consists of the following three axioms:

⇒ X→X (B1)

X →YZ ⇒ X →Y (B2)

X → Y Z, Z → C ⇒ X → Y ZC (B3)

2. Suppose that X and Y are (possibly empty) sets of attributes. The second axiom system consists
of the following single axiom:

⇒ X→Y (C1)

3. Suppose that X, Y , and Z are (possibly empty) sets of attributes. The third axiom system consists
of the following two axioms:

⇒ X →X (D1)

X → Y, Y → Z ⇒ X → Z (D2)

Problem 2: Computing the Closure of a Set of Attributes (25


points)
Let F be a set of functional dependencies on a set of attributes U , and let X be a subset of U . The
following algorithm was given during the lecture for computing the closure of X:

1
ComputeClosure(X, F )
C := X
while there is a V → W ∈ F such that V ⊆ C and W 6⊆ C, do
C := C ∪ W
return C

1. Prove the soundness of the algorithm.

2. Give the semantic proof of completeness of the algorithm.


Hint: See the lecture notes for proof sketch.

Problem 3: Relational Algebra (40 Points)


1. Show that the operator ÷ is redundant, i.e. rewrite it using other operators of relational algebra
(∪, ×, π, −, σ).

2. We would like to show that each of the five operators (∪, ×, π, −, σ) of relational algebra are inde-
pendent, i.e., each operator cannot be expressed as a formula involving only the four other operators.
To help you, we have provided the proof for ∪.
Show the required for ×, π, −, σ. You have to write the full proof for at least one of them, and the
proof sketches for the others.

Proof for union: Let R be a relation containing a single tuple (a) and S be a relation containing
a single tuple (b). Clearly, the union of R and S contains 2 tuples. We show that any expression
that does not contain ∪ results in a relation that has at most one tuple. This will imply that union
cannot be expressed using any of the other operators. The proof will be by induction on the number
of operators in the expression.

For the base case, consider an expression with no operators. This expression can only be either R
or S, and clearly contains only one tuple. For the induction step, suppose that E is an expression
with k operators, where k > 0. Then, E is one of the following expressions:

• E = E1 × E2 , where E1 and E2 are each expressions with at most k − 1 operators. By the


induction hypothesis, E1 and E2 result in relations with at most one tuple. The number of
tuples in E is the product of the number of tuples in E1 and in E2 . Hence, clearly there is at
most one tuple in E.
• E = πL (E1 ), where E1 is an expression with at most k − 1 operators and L is a list of columns.
By the induction hypothesis, E1 has at most one tuple. The operator π can only decrease the
number of tuples, hence, the required holds.
• E = σC (E1 ), where E1 is an expression with at most k − 1 operators and C is a condition.
By the induction hypothesis, E1 has at most one tuple. The operator σ can only decrease the
number of tuples, hence, the required holds.

2
• E = E1 − E2 , where E1 and E2 are each expressions with at most k − 1 operators. By the
induction hypothesis, E1 has at most one tuple. Clearly, E cannot have more tuples than E1 .
Thus, the required holds.

We have shown the required for any k. Hence, ∪ cannot be expressed by the other operators in
relational algebra.

3
Exercise 5
Due: Wednesday, 14.12.05, Ross closing time

Problem 1 ( 100 Points )


You are the DB administrator of the exclusive ’Granada’ college. The college would like to upgrade its
administration system so that each student will be able to view his personal information details, as well
as viewing some details about his friends.
Note the following:

1. Each student receives a login when he registers to the college. The first letter of the login states
the year number in college. For example, logins of first year students starts with ’A’, second year
student logins starts with ’B’, etc. (CDanny is a third year student, DMoshe is a fourth year
student, etc.)

2. The logins are unique (no 2 students have the same login).

3. The Oracle Database of Granada currently holds the following tables:


Students(studid, studName, studLogin, studRating)
Courses(courseNum, CourseName)
Grades(studid, courseNum, grade)

4. Students lists all students in the college. studLogin is the login of the student. studRating is
an integer which indicates the general evaluation of the student so far. All fields are NOT NULL.
studLogin is unique.

5. Courses lists all courses taught in the college. CourseName is Not Null.

6. Grades lists for all students, the courseNum of the courses they have taken, and the grade (integer)
in each such course. There are foreign keys from studid in Grades table to studid in Students
table, and from courseNum in Grades table to courseNum in Courses table.

7. The students DO NOT have access to these tables

Your task is to create the following views which the students will be able to query:

1. myDetails

2. myGrades

3. myGroupDetails

4. myGroupCourses

You are asked to meet the following requirements (The requirements are for the general case when the
student presents a query of the form “select * from < view >”, where < view > is any of the 4 views)

1
1. A student queries the view myDetails to get his own details. You are asked to let him see all his
details as they appear in the Students table, apart from the studRating. He should NOT be able
to see anyone else’s details.

2. A student queries the myGrades view to view his own grades. He should see a list of the Course
Names he has taken and the grade he got in each of those courses. He should NOT be able to see
anyone else’s grades.

3. A student queries the view myGroupDetails to see some details of his fellow students. Every
student should be able to see only data about the students who study in the same year-group as
himself. He should only be able to see the names and logins of these students (again, only of the
students who started the college the same year he did).

4. A student queries the myGroupCourses view to see a list of all courses taken by (any) students
in his year-group. Selecting * from myGroupCourses should show one column of distinct course
names which are taken by students in the year group of the querying student.

5. All views should return the result in ascending order according to the first (leftmost)
column

Tips and guidelines:

1. The word user is reserved to indicate the login of the current user (in our DB as well as in Granada’s
DB). Try to insert user instead of a varchar to a table.

2. The SQL function substr might be useful.

3. A student can query the views in any way he likes, but it will be comfortable for you to think of
the query “select * from < view >” when building the view.

4. All view columns should be created in the order they are mentioned above.

5. Submit a single file called ex5.sql which includes only creation commands for the 4 views, one after
the other (do not submit creation commands for the tables). Running this file should create the
correct views.

6. We will test your views by querying them from different logins, and for different contents of the
underlying tables.

7. No documentation is required unless you perform something very special.

Good luck!

2
Exercise 7
Due: January 5, Ross closing time
This exercise has 4 problems. To be submitted to the course box. Remember to write your login!
All answers must be substantiated and the work shown.

Problem 1: Lossless Join (10 Points)


Let R = ABCD and F = {D → A, AB → C, D → B}. Let P be the decomposition {AC, AD, BD}.
Does P have a lossless join with respect to F ?

Problem 2: Redundancies - BCNF vs. 3NF (25 points)


1. Formulate precisely what it means that a relation contains a redundancy caused by functional
dependencies.

2. Using your formulation prove that if a schema is in BCNF it does not contain redundancies caused
by functional dependencies.

3. Let R = ABC. Produce a set F of functional dependencies on R and a relation in R that satisfies
F such that R is in 3NF, but the relation contains redundancies caused by F .

Problem 3: Algorithm for Computing a Key (25 points)


Given a schema R and a set of relational dependencies F , we would like to compute some key of R. Prove
that the following algorithm is correct:

ComputeKey(S = {A1 , ..., An }, F )


K := S
for i = 1...n do
if F |= K \ {Ai } → Ai then
K := K \ {Ai }
return K

Problem 4: Design Theory (40 Points)


Let R = ABCDEH and F = {AB → CE, CE → D, BE → AC, D → C, C → BDE}.

1. Find a non-redundant cover of F .

2. Find all the keys for R.

3. Give a decomposition of R in 3NF that has a lossless join and preserves functional dependencies.

4. Give a decomposition of R in BCNF that has a lossless join.

5. Does your decomposition preserve functional dependencies?

1
Exercise 8
Due: Sunday, January 22, Ross closing time

This exercise consists of programming in PL/SQL, and should be submitted electronically. Please run
your code one more time just before submission to check that it works!

The year is 2253. Your academic plans did not go as you planned, and instead, you have chosen to
be a Space Taxi Driver. Since you took the DB course during your studies, you have been asked by the
boss to assist in constructing the taxis’ Database.
The Space Taxi holds the following tables (They already exist):

1. Planets: This table holds names of all planets in the galaxy, and their position in space in X,Y,Z
coordinates. Apart from all planets, there is one entry for the taxi current location, called
CurLocation.

2. ToDoRides: This table holds the rides that are offered to all taxis by the office. Instead of asking
all taxis ’mi panuy be...’, the office inserts the ride details into this table, and you as a driver can
choose to take it or not. The ride details are: the RideID, the pickup planet (FromPlanet), the
drop-off planet (ToPlanet), and Status. For every ride, Status is set to ’wait’ if no driver has taken
it, or else, it is set to the USER login of the taxi driver who has taken it (if it is you, that would
be your real USER login here in CS).

3. DoneRides: This table lists all the rides that the Taxi has made. It holds the RideID, FromPlanet,
and ToPlanet.

4. RideScore: This table is meant to help the driver decide which Ride to take by assigning a score
to the drives which appear in ToDoRides. The RideScore table holds a list of rides which appear in
table ToDoRides which are in ’wait’ status, and for every Ride it holds a score. This score shows
how appealing the ride is. For example, if it is a short ride and the FromPlanet is very far away
from the taxi current location, it would get a low score, and vice versa. To make the life of the
driver easy, you will be asked to make sure that no more than 10 tuples appear in the RideScore
table at any time (details below).

1
The structure of the tables with example content is presented:

PlanetName X Y Z
CurLocation 2328 3225 4909
FishPlanet 3288 3245 -1901
Earth -1899 1223 3332
MonkeyPlanet -9887 1122 3229

Table 1: Planets

RideID FromPlanet ToPlanet Status


233 FishPlanet Earth wait
234 Earth FishPlanet MOSHE

Table 2: ToDoRides

RideID FromPlanet ToPlanet


233 FishPlanet MonkeyPlanet

Table 3: DoneRides

2
RideID Score
233 9889

Table 4: RideScore

You are asked to write the following:

1. Write a function called dist(PlA,PlB) in a file called dist.sql, which accepts names of two planets,
and returns the (Euclidean) distance between them. The returned value is the closest integer to
the real distance.

2. Write a function called CalcScore(FromPl, ToPl) in a file called CalcScore.sql, which calculates
the score of a ride. The function accepts the names of two planets as input. The score of the ride is
computed as follows: dist(F romP lanet, T oP lanet) − dist(CurLocation, F romP lanet). Explain in
a file called answer.txt why this score evaluates the profitability of a ride. Bonus: What is (are)
the disadvantage(s) of this kind of score? how would you change it?

3. Since the office only inserts tuples into the ToDoRides table, you are asked to write a trigger
called ToDoTrig in a file called ToDoTrig.sql, which fires whenever a tuple is inserted into the
ToDoRides table, and inserts the RideID into the RideScore table as well. The trigger should insert
the RideID into the RideScore table together with the score of the ride, which should be computed
with the CalcScore function you wrote.

4. Write a procedure called TakeRide(RID integer) in a file called TakeRide.sqlwhich accepts a


RideID (integer) as input. This procedure indicates that the driver wants to take the ride with
RideID, which appears in the ToDoRides table. This procedure should thus change the status of
this ride from ’wait’ to the taxi driver’s login (USER).

5. Write a procedure called FinishRide(RID integer) in a file called FinishRide.sql which accepts
a RideID as input. This procedure indicates that the driver has completed this ride. This procedure
should thus insert the details (RideID, FromPlanet, ToPlanet) of the ride into DoneRides, and delete
this ride from the ToDoRides table and from the RideScore table.

6. Write a trigger called RideScoreTrig in a file called RideScoreTrig.sql which makes sure that no
more than 10 rides are shown in the RideScore table, as follows: This trigger should fire whenever
there are more than 10 tuples in the table, and it should delete the tuple (or tuples) with the lowest
score in the table.

Here are a few important notes and technical tips which should save you a lot of time. Some of the
syntax is unique to the local DB.

1. When creating a function or a procedure, do not write the word ’declare’. Just declare the variables
between the function/procedure definition and the ’begin’.

3
2. Do not specify length of varchar2 in the input or output of functions/procedures.

3. You might find the PL/SQL sqrt and power functions useful. Notice that sqrt returns the closest
integer. That would be comfortable for you when writing the dist function which should return the
closest integer.

4. If you want to run a procedure which you have written from the sqlplus shell, use the following
syntax (for the TakeRide procedure as an example): execute TakeRide(RideID). (This actually
adds a ’begin’ before the command and a ’end; /’ after and runs it as a PL/SQL block).

5. This is not P-Lab. You can assume legal input for all tables: no null values and correct data types.
You can assume RideID is unique (Naturally, FromPlanet and ToPlanet are not unique for example)

6. You can implement some of the required tasks in various ways. However, we will only run the files
we specified, so your solution can’t depend on other functions (they will not be created).

7. We will take off points for style or efficiency only in extreme cases. No documentation is required
unless you do something very special.

8. Pay attention to correct naming of functions, input variables, and files. Pay attention to correct
usage of functions, valid tar file, and that everything works properly.

9. We will check your functions, procedures and triggers independently of each other and together, on
different (legal) input.

Submit a tar file called ex8.tar which contains the following files: dist.sql, CalcScore.sql, ToDoTrig.sql,
TakeRide.sql, FinishRide.sql, RideScoreTrig.sql, answer.txt.

Good luck!

4
Exercise 9
Due: Tuesday, 31.1.06

This exercise has 4 problems. It should be submitted to the course box. Remember to write your
login!

Problem 1: Join Optimization (35 points)


Suppose you have tables Students(sid, sname, major) that contains info about students and Courses(cid,
sid) that contains info on who takes which courses. Both tables may contains additional columns.

Consider the following query:

select S.sname, C.cid


from Students S, Courses C
where S.major = ’CS’ and C.sid = S.sid

For each of the following cases, you should specify the indexes that you think should be built in order
to efficiently evaluate the query and the join method that you would choose for the evaluation. You
should assume that your query processor only implements index nested loops join and block nested loops
join. You should present an intuitive explanation of your choice, and why you think that it would be
efficient. If you make any assumptions, state them clearly.

1. Suppose that there are few students that major in CS and you may build as many indexes as you
want, but at most one may be clustered.

2. Suppose that there are few students that major in CS and you may build as many indexes as you
want, but none may be clustered.

3. Suppose that there are many students that major in CS (every course is taken by several students
that major in CS) and you may build as many indexes as you want, but at most one may be
clustered.

4. Suppose that there are many students that major in CS (every course is taken by several students
that major in CS) and you may build as many indexes as you want, but none may be clustered.

Problem 2: Execution Plan (15 points)


In sqlplus create the following table:

CREATE TABLE R (
a numeric,
b numeric,
c numeric PRIMARY KEY);

1
Create PLAN TABLE and set autotrace on (please see tirgul lecture 11, slide 5 on how to do it).
Then execute the following query:

SELECT DISTINCT R.a, R.b


FROM R;

Transcribe the execution plan and explain it.

Problem 3 (25 points)


Supermarket “Fresh n’ Cheap” holds a database of the purchases performed during the last week:

transid date item


110 3.1.06 coffee
110 3.1.06 corn flakes
110 3.1.06 sugar
110 3.1.06 milk
111 5.1.06 bread
111 5.1.06 sugar
111 5.1.06 salmon
112 7.1.06 steak
112 7.1.06 coffee
112 7.1.06 sugar
112 7.1.06 bread
113 7.1.06 milk
113 7.1.06 sugar
113 7.1.06 coffee
113 7.1.06 corn flakes
114 10.1.06 corn flakes
114 10.1.06 milk

1. Find all itemsets with support >= 0.6. Do it using the algorithm described in the TA lecture and
describe the stages of the algorithm. How many scans of the table are needed?

2. Find all association rules of the form LHS => RHS with support >= 0.6 and conf idence >= 0.6.
Do it using the algorithm described in the TA lecture and describe the stages of the algorithm.
State and show the support and confidence of each rule.

2
3. Finding association rules with absolute values like support=1 or confidence=1 is considered rare.
Why? What might be the reason for getting such values? explain.

Problem 4 ( 25 Points)
Going back to Jim’s cows from the TA lecture, consider the following relation concerning the cows’ data
and Rating:

Name numCalves(CV) MilkAverage(MA) Weight(WT) Rating


Augusta 4 12 200 5.4
April 6 13 300 7.6
Juliette 2 5 400 5
Romeo 0 24 200 5.8
Frida 8 15 220 8

The sum of the square errors is defined to be:


P
i (Ratingi − (W0 + W1 ∗ CVi + W2 ∗ M Ai + W3 ∗ W Ti ))2 .
Where i is the number of row (cow).

1. Describe the meaning of this expression.

2. Propose W0 , W1 , W2 , W3 so that the sum of the square errors is <= 10 for the above relation. Show
your calculation for the sum of the square errors. Notice that we did not learn how to find these
weights algorithmically, so you can find them in any way, including pure intuition.

3. Using your proposed weights from above, and assuming you only want to buy one cow, which cow
of the following would you buy if they all cost the same? why? show your calculation.
a. Hava has 3 calves, 5 Litres of Milk Average, and weighs 300 kilos.
b. Alberta has 5 calves, 10 Litres of Milk Average, and weighs 400 kilos.
c. Mona has 8 calves, 0 Litres of Milk Average, and weighs 300 kilos.
d. Paulina has 2 calves, 15 Litres of Milk Average, and weighs 250 kilos.
e. Admonit has 7 calves, 10 Litres of Milk Average, and weighs 150 kilos.

Vous aimerez peut-être aussi