Vous êtes sur la page 1sur 3

ur. !.!. Magdum College of Lnglneerlng, !ayslngpur.

uepottmeot of lofotmotloo 1ecbooloqy


Ass|gnment for 8L (I1) Student Sub[ect: Advanced Database Systems
AsslgnmenL no. 2 Sub[ecL 1eacher: rof. nadaf A.P
uaLe of AsslgnmenL: 22/07/2011
ueadllne uaLe for Submlsslon: 28]07]2011
uaLe of reLurnlng back correcLed coples: 01/08/2011

Ior 8atch 81
Q1. For each of the three partitioning techniques, namely round-
robin, hash partitioning, and range partitioning, give an
example of a query for which that partitioning technique would
provide the fastest response.
Q2. What form of parallelism(interquery, interoperation, or
intraoperation) is likely to be the most important for each of
the following tasks.
a. Increasing the throughput of a system with many small
queries
b. Increasing the throughput of a system with a few large
queries, when the number of disks and
processors is large
Q3. Describe the benets and drawbacks of pipelined parallelism.

Ior 8atch 82
Q4. In a range selection on a range-partitioned attribute, it is
possible that only one disk may need to be accessed. Describe
the benets and drawbacks of this property.
Q5. Some parallel database systems store an extra copy of each data
item on disks attached to a different processor, to avoid
loss of data if one of the processors fails.
a. Why is it a good idea to partition the copies of the
data items of a processor across multiple processors?
b. What are the benets and drawbacks of using RAID storage
instead of storing an extra copy of each data item?

Q6. What factors could result in skew when a relation is partitioned
on one of its attributes by:
a. Hash partitioning
b. Range partitioning
In each case, what can be done to reduce the skew?

Ior 8atch 83
Q7. With pipelined parallelism, it is often a good idea to
perform several operations in a pipeline on a single
processor, even when many processors are available.
a. Explain why.
b. Would the arguments you advanced in part a hold if the
machine has shared-memory architecture? Explain why or why
not.
c. Would the arguments in part a hold with independent
parallelism? (That is, are there cases where, even if the
operations are not pipelined and there are many processors
available, it is still a good idea to perform several
operations on the same processor?)
Q8. Describe a good way to parallelize each of the following.
a. The difference operation
b. Aggregation by the count operation
c. Aggregation by the count distinct operation
d. Aggregation by the avg operation
e. Left outer join, if the join condition involves only
equality
f. Left outer join, if the join condition involves
comparisons other than equality
g. Full outer join, if the join condition involves
comparisons other than equality


Ior 8atch 84
Q9. Consider join processing using symmetric fragment and replicate
with range partitioning. How can you optimize the evaluation
if the join condition is of the form | r.A s.B | k,where k is
a small constant. Here, | x | denotes the absolute value of
x. A join with such a join condition is called a band join.
Q10. Give an example of a join that is not a simple equi-join for
which partitioned parallelism can be used. What attributes
should be used for partitioning?
Q11. Recall that histograms are used for constructing load-balanced
range partitions.
a. Suppose you have a histogram where values are between 1
and 100, and are partitioned into 10 ranges, 110,
1120,..., 91100, with frequencies 15, 5, 20,10, 10, 5, 5,
20,5,and 5, respectively. Give a load balanced range
partitioning function to divide the values into 5
partitions.
b. Write an algorithm for computing a balanced range
partition with p partitions, given a histogram of
frequency distributions containing n ranges.
---XX---

Vous aimerez peut-être aussi