Exercise 6: QEPs & Cost Estimation

(1) Query Execution Plans (QEPs)

Draw as many QEPs as you can for the following queries (you are allowed to make additional assumptions). How would you decide which QEP is the most efficient?

a)

SELECT S.sname
FROM Student S
WHERE S.major = 'CS' and age<20

b)

SELECT S.sname
FROM Student S
WHERE S.snum IN (SELECT E.snum 
                 FROM Enrolled E)

c)

SELECT F.fname
FROM Faculty F
WHERE 5 > (SELECT COUNT(E.snum)
           FROM Class C, Enrolled E
           WHERE C.name = E.cname AND
           C.fid = F.fid)

(2) DSCB Ex. 15.6.2.

Let B(R) denote the number of blocks of a relation R, T(R) the number of tuples of a relation R, and V(R.a) the number of distinct values of the attribute R.a.

Suppose B(R) = 10,000 and T(R) = 500,000. Let there be an index on R.a, and let V(R.a) = k for some number k. Give the cost of σa=0(R), as a function of k, under the following circumstances. You may neglect disk I/O’s needed to access the index itself.

a) The index is clustering

b) The index is not clustering

c) R is clustered, and the index is not used.

(3) DSCB Ex. 15.6.3.

Repeat Ex. 15.6.2. if the operation is the range query σC<=a AND a<=D(R). You may assume that C and D are constants such that k/10 of the values are in the range.