Académique Documents
Professionnel Documents
Culture Documents
Lecture No 04
1
9/18/2014
2
9/18/2014
God made the integers; all else is the work of Pre-Relational: if your data changed, your
application broke.
man. Early RDBMS were buggy and slow (and often
(Leopold Kronecker, 19th Century Mathematician) reviled), but required only 5% of the application
Codd made relations; all else is the work of code.
Activities of users at terminals and most
man. application programs should remain unaffected
(Raghu Ramakrishnan, DB text book author) when the internal representation of data is
changed and even when some aspects of the
external representation are changed. Codd-- 79
Key Ideas: Programs that manipulate tabular data
exhibit an algebraic structure allowing reasoning
and manipulation independently of physical data
representation
N = ((z*2)+((z*3)+0))/1 5 1
Algebraic Laws:
1. (+) identity: x+0 = x
2. (/) identity: x/1 = x 4 3
3. (*) distributes: (n*x+n*y) = n*(x+y) 2
4. (*) commutes: x*y = y*x
Apply rules 1, 3, 4, 2:
N = (2+3)*z
two operations instead of five, no division
operator
Same idea works with the Relational Algebra!
3
9/18/2014
Rule of thumb:
Every paper will assume set semantics
Every implementation will assume bag semantics
R1 R2 R1 - R2
SELECT * FROM R1 SELECT * FROM R1
UNION All EXCEPT
SELECT * FROM R2 SELECT * FROM R2
Derived operator using minus Returns all tuples which satisfy a condition
4
9/18/2014
Eliminates columns
A1,,An (R)
R1 R2
SELECT * SELECT *
FROM R1 JOIN R2 FROM R1, R2
ON R1.A = R2.B WHERE R1.A = R2.B
5
9/18/2014
A join that involves a predicate Find all hospitals within 5 miles of a school
R1 R2 = (R1 R2)
name(Hospitals distance(location,location) < 5 Schools)
Here can be any condition
SELECT binid,
round(avg(cast(Fluo as float)),3) as Fluo,
round(avg(cast(Oxygen as float)),3) as Oxygen,
round(avg(cast(Nitrate_uM as float)),3) as Nitrate_uM,
round(avg(cast(longitude as float)),3) as longitude,
round(avg(cast(latitude as float)),3) as latitude
FROM (
SELECT *, cast(floor(ts) +
floor((ts - floor(ts))*24*60/binsize) *
binsize / (24*60) as datetime) as binid
FROM (
SELECT *, cast(timestamp as float) as ts, 5.0 as binsize
FROM Tokyo_4_merged_data_time
)x
) bins
GROUP BY binid
ORDER BY binid asc
6
9/18/2014
7
9/18/2014
8
9/18/2014
9
9/18/2014
10