Vous êtes sur la page 1sur 18

CMSC424: Review

Database Management Systems


Manage data
Store

data

Update Answer

data

questions about the data

What kind of data ?


Enterprise

Banking Supermarkets, Sales Airlines Universities Manufacturing Human resources

data

More

Semi-structured Data (XML) Scientific data Biological data Sensor network data etc etc

recent:

Nave solutions
Dont

offer:
durability etc

Consistency
Atomicity,

Concurrency Declarative data retrieval Control of redundancy Dynamic data evolution

DBMS
Database
Key

Management Systems provide

Data abstraction
in evolving systems
the most important purpose of a DBMS Goal: Hiding low-level details from the users of the system
Probably

Guarantees about data integrity


In

presence of concurrent access, failures

Speed !!

Data Abstraction
What data users and application programs see ?

View Level
View 1 View 2

View n

What data is stored ? describe data properties such as data semantics, data relationships

Logical Level

How data is actually stored ? e.g. are we using disks ? Which file system ?

Physical Level

DBMS at a Glance
1. 2.

Data Modeling Data Retrieval

3.
4.

Data Storage
Data Integrity

Data Modeling

A data model is a collection of concepts for describing data properties and domain knowledge:
Data relationships Data semantics Data constraints

We discussed two models:


Entity-relationship Model
Diagrammatic representation Easier to work with Syntax not important, but remember

Relational Model

Remember

what you can model

the meaning

Only one abstract concept Closer to the physical representation Normalization

on disk

Data Retrieval

Query = Declarative data retrieval program


scan the accounts file look for number 55 in subtract $50 from the

describes what data to acquire, not how to acquire it Non-declarative:


the 2nd field 3rd field

Declarative (posed against the tables abstraction):


update accounts set balance = balance - 50 where acct_no = 55

Why ?

Easier to write More efficient to execute


Database

system can decide how to execute it

Data Storage
Where

Main memory ?
Disks
What We

and how to store data ?


if the database larger than memory size ?

discussed properties of disks RAID How to move data between memory and disk ?
Buffer Management LRU, MRU, Clock

Indexes

Closely

tied to data retrieval B+-trees, Hashing

Data Integrity
Manage concurrency and crashes
Transaction: A sequence of database actions enclosed within special tags Properties:

Atomicity: Entire transaction or nothing Consistency: Transaction, executed completely, take database from
one consistent state to another

Isolation: Concurrent transactions appear to run in isolation Durability: Effects of committed transactions are not lost
DBMS can do a few things, e.g., enforce constraints on the data

Consistency: Transaction programmer needs to guarantee that

Rest: DBMS guarantees

Havent covered in class yet

Data Integrity
Semantic

constraints

Typically specified at the logical level E.g. balance > 0


Assert

statements Functional dependencies


kinda

DBMS at a glance

Data Models
Conceptual representation of the data

Data Retrieval
How to ask questions of the database How to answer those questions

Data Storage
How/where to store data, how to access it

Data Integrity
Manage crashes, concurrency Manage semantic inconsistencies

Not fully disjoint categorization !!

SQL Assignment
Report the home run champs in the last three years (2002 to 2004).

select h.year, firstname, lastname, h.hrs from playerinfo p, hitting h where p.playerid = h.playerid and h.hrs = ( select max(hrs) from hitting h2 where h.year = h2.year);

SQL Assignment
Report the last name of the batter who would be reported first in alphabetical order.
select firstname, lastname from playerinfo where firstname <= all (select firstname from playerinfo) and lastname <= all ( select lastname from playerinfo p2 where p2.firstname = playerinfo.firstname);

SQL Assignment
20 Create the dream NL batting team (that will have the most total RBI) from 2004 statistics. Remember, a NL team consists of 1 LF, 1CF, 1RF, 1 SS, 1 2B, 1 3B, 1 1B, 1 Catcher, and 1 Pitcher. Only consider the position at which the hitter played the maximum number of games; so a player will only qualify at one fielding position. Break Ties arbitrarily.

create table rbistable as select p.firstname, p.lastname, p.playerid, h.rbis, f.pos from playerinfo p, hitting h, fielding f where p.playerid = h.playerid and f.playerid = h.playerid and f.year = 2004 and h.year = 2004 and f.numgames = (select max(f2.numgames) from fielding f2 where f2.playerid = f.playerid and f2.year = f.year); create table rbistable2 as select firstname, lastname, pos, rbis, playerid from rbistable r1 where r1.rbis = (select max(r2.rbis) from rbistable r2 where r2.pos = r1.pos); select firstname, lastname, pos, rbis from rbistable2 r where playerid <= all (select playerid from rbistable2 r2 where r2.pos = r.pos);

22 Rank the 2004 teams by their number of wins. The output should contains a table with two columns: Team Name, and Rank (between 1 and 30), and it should be sorted by Rank. Two teams with same number of wins will be ranked the same, and the next rank will be skipped in that case.

select t1.teamname, t1.wins, 31 - count(t2.teamname) as rank from teams t1, teams t2 where t1.year = 2004 and t1.year = t2.year and t1.losses <= t2.losses group by t1.teamname, t1.wins order by rank;

Vous aimerez peut-être aussi