Académique Documents
Professionnel Documents
Culture Documents
Acknowledgements
These slides were written by Richard T. Snodgrass (University of Arizona), Curtis Dyreson (Utah State University) and Christian S. Jensen (Aalborg University). Kristian Torp (Aalborg University) converted the slides from Island Presents to Powerpoint. Sabah Currim added some slides.
CS 5800
Introduction
I-2
Prevalence of Databases
Behind every successful website, there is a powerful database. Examples:
UPS / FedEx tracking Amazons website Wal-Marts inventory system Dells ordering system Googles search engine
Needs
Which DVDs has a customer rented? Are any DVDs overdue? When will a DVD become available?
Introduction
I-3
Introduction
I-4
Complication: Queries
Does not address needs
Query: Which movies has Joe Jenkins rented? Execute (not quite right): Search for Joe Jenkins. Execute: Search for ^\s+Customer:\s*Joe\s+Jenkins\s*,\s+Rented:. Query: Are any DVDs overdue? Execute: ???
Requirements Advantages
Text editors are easy to use Simple to insert a record Simple to delete a record Robust, sophisticated query language Clear separation between data organization (schema) and data DBMS Concepts Schema DML SQL
I-5 Introduction I-6
Introduction
Complication: Integrity
Lacks data integrity, consistency
Clerk misspells value/field
Customer: Jane Doek, Rented: Eraserhead, Deu: Jan. 19, 2010
Complication: Update
Add/delete/update fields in every record
Record store location.
Customer: Jane Doe, Rented: Babe, Due: Jan. 19, 2011, Store: Paradise
Forgets/adds/reorders field
Terms: weekly special Due: Jan. 19, 2010, Rented: Eraserhead
Requirements
Enforce constraints to permit only valid information to be input. DBMS Concepts Integrity constraints Types
Requirements
Ability to manipulate the way data is organized. DBMS Concepts DDL
Introduction
I-7
Introduction
I-8
Complication: Crashes
Crash during update may lead to inconsistent state.
Ben makes 250 of 500 edits to change Jane Doe to her preferred name Jan Doe. Before he saves it, Windows crashes!
Requirements
Must update on all or none basis. Implemented by commit or rollback if necessary. DBMS Concepts Transactions Commit Rollback Recovery
Requirements
Must support multiple readers and writers. Updates to data must (appear to) occur in serial order. DBMS Concepts Serializability Concurrency control
Introduction I-9 Introduction
I-10
Complication: Security
Customers want to know how many times a movie has been rented.
Provide access to rented.txt, but not to customer field, how to I do that in an editor?
Method
customer.txt contains addresses of customers. Must merge with rented.txt to create mailing list.
Problems
Text editors incapable of such a merge (write a program) Several Joe Jenkins DBMS Concepts No information on some customers!? Joins Keys Foreign keys Requirements Referential integrity Uniquely identify each customer. Make sure we have information on customers that rent DVDs.
Requirements
Ability to control who has access to what information.
Introduction
Complication: Efficiency
All video store owners in the US West get together.
rented.txt file gets huge (gigabytes of data). Slow to edit. Slow to query for customer information.
Requirements
New data structures to improve query performance. System automatically modifies queries to improve speed. Ability of system to scale to handle huge datasets. DBMS Concepts Indexes Query optimization Database tuning
Introduction I-13 Introduction
Requirements
Collect and analyze summary data. Use computer to mine for interesting trends. Support access to data by sophisticated programs. DBMS Concepts Data warehouses Data mining Database API
I-14
Outline
Database System Overview File-based Approach vs. Database Approach Time Line for Database Technologies Architecture of Database Systems
Database system: A database and a DBMS What do we build in this course: database or DBMS or Database system?
Introduction
I-15
Introduction
I-16
A Database System
application program
DBMS
user
4 8 16 32
database
database
persistent storage
Introduction I-17 Introduction I-18
Outline
Database System Overview File-based approach vs. Database Approach Time Line for Database Technologies Architecture of Database Systems
Basic Definitions
Miniworld: Some part of the real world about which information is stored. Also called the Universe of Discourse (UoD). Data: Known facts about the miniworld
recorded have an implicit meaning
Information: data processed to be useful in decision making Metadata: data that describes the properties or characteristics of other data, e.g. the header of a table
Introduction
I-27
Introduction
I-28
Example
(From Modern Database Management)
Data
Metadata customer name, movie name, copy number, due date, movie name, upc, copy number, customer name, phone, address,
customer file
customer file
customer file
In how many different places does a customer name, movie name and copy number appear in the system?
customer name, movie name, copy number, due date, movie name, upc, copy number, customer name, phone, address,...
Introduction
customer name, movie name, copy number, due date, customer name, phone, address,
I-31
customer name, movie name, copy number, due date, movie name, upc, copy number, customer name, phone, address,...
Introduction I-32
Observation
Many applications need these services.
Solution
Build and sell a software system to provide services! i.e. Database Management Systems
Introduction
I-33
Introduction
I-34
Multiple views
A (virtual) view is
x x
inventory
rented
customer
persistent storage
Introduction
Pictorial Representation
Users/Programs DATABASE SYSTEM DBMS SOFTWARE Application Programs/Queries
Functions of a DBMS
Provides persistent, shared storage
Objects live beyond program execution Shared by multiple applications
x
x x
Protects against
Stored Database Definition (Meta-Data)
Introduction
Stored Database
Application programmers
Design and implement canned transactions for parametric users.
Introduction
I-39
Introduction
I-40
DBA - Duties
Chooses
Information content of the database Storage structure and access strategy Performance-enhancing data structures
Database People
People who design and develop the DBMS software DBMS designers and implementers Tool developers
Design and implement tools that facilitate the use of DBMS software. Tools include design tools, performance tools, special interfaces, etc.
Responsible for
Backups and recovery Monitoring performance Updates (to schema)
Introduction
I-41
Introduction
I-42
Interfaces
Menu vs. form-based GUI
Canned interfaces for parametric users DBA Application
Natural language
Web search engines
Shell
Introduction
I-43
Introduction
I-44
Data Models
A data definition language (DDL) describes database schemas.
Data relationships Data semantics Integrity constraints
A data manipulation language (DML) is used for querying and updating database instances. A data model is a data definition language along with a data manipulation language.
Conceptual Representational Physical
page 1
Introduction
I-45
Introduction
I-46
Outline
Database System Overview
What is Database System? Components of Database System
File-based Approach vs. Database Approach Time Line for Database Technologies Architecture of Database Systems
Introduction
I-47
Introduction
I-48
Introduction
I-49
Introduction
I-50
How data is stored on disk Data storage structures Access paths to the data How we think the data is organized Conceptual structure Integrity constraints What a user sees of the data View is often limited by security
I-52
Logical
x x x
object-relational
External (view)
x x
Introduction
I-51
Introduction
conceptual level
physical level
Struct STAFF { int staff_no; int branch_no; char fname [15]; char lname [15]; struct date dob; float salary; struct STAFF *next; }; index staff_no; index branch_no;
I-54
Introduction
Data Independence
Each level is independent in the sense that a completely different organization can be used. Physical data independence - Physical level can change without having to change the logical level. Logical data independence - Logical level can change without having to change the external level.
Spatial databases
Maps, cadastral applications Many commercial products (GIS)
Text databases
Special text search capabilities Library collections
Statistical databases
Census data OLAP, data warehousing