Vous êtes sur la page 1sur 34

Chapter 1: Introduction

Data base
 The Collection of data, usually referred to as the database, contains information

relevant to an enterprise. What is Data base Management System?


 DBMS contains information about a particular enterprise
  

Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient to use

Database vs. File Systems

What are the advantages and disadvantages of Database Management system over conventional file system?
 disadvantages of Database Management system over

conventional file system




Data redundancy and inconsistency




Multiple file formats, duplication of information in different files Need to write a new program to carry out each new task

Difficulty in accessing data




 

Data isolation multiple files and formats Integrity problems




Integrity constraints (e.g. account balance > 0) become buried in program code rather than being stated explicitly Hard to add new constraints or change existing ones

disadvantages of Database conventional file system




Management

system

over

Atomicity of updates


Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance and updating it at the same time

Concurrent access by multiple users


 

Security problems


Hard to provide user access to some, but not all, data

advantages of Database Management system over conventional file system


 A DBMS scores over simple file systems in that it is quicker to access, easier to link

related data items, and much easier to maintain. It also makes more efficient use of back-up memory resources.
 In an office environment, some of the advantages would be improvement of

searching and the implementation of right management.


 mainly, you should be aware that a database system, I suppose you're talking about a

relational one, is more per formant, therefore it's quicker to find a record which makes it more cheap than a file System,

Levels of Abstraction
 Physical level: describes how a record (e.g., customer) is stored.  Logical level: describes data stored in database, and the relationships among the data.
 

type customer = record customer_id : string; customer_name : string; customer_street : string; customer_city : integer;

end;

 View level: application programs hide details of data types. Views can also hide

information (such as an employees salary) for security purposes.

View of Data
An architecture for a database system

Defines DBMS schema at three levels: - Internal schema at the internal level to describe data storage structures and access paths. Typically uses a physical data model. - Conceptual schema at the conceptual level to describe the structure and constraints for the whole database. Uses a conceptual or an implementation data model. - External schema at the external level to describe the various user views. Usually uses the same data model as the conceptual level or high-level data model.

Instances and Schemas


 Schema the logical structure of the database


Example: The database consists of information about a set of customers and accounts and the relationship between them) Analogous to type information of a variable in a program Conceptual organization of entire database as viewed by the database administrator Physical schema: database design at the physical level Logical schema: database design at the logical level Analogous to the value of a variable Database files themselves are useless without the memory structures and processes to interact with the database. Oracle defines the term instance as the memory structure and the background processes used to access data from a database.

   

 Instance the actual content of the database at a particular point in time


 

Schema diagram for UNIVERSITY database

schema construct

UNIVERSITY Database

Sub Schema
 That part of a database definition, to be viewed by particular applications, that

describes all or a subset of the data elements, record types, set types, and areas defined in the schema. It is basically a portion of a schema - usually to show a particular user department's portion of the database. It identifies a subset of areas, sets, records, and data names defined in the database schema available to user sessions.
 Defines database portion "seen" by the application programs that actually produce

the desired information from data contained within the database


 The following are a few of the many reasons sub schemas are used:


Sub schemas provide different views of the data to the user and the programmer, who do not need to know all the data contained in the entire database. Sub schemas enhance security factors and prohibit data compromise.

Data Independence
 Data Independence: It is the ability to modify a schema definition in one level without

affecting a schema definition in the next higher level. The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.  Allows data fields to be added, changed, and deleted from a database without necessarily affecting existing application programs.  Two levels of data independence:  Physical data independence  Logical data independence
 Physical data independence is the ability to modify the physical schema without

making it necessary to rewrite application programs. E.G., changing from unblocked to blocked record storage, or from sequential to random-access files.  Logical data independence is the ability to modify the conceptual schema without making it necessary to rewrite application programs. E.G., adding a new field to a record. An application program's view hides this change from the program.

UNIVERSITY Conceptual Schema STUDENT (Name, Student Number, Class, Major) COURSE (Course Name, Course Number, Credit, Dept) PREREQUISITE (Course Number, Prerequisite Number) SECTION (Section Id, Course Number, Semester, Year, Instructor) GRADE_REPORT(Student Number, Section Id , Grade) UNIVERSITY External Schema TRANSCRIPT(Student Name, Course Number, Grade, Semester, Year, Section Id) derived from STUDENT, SECTION, GRADE_REPORT PREREQUISITES(Course Name, Course Number, Prerequisites) derived from PREREQUISITE, COURSE

Data Models
 Data Model: A set of concepts to describe the

structure

of a database, and

certain constraints that the database should obey.


 A collection of tools for describing
   

Data Data relationships Data semantics Data constraints The relational model uses a collection of tables to represent both data and the relationships among those data. The entity relationship (E-R) data model is based on a perception of a real world that consists of a collection of basic objects, called entities and of relationship among these objects.

 Relational model-Data model based on tables




 Entity-Relationship data model (mainly for database design)




 Object-based data models (Object-oriented and Object-relational)-data model

based on the object-oriented programming paradigm




The Object-Oriented data model is another data model that has seen increasing attention.

Data Models
 Semi structured data model (XML) The Semi structured data model permits the

specification of data where individual data items of the same type may have different sets of attributes.
 Other older models:


Network model-data model based on graphs with records as nodes and relationships between records as edges Hierarchical model-Data model based on trees

Relational Model
Attributes

 Example of tabular data in the relational model

A Sample Relational Database

The Entity-Relationship Model Entity Models an enterprise as a collection of entities and relationships


Entity: a thing or object in the enterprise that is distinguishable from other objects


Described by a set of attributes

Relationship: an association among several entities

 Represented diagrammatically by an entity-relationship diagram:

ObjectObject-Relational Data Models


 Extend the relational data model by including object orientation and constructs to deal

with added data types.


 Allow attributes of tuples to have complex types, including non-atomic values such as

nested relations.
 Preserve relational foundations, in particular the declarative access to data, while

extending modeling power.


 Provide upward compatibility with existing relational languages.

Semi Structured Data Models


 Semi structured data models permit the specification of data where individual data

items of the same type may have different sets of attributes. This is in contrast with the data models mentioned earlier. Where every data item of a particular type must have the same set of attributes. XML: Extensible Markup Language
 Defined by the WWW Consortium (W3C)  Originally intended as a document markup language not a database language  The ability to specify new tags, and to create nested tag structures made XML a great

way to exchange data, not just documents


 XML has become the basis for all new generation data interchange formats.  A wide variety of tools is available for parsing, browsing and querying XML

documents/data

Data Manipulation Language (DML)


 Language for accessing and manipulating the data organized by the appropriate data

model


DML also known as query language Procedural user specifies what data is required and how to get those data Also called record-at-a-time (record-oriented) or low-level DML

 Two types of DML


 

 

Must be embedded in a programming language. Searches for and retrieves individual database records and uses looping and other constructs of the host programming language to retrieve multiple records.

Declarative (nonprocedural) user specifies what data is required without specifying how to get those data Also called set-at-a-time (set-oriented) or high-level DML. Can be used as a stand-alone query language or can be embedded in a programming language. Searches for and retrieves information from multiple related database records in a single command. Host language: general-purpose language Data sublanguage: DML C++

 

  

 SQL is the most widely used query language

Data Definition Language (DDL)


 Specification notation for defining the database schema


Example:

create table account ( account-number char(10), balance integer)

 DDL compiler generates a set of tables stored in a data dictionary  Data dictionary contains metadata (i.e., data about data)
 

Database schema Data storage and definition language




Specifies the storage structure and access methods used Domain constraints Referential integrity (references constraint in SQL) Assertions An assertion is any condition that the database must always satisfy. Domain constraint and referential integrity constraint are special forms of assertions. Authorization

Integrity constraints
  

Database Design
 The process of designing the general structure of the database:  Logical Design Deciding on the database schema. Database design requires that we

find a good collection of relation schemas.


 

Business decision What attributes should we record in the database? Computer Science decision What relation schemas should we have and how should the attributes be distributed among the various relation schemas?

 Physical Design Deciding on the physical layout of the database

Query Processing
 1.  2.  3.

Parsing and translation Optimization Evaluation

Query Processing (Cont.)


 Alternative ways of evaluating a given query
 

Equivalent expressions Different algorithms for each operation

 Cost difference between a good and a bad way of evaluating a query can be enormous  Need to estimate the cost of operations


Depends critically on statistical information about relations which the database must maintain Need to estimate statistics for intermediate results to compute cost of complex expressions

Database Architecture
 The architecture of a database systems is greatly influenced by the underlying

computer system on which the database is running:


   

Centralized Client-server Parallel (multi-processor) Distributed

Overall System Structure

Two Tier and Three Tier Architecture

Database Administrator
 Coordinates all the activities of the database system; the database

administrator has a good understanding of the enterprises information resources and needs.
 Database administrator's duties include:
      

Schema definition Storage structure and access method definition Schema and physical organization modification Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements

GTU-MCA http://gtu-mca.blogspot.com