Académique Documents
Professionnel Documents
Culture Documents
th
[BSc CSIT-7 Semester]
Rishi K. Marseni
{rishimarseni@gmail.com}
Course Content – Abstract View
The Relational Model of Data and
RDBMS Implementation Techniques
Why ADBMS?
● Controlling Redundancy
● Restricting Unauthorized Access
● Providing Persistent Storage for Program Objects and Data Structures
● Permitting Inferencing and Actions Using Rules
● Providing Multiple User Interfaces
● Representing Complex Relationships Among Data
● Enforcing Integrity Constraints
● Providing Backup and Recovery
Understanding DBMS
● Database is a collection of interrelated data.
● DBMS is a set of programs that access the data stored in database.
● The primary goal of DBMS is to store and manage data both conveniently and
efficiently.
● Management of data involves defining structure for storage of information and
providing mechanisms for manipulation of information
● It can also be defines as a general purpose software system that enables uses
to create, maintain and manipulate database.
● It provides fast and convenient access to information from data stored in
database.
● DBMS interfaces with application programs so data contain in database can
access by multiple applications and users.
Understanding DBMS
●
It provides fast and convenient access to information
from data stored in database.
●
DBMS interfaces with application programs so data
contain in database can access by multiple applications
and users.
●
Commercial DBMS Software examples
– Oracle
– SQL – Server
– IBM-DB2
– MySql
– MS Access
– PostgreSQL
– Sybase etc.
Understanding DBMS
●
Database Applications:
– Banking : all transactions
– Airlines : reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries,
●
Databases touch all aspects of our lives
Purpose of Database System
●
In the early days, database applications were built on top of file
systems
●
Drawbacks of using file systems to store data:
– Data redundancy and inconsistency
●
Multiple file formats, duplication of information in different
files
– Difficulty in accessing data
●
Need to write a new program to carry out each new task
– Data isolation — multiple files and formats
– Integrity problems
●
Integrity constraints (e.g. account balance > 0) become part
of program code
●
Hard to add new constraints or change existing ones
Purpose of Database System
●
Drawbacks of using file systems (cont.)
– Atomicity of updates
●
Failures may leave database in an inconsistent state with
partial updates carried out
●
E.g. transfer of funds from one account to another should
either complete or not happen at all
– Concurrent access by multiple users
– Concurrent accessed needed for performance
– Uncontrolled concurrent accesses can lead to inconsistencies
●
E.g. two people reading a balance and updating it at the
same time
– Security problems
●
Database systems offer solutions to all the above problems
Levels of Abstraction
●
Physical level describes how a record (e.g., customer) is stored.
●
Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
●
View level: application programs hide details of data types. Views
can also hide information (e.g., salary) for security purposes.
View of Data
●
An architecture for a database system
ANSI/SPARC architecture is based on data
3 views of data and a total of 43 interfaces between these views
Instances and Schema
●
Similar to types and variables in programming languages
●
Schema – the logical structure of the database
– e.g., the database consists of information about a set of customers and
accounts and the relationship between them)
– Analogous to type information of a variable in a program
– Physical schema: database design at the physical level
– Logical schema: database design at the logical level
●
Instance – the actual content of the database at a particular point in time
– Analogous to the value of a variable
●
Physical Data Independence – the ability to modify the physical schema
without changing the logical schema
– Applications depend on the logical schema
– In general, the interfaces between the various levels and components
should be well defined so that changes in some parts do not seriously
influence others.
Data Models
●
A collection of tools for describing
– data
– data relationships
– data semantics
– data constraints
●
Entity-Relationship model
●
Relational model
●
Other models:
– object-oriented model
– semi-structured data models
– Older models: network model and hierarchical model
Database Users
●
Users are differentiated by the way they expect to interact with the
system
●
Application programmers – interact with system through DML
calls
●
Sophisticated users – form requests in a database query
language
●
Specialized users – write specialized database applications that
do not fit into the traditional data processing framework
●
Naive users – invoke one of the permanent application programs
that have been written previously. E.g. people accessing database
over the web, bank tellers, clerical staff
Database Administrator
●
Coordinates all the activities of the database system;
●
the database administrator has a good understanding of the enterprise’s
information resources and needs.
●
Database administrator's duties include:
– Schema definition
– Storage structure and access method definition
– Schema and physical organization modification
– Granting user authority to access the database
– Specifying integrity constraints
– Acting as liaison with users
– Monitoring performance and responding to changes in requirements
Transaction Management
●
A transaction is a collection of operations that performs a single
logical function in a database application
●
Transaction-management component ensures that the database
remains in a consistent (correct) state despite system failures (e.g.,
power failures and operating system crashes) and transaction
failures.
●
Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the database.
Storage Management
●
Storage manager is a program module that provides the interface
between the low-level data stored in the database and the
application programs and queries submitted to the system.
●
The storage manager is responsible to the following tasks:
– interaction with the file manager
– efficient storing, retrieving and updating of data
Application Architectures
● An E-R Model describes the data with entity set, relationship set and attributes.
However, the Relational model describes the data with the tuples, attributes and
domain of the attribute.
● One can easily understand the relationship among the data in E-R Model as
compared to Relational Model.
● Restrictions could be the range of values that the element can have, the
default value if none is provided, and if the element can be NULL.
User Defined Integrity
● A business rule is a statement that defines or constrains some aspect of
the business.
Basic Operators:
[ = , <> or != , > , < , <= , >= ]
Other Operators:
Advanced SQL programming
1. Data Manipulation: The Data Manipulation Language (DML) is the subset of
SQL used to add, update and delete data:
INSERT adds rows (formally tuples) to an existing table, e.g.:
INSERT INTO example (field1, field2, field3) VALUES ('test', 'N', NULL);
COMMIT and ROLLBACK terminate the current transaction and release data locks.
In the absence of a START TRANSACTION or similar statement, the semantics of
SQL are implementation-dependent
Advanced SQL programming
3. Transaction Control:
The following example shows a classic transfer of funds transaction, where
money is removed from one account and added to another. If either the removal
or the addition fails, the entire transaction is rolled back.
START TRANSACTION;
UPDATE Account SET amount=amount-200 WHERE
account_number=1234;
UPDATE Account SET amount=amount+200 WHERE
account_number=2345;
IF ERRORS=0 COMMIT;
IF ERRORS<>0 ROLLBACK;
Advanced SQL programming
Conditional (CASE) expressions:
SQL has a case/when/then/else/end expression, which was introduced in SQL-92. In its
most general form, which is called a "searched case" in the SQL standard, it works like else
if in other programming languages:
CASE WHEN n > 0 THEN 'positive' WHEN n < 0 THEN 'negative' ELSE 'zero' END
The WHEN conditions are tested in the order in which they appear in the source. If no ELSE
expression is specified, it defaults to ELSE NULL. An abbreviated syntax exists mirroring
switch statements; it is called "simple case" in the SQL standard:
CASE n WHEN 1 THEN 'one' WHEN 2 THEN 'two' ELSE 'i cannot count that high' END
This syntax uses implicit equality comparisons, with the usual caveats for comparing with
NULL. For the Oracle-SQL dialect, the latter can be shortened to an equivalent DECODE
construct:
SELECT DECODE(n, 1, "one", 2, "two", "i cannot count that high") FROM some_table;
The last value is the default; if none is specified, it also defaults to NULL. However, unlike
the standard's "simple case", Oracle's DECODE considers two NULLs to be equal with each
other.
Query optimization
● Query optimization is a function of many relational database management systems.
● The query optimizer attempts to determine the most efficient way to execute a
given query by considering the possible query plans.
● Generally, the query optimizer cannot be accessed directly by users: once queries
are submitted to database server, and parsed by the parser, they are then passed to
the query optimizer where optimization occurs.
● However, some database engines allow guiding the query optimizer with hints.
● A query is a request for information from a database.
● It can be as simple as "finding the address of a person with SS# 123-45-6789," or
more complex like "finding the average salary of all the employed married men in
California between the ages 30 to 39, that earn less than their wives."
● Queries results are generated by accessing relevant database data and manipulating
it in a way that yields the requested information.
Query optimization
● Since database structures are complex, in most cases, and especially for not-very-
simple queries, the needed data for a query can be collected from a database by
accessing it in different ways, through different data-structures, and in different
orders.
● Each different way typically requires different processing time.
● Processing times of a same query may have large variance, from a fraction of a
second to hours, depending on the way selected.
● The purpose of query optimization, which is an automated process, is to find the
way to process a given query in minimum time.
● The large possible variance in time justifies performing query optimization, though
finding the exact optimal way to execute a query, among all possibilities, is typically
very complex, time consuming by itself, may be too costly, and often practically
impossible.
● Thus query optimization typically tries to approximate the optimum by comparing
several common-sense alternatives to provide in a reasonable time a "good
enough" plan which typically does not deviate much from the best possible result.
The Relational Model of Data and
RDBMS Implementation Techniques