Vous êtes sur la page 1sur 21

DATABASE: A database is a structured collection of data.

A database is a collection of information


that is organized so that it can easily be accessed, managed, and updated. In one view, databases
can be classified according to types of content: bibliographic, full-text, numeric, and images.

DBMS: A collection of programs that enables you to store, modify, and extract information from a
database. There are many different types of DBMSs, ranging from small systems that run on
personal computers to huge systems that run on mainframes. The following are examples of
database applications:

 computerized library systems


 automated teller machines
flight reservation systems

RDMBS: RDMBS stand for Relational DataBase Management System. This is the most common
form of DBMS. Invented by E.F. Codd, the only way to view the data is as a set of tables. Because
there can be relationships between the tables, people often assume that is what the word
"relational" means. Not so. Codd was a mathematician and the word "relational" is a mathematical
term from the science of set theory. It means, roughly, "based on tables".

Difference between DBMS & RDBMS?

DBMS:1)In dbms no relationship concept


2)It supports Single User only
3)It treats Data as Files internally
4)It supports 3 rules of E.F.CODD out off 12 rules
5)It requires low Software and Hardware Requirements.
6)FoxPro, IMS are Examples

RDBMS:
1)It is used to establish the relationship concept between two database objects, i.e, tables
2)It supports multiple users
3)It treats data as Tables internally
4)It supports minimum 6 rules of E.F.CODD
5)It requires High software and hardware requirements.
6)SQL-Server, Oracle are examples

EXTRA: • DBMS accepts the ‘flat file’ data that means there is no relation among different data
whereas RDBMS does not accepts this type of design.

• DBMS is used for simpler business applications whereas RDBMS is used for more complex
applications.

• Although the foreign key concept is supported by both DBMS and RDBMS but its only RDBMS that
enforces the rules.

• RDBMS solution is required by large sets of data whereas small sets of data can be managed by
DBMS.

©kamrul hasan, malda 1


ATTRIBUTE: In general, an attribute is a property or characteristic. Color, for example, is an attribute of
your hair. In using or programming computers, an attribute is a changeable property or characteristic of
some component of a program that can be set to different values.

In a database management system (DBMS), an attribute may describe a component of the database, such as
a table or a field, or may be used itself as another term for a field.

Simple Attribute: Attribute that consist of a single atomic value.


Example: Salary, age etc

Composite Attribute : Attribute value not atomic.


Example - Address : ‘House_no: City:State
Name : ‘First Name: Middle Name: Last Name’

Single Valued Attribute: Attribute that hold a single value


Example1: Age
Exampe2: City
Example3:Customer id

Multi Valued Attribute: Attribute that hold multiple values.


Example1: A customer can have multiple phone numbers, email id's etc
Example2: A person may have several college degrees

Stored Attribute: An attribute that supplies a value to the related attribute.


Example: Date of Birth

Derived Attribute: An attribute that’s value is derived from a stored attribute.


Example : age, and it’s value is derived from the stored attribute Date of Birth.

Complex Attribute - A complex attribute that is both composite and multi valued.
Tuple: In the context of databases, a tuple is one record (one row).

Types of Database Management Systems

DBMSs come in many shapes and sizes. For a few hundred dollars, you can purchase a DBMS for
your desktop computer. For larger computer systems, much more expensive DBMSs are required.
Many mainframe-based DBMSs are leased by organizations. DBMSs of this scale are highly
sophisticated and would be extremely expensive to develop from scratch. Therefore, it is cheaper
for an organization to lease such a DBMS program than to develop it. Since there are a variety of
DBMSs available, you should know some of the basic features, as well as strengths and weaknesses,
of the major types.

After reading this lesson, you should be able to Compare and contrast the structure of different
database management systems.

 Hierarchical databases.
 Network databases.
 Relational databases.
 Object-oriented databases.

There are four structural types of database management systems: hierarchical, network, relational,
and object-oriented.

©kamrul hasan, malda 2


1.Hierarchical Databases

Hierarchical Databases (DBMS), commonly used on mainframe computers, have been around for a
long time. It is one of the oldest methods of organizing and storing data, and it is still used by some
organizations for making travel reservations. A hierarchical database is organized in pyramid
fashion, like the branches of a tree extending downwards. Related fields or records are grouped
together so that there are higher-level records and lower-level records, just like the parents in a
family tree sit above the subordinated children

Based on this analogy, the parent record at the top of the pyramid is called the root record. A child
record always has only one parent record to which it is linked, just like in a normal family tree. In
contrast, a parent record may have more than one child record linked to it. Hierarchical databases
work by moving from the top down. A record search is conducted by

Based on this analogy, the parent record at the top of the pyramid is called the root record. A child
record always has only one parent record to which it is linked, just like in a normal family tree. In
contrast, a parent record may have more than one child record linked to it. Hierarchical databases
work by moving from the top down. A record search is conducted by starting at the top of the
pyramid and working down through the tree from parent to child until the appropriate child record
is found. Furthermore, each child can also be a parent with children underneath it.

The advantage of hierarchical databases is that they can be accessed and updated rapidly because
the tree-like structure and the relationships between records are defined in advance. However, this
feature is a two-edged sword. The disadvantage of this type of database structure is that each child
in the tree may have only one parent, and relationships or linkages between children are not
permitted, even if they make sense from a logical standpoint. Hierarchical databases are so rigid in
their design that adding a new field or record requires that the entire database be redefined.

2. Network Databases-

Network databases are similar to hierarchical databases by also having a hierarchical structure.
There are a few key differences, however. Instead of looking like an upside-down tree, a network
database looks more like a cobweb or interconnected network of records. In network databases,
children are called members and parents are called owners. The most important difference is that
each child or member can have more than one parent (or owner).

©kamrul hasan, malda 3


Like hierarchical databases, network databases are principally used on mainframe computers. Since
more connections can be made between different types of data, network databases are considered
more flexible. However, two limitations must be considered when using this kind of database.
Similar to hierarchical databases, network databases must be defined in advance. There is also a
limit to the number of connections that can be made between records.

3. Relational Databases

In relational databases, the relationship between data files is relational, not hierarchical.
Hierarchical and network databases require the user to pass down through a hierarchy in order to
access needed data. Relational databases connect data in different files by using common data
elements or a key field. Data in relational databases is stored in different tables, each having a key
field that uniquely identifies each row. Relational databases are more flexible than either the
hierarchical or network database structures. In relational databases, tables or files filled with data
are called relations, tuples designates a row or record, and columns are referred to as attributes or
fields.

Relational databases work on the principle that each table has a key field that uniquely identifies
each row, and that these key fields can be used to connect one table of data to another. Thus, one
table might have a row consisting of a customer account number as the key field along with
address and telephone number. The customer account number in this table could be linked to
another table of data that also includes customer account number (a key field), but in this case,
contains information about product returns, including an item number (another key field). This key
field can be linked to another table that contains item numbers and other product information
such as production location, color, quality control person, and other data. Therefore, using this
database, customer information can be linked to specific product information.

©kamrul hasan, malda 4


The relational database has become quite popular for two major reasons. First, relational
databases can be used with little or no training. Second, database entries can be modified without
redefining the entire structure. The downside of using a relational database is that searching for
data can take more time than if other methods are used.

4. Object-oriented Databases (OODBMS)

Able to handle many new data types, including graphics, photographs, audio, and video, object-
oriented databases represent a significant advance over their other database cousins. Hierarchical
and network databases are all designed to handle structured data; that is, data that fits nicely into
fields, rows, and columns. They are useful for handling small snippets of information such as
names, addresses, zip codes, product numbers, and any kind of statistic or number you can think
of. On the other hand, an object-oriented database can be used to store data from a variety of
media sources, such as photographs and text, and produce work, as output, in a multimedia
format.

Object-oriented databases use small, reusable chunks of software called objects. The objects
themselves are stored in the object-oriented database. Each object consists of two elements: 1) a
piece of data (e.g., sound, video, text, or graphics), and 2) the instructions, or software programs
called methods, for what to do with the data. Part two of this definition requires a little more
explanation. The instructions contained within the object are used to do something with the data in
the object. For example, test scores would be within the object as would the instructions for
calculating average test score.

Object-oriented databases have two disadvantages. First, they are more costly to develop. Second,
most organizations are reluctant to abandon or convert from those databases that they have
already invested money in developing and implementing. However, the benefits to object-oriented
databases are compelling. The ability to mix and match reusable objects provides incredible
multimedia capability. Healthcare organizations, for example, can store, track, and recall CAT scans,
X-rays, electrocardiograms and many other forms of crucial data.

Database Architecture

DBMSs do not all conform to the same architecture.The three-level architecture forms the basis of
modern database architectures. this is in agreement with the ANSI/SPARC study group on Database
Management Systems. ANSI/SPARC is the American National Standards Institute/Standard Planning
and Requirement Committee). The architecture for DBMSs is divided into three general levels:

1. External
2. Conceptual
3. Internal or Physical

Three level database architecture

Figure 1: Three level architecture

©kamrul hasan, malda 5


1. the external level : concerned with the way individual users see the data
2. the conceptual level : can be regarded as a community user view a formal description of
data of interest to the organisation, independent of any storage considerations.
3. the internal level : concerned with the way in which the data is actually stored

Figure : How the three level architecture works

External View

A user is anyone who needs to access some portion of the data. They may range from application
programmers to casual users with adhoc queries. Each user has a language at his/her disposal.

The application programmer may use a high level language ( e.g. COBOL) while the casual user will
probably use a query language.

Regardless of the language used, it will include a data sublanguage( DSL) which is that subset of the
language which is concerned with storage and retrieval of information in the database and may or
may not be apparent to the user.

A DSL is a combination of two languages:

 a data definition language (DDL) - provides for the definition or description of database
objects
 a data manipulation language (DML) - supports the manipulation or processing of database
objects.

Each user sees the data in terms of an external view: Defined by an external schema, consisting
basically of descriptions of each of the various types of external record in that external view, and
also a definition of the mapping between the external schema and the underlying conceptual
schema.

Conceptual View

The middle level in the three-level architecture is a conceptual view level, which is also referred to as logical
view level. It describes the entire structure of database such as entities, attributes, data types, relationships,
constraints, and user operations. It hides the details of physical storage structures. The conceptual view
level supports the external view level to present the data to end-users as they need. This view is relatively
constant and the Database Administrator designs it after determining the present and future information
needs of the organization. However, to expand the conceptual view, the Database Administrator adds the
new objects to fulfill the requirements of the organization, without affecting the external view.

©kamrul hasan, malda 6


The conceptual view is defined by means of the conceptual schema, which includes definitions of each of
the various conceptual record types. The conceptual schema is a complete description of information of
database structure such as every record type with all its fields. It also includes security and integrity rules.
The conceptual schema is written in DDL, compiled by the DBMS and stored in its data dictionary. The DBMS
uses the conceptual schema to create the logical record interface, which is used by the external record of a
particular user to present data to that user. Actually, conceptual view level is a collection of the logical
records.

Internal View (Physical)

The internal view is a low-level representation of the entire database consisting of multiple
occurrences of multiple types of internal (stored) records.

It is however at one remove from the physical level since it does not deal in terms of physical
records or blocks nor with any device specific constraints such as cylinder or track sizes. Details of
mapping to physical storage is highly implementation specific and are not expressed in the three-
level architecture.

The internal view described by the internal schema:

 defines the various types of stored record


 what indices exist
 how stored fields are represented
 what physical sequence the stored records are in

In effect the internal schema is the storage structure definition.

Advantages of Database Management System:


The DBMS has a number of advantages as compared to traditional computer file processing
approach. The DBA must keep in mind these benefits or capabilities during designing databases,
coordinating and monitoring the DBMS.
The major advantages of DBMS are described below.
1. Controlling Data Redundancy:
In non-database systems (traditional computer file processing), each application program has its
own files. In this case, the duplicated copies of the same data are created at many places. In DBMS,
all the data of an organization is integrated into a single database. The data is recorded at only one
place in the database and it is not duplicated. For example, the dean's faculty file and the faculty
payroll file contain several items that are identical. When they are converted into database, the
data is integrated into a single database so that multiple copies of the same data are reduced to-
single copy.
In DBMS, the data redundancy can be controlled or reduced but is not removed completely.
Sometimes, it is necessary to create duplicate copies of the same data items in order to relate
tables with each other.
By controlling the data redundancy, you can save storage space. Similarly, it is useful for retrieving
data from database using queries.
2. Data Consistency:
By controlling the data redundancy, the data consistency is obtained. If a data item appears only
once, any update to its value has to be performed only once and the updated value (new value of
item) is immediately available to all users.

©kamrul hasan, malda 7


If the DBMS has reduced redundancy to a minimum level, the database system enforces
consistency. It means that when a data item appears more than once in the database and is
updated, the DBMS automatically updates each occurrence of a data item in the database.
3. Data Sharing:
In DBMS, data can be shared by authorized users of the organization. The DBA manages the data
and gives rights to users to access the data. Many users can be authorized to access the same set of
information simultaneously. The remote users can also share same data. Similarly, the data of same
database can be shared between different application programs.
4. Data Integration:
In DBMS, data in database is stored in tables. A single database contains multiple tables and
relationships can be created between tables (or associated data entities). This makes easy to
retrieve and update data.
5. Integrity Constraints:
Integrity constraints or consistency rules can be applied to database so that the correct data can be
entered into database. The constraints may be applied to data item within a single record or they
may be applied to relationships between records.
Examples:
The examples of integrity constraints are:
(i) 'Issue Date' in a library system cannot be later than the corresponding 'Return Date' of a book.
(ii) Maximum obtained marks in a subject cannot exceed 100.
(iii) Registration number of BCS and MCS students must start with 'BCS' and 'MCS' respectively etc.
There are also some standard constraints that are intrinsic in most of the DBMSs. These are;
Constraint Name Description
Designates a column or combination of columns as Primary Key and
PRIMARY KEY
therefore, values of columns cannot be repeated or left blank.
Relates one table with another table.
FOREIGN KEY

Specifies that values of a column or combination of columns cannot


UNIQUE
be repeated.
NOT NULL Specifies that a column cannot contain empty values.
Specifies a condition which each row of a table must satisfy.
CHECK

Most of the DBMSs provide the facility for applying the integrity constraints. The database designer
(or DBA) identifies integrity constraints during database design. The application programmer can
also identify integrity constraints in the program code during developing the application program.
The integrity constraints are automatically checked at the time of data entry or when the record is
updated. If the data entry operator (end-user) violates an integrity constraint, the data is not
inserted or updated into the database and a message is displayed by the system. For example,
when you draw amount from the bank through ATM card, then your account balance is compared
with the amount you are drawing. If the amount in your account balance is less than the amount
you want to draw, then a message is displayed on the screen to inform you about your account
balance.
6. Data Security:
Data security is the protection of the database from unauthorized users. Only the authorized
persons are allowed to access the database. Some of the users may be allowed to access only a

©kamrul hasan, malda 8


part of database i.e., the data that is related to them or related to their department. Mostly, the
DBA or head of a department can access all the data in the database. Some users may be permitted
only to retrieve data, whereas others are allowed to retrieve as well as to update data. The
database access is controlled by the DBA. He creates the accounts of users and gives rights to
access the database. Typically, users or group of users are given usernames protected by
passwords.
Most of the DBMSs provide the security sub-system, which the DBA uses to create accounts of
users and to specify account restrictions. The user enters his/her account number (or username)
and password to access the data from database. For example, if you have an account of e-mail in
the "hotmail.com" (a popular website), then you have to give your correct username and password
to access your account of e-mail. Similarly, when you insert your ATM card into the Auto Teller
Machine (ATM) in a bank, the machine reads your ID number printed on the card and then asks you
to enter your pin code (or password). In this way, you can access your account.
7. Data Atomicity:
A transaction in commercial databases is referred to as atomic unit of work. For example, when you
purchase something from a point of sale (POS) terminal, a number of tasks are performed such as;

 Company stock is updated.


 Amount is added in company's account.
 Sales person's commission increases etc.

All these tasks collectively are called an atomic unit of work or transaction. These tasks must be
completed in all; otherwise partially completed tasks are rolled back. Thus through DBMS, it is
ensured that only consistent data exists within the database.
8. Database Access Language:
Most of the DBMSs provide SQL as standard database access language. It is used to access data
from multiple tables of a database.
9. Development of Application:
The cost and time for developing new applications is also reduced. The DBMS provides tools that
can be used to develop application programs. For example, some wizards are available to generate
Forms and Reports. Stored procedures (stored on server side) also reduce the size of application
programs.
10. Creating Forms:
Form is very important object of DBMS. You can create Forms very easily and quickly in DBMS,
Once a Form is created, it can be used many times and it can be modified very easily. The created
Forms are also saved along with database and behave like a software component.
A Form provides very easy way (user-friendly interface) to enter data into database, edit data, and
display data from database. The non-technical users can also perform various operations on
databases through Forms without going into the technical details of a database.
11. Report Writers:
Most of the DBMSs provide the report writer tools used to create reports. The users can create
reports very easily and quickly. Once a report is created, it can be used many times and it can be
modified very easily. The created reports are also saved along with database and behave like a
software component.
12. Control Over Concurrency:
In a computer file-based system, if two users are allowed to access data simultaneously, it is
possible that they will interfere with each other. For example, if both users attempt to perform

©kamrul hasan, malda 9


update operation on the same record, then one may overwrite the values recorded by the other.
Most DBMSs have sub-systems to control the concurrency so that transactions are always
recorded" with accuracy.
13. Backup and Recovery Procedures:
In a computer file-based system, the user creates the backup of data regularly to protect the
valuable data from damaging due to failures to the computer system or application program. It is a
time consuming method, if volume of data is large. Most of the DBMSs provide the 'backup and
recovery' sub-systems that automatically create the backup of data and restore data if required.
For example, if the computer system fails in the middle (or end) of an update operation of the
program, the recovery sub-system is responsible for making sure that the database is restored to
the state it was in before the program started executing.
14. Data Independence:
The separation of data structure of database from the application program that is used to access
data from database is called data independence. In DBMS, database and application programs are
separated from each other. The DBMS sits in between them. You can easily change the structure of
database without modifying the application program. For example you can modify the size or data
type of a data items (fields of a database table).
On the other hand, in computer file-based system, the structure of data items are built into the
individual application programs. Thus the data is dependent on the data file and vice versa.
15. Advanced Capabilities:
DBMS also provides advance capabilities for online access and reporting of data through Internet.
Today, most of the database systems are online. The database technology is used in conjunction
with Internet technology to access data on the web servers.

Disadvantages of Database Management System (DBMS):


Although there are many advantages but the DBMS may also have some minor disadvantages.
These are:
1. Cost of Hardware & Software:
A processor with high speed of data processing and memory of large size is required to run the
DBMS software. It means that you have to upgrade the hardware used for file-based system.
Similarly, DBMS software is also Very costly.
2. Cost of Data Conversion:
When a computer file-based system is replaced with a database system, the data stored into data
file must be converted to database files. It is difficult and time consuming method to convert data
of data files into database. You have to hire DBA (or database designer) and system designer along
with application programmers; Alternatively, you have to take the services of some software
houses. So a lot of money has to be paid for developing database and related software.
3. Cost of Staff Training:
Most DBMSs are often complex systems so the training for users to use the DBMS is required.
Training is required at all levels, including programming, application development, and database
administration. The organization has to pay a lot of amount on the training of staff to run the
DBMS.
4. Appointing Technical Staff:
The trained technical persons such as database administrator and application programmers etc are
required to handle the DBMS. You have to pay handsome salaries to these persons. Therefore, the
system cost increases.

©kamrul hasan, malda 10


5. Database Failures:
In most of the organizations, all data is integrated into a single database. If database is corrupted
due to power failure or it is corrupted on the storage media, then our valuable data may be lost or
whole system stops.

Database administrator

A database administrator (short form DBA) is a person responsible for the installation, configuration,
upgrade, administration, monitoring and maintenance of databases in an organization.

The DBA's responsibilities include the following:

 deciding the information content of the database, i.e. identifying the entities of interest to
the enterprise and the information to be recorded about those entities. This is defined by
writing the conceptual schema using the DDL
 deciding the storage structure and access strategy, i.e. how the data is to be represented by
writing the storage structure definition. The associated internal/conceptual schema must
also be specified using the DDL
 liaising with users, i.e. to ensure that the data they require is available and to write the
necessary external schemas and conceptual/external mapping (again using DDL)
 defining authorisation checks and validation procedures. Authorisation checks and
validation procedures are extensions to the conceptual schema and can be specified using
the DDL
 defining a strategy for backup and recovery. For example periodic dumping of the database
to a backup tape and procedures for reloading the database for backup. Use of a log file
where each log record contains the values for database items before and after a change and
can be used for recovery purposes
 monitoring performance and responding to changes in requirements, i.e. changing details of
storage and access thereby organising the system so as to get the performance that is `best
for the enterprise'

The DBA's duties


 The exact set of database administration duties of each DBA is dependent on his/her job
profile, the IT policies applied by the company he/she works for and last but not least - the
concrete parameters of the database management system in use. A DBA must be able to
think logically to solve all problems and to easily work in a team with both DBA colleagues
and staff with no computer training.
Frequent updates
 Some of the basic database management tasks of a DBA supplement the work of a system
administrator and include hardware/ software configuration, as well as installation of new
DBMS software versions. Those are very important tasks since the proper installation and
configuration of the database management software and its regular updates are crucial for
the optimal functioning of the DBMS and hence - of the databases, since the new releases
often contain bug fixes and security updates.

Data analysis and security


 Database security administration and data analysis are among the major duties of a DBA.
He/she is responsible for controlling the DBMS security by adding and removing users,
managing database quotas and checking for security issues. The DBA is also engaged in
analyzing database contents and improving the data storage efficiency by optimizing the
use of indexes, enabling the ‘Parallel Query’ execution, etc.

©kamrul hasan, malda 11


Database design
 Apart from the tasks related to the logical and the physical side of the database
management process, DBAs may also take part in database design operations. Their main
role is to give developers recommendations about the DBMS specificities, thus helping them
avoid any eventual database performance issues. Other important tasks of the DBAs are
related to data modeling aimed at optimizing the system layout, as well as to the analysis
and creation of new databases.

Advantages and Limitations

A good database management system (DBMS) should provide the following advantages over a
conventional system:

Advantages

1. Reduced data redundancy


2. Reduced updating errors and increased consistency
3. Greater data integrity and independence from applications programs
4. Improved data access to users through use of host and query languages
5. Improved data security
6. Reduced data entry, storage, and retrieval costs

However, the following can be viewed as some of the limitations of a database:

Disadvantages

1. Database systems are complex, difficult, and time-consuming to design


2. Substantial hardware and software start-up costs
3. Damage to database affects virtually all applications programs
4. Extensive conversion costs in moving form a file-based system to a database system
5. Initial training required for all programmers and users

What is meant by constraints in DBMS ?

Constraints within a database are rules which control values allowed in columns and also enforce the
integrity between columns and tables.

An example of a column constraint would be a 'CHECK' constraint on a column to limit the values allowed.
For example, you could specify the datatype of a column to be tinyint, which can store values from 0-255,
but then specify a CHECK constraint that limits values of 1-99, like this;

CREATE TABLE my_example (SomeColumn tinyint, CONSTRAINT

ChkValue CHECK (SomeColumn BETWEEN 1 and 99 ))

Another common constraint is UNIQUE, which ensures that the value in a column is unique within the table.
This would be used on something like a customer code in a customer table, where you need to ensure that
no two customers have the same code. This can also be done with a PRIMARY KEY constraint.

To enforce integrity, you can specify that a column has a FOREIGN KEY relationship with the values in

©kamrul hasan, malda 12


another table, a concept at the heart of a relational database. An example of this would be a FK constraint
on a 'sales order' table linking the order customer code back to the customer table. The FK serves two
functions; (1) The code entered on the order must be valid and (2) the customer can't be deleted if there are
any orders.

The simplest and most common constraint is the nullability of a column, specified with the NULL and NOT
NULL operators when creating columns;

CREATE TABLE MyExample (ID int NOT NULL, Description varchar(50) NULL).

Integrity Constrains
Integrity constraints ensure that changes made to the database by authorized users do not result in a
loss of data consistency. Thus, integrity constraints guard against accidental damage to the database.
Example of integrity constraints are:

 An instructor name cannot be NULL


 No two instructor can have same instructor ID
 Every department name in the course relation must have a matching department name in the
department name.
 The budget of a department must be greater than Rs. 0.00.

In general, an integrity constraint can be an arbitrary predicate pertaining to the database. However,
arbitrary predicates may be costly to test. Thus, most database allow one to specify integrity constraints that
can be tested with minimal overhead.
Integrity constraints are usually identified as part of the database schema design process, and declared as
part of the create table command used to create relations. However, integrity constraints can also be added
to an existing relation by using the command alter table name add constraint, where constraint can be any
constraint on the relation. When such command is executed, the system first ensures that the relation
satisfies the specified constraint. If it does, the constraint is added to the relation; if not the command is
rejected.

Constraints on a single relation: The create table command may also include integrity-constraint statements.
In addition to the primary-key constraint, there are a number of other ones (many) that can be include in
the create table command. The allowed integrity constraints include

 Not null
 Unique
 Check (<predicate>)

Not Null Constraint -


The null value is a member of all domains and as a result is a legal value for every
attribute in SQL by default. For certain attribute, however, null value may be in-appropriate. Condider a
tuple in the student relation where name is null. Such a tuple gives student information for an unknown
student, thus, it does not certain useful information. Similarly we would not want the department budget to
be null. In cases such as this, we wish forbid null values, and we can do so by restricting the domain of the
attributes name and budget to exclude null values by declaring it as follows.

Name varchar2 (20) not null


Budget numeric (12,2) not null
The not null specification prohibits the insertion of a null value for the attribute, Any database modification
that would cause a null to be inserted in an attribute declared to be not null generates an error diagnostic.

Unique Constraint - unique (Aj1, Aj2, … , Ajm)


The unique specification says that attributes Aj1, Aj2, … , Ajm from a candidate key ; that is, no two
tuples in the relation can be equal on all the listed attributes. However, candidate key attributes

©kamrul hasan, malda 13


are permitted to be null unless they have explicitly been declared to be not null. Recall that null
value does not equal to any other value.

Check (<predicate>) Constraint -


A common use of check clause is to be ensure that attribute
values satisfy specified conditions, in effect creating a powerful type system. For instance, a clause
check (budget>0) in the create table command for relation department would ensure that the
value of budget is non-negative.

Create table section


(course_id varchar (8),
semester varchar (6),
building varchar (15),
primary key (course_id),
check (semester in (‘Fall’, ‘Winter’, ‘Spring’, ‘Summer’)));
Here, we use the check clause to simulate an enumerated type, by specifying that semester must
be one of ‘Fall’, ‘Winter’, ‘Spring’, or ‘Summer’. Thus, the check clause permits attribute domains to
be restricted in powerful ways that most programming-language type systems do not permit.

Data Integrity vs Data Security

Data are the most important asset to any organization. Therefore, it must be made sure that data is
valid and secure all the time. Data integrity and Data security are two important aspects of making
sure that data is useable by its intended users. Data integrity makes sure that the data is valid. Data
security makes sure that data is protected against loss and unauthorized access.

What is Data Integrity?

Data Integrity defines a quality of data, which guarantees the data is complete and has a whole
structure. Data integrity is most often talked about with regard to data residing in databases, and
referred to as database integrity as well. Data integrity is preserved only if and when the data is
satisfying all the business rules and other important rules. These rules might be how each piece of
data is related to each other, validity of dates, lineage, etc. According to data architecture
principles, functions such as data transformation, data storage, metadata storage and lineage
storage must guarantee the integrity of data. That means, data integrity should be maintained
during transfer, storage and retrieval.

If data integrity is preserved, the data can be considered consistent and can be given the assurance
to be certified and reconciled. In terms of data integrity in databases (database integrity), in order
to guarantee that integrity is preserved, you have to ensure that the data becomes an accurate
reflection of the universe it is modeled after. In other words, it must make sure that the data
stored in the database corresponds exactly to the real world details it is modeled after. Entity
integrity, referential integrity and domain integrity are several popular types of integrity constraints
used for preserving data integrity in databases.

(Database integrity ensures that data entered into the database is accurate, valid, and consistent.
Any applicable integrity constraints and data validation rules must be satisfied before permitting a
change to the database.)

Three basic types of database integrity constraints are:

 Entity integrity, not allowing multiple rows to have the same identity within a table.

©kamrul hasan, malda 14


 Domain integrity, restricting data to predefined data types, e.g.: dates.
 Referential integrity, requiring the existence of a related row in another table, e.g. a
customer for a given customer ID.

What is Data Security?

Data security deals with the prevention of data corruption through the use of controlled access
mechanisms. Data security makes sure that data is accessed by its intended users, thus ensuring
the privacy and protection of personal data. Several technologies are used for ensuring data
security. OTFE (on-the-fly-encryption) uses cryptographic techniques for encrypting data on hard
drives. Hardware based security solutions prevent unauthorized read/write access to data and thus
provides stronger protection compared to software based security solutions. Because software
based solutions may prevent data loss or stealing but cannot prevent intentional corruption (which
makes data unrecoverable/unusable) by a hacker. Hardware based two factor authorization
schemes are highly secure because the attacker needs physical access to the equipment and site.
But, the dongles can be stolen and be used by almost anybody else. Backing up data is also used as
a mechanism against loss of data. Data masking is another method used for data security by which
data is obscured. This is done to maintain the security and sensitivity of personal data against
unauthorized access. Data erasure is the method of overwriting of data to ensure that data is not
leaked after its life time has passed.

What is the difference between Data Integrity and Data Security?

Data integrity and data security are two different aspects that make sure the usability of data is
preserved all the time. Main difference between integrity and security is that integrity deals with
the validity of data, while security deals with protection of data. Backing up, designing suitable user
interfaces and error detection/correction in data are some of the means to preserve integrity,
while authentication/authorization, encryptions and masking are some of the popular means of
data security. Suitable control mechanisms can be used for both security and integrity.

What’s referential integrity?

Referential integrity is a relational database concept in which multiple tables share a relationship
based on the data stored in the tables, and that relationship must remain consistent.

The concept of referential integrity, and one way in which it’s enforced, is best illustrated by an
example. Suppose company X has 2 tables, an Employee table, and an Employee Salary table. In the
Employee table we have 2 columns – the employee ID and the employee name. In the Employee
Salary table, we have 2 columns – the employee ID and the salary for the given ID.

Now, suppose we wanted to remove an employee because he no longer works at company X.


Then, we would remove his entry in the Employee table. Because he also exists in the Employee
Salary table, we would also have to manually remove him from there also. Manually removing the
employee from the Employee Salary table can become quite a pain. And if there are other tables in
which Company X uses that employee then he would have to be deleted from those tables as well –
an even bigger pain.

By enforcing referential integrity, we can solve that problem, so that we wouldn’t have to manually
delete him from the Employee Salary table (or any others). Here’s how: first we would define the
employee ID column in the Employee table to be our primary key. Then, we would define the
employee ID column in the Employee Salary table to be a foreign key that points to a primary key
that is the employee ID column in the Employee table. Once we define our foreign to primary key

©kamrul hasan, malda 15


relationship, we would need to add what’s called a ‘constraint’ to the Employee Salary table. The
constraint that we would add in particular is called a ‘cascading delete’ – this would mean that any
time an employee is removed from the Employee table, any entries that employee has in the
Employee Salary table would also automatically be removed from the Employee Salary table.

Note in the example given above that referential integrity is something that must be enforced, and
that we enforced only one rule of referential integrity (the cascading delete). There are actually 3
rules that referential integrity enforces:

1.We may not add a record to the Employee Salary table unless the foreign key for that record
points to an existing employee in the Employee table.

2.If a record in the Employee table is deleted, all corresponding records in the Employee Salary table
must be deleted using a cascading delete. This was the example we had given earlier.

3.If the primary key for a record in the Employee table changes, all corresponding records in the
Employee Salary table must be modified using what's called a cascading update.

It’s worth noting that most RDBMS’s – relational databases like Oracle, DB2, Teradata, etc. – can
automatically enforce referential integrity if the right settings are in place. But, a large part of the
burden of maintaining referential integrity is placed upon whoever designs the database schema –
basically whoever defined the tables and their corresponding structure/relationships in the database
that you are using. Referential integrity is an important concept and you simply must know it for any
programmer interview.

Example
Lets look at a simple example using two tables: SALES_REPS and OFFICES. The following SQL statement is syntactically
correct, and with the current state of our example database this statement would execute and add a new sales rep,
"Doug Henry", who works in office number 45:

INSERT INTO SALES_REPS (EMPL_NUM, NAME, REP_OFFICE)

VALUES (69, ‘Doug Henry’, 45)

No validity checking has been enforced, and even if office number 45 does not exist in the OFFICES table, Doug Henry
will still exist in our database.

To remedy this situation, we’ll define one RI rule linking the OFFICES.OFFICE field (as the primary key) to the
SALES_REPS.REP_OFFICE field (as the foreign key). With this rule in place, the previous SQL statement would not
execute without returning an error. Before adding Doug to the SALES_REPS table, the Advantage Database Server will
first ensure that all foreign keys in this new row reference existing primary keys in their parent tables. Because office
number 45 does not exist, the INSERT operation will fail. The application developer does not write any code to enforce
this rule. The database server does all the work; the developer can simply catch this error, notify the user of the
violation, and request correct data.

Update and Delete Rules


Referential Integrity allows update and delete rules to be specified for each relation you define. These rules affect the
behavior of the Advantage Database Server when updating and deleting existing parent rows. There are four possible
update/delete rules that can be performed:

Delete Rules
 RESTRICT - Prevents deletion of a row from a parent table if children of the row still exist in a child table. If
applied to our example above, this would make it illegal to delete an office if any sales representatives were
still assigned to the office.

©kamrul hasan, malda 16


 CASCADE - When a parent row is deleted, automatically delete all child rows. If applied to our example above,
deleting an office would automatically delete every sales representative assigned to the office.

 SET_NULL - When a parent row is deleted, automatically set all foreign key values to NULL. If applied to our
example above, this would make deleting an office set every sales representative’s office assignment to an
unknown office.

 SET_DEFAULT - When a parent row is deleted, automatically set all foreign key values to their default values.
See Advantage Data Dictionary for more information on default field values. If applied to our example above,
this rule would assign sales representatives to some default office if their office were ever removed. The
default office is stored within the data dictionary and is the default field value for the office field.

Update Rules
 RESTRICT - Prevents updating a primary key if foreign key values still exist in a child table. If applied to our
example above, this would make it illegal to change an office number if sales representatives were still
assigned to the office.

 CASCADE - When a primary key is updated, automatically update all foreign key values. If applied to our
example above, updating an office number would also update the REP_OFFICE field for each sales
representative currently assigned to the office.

 SET_NULL - When a primary key is updated, automatically set all foreign key values to NULL. If applied to our
example above, this would make updates to the office number set every sales representative’s office
assignment to an unknown office.

 SET_DEFAULT - When a primary key is updated, automatically set all foreign key values to their default values.
See Advantage Data Dictionary for more information on default field values. If applied to our example above,
this rule would assign sales representatives to some default office if their office number were ever updated.
The default office is stored within the data dictionary and is the default field value for the office field.

CREATE TABLE STUDENT(ID VARCHAR2(5),


NAME VARCHAR2(20) not null,
TOT_CRED NUMERIC(3,0)default 0,
PRIMARY KEY(ID));

NULL Values
Advantage primary keys can contain one NULL value. Advantage foreign keys (as long as they are not
defined with the unique index type) can contain multiple NULL values. NULL values in foreign keys are often
a necessity when dealing with updates to a database that utilizes RI constraints to break a dependency
between primary and foreign keys.

Behind The Scenes


If you have defined multiple RI rules in your database (a very likely scenario) it is important to keep in mind
the operations that the server will be performing on your behalf. Tables involved in RI rules are grouped
together into graphs. The server must have all tables in the graph open to enforce the RI rules that you have
placed on the database.

For example, if your application is designed to use the example database described above, and opens the
OFFICES table, no extra work will be done. But the first time you attempt an update operation on the
OFFICES table, the server will also open the SALES_REPS table in the background, to maintain your RI
constraint.

How to Define RI Rules


Builder Tabs:

Rules for Updating


Specifies rules to apply when the key value in the parent table is modified.
Rules for Deleting
Specifies rules to apply when a record in the parent table is deleted.
Rules for Inserting
Specifies rules to apply when a new record is inserted or an existing record is updated in the child table.

©kamrul hasan, malda 17


An example of a database that has not enforced referential integrity. In this example, there is a
foreign key (artist_id) value in the album table that references a non-existent artist — in other words
there is a foreign key value with no corresponding primary key value in the referenced table. What
happened here was that there was an artist called "Aerosmith", with an artist_id of 4, which was
deleted from the artist table. However, the album "Eat the Rich" referred to this artist. With
referential integrity enforced, this would not have been possible.

Entity Integrity

An entity is any person, place, or thing to be recorded in a database. Each table represents an
entity, and each row of a table represents an instance of that entity. For example, if order is an
entity, the orders table represents the idea of an order and each row in the table represents a
specific order.

To identify each row in a table, the table must have a primary key. The primary key is a unique
value that identifies each row. This requirement is called the entity integrity constraint.

For example, the orders table primary key is order_num. The order_num column holds a unique
system-generated order number for each row in the table. To access a row of data in the orders
table, use the following SELECT statement:

SELECT * FROM orders WHERE order_num = 1001

Using the order number in the WHERE clause of this statement enables you to access a row easily
because the order number uniquely identifies that row. If the table allowed duplicate order
numbers, it would be almost impossible to access one single row because all other columns of this
table allow duplicate values.

Semantic Integrity

Semantic integrity ensures that data entered into a row reflects an allowable value for that row.
The value must be within the domain, or allowable set of values, for that column. For example, the
quantity column of the items table permits only numbers. If a value outside the domain can be
entered into a column, the semantic integrity of the data is violated.

The following constraints enforce semantic integrity:

 Data type

The data type defines the types of values that you can store in a column. For example, the
data type SMALLINT allows you to enter values from -32,767 to 32,767 into a column. DATE

©kamrul hasan, malda 18


a calendar date contain year, month, day. TIME the time of day, in hours, minutes, and
seconds. A variant, time(p) can be used to specify the number of fractional digits.

 Default value

The default value is the value inserted into the column when an explicit value is not
specified. For example, the user_id column of the cust_calls table defaults to the login
name of the user if no name is entered.

CREATE TABLE STUDENT(ID VARCHAR2(5),


NAME VARCHAR2(20) not null,
TOT_CRED NUMERIC(3,0)default 0,
PRIMARY KEY(ID));

 Check constraint

The check constraint specifies conditions on data inserted into a column. Each row inserted
into a table must meet these conditions. For example, the quantity column of the items
table might check for quantities greater than or equal to one.

what is Domain Integrity


Domain integrity is the validity of entries for a given column. You can enforce domain integrity by
restricting the type (through data types), the format (through CHECK constraints and rules), or the
range of possible values (through FOREIGN KEY constraints, CHECK constraints, DEFAULT
definitions, NOT NULL definitions, and rules).
Example

if we have an emp_rating column which is intended to have values ranging from 1 to 10,the
database should not allow the value except from 0-10.
User Defined Integrity
A business rule is a statement that defines or constrains some aspect of the business. It is intended to assert
business structure or to control or influence the behaviour of the business. E.g.: Age>=18 && Age<=60

DBMS Keys
A key is an attribute (also known as column or field) or a combination of attribute that is used to
identify records. Sometimes we might have to retrieve data from more than one table, in those
cases we require to join tables with the help of keys. The purpose of the key is to bind data
together across tables without repeating all of the data in every table.

The various types of key with e.g. in SQL are mentioned below, (For examples let suppose we have
an Employee Table with attributes ‘ID’ , ‘Name’ ,’Address’ , ‘Department_ID’ ,’Salary’)

(I) Super Key – An attribute or a combination of attribute that is used to identify the records
uniquely is known as Super Key. A table can have many Super Keys.
E.g. of Super Key
1 ID
2 ID, Name
3 ID, Address
4 ID, Department_ID

©kamrul hasan, malda 19


5 ID, Salary
6 Name, Address
7 Name, Address, Department_ID ………… So on as any combination which can identify the records
uniquely will be a Super Key.

(II) Candidate Key – It can be defined as minimal Super Key or irreducible Super Key. In other
words an attribute or a combination of attribute that identifies the record uniquely but none of its
proper subsets can identify the records uniquely.
E.g. of Candidate Key
1 Code
2 Name, Address
For above table we have only two Candidate Keys (i.e. Irreducible Super Key) used to identify the
records from the table uniquely. Code Key can identify the record uniquely and similarly
combination of Name and Address can identify the record uniquely, but neither Name nor Address
can be used to identify the records uniquely as it might be possible that we have two employees
with similar name or two employees from the same house.

(III) Primary Key – A Candidate Key that is used by the database designer for unique identification
of each row in a table is known as Primary Key. A Primary Key can consist of one or more
attributes of a table.
E.g. of Primary Key - Database designer can use one of the Candidate Key as a Primary Key. In this
case we have “Code” and “Name, Address” as Candidate Key, we will consider “Code” Key as a
Primary Key as the other key is the combination of more than one attribute.

(IV) Foreign Key – A foreign key is an attribute or combination of attribute in one base table that
points to the candidate key (generally it is the primary key) of another table. The purpose of the
foreign key is to ensure referential integrity of the data i.e. only values that are supposed to
appear in the database are permitted.
E.g. of Foreign Key – Let consider we have another table i.e. Department Table with Attributes
“Department_ID”, “Department_Name”, “Manager_ID”, ”Location_ID” with Department_ID as an
Primary Key. Now the Department_ID attribute of Employee Table (dependent or child table) can
be defined as the Foreign Key as it can reference to the Department_ID attribute of the
Departments table (the referenced or parent table), a Foreign Key value must match an existing
value in the parent table or be NULL.

(V) Composite Key – If we use multiple attributes to create a Primary Key then that Primary Key is
called Composite Key (also called a Compound Key or Concatenated Key).
E.g. of Composite Key, if we have used “Name, Address” as a Primary Key then it will be our
Composite Key.

(VI) Alternate Key – Alternate Key can be any of the Candidate Keys except for the Primary Key.
E.g. of Alternate Key is “Name, Address” as it is the only other Candidate Key which is not a
Primary Key.

(VII) Secondary Key – The attributes that are not even the Super Key but can be still used for

©kamrul hasan, malda 20


identification of records (not unique) are known as Secondary Key.
E.g. of Secondary Key can be Name, Address, Salary, Department_ID etc. as they can identify the
records but they might not be unique.

Transaction
A transaction consists of a sequence of query and or update statements. The SQL standard
specifies that a transaction begins implicitly when an SQL statement is executed. One of the
following SQL statements must end the transaction:
i) Commit work : Commit work commits the current transaction, that is ,it makes the
updates performed by the transaction become permanent in the database. After
the transaction is committed, a new transaction is automatically stored.
ii) Rollback work: Rollback work cause the current transaction to be rolled back, that
is, it undoes all the updates performed by the SQL statements in the transaction.
Thus, the database sate is restored to what it was before the first statement of
the transaction was executed.

©kamrul hasan, malda 21

Vous aimerez peut-être aussi