Vous êtes sur la page 1sur 10

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 214


Unit 13 Distributed Database
Structure
13.1 Introduction to Distributed DBMS Concepts
Objectives
Self Assessment Question(s) (SAQs)
13.2 Client-Server Model
Self Assessment Question(s) (SAQs)
13.3 Data Fragmentation, Replication, and Allocation Techniques for
Distributed Database Design
Self Assessment Question(s) (SAQs)
13.4 Summary
13.5 Terminal Questions (TQs)
13.6 Multiple Choice Questions (MCQs)
13.7 Answers to SAQs, TQs, and MCQs
13.7.1 Answers to Self Assessment Questions (SAQs)
13.7.2 Answers to Terminal Questions (TQs)
13.7.3 Answers to Multiple Choice Questions (MCQs)
13.1 Introduction to Distributed DBMS Concepts
In a centralized database system, all system components such as data,
DBMS software, storage devices reside at a single computer or site, where
as in distributed database system data is spread over one or more computer
connected by a network.
Distributed database is thus a set of databases stored on multiple
computers but it appears to a user as a single database. The data on
several computers can be simultaneously accessed and modified (data from
local and remote databases) using a network. Each database server in the
DDB is controlled by its local DBMS, and each cooperates to maintain the
consistency of the global database.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 215
As a general goal, distributed computing systems divide a big,
unmanageable problem into smaller pieces and solve it efficiently in a
coordinated manner.
Fig. 13.1: Data distribution and replication among distributed database
Objectives
To know about
o Client-Server Model
o Data fragmentation
o Replication
o Allocation Techniques for Distributed Database Design
Advantages of Distributed Databases
1. Increased reliability and availability: Reliability is broadly defined as the
probability that a system is running at a certain time point, whereas
reliability is defined as the system that is continuously available during a
time interval. When the data and DBMS software are distributed over
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 216
several sites, one site may fail while other sites continue to operate.
Only the data and software that exist at the failed site cannot be
accessed. In a centralized system, failure at a single site makes the
whole system unavailable to all users.
2. Improved performance: Large database is divided into smaller
databases by keeping the necessary data where it is needed most.
Data localization reduces the contention for CPU and I/O services, and
simultaneously reduces access delays involved in wide area network.
When a large database is distributed over multiple sites, smaller
databases exist at each site. As a result, local queries and transactions
accessing data at a single site have better performance because of the
smaller local databases. To improve parallel query processing a single
large transaction is divided into a number of smaller transactions and
executes multiple transactions at different sites.
3. Data sharing: Data can be accessed by users at other remote sites
through the distributed database management system (DDBMS)
Software.
4. Transparency: Ideally, a distributed database should be distribution
transparent in the sense of hiding the details of where each file is
physically stored within the system. It provides network transparency,
that is the command used to perform a task is independent of the
location of data, and the location of the system where the command was
issued.
5. Easier expansion: In a distributed environment, expansion of the
system in terms of adding more data, increasing database size, or
adding more processors is much easier.
Additional Functions of Distributed Databases:
Basic functions performed by DDBMS in addition to those of centralized
DBMS.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 217
1. Distributed query processing: Distributed query processing means the
ability to access remote sites and transmit queries and data among the
various sites via the communication network.
2. Data tracing: DDBMS should have the ability to keep track of the data
distribution, fragmentation and replication by maintaining DDBMS
catalog.
3. Distributed transaction management In DDBMS transactions that
accesses data from more than one site, and it synchronizes the access
to distributed data and maintains integrity of the overall database.
4. Distributed database recovery: The ability to recover from individual site
crashes and from new types of failures.
5. Security: It must be executed with the proper management of the
security of the data and the authorization/access privileges of the users.
6. Distributed directory (catalog) management: A directory contains
information (meta data) about data in the database. The directory may
be global for the entire DDB, or local for each site. The placement and
distribution of the directory are design and policy issues.
These functions increase the complexity of a DDBMS over a centralized
DBMS.
Self Assessment Question(s) (SAQs) (For Section 13.1)
1. Define distributed database system
2. What are the advantages of Distributed database systems?
13.2 Client-Server Model
The Client-Server model is basic to distributed systems, it allows clients to
make requests that are routed to the appropriate server in the form of
transactions. The client_server model consists of three parts.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 218
1. Client - The client is the machine (workstation or pc) running the front
and applications. It interacts with a user through the keyboard, display
and mouse. The client has no direct data access responsibilities. The
client machine provides front_end application software for accessing the
data on the server. The clients initiates transactions, the server
processes the transactions.
Interaction between client and server might be processed as follows during
processing of an SQL query.
1. The client passes a user query and decomposes it into a number of
independent site queries. Each site query is sent to the appropriate
server site.
2. Each server processes the local query and sends the resulting relation
to the client site.
3. The client site combines the results of the queries to produce the result
of the originally submitted query.
So the server is called database processor or back end machine, where as
the client is calledapplication processor or front end machine.
Another function controlled by the client is that of ensuring consistency of
replicated copies of a data item by using distributed concurrency control
techniques. The client must also ensure the atomicity of global transactions
by performing global recovery when certain sites fail. It provides distribution
transparency, that is the client hides the details of data distribution from the
user.
1. Server The server is the machine that runs the DMS software. It is
referred to as back end. The server processes SQL and other query
statements received from client applications. It can have large disk
capacity and fast processors.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 219
2. Network The network enables remote data access through client
server and server-to-server communication.
Each computer in a network is a node, acts as a client, a server, or both,
depending on the situation.
Advantages:
Client applications are not dependent on physical location of the data. If
the data is moved or distributed to other database servers, the
application continues to function with little or no modification.
It provides multi-tasking and shared memory facilities; as a result they
can deliver the highest possible degree of concurrency and data
integrity.
In networked environment, shared data is stored on the servers, rather
than on all computers in the system. This makes it easier and more
efficient to manage concurrent access. Inexpensive, low-end client work
stations can access the remote data of the server effectively.
Self Assessment Question(s) (SAQs) (For Section 13.2)
1. Explain the concept of Client server model.
13.3 Data fragmentation, Replication, and Allocation Techniques
for Distributed Database Design
Data fragmentation: Techniques that are used to break up the database
into logical units called fragments that may be assigned for storage at the
various sites. In a DDBMS, decisions must be made regarding which site
should be used to store which portions of the database. There are three
types of fragmentation:
1. Horizontal fragmentation: A horizontal fragmentation divides a relation
"horizontally" by grouping rows to create subsets of tuples, where each
subset has a certain logical meaning. These fragments can then be
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 220
assigned to different sites in the distributed system. For example, we
may divide employee relation into three horizontal fragments with the
following conditions: (DNO=10), (DNO=20) AND (DNO=30) each
fragment contains the Employee tuples working for a particular
department.
2. Vertical fragmentations: It is a collection of only certain attributes of
the relation. It divides a relation "vertically" by columns. For ex: we may
want to fragment the employee relation into two vertical fragments. The
first fragment includes personal information Name, B date, Address
and the Second includes work related information-SSN, Salary, Mgr no
etc.
3. Mixed fragmentation: Mixing of horizontal and vertical fragmentation is
called mixed fragmentation.
Data Replication and Allocation: Replication is useful in improving the
availability of data. This replication of the whole database at every site in
the distributed system is called fully replicated database. This can improve
availability because the system can continue to operate as long as at least
one site is up. It improves performance of retrieval for global queries,
because the result of such a query can be obtained locally from any one
site. The disadvantage is that it can slow down update operations, since
update must be performed on every copy of the database to keep the copies
consistent. Full replication makes the concurrency control and recovery
techniques more expensive.
The other extreme from full replication is no replicating that is, each
fragment is stored at only one location, whereas in partial replication some
fragments of the database may be replicated and others may not. Some
people carry partially replicated databases with them on laptops.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 221
Allocation: Each copy of a fragment must be assigned to a particular site in
the distributed system. This process is called data distribution or allocation.
Type of Distributed DB Systems:
In DDB software is distributed over multiple sites connected by network. It
is categorized as:
The first factor is the degree of homogeneity of the DDBMS software. If all
servers (or individual local DDMSs) use identical software and all users use
identical software, the DDBMS is called homogeneous; otherwise, it is
called heterogeneous. At the other extreme is the federated DDBMS or
multidatabase system. In such a system each server has an independent
DBMS, own local users, local programmers and DBA. In heterogeneous
FDBS one server may be RDBMS, another may be network DBMS, and the
third one may be hierarchical DBMS etc. In such a way, it is necessary to
have a canonical system language and language translators to translate
canonical language to the language of each server.
Self Assessment Question(s) (SAQs) (For Section 13.3)
1. What do you mean by data fragmentation? Explain different types.
2. Explain the concept of data replication and allocation.
13.4 Summary
In this unit we have learnt concepts such as
o Client-Server Model
o Data fragmentation
o Replication
o Allocation Techniques for Distributed Database Design
13.5 Terminal Questions (TQs)
1. Discuss briefly the advantages of distributed databases.
2. Discuss Data fragmentation, Replication, and Allocation Techniques for
Distributed Database Design.
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 222
13.6 Multiple Choice Question (MCQs)
1. In .all system components such as data, DBMS
software, storage device reside at a single computer or site.
a) a centralized database system
b) Distributed database System
c) client and server architecuture
d) None of the above
2. Indata is spread over one or more computer connected by a
network
a) a centralized database system
b) Distributed database System
c) client and server architecuture
d) None of the above
3. is the machine (workstation or pc) running the front end applications.
a) Server
b) Client
c) Client and server
d) None of the above
4. enables remote data access through client server and server-
to-server communication
a) The network
b) client
c) Server
d) None of the above
13.7 Answers to SAQs, TQs, and MCQs
13.7.1 Answers to Self Assessment Questions (SAQs)
For Section 13.1
1. In a distributed database system, data is spread over one or more
computer connected by a network. (Refer section 13.1)
Database Management Systems Unit 13
Sikkim Manipal University Page No.: 223
2. Increased reliability and availability, Improved performance, Data
sharing, Transparency, Easier expansion (Refer section 13.1)
For Section 13.2
1. The Client-Server model is basic to distributed systems, it allows clients
to make requests that are routed to the appropriate server in the form of
transactions. (Refer section13.2)
For Section 13.3
1. Data fragmentation: Techniques that are used to break up the database
into logical units called fragments, that may be assigned for storage at
the various sites. (Refer section13.3)
2. Data Replication and Allocation: Replication is useful in improving the
availability of data. Each copy of a fragment must be assigned to a
particular site in the distributed system. This process is called data
distribution or allocation. (Refer section 13.3)
13.7.2 Answers to Terminal Questions (TQs)
1. Increased reliability and availability: Reliability is broadly defined as the
probability that a system is running at a certain time point, whereas
reliability is defined as the system is continuously available during a time
interval. (Refer section 13.1)
2. Data fragmentation: Techniques that are used to break up the database
into logical units called fragments, that may be assigned for storage at
the various sites. (Refer section 13.3)
13.7.3 Answers to Multiple Choice Questions (MCQs)
1. A
2. B
3. B
4. A

Vous aimerez peut-être aussi