Aries

AN ARIES ALGORITHM
FOR OPTIMAL DATA RECOVERY

IN DATABASE SERVER
NAJIHAH BT MOHD YAJID
BACHELOR OF COMPUTER SCIENCE

(NETWORK SECURITY)
UNIVERSITI SULTAN ZAINAL ABIDIN
2017
AN ARIES ALGORITHM
FOR OPTIMAL DATA RECOVERY IN DATABASE SERVER
NAJIHAH BT MOHD YAJID
Bachelor of Computer Science (Network Security)
Faculty of Informatics and Computing
Universiti Sultan Zainal Abidin, Terengganu, Malaysia
MAY 2017
DECLARATION
I hereby declare that this report is based on my original work except for quotations
and citations, which have been duly acknowledged. I also declare that it has not been
previously or concurrently submitted for any other degree at Universiti Sultan Zainal
Abidin or other institutions.
Name : Najihah Bt Mohd Yajid
Date : ..................................................
i
CONFIRMATION
This is to confirm that:
The research’s proposal conducted and the writing of this report was under my
supervisor.
Name : Najihah Bt Mohd Yajid
Date : ..................................................
ii
DEDICATION
First and foremost, I am grateful to Allah, The Most Almighty for establishing me
to complete this proposal. I would like to express my sincere gratitude to my
supervisor Dr. Zarina Bt Mohamad for the continuous support in my research project,
for her patience, motivation, and immense knowledge. Her guidance helped me in all
the time of writing of this thesis.
My sincere thanks also goes to my fellow friends Siti Mazidah Bt Mohamad and
Muhammad Shahrul Nizam B Zainol Rashid who always gives me support and
working together along the writing of this proposal.
I would like to express my special appreciation to my family for supporting me
financially and spiritually throughout writing this proposal and my life in general.
To all my friends who helped me throughout this research and to one and all who
directly or indirectly have lent their hand in this research.
Thank you.
iii
ABSTRACT
Data recovery is the process of restoring data that has been lost, accidentally
deleted, corrupted or made inaccessible. It can happen in a database server. If it fails
or crashes amid transactions, it is expected that the system would follow some sort of
algorithm or techniques to recover lost data. The major problem is data lost. To
preventing the data is one of the complex task and the data neither retrieved 100% in
the server. It also may take lots of time to recover back the data. The multiple servers
that stored the same data may have a different amount of data that caused by server
failure. In other words, when the failure server become active as usually, the data that
has been stored in other database servers doesn’t keep on that server. The importance
of data recovery is, it can help to recover data in a database server in the case of host
failure. It also provides ease of any information or file getting lost. So, no issues of
information lost in any event will be occur. On other words, it could serve as well as
additional storage device. As a solution, ARIES (Algorithms for Recovery and
Isolation Exploiting Semantics) can be used to ensure database consistency.
iv
ABSTRAK
Pemulihan data adalah proses mengembalikan data yang telah hilang, sengaja
dipadam, rosak atau tidak boleh diakses. Ia boleh berlaku dalam pelayan pangkalan
data. Jika ia kegagalan berlaku semasa proses transaksi, beberapa jenis teknik atau
algoritma perlu diikuti untuk mendapatkan semula data-data yang hilang. Masalah
utama adalah kehilangan data. Untuk melindungi data adalah satu tugas yang
kompleks dan data tidak sepenuhnya 100% akan dikembalikan kepada pelayan. Ia
juga akan mengambil masa yang lama ntuk memulihkan kembali data-data yang
hilang. Terdapat berbilang-bilang pelayan yang ada untuk menyimpan data yang
sama. Tetapi mungkin pelayan-pelayan tersebut tidak akan menyimpan jumlah data
yang berbeza disebabkan oleh kegagalan pelayan. Dalam erti kata lain, apabila
pelayan yang gagal itu kembali menjadi aktif seperti biasanya, data yang telah
disimpan di dalam pelayan pangkalan data yang lain tidak disimpan dalam pelayan
itu. Kepentingan pemulihan data ialah ia boleh membantu untuk mendapatkan semula
data-data yang hilang dalam pelayan pangkalan data. Ia juga menyediakan maklumat
mengenai fail-fail yang hilang. Jadi, masalah kehilangan maklumat berkenaan data
tersebut tidak akan berlaku. Dalam erti kata lain, ia adalah sebagai peranti storan
tambahan. Sebagai penyelesaian, ARIES (Algoritma untuk Pemulihan dan
Pengasingan Mengeksploitasi Semantics) boleh digunakan bagi memastikan
keseragaman dalam pangkalan data.
v
CONTENTS
PAGE
DECLARATION i
CONFIRMATION ii
DEDICATION iii
ABSTRACT iv
ABSTRAK v
CONTENTS vi
LIST OF TABLES ix
LIST OF FIGURES x
LIST OF ABBREVIATIONS xi
LIST OF APPENDIX xii
CHAPTER I INTRODUCTION
1.1 Background 1
1.2 Problem statement 4
1.3 Objectives 4
1.4 Project scope 5
1.5 Limitations of work 5
1.6 Expected results 5
CHAPTER II LITERATURE REVIEW

2.1 Overview 6
2.2 Database server in client server model 7
2.3 Fault tolerance in transaction 8
2.4 Data and transaction consistency in database
8
system
2.5 Reliability and availability of database systems 9
vi
2.6 Algorithms for optimization
10
2.6.1 Genetic Algorithm
2.6.2 Artificial bee colony algorithm 11
2.6.3 Ant Colony Algorithm
11
2.7 Main techniques in data recovery
13
2.7.1 Write-ahead logging (WAL)
2.7.2 Shadow paging 14
2.8 Related works

2.8.1 A Conceptual Framework for Disaster
Recovery and Business Continuity of 14
Database Services in Multi- Cloud
2.8.2 Instant recovery with write-ahead logging 15
2.8.3 A novel recovery mechanism enabling
16
fine-granularity locking and fast, REDO-
only recovery
2.8.4 Dynamic Damage Recovery for Web 17
Databases
2.8.5 Data Recovery for Web Applications 17
2.8.6 Fine Grained Transaction Log for Data
18
Recovery in Database Systems
2.8.7 Dynamic Data Recovery for Database 19
Systems
CHAPTER III METHODOLOGY

3.1 Methodology 20
3.2 Framework of a project 21
3.3 Flowchart of a project 23
3.4 Algorithms for Recovery and Isolation Exploiting
Semantics (ARIES) approach
3.4.1 Principles of ARIES 25
3.4.2 Steps in ARIES 26
3.4.3 Data structures used in ARIES recovery
Algorithm 26
3.5 Implemented ARIES algorithm in project 29
3.6 Software and hardware requirements 33
vii
34
REFERENCES
APPENDIX 1 Gantt Chart 37
viii
LIST OF TABLES
TABLE TITLE PAGE
3.1 First table in chapter 3 29
ix
LIST OF FIGURES
FIGURE TITLE PAGE
2.1 First figure in chapter 2 7
3.1 First figure in chapter 3 21
3.2 Second figure in chapter 3 23
3.3 Third figure in chapter 3 26
3.4 Fourth figure in chapter 3 27
x
LIST OF ABBREVIATIONS / TERMS / SYMBOLS
ARIES Algorithms for Recovery and Isolation Exploiting Semantics
DBMS Database Management System
SQL Statement Query Language
ACID Atomicity, Consistency, Isolation and Durability
ABC Artificial Bee Colony
PITR Point In Time Recovery
WAL Write-ahead logging
WAFL Write Anywhere File Layout
DR Disaster Recovery
RTO Recovery Time Objective
RPO Recovery Point Objective
LSN Log Sequence Number
CLR Compensation Log Record
TT Transaction Table
DPT Dirty Page Table
xi
LIST OF APPENDICES
APPENDIX TITLE PAGE
A Appendix 1 37
xii
CHAPTER I
INTRODUCTION
1.1 Background
Data recovery can be defined as a process of retrieving or regaining
inaccessible, lost, corrupted, damaged or formatted data from secondary storage,
removable media or files, when the data stored in them cannot be accessed in a normal
way [1]. Data recovery is used to recall or recover data from any storage that facing
data loss disaster. In case of database system, data recovery will be used when a
transaction failure occurs.
Data recovery is an important factor in any disaster recovery plan. One of the
main stumbling blocks for a disaster recovery operation is how to get a copy of the
latest data on the target system. The most common data recovery scenario involves an
operating system failure, server down, malfunction of a storage device, logical failure
of storage devices, accidental damage or deletion. Recovery should protect the data
and associated users from unnecessary problems and avoid or reduce the possibility of
having to duplicate work manually.
Even a failure occurs, a process still can be proceed as usually based of fault
tolerance concept. Generally, fault tolerance is the way in which an operating system
responds to a hardware or software failure. The term essentially refers to a system’s
1
ability to allow for failures or malfunctions, and this ability may be provided by
software, hardware or a combination of both. To handle faults gracefully, some
computer systems have two or more duplicate systems. Fault tolerance in database
system involves error processing to remove errors from the system's state, which can
be carried out either with recovery by rolling back to a previous correct state [4].
Moreover, a database or transaction that made in web server that come from
client will store in database server. A database is a collection of information that is
organized so that it can easily be accessed, managed, and updated. While, database
server is the term used to refer to the back-end system of a database application using
client-server architecture. The back-end, sometimes called a database server, performs
tasks such as data analysis, storage, data manipulation, archiving, and other non-user
specific tasks. The capture and analyzing of data is typically performed by database
management systems, otherwise known as DBMSs. These types of database software
systems are programmed in SQL, and examples include MySQL in Linux.
Optimization in data recovery refer to consistency and reliability of database
system. It defines how much of the recovered data is accurate with the latest update
from transaction process. Database consistency is a set of guidelines for ensuring the
accuracy of database transactions. It is states that only valid data will be written to the
database. If a transaction is executed that violates the database's consistency rules, the
entire transaction will be rolled back and the database will be restored to its original
state. On the other hand, if a transaction successfully executes, it will take the
database from one state that is consistent with the rules to another state that is also
consistent with the rules. Database consistency doesn't mean that the transaction is
correct, only that the transaction didn't break the rules defined by the program.
2
Database consistency is important because it regulates the data that is coming in and
rejects the data that doesn't fit into the rules. A reliable in database is one that can
continue to process user requests even when the underlying system is unreliable that
caused from any failures occur. Reliability is closely related to the problem of how to
maintain the atomicity and durability properties of transactions. It is a very important
in transaction processing for higher performance with a rapid response time is critical.
Transaction processing systems are usually measured by the number of transactions
they can process in a given period of time. The system must be able to handle
hardware or software problems without corrupting data. Multiple users must be
protected from attempting to change the same piece of data at the same time, for
example two operators cannot sell the same seat on an airplane. Therefore, of data
updated is an important role for the system.
Recovery algorithms are techniques to ensure database consistency and transaction
atomicity and durability despite failures. There are two parts of recovery algorithms.
The first is actions taken during normal transaction processing to ensure enough
information exists to recover from failures. Whereas the second is actions taken after a
failure to recover the database contents to a state that ensures atomicity, consistency
and durability. Algorithms for Recovery and Isolation Exploiting Semantics (ARIES)
can be used to get optimal solution in data recovery. ARIES is a designed to work
with a no-force and steal database approach. It is a widely used as a framework for
recovery management with many possible generalizations [3].
3
1.2 Problem statement
Data recovery is a method to get back a lost data as it ensures atomicity,
consistency, isolation and durability (ACID) of a database transaction. Recovery is
a process of restoring data that has been lost, accidentally deleted, corrupted or
made inaccessible. This process can prevent a system crash for a long time that
will interrupt the users to make a transaction. However, data update during a
transaction is not automatically written to a server at each synchronization point.
Most existing methods of recovery focus on fast recovery without considering the
data that restored is exactly same with the latest data updated during the
transaction. This problem will lead a data to become inconsistent and caused a
system not reliable to the users. Therefore, a new approach needs to be proposed
to handle efficient data recovery in a database system.
1.3 Objectives
The goal of this project is to apply Algorithm for Recovery and Isolation
Exploiting Semantics (ARIES) for data recovery in database server. This project
focuses on the following objectives:
1. To propose an optimization framework in data recovery that caused by system
failure.
2. To implement the techniques that can be used to recovering the data in
database server.
3. To test how the algorithms that apply can be achieve an optimal solution.
4
1.4 Project scope
This project works in a virtual environment, which is virtual machine. The
important scope is focusing on the use of virtual machines for data recovery in a
virtual server. There are five servers will be developed in this project. Three of them
are web servers, and the rest are database server and backup database server
respectively. It also focuses on technical aspects such as performance, reliability and
consistency of the data in database server. The software product of the virtual machine
that used is VirtualBox.
1.5 Limitations of work
There are a few limitation in this project, which are:
1. The project is simulate in virtual server for prototype.
2. More cost and time for implement in a physical server.
3. An optimal solution can be achieve depends on the type of failure.
1.6 Expected results
Based on objectives, an optimal result can be achieve, which are:
1. A transaction process still can proceed even a failure occurs.
2. Recovering process will retrieved accurate data, same with the latest updated
data.
3. The data that recover can be able to keep consistency and reliability.
5
CHAPTER 2
LITERATURE REVIEW
2.1 Overview
Data recovery means retrieving lost, deleted, unusable or inaccessible data that
lost for various reasons. Also known as data restoration which is not only restores lost
files but also recovers corrupted data. On the basis of different lost reason, the data
recovery methods can be adopt [1]. Data corruption and recovery pose especially
serious challenges for database server. Database restoration is an activity of replacing
an existing database or creating a new database using a previous version or copy of
the backup taken at an older point of time and using transaction logs and archive logs
to apply the transactions to roll forward the database to a valid and consistent state.
Loss of data files may lead to great disaster, so the data recovery in oracle has become
popular technology. [2] Inherent data recovery mechanisms in DBMSs focus on
transaction level and system level recovery. The transaction level recovery uses
keywords “commit" and “rollback" to guarantee logical integrity of transactions while
the system level recovery undoes and redoes transactions between the nearest
checkpoint and the crash point. For system level recovery, the DBMSs rollback both
malicious and benign transactions, but cannot process user requests during the
recovery period. [3]
6
2.2 Database server in client server model
Figure 2.1 : Database server in client server model
Figure 2.1 shows how database server in client server model is communicate.
In client server model, clients is a programs that represent web browser who need
services. While servers is a programs that provide services, are separate logical objects
that communicate over a network to perform tasks together [6]. A client makes a
request for a service and receives a reply to that request. At the same time, server
receives and processes a request, and sends back the required response. The database
server holds the Database Management System (DBMS) and the databases. Upon
requests from the client machines, it searches the database for selected records and
passes them back over the network. All database functions are controlled by the
database server. Any type of computer can be used as database server. Some user refer
to the central DBMS functions as the back-end functions, whereas the application
programs on the client computer as front-end programs. From that, it can be conclude
that client is the application, which is used to interface with the DBMS, while
database server is a DBMS [8].
7
2.3 Fault tolerance in transaction
According to (J. Guo, 2004), a system is said to be fault-tolerant if it can mask
the presence of one or more faults in the system by using redundancy while
performance may be degraded. That is, it allows a system to continue to behave
according to design objectives. Redundancy implies the presence of parts or modules
of similar configuration to the one that is functioning but whose purpose is to form
error checking quorum and possibly take over the functions of the active module when
it fails [4]. Fault tolerance is one which is strongly related to dependable systems.
Dependability covers a number of favourable requirements for distributed systems
including the availability, maintainability, reliability and safety. Availability refers to
the property in which a system is ready to be used immediately. In particular, it is
defined as the probability the system is operating correctly at any given moment and is
able to perform functions on behalf of its users. Reliability is defined as the property
that a system can run continuously without failure. Safety is defined as the situation
that when a system temporarily fails to operate correctly, nothing catastrophic
happens. Finally, maintainability pertains to how easily a failed system can be
restored. However, automatic recovering from failures is much harder in practice than
in theory [5].
2.4 Data and transaction consistency in database system
Data consistency refers to the usability of data and is often taken for granted
in the single site environment. Data consistency problems may arise even in a single-
site environment during recovery situations when backup copies of the production
data are used in place of the original data. In order to ensure that the backup data is
useable, it is necessary to understand the backup methodologies that are in place as
8
well as how the primary data is created and accessed. Another very important
consideration is the consistency of the data once the recovery has been completed and
the application is ready to begin processing [6]. A transaction is a logical unit of work
that may include any number of file or database updates. During normal processing,
transaction consistency is present only before any transactions have run, following the
completion of a successful transaction and before the next transaction begins, and
when the application ends normally or the database is closed. Following a failure of
some kind, the data will not be transaction consistent if transactions were in-flight at
the time of the failure. In most cases what occurs is that once the application or
database is restarted, the incomplete transactions are identified and the updates
relating to these transactions are either “backed-out” or processing resumes with the
next dependant write [7].
2.5 Reliability and availability of database systems
The overall application needs to provide reliability and availability, the
database has to guarantee these properties as well. Entailing non-functional database
features such as replication, consistency, conflict management, and partitioning
represent subsequent challenges for successfully designing and operating an available
and reliable database system [8]. Availability is the degree to which a system is
operational and accessible when required for use. In turn, reliability enables a
component to perform its required functions under stated conditions for a specific
period of time. It is defined as a measure of the continuity of correct service. Thus,
availability is a liveness guarantee, while reliability is a safety guarantee. In theory,
reliable systems are not necessarily available and vice-versa. Yet, in practice, an
available, but unreliable system as well as a reliable, unavailable system are barely
9
useful. Systems that provide both reliability and availability are often said to be fault-
tolerant [9].
2.6 Algorithms for optimization
2.6.1 Genetic Algorithm
The genetic algorithm is a method for solving both constrained and
unconstrained optimization problems that is based on natural selection, the process
that drives biological evolution. The genetic algorithm repeatedly modifies a
population of individual solutions. At each step, the genetic algorithm selects
individuals at random from the current population to be parents and uses them to
produce the children for the next generation. Over successive generations, the
population "evolves" toward an optimal solution. The genetic algorithm can be
applied to solve a variety of optimization problems that are not well suited for
standard optimization algorithms, including problems in which the objective function
is discontinuous, non-differentiable, stochastic, or highly nonlinear. The genetic
algorithm can address problems of mixed integer programming, where some
components are restricted to be integer-valued. The genetic algorithm uses three main
types of rules at each step to create the next generation from the current population.
First, ‘selection rules’ which select the individuals, called parents, that contribute to
the population at the next generation. Second is ‘crossover rules’ combine two parents
to form children for the next generation. Third, is ‘mutation rules’ apply random
changes to individual parents to form children [10].
10
2.6.2 Artificial bee colony algorithm
The Artificial Bee Colony (ABC) algorithm is a swarm based meta-heuristic
algorithm for optimizing numerical problems. It was inspired by the intelligent
foraging behaviour of honey bees. The model consists of three essential components,
which are, employed and unemployed foraging bees, and food sources. The first two
components, employed and unemployed foraging bees, search for rich food sources,
which is the third component, close to their hive. The model also defines two leading
modes of behaviour which are necessary for self-organizing and collective intelligence
that recruitment of foragers to rich food sources resulting in positive feedback and
abandonment of poor sources by foragers causing negative feedback. In ABC, a
colony of artificial forager bees as an agents search for rich artificial food sources is a
good solutions for a given problem. To apply ABC, the considered optimization
problem is first converted to the problem of finding the best parameter vector which
minimizes an objective function. Then, the artificial bees randomly discover a
population of initial solution vectors and then iteratively improve them by employing
the strategies: moving towards better solutions by means of a neighbour search
mechanism while abandoning poor solutions [11].
2.6.3 Ant Colony Algorithm
In the natural world, ants of some species wander randomly, and upon finding
food return to their colony while laying down pheromone trails. If other ants find such
a path, they are likely not to keep travelling at random, but instead to follow the trail,
returning and reinforcing it if they eventually find food. Over time, however, the
pheromone trail starts to evaporate, thus reducing its attractive strength. The more
time it takes for an ant to travel down the path and back again, the more time the
11
pheromones have to evaporate. A short path, by comparison, gets marched over more
frequently, and thus the pheromone density becomes higher on shorter paths than
longer ones. Pheromone evaporation also has the advantage of avoiding the
convergence to a locally optimal solution. If there were no evaporation at all, the paths
chosen by the first ants would tend to be excessively attractive to the following ones.
In that case, the exploration of the solution space would be constrained. The influence
of pheromone evaporation in real ant systems is unclear, but it is very important in
artificial systems [12]. The overall result is that when one ant finds a good path from
the colony to a food source, other ants are more likely to follow that path, and positive
feedback eventually leads to all the ants following a single path. The idea of the ant
colony algorithm is to mimic this behaviour with "simulated ants" walking around the
graph representing the problem to solve. Because the ant-colony works on a very
dynamic system, the ant colony algorithm works very well in graphs with changing
topologies. Examples of such systems include computer networks, and artificial
intelligence simulations of workers.
2.7 Main techniques in data recovery
Recovery techniques existing in today’s conventional data servers are
restricted for a few recovery model. Firstly, a time-based recovery model, also called
point-in-time recovery (PITR), which recovers the data up to a specified point in time.
Secondly, a transaction log based recovery model where database is rolled forward till
the transactions from a specific transaction log file whether archive or unachieved is
applied. Lastly, the change-based recovery model or log sequence recovery model
based on the system change number assigned by the data server [13].
12
Mr. Mourad Benchikh has described that “Recovery algorithms are techniques
to ensure transaction atomicity and durability despite failures”. The recovery
subsystem, using recovery algorithm, ensures atomicity by undoing the actions of
transactions that do not commit and durability by making sure that all actions of
committed transactions survive even if failures occur [14]. There are two general
approaches to recovery which are write-ahead logging (WAL) approach and a
shadow-page technique. Both of WAL and a shadow-page providing atomicity and
durability in database system.
2.7.1 Write-ahead logging (WAL)
In a system using WAL, all modifications are written to a log before they are
applied. Usually both redo and undo information is stored in the log. The purpose of
this can be illustrated by a program that is in the middle of performing some operation
when the machine it is running on loses power. Upon restart, that program might well
need to know whether the operation it was performing succeeded, half-succeeded, or
failed [3]. If a write-ahead log is used, the program can check this log and compare
what it was supposed to be doing when it unexpectedly lost power to what was
actually done. On the basis of this comparison, the program could decide to undo what
it had started, complete what it had started, or keep things as they are. WAL allows
updates of a database to be done in-place. Another way to implement atomic updates
is with shadow paging, which is not in-place. The main advantage of doing updates in-
place is that it reduces the need to modify indexes and block lists [13].
13
2.7.2 Shadow paging
Shadow paging is a copy-on-write technique for avoiding in-place updates of
pages. Instead, when a page is to be modified, a shadow page is allocated. Since the
shadow page has no references, it can be modified liberally, without concern for
consistency constraints, etc [3]. When the page is ready to become durable, all pages
that referred to the original are updated to refer to the new replacement page instead.
Because the page is "activated" only when it is ready, it is atomic. If the referring
pages must also be updated via shadow paging, this procedure may recursive many
times, becoming quite costly. One solution, employed by the WAFL file system
(Write Anywhere File Layout) is to be lazy about making pages durable. This
increases performance significantly by avoiding many writes on hotspots high up in
the referential at the cost of high commit latency. Shadow paging is similar to the old
master–new master batch processing technique used in mainframe database systems.
In these systems, the output of each batch run in possibly a day's work was written to
two separate disks or other form of storage medium. One was kept for backup, and the
other was used as the starting point for the next day's work [15].
2.8 Related works
2.8.1 A Conceptual Framework for Disaster Recovery and Business Continuity
of Database Services in Multi- Cloud
(Mohammad M. Al-Shammari et al, 2016) described a cloud database services
have been utilized to reduce the cost of storage in information technology fields. It
also provides many other benefits such as data accessibility through internet. Single
cloud is defined as a set of servers reside in one or multiple data centres offered by a
single provider. However, moving from single cloud to multi-clouds is reasonable and
14
important for many reasons. For instance, the services of single clouds are still subject
to outage which affects the availability of the database. Besides, in the case of
disaster, the single cloud is subject to data lost partially or fully. The single cloud is
predicted to become less popular with customers due to the high risks of database
service availability failure and the possibility of malicious insiders in the single cloud.
With Disaster Recovery (DR) in cloud, resources of multiple cloud service providers
can be utilized cooperatively by the DR services provider. Therefore, there is a
necessity to develop a practical multi-cloud based DR framework aims at minimizing
the backup cost with respect to Recovery Time Objective (RTO) and Recovery Point
Objective (RPO). The framework should maintain the availability of data by achieving
high data reliability, low backup cost, and short recovery and ensure continuity for
business before, during and after the disaster incident. This paper proposes a multi-
cloud framework maintaining high availability of data before, during and after the
occurrence of the disaster. Besides, it also ensures the continuity of the database
services during and after the disaster [16].
2.8.2 Instant recovery with write-ahead logging
(Theo Harder et al, 2015) related instant recovery improves system availability
by reducing the mean time to repair, i.e., the interval during which a database is not
available for queries and updates due to recovery activities. Variants of instant
recovery pertain to system failures, media failures, node failures, and combinations of
multiple failures. After a system failure, instant restart permits new transactions
immediately after log analysis, before and concurrent to “redo” and “undo” recovery
actions. After a media failure, instant restore permits new transactions immediately
after allocation of a replacement device, before and concurrent to restoring backups
15
and replaying the recovery log. Write-ahead logging is already ubiquitous in data
management software. The recent definition of single-page failures and techniques for
log-based single-page recovery enable immediate, lossless repair after a localized
wear-out in novel or traditional storage hardware. In addition, they form the backbone
of on-demand “redo” in instant restart, instant restore, and eventually instant failover.
Thus, they complement on-demand invocation of traditional single-transaction “undo”
or rollback. In addition to these instant recovery techniques, the discussion introduces
self-repairing indexes and much faster offline restore operations, which impose no
slowdown in backup operations and hardly any slowdown in log archiving operations.
The new restore techniques also render differential and incremental backups obsolete,
complete backup commands on a database server practically instantly, and even
permit taking full up-to-date backups without imposing any load on the database
server [17].
2.8.3 A novel recovery mechanism enabling fine-granularity locking and fast,
REDO-only recovery
(Caetano Sauer et al, 2014) present a series of novel techniques and algorithms
for transaction commit, logging, recovery, and propagation control. In combination,
they provide a recovery component that maintains the persistent state of the database
(both log and data pages) always in a committed state. Recovery from system and
media failures only requires only REDO operations, which can happen concurrently
with the processing of new transactions. The mechanism supports ne-granularity
locking, partial rollbacks, and snapshot isolation for reader transactions. That design
does not assume a specific hardware configuration such as non-volatile RAM or a
flash. It is designed for traditional disk environments. Nevertheless, it can exploit
16
modern I/O devices for higher transaction throughput and reduced recovery time with
a high degree of flexibility [18].
2.8.4 Dynamic Damage Recovery for Web Databases
(Hong Zhu et al, 2010) said, there is an urgent need for a self-healing database
system which has the ability to automatically locate and undo a set of transactions that
are corrupted by malicious attacks. The metrics of survivability and availability
require a database to provide continuous services during the period of recovery, which
is referred to as dynamic recovery. In this paper, we present that an extended read
operation from a corrupted data would cause damage spreading. We build a fine
grained transaction log to record the extended read and write operations while user
transactions are processing. Based on that, we propose a dynamic recovery system to
implement the damage repair. The system captures damage spreading caused by
extended read-write dependency between transactions. It also retains the execution
results for blind write transactions and gives a solution to the issues of recovery
conflicts caused by forward recovery. Moreover, a confinement activity is imposed on
the in-repairing data to prevent a further damage propagation while the data recovery
is processing. The performance evaluation in our experiments shows that the system is
reliable and highly efficient [19].
2.8.5 Data Recovery for Web Applications
(Akkus, I. E et al, 2010) described the design of a generic data recovery system
for web applications that store their persistent data in a database tier. The system does
not rely on the web application for recovery and thus, is resilient to failures and bugs
in the applications. The main goals are allow web application administrators to
17
diagnose application failures that corrupt persistent data and enable selective recovery
of this data, without affecting the rest of the application. The system tracks
application-level dependencies during recovery by employing dynamic data-flow
within requests rather than just relying on the read-write sets of queries and requests,
avoiding application level inconsistencies and tracking corruption more accurately. A
method that proposed for recovering from malicious transactions is based on tracking
inter-transaction dependencies. These inter-transaction dependencies are created by
examining the read-write sets of transactions. The attacking transaction and effected
transactions are moved to the end of the transaction history to simplify recovery. Their
follow-up works proposes a system in which normal operation is allowed while
recovery is performed. This methods focus entirely on database-level recovery while
ignoring application level dependencies, which can cause inconsistent recovery at the
application level [20].
2.8.6 Fine Grained Transaction Log for Data Recovery in Database Systems
(Ge Fu et al, November 2008) proposed a fine grained transaction log based
write operation, extended read operation and association degree for a SQL statement.
Also known as transaction journal, database log, binary log or audit trail are the
history of actions executed by a database management system used to guarantee ACID
properties over crashes or hardware failures. Physically, a log is a file listing changes
to the database, stored in a stable storage format. The log records all the data items of
the read only and update-involved operations (read and write) for the committed
transactions, and even extracts data items read by the subqueries in the SQL
statements. (Ammann, 2002) introduced the read-write dependency method to
database systems, and proposed an “on-the-fly” recovery model based on the
18
transaction dependency. The model logged transaction history in executing period of
transactions, and established the dependencies of transactions based on the history,
after that undoes all the malicious and affected transactions [21].
2.8.7 Dynamic Data Recovery for Database Systems
(Hong Zhu et al, 2008) refer to require the system provide fault tolerance
mechanism. When the damage for data items occurs, the database system should
provide continuous, but maybe degraded service while the damage is being repaired.
The mechanism become as "Dynamic Recovery". There are two evaluation criteria for
dynamic recovery, which are exactness and high-efficiency. Exactness requires that a
database system be able to recover from committed malicious transactions. High-
efficiency requires that the system should spend as less time as possible on damage
assessment and repair. The goal of damage recovery is to locate each affected
transaction and recover the database from all malicious or affected transactions [22].
19
CHAPTER 3
METHODOLOGY
3.1 Methodology
“Methodology” implies more than simply the methods that intend to use to
collect data. It is often necessary to include a consideration of the concepts and
theories which underline the methods. This chapter describe plan to tackle research
problem. This chapter also provide work plan and describe the activities necessary for
the completion of project. Project methodology that has been used in the project is
incremental model methodology. Framework design for the project has been described
to show the flow based on chosen approach. Expected result stated at the end of this
chapter as a conclusion of this chapter.
20
3.2 Framework of a project
Figure 3.1 : Framework of a data recovery in web server
Figure 3.1 shows a framework of data recovery in web servers. In the project,
five servers will develop which are three web servers, one database server and one
backup server. It use an active passive concept and focuses to a transaction failure.
That means, only one server is running at one time. If a database server is up, backup
server will be down at one time and vice versa. If a transaction occur, it will connect
only to up server. This project will demonstrate the occurrence of fault tolerance and
show how fault tolerance can be able to manage a failure occurs. As a general, fault
tolerance is the property that enables a system to continue operating properly in the
event of the failure of some of its components. If its operating quality decreases at all,
the decrease is proportional to the severity of the failure, as compared to a naively
designed system in which even a small failure can cause total breakdown. A fault-
tolerant design enables a system to continue its intended operation, possibly at a
reduced level, rather than failing completely, when some part of the system fails [4].
21
The main project will cover between database server and backup server section. If
there are no failure occurs during process of transaction, a database server will
running as usual and any transaction that has been made will be store in database
server first. At the same time, a backup server will be store what data that has been
exist in database server based that time that was set to backup. So, any transaction in
web server from client only connect to database server. Otherwise, if database server
is failure at that time, automatically backup server will be active and all of the
transactions that made in web server directly connect to backup. Any updated data
will store in backup server. In this case, high availability of the system operation can
achieve due to failure. Another server can take over automatically after failure occur
that will not be disrupt the operation of the system. After a database server become
active or up as usual, database server will request all the latest updated data that store
in backup server before a database server is failed due to perform a recovery process.
During a recovery process occur, a data that retrieve into database server will be same
with an amount of data in backup server. So, for optimization in data recovery process
in this project, a data that recover into database server must accurate with a latest
updated data in backup server. The amount of transaction that occur must be same in
both of database and backup server. For instance, if a backup server stores fifty
transaction, a database server also will recover fifty transaction also in their storage to
keep reliability and consistency of data. So, a data lost during a transaction failure that
caused by server down should not be worried because a database server will has a
successor to retrieve an accurate data in case of failure. The optimization of data
recovery can be achieve by implement ARIES algorithm to keep reliability and
consistency.
22
3.3 Flowchart of a project
START
Transaction
process
Backup server
copy data from
END
database server
No Transaction Transaction
Failure proceed complete
occur
Yes
Backup server
take over
No Transaction Transaction
Transaction
commit abort Roll back
Yes
Transaction
Roll forward
Figure 3.2 : Flowchart of data recovery process
23
Figure 3.2 shows a flowchart that describe a process of data recovery in case of
transaction failure occurs. When a transaction process start and no failure occur during
that transaction, the process will be proceed until a transaction process is completed.
The updated data in database server will copies into backup server for case of database
server fail soon. But, if any failure occur during a transaction, automatically backup
server will take over a process of transaction. After that, a recovery process occur and
backup server will trace a record of transaction of log file to get information about the
transaction is commit or not yet. Transaction commit is a success transaction that store
in database server which had been done backup in server. If a transaction is
committed, roll forward transaction will proceed that mean it will make a redo action.
Redo log files record changes to the database as a result of transactions. It will protect
the database from the loss of integrity because of system failures caused by
transaction failure. Besides, redo log files must be multiplexed to ensure that the
information stored in them is not lost in the event of a database storage failure. When
a backup server got track a log file that has been commit, it will recover back a lost
transactions in database server that caused by server failed. Then, a transaction in web
server can proceed as usual like a transaction in database server. Otherwise, if a
transaction has not commit that mean not success, the transaction will abort. Then, a
roll back transaction occur and perform undo action to return the database at the
current state. The undo records are used to undo changes that were made to the
database by the uncommitted transaction. During database recovery, undo records are
used to undo any uncommitted changes applied from the redo log to the data files.
Undo records provide read consistency by maintaining the before image of the data
for users who are accessing the data at the same time that another user is changing it.
Once finished, a transaction can be proceed until it complete.
24
3.4 Algorithms for Recovery and Isolation Exploiting Semantics (ARIES)
approach
ARIES algorithm is a state of the art recovery method. It incorporated
numerous optimizations to reduce overheads during normal processing and to speed
up recovery. ARIES also can achieve synchronization updated between two or more
database server through it principles, steps and data structures.
3.4.1 Principles of ARIES
There are three main principles lie behind ARIES :-
(1) Write-ahead logging: Any change to an object is first recorded in the log, and
the log must be written to stable storage before changes to the object are written to
database storage.
(2) Repeating history during Redo: On restart after a crash, ARIES retraces the
actions of a database before the crash and brings the system back to the exact state that
it was in before the crash. Then it undoes the transactions still active at crash time.
(3) Logging changes during Undo: Changes made to the database while undoing
transactions are logged to ensure such an action isn't repeated in the event of repeated
restarts.
25
3.4.2 Steps in ARIES
There are three main steps in ARIES :-
(1) Analysis: It identifies the dirty (updated) pages in the buffer and the set of
transactions active at the time of crash. The appropriate point in the log where REDO
operation should start is also determined.
(2) REDO phase: It actually reapplies updates from the log to the database.
Generally the REDO operation is applied to only commit transactions. However, in
ARIES, this is not the case. Certain information in the ARIES log will provide the
start point for REDO, from which REDO operations are applied until the end of the
log is reached. Thus only the necessary REDO operations are applied during recovery.
(3) UNDO phase: The log is scanned backwards and the operations of transactions
that were active at the time of the crash are undone in reverse order. The information
needed for ARIES to accomplish its recovery procedure includes the log, the
transaction table, and the dirty page table. In addition, checkpointing also will be used.
3.4.3 Data structures used in ARIES recovery algorithm
There are four data structure that implemented in ARIES :-
(1) Log records
Each log record contains LSN of previous log record of the same
transaction. LSN in log record may be implicit. Figure 3.3 shows a field that
contains in log records.
Figure 3.3 : Contains in log reocrds
26
A special redo-only log record called compensation log record (CLR) used to
log actions taken during recovery that never need to be undone. It will serves the role
of operation-abort log records. It also has a field UndoNextLSN to note next which is
earlier record to be undone. The records in between would have already been undone.
It required to avoid repeated undo of already undo actions.
Figure 3.4 : A redo-only process in log record
(2) Transaction Table (TT)
Responsible to stores each transaction of TransID, State, LastLSNC (LSN of
last record written by transaction) and UndoNxtLSN which is a next record to be
processed in rollback. During recovery process, TT will initialized during analysis
pass from most recent checkpoint. It will be modified during analysis as log records
are encountered and during undo.
(3) Dirty Page Table (DPT)
A list of pages in the buffer that have been updated. It contains pageLSN and
RecLSN. RecLSN is an LSN such that logs records before this LSN have already been
applied to the page version on storage. It will set to current end of log when a page is
inserted into DPT just before being updated. Then, it recorded in checkpoints that
helps to minimize redo work.
27
(4) Checkpoint log
Checkpoint log record contains a Dirty Page Table (DPT) and list of active
transactions. For each transaction, LastLSN, the LSN of the last log record written by
the transaction. Also, it fixed position on storage notes LSN of last completed
checkpoint log record. A DPT are not written out at checkpoint time. Instead, they are
flushed out continuously. The checkpoint in thus very low overhead. So, it can be
done frequently.
28
3.5 Implemented ARIES algorithm in project
START
1 Transaction failure occurs
2 Write-ahead logging (WAL)
3 Create a Log Sequence Number (LSN)
4 Dirty page table keep record
5 Transaction table is running
6 Analysis information from log file
7 First updated log
8 If (Transaction Roll-forward)
9 Redo operation can be attempted
10 Repeats history
11 Scan forward from RedoLSN
12 If (page is not in DPT)
Skip the log record
13 Else if (LSN of the log record is less than the RecLSN in DPT)
Skip the log record
14 Else (PageLSN of the page less than the LSN)
Redo the log record
15 Else if (Transaction Roll-back)
16 Undo operation can be done to retrieve old data
29
17 Transaction whose abort was complete earlier are not undone
18 Scan backward on log undoing
19 If (Ordinary log records)
Set next LSN to be undone for transaction to PrevLSN noted
in the record
20 Else if (Compensation log records (CLR) )
Set next LSN to be undo to UndoNextLSN noted in the log
Record
21 Else (Records have been undone already)
Skip the log record
22 Else (Both before & after records are logged)
23 Undo-Redo operation are logged
24 Start from the end of the log
25 Proceed backward
26 Roll-forward a records
27 End of log
28 Records have been retrieved
End if
END
Table 3.1 : Design of ARIES Algorithms
30
1. A failure is occur during transaction which is one of a database server is down.
2. ARIES keeps track of the changes made to the database by using a log. It
implements the WAL protocol which is all updates to all pages are logged.
3. Log records is created during the operation of a database. Log entries are
sequentially ordered with Sequence Numbers.
4. The dirty page table keeps record of all the pages that have been modified and
not yet written back to storage and the first Sequence Number that caused that
page to become dirty.
5. The transaction table contains all transactions that are currently running and
the Sequence Number of the last log entry they caused.
6. Analysis the information from log file that determine which transactions to
undo and which page were dirty, the data not up to date at time of crash. It also
determine RedoLSN, LSN from which redo should be start.
7. Every transaction implicitly begins with the first "Update" type of entry for the
given TransactionID.
8. Check a process, whether a transaction need to roll-forward or not.
9. This state, redo operation will be done.
10. Repeats history by replaying every action not already reflected in the page of
storage.
11. Scan forward from RedoLSN. Whenever an update log record is found in three
condition.
12. If the page is not in Dirty Page Table (DPT), the log record can be skip.
13. If the LSN of the log record is less than the RecLSN of the Dirty Page Table,
the log record also can be skip.
31
14. If the PageLSN of the page fetched from storage is less than the LSN of the log
record, a log record must be redo.
15. Check a process, whether a transaction need to roll-back or not.
16. This state, Undo operation can be done to retrieve old data.
17. Transaction whose abort was complete earlier are not undone. No need to undo
these transaction, earlier undo actions were logged and are redone as required.
18. Performs backward scan on log undoing all transaction in undo-list.
19. For ordinary log records, set next LSN to be undone for transaction to
PrevLSN noted in the log record.
20. For compensation log records (CLRs), set next LSN to be undo to
UndoNextLSN noted in the log record.
21. All intervening records are skipped since they would have been undone
already.
22. Check a process, whether a transaction need both of roll-forward and roll-back
or not.
23. Undo and Redo operations will be proceed.
24. Fetch the records that start from the end of the log.
25. Proceed backward until meet a lost record in log.
26. Roll-forward the records and trace a lost a latest record.
27. Finish retraces a records in log.
28. The data are record and can be retrieved.
32
3.5 Software and hardware requirements
(1) VirtualBox
A software virtualization package that installs on an operating system as an
application. VirtualBox allows additional operating systems to be installed on it, as a
Guest OS, and run in a virtual environment. A Linux OS is used as a medium to
implement the project.
(2) PhpMyAdmin
A free software tool written in PHP, intended to handle the administration of
MySQL over the Web. phpMyAdmin supports a wide range of operations on MySQL.
It frequently used operations to manage databases, tables, columns, relations, indexes,
users and permissions. It can be performed via the user interface, while users can still
have the ability to directly execute any SQL statement.
(3) Personal Computer/Laptop
Processor : Intel® Core ™ i3-2350M CPU @ 2.30 GHz
RAM : 6.00 GB
System Type : 64-bit Operating System
33
REFERENCES
[1] Kumar, A., Sahu, S. K., Tyagi, S., Sangwan, V., & Bagate, R. (2013). Data
Recovery Using Restoration Tool. INTERNATIONAL JOURNAL OF
MATHEMATICS, 1(3).
[2] Zhao, F., Zhang, J. S., & Wang, Z. X. (2013). Research on Data Recovery of
Oracle Database in Linux. In Advanced Materials Research (Vol. 601, pp. 337-341).
Trans Tech Publications.
[3] Speer, J., & Kirchberg, M. (2005). D-ARIES: A Distributed Version of the
ARIES Recovery Algorithm. In ADBIS Research Communications.
[4] Guo, J. (2004). Fault tolerant computing.The University of Michigan-

Dearborn, 7.
[5] Nasreen, M. A., Ganesh, A., & Sunitha, C. (2016). A Study on Byzantine Fault
Tolerance Methods in Distributed Networks. Procedia Computer Science, 87, 50-54.
[6] Cong, G., Fan, W., Geerts, F., Jia, X., & Ma, S. (2007, September). Improving
data quality: Consistency and accuracy. In Proceedings of the 33rd international
conference on Very large data bases (pp. 315-326). VLDB
Endowment.http://recoveryspecialties.com
[7] Skeel Jr, D. A., & Jackson, T. H. (2012). Transaction consistency and the new
finance in bankruptcy. Columbia Law Review, 152-202.
[8] Domaschka, J., Hauser, C. B., & Erb, B. (2014, September). Reliability and
availability properties of distributed database systems. In Enterprise Distributed
34
Object Computing Conference (EDOC), 2014 IEEE 18th International (pp. 226-233).
IEEE.
[9] A. S. Tanenbaum and M. v. Steen, Distributed Systems - Principles and

Paradigms. Pearson Prentice Hall, 2007.
[10] Juang, C. F. (2004). A hybrid of genetic algorithm and particle swarm

optimization for recurrent network design. IEEE Transactions on Systems, Man, and
Cybernetics, Part B (Cybernetics), 34(2), 997-1006.
[11] Karaboga, D., & Gorkemli, B. (2014). A quick artificial bee colony (qABC)
algorithm and its performance on optimization problems. Applied Soft Computing,
23, 227-238.
[12] Liao, T., Socha, K., de Oca, M. A. M., Stützle, T., & Dorigo, M. (2014). Ant
colony optimization for mixed-variable optimization problems. IEEE Transactions on
Evolutionary Computation, 18(4), 503-518.
[13] Kim, J. J., Kang, J. J., & Lee, K. Y. (2012). Recovery Methods in Main
Memory DBMS. International journal of advanced smart convergence, 1(2), 26-29.
[14] Sharma, S., Agiwal, P., Gaherwal, R., Mewada, S., & Sharma, P. (2012).
Analysis of Recovery Techniques in Data Base Management System. Research
Journal of Computer and Information Technology Sciences, E-ISSN, 2320, 6527.
[15] Speer, J., & Kirchberg, M. (2007, September). C-ARIES: A multi-threaded

version of the ARIES recovery algorithm. In International Conference on Database
and Expert Systems Applications (pp. 319-328). Springer Berlin Heidelberg.
[16] Al-shammari, M. M., & Alwan, A. A. A Conceptual Framework for Disaster

Recovery and Business Continuity of Database Services in Multi-Cloud.
35
[17] Harder, T., Sauer, C., Graefe, G., & Guy, W. (2015). Instant recovery with
write-ahead logging. Datenbank-Spektrum, 15(3), 235-239.
[18] Graefe G (2014) Instant Recovery from System Failures. Submitted for
publication
[19] Zhu, H., Fu, G., Feng, Y. C., & Lü, K. (2010). Dynamic damage recovery for
web databases. Journal of Computer Science and Technology, 25(3), 548-561.
[20] Akkuş, İ. E., & Goel, A. (2010, June). Data recovery for web applications. In
Dependable Systems and Networks (DSN), 2010 IEEE/IFIP International Conference
on (pp. 81-90). IEEE.
[21] Fu, G., Zhu, H., Feng, Y., Zhu, Y., Shi, J., Chen, M., & Wang, X. (2008,
October). Fine grained transaction log for data recovery in database systems. In
Trusted Infrastructure Technologies Conference, 2008. APTC'08. Third Asia-Pacific
(pp. 123-131). IEEE.
[22] Zhu, H., Fu, G., Zhu, Y., Jin, R., Lü, K., & Shi, J. (2008, September). Dynamic
data recovery for database systems based on fine grained transaction log. In
Proceedings of the 2008 international symposium on Database engineering &
applications (pp. 249-253). ACM.
36
APPENDIX 1
Task W W W W W W W W W W W W W W W
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Discussion title with
supervisor
Abstract & tittle
submission
LR discussion &
problem statement
Proposal preparation
& slide
Proposal presentation
Proposal correction
Methodology
Framework design
Implementation of
algorithm
Conference preparation
Conference academic
project (framework)
Proposal draft
submission
Proposal correction
Proposal report
submission
37

Aries

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Aries

Transféré par

Droits d'auteur :

Formats disponibles

AN ARIES ALGORITHM

FOR OPTIMAL DATA RECOVERY

NAJIHAH BT MOHD YAJID

BACHELOR OF COMPUTER SCIENCE

FOR OPTIMAL DATA RECOVERY IN DATABASE SERVER

NAJIHAH BT MOHD YAJID

Bachelor of Computer Science (Network Security)

Faculty of Informatics and Computing

Universiti Sultan Zainal Abidin, Terengganu, Malaysia

Abidin or other institutions.

Name : Najihah Bt Mohd Yajid

This is to confirm that:

Name : Najihah Bt Mohd Yajid

to complete this proposal. I would like to express my sincere gratitude to my

the time of writing of this thesis.

working together along the writing of this proposal.

I would like to express my special appreciation to my family for supporting me

directly or indirectly have lent their hand in this research.

deleted, corrupted or made inaccessible. It can happen in a database server. If it fails

additional storage device. As a solution, ARIES (Algorithms for Recovery and

Isolation Exploiting Semantics) can be used to ensure database consistency.

tambahan. Sebagai penyelesaian, ARIES (Algoritma untuk Pemulihan dan

Pengasingan Mengeksploitasi Semantics) boleh digunakan bagi memastikan

keseragaman dalam pangkalan data.

CHAPTER II LITERATURE REVIEW

2.8 Related works

CHAPTER III METHODOLOGY

APPENDIX 1 Gantt Chart 37

TABLE TITLE PAGE

3.1 First table in chapter 3 29

FIGURE TITLE PAGE

2.1 First figure in chapter 2 7

3.1 First figure in chapter 3 21

3.2 Second figure in chapter 3 23

3.3 Third figure in chapter 3 26

3.4 Fourth figure in chapter 3 27

ARIES Algorithms for Recovery and Isolation Exploiting Semantics

DBMS Database Management System

SQL Statement Query Language

ACID Atomicity, Consistency, Isolation and Durability

ABC Artificial Bee Colony

PITR Point In Time Recovery

WAL Write-ahead logging

WAFL Write Anywhere File Layout

RTO Recovery Time Objective

RPO Recovery Point Objective

LSN Log Sequence Number

CLR Compensation Log Record

DPT Dirty Page Table

APPENDIX TITLE PAGE

Data recovery can be defined as a process of retrieving or regaining

inaccessible, lost, corrupted, damaged or formatted data from secondary storage,

transaction failure occurs.

having to duplicate work manually.

responds to a hardware or software failure. The term essentially refers to a system’s

software, hardware or a combination of both. To handle faults gracefully, some

client will store in database server. A database is a collection of information that is

client-server architecture. The back-end, sometimes called a database server, performs

management systems, otherwise known as DBMSs. These types of database software

systems are programmed in SQL, and examples include MySQL in Linux.

Optimization in data recovery refer to consistency and reliability of database

maintain the atomicity and durability properties of transactions. It is a very important

Transaction processing systems are usually measured by the number of transactions

hardware or software problems without corrupting data. Multiple users must be