Académique Documents
Professionnel Documents
Culture Documents
Definition;
Transaction – refers to an action or series of actions carried out by a single user or
application program which accesses or changes the contents of the database.
Most DBMSs do not have an inbuilt way of determining which actions are grouped
together to form a single transaction.
A way of circumventing this is to provide key indicators for these important boundaries.
Keywords like BEGIN, END, COMMIT and/or ROLLBACK (or their equivalent) are
used.
If these delimiters are not used, the entire program is usually regarded as a single
transaction, with the DBMS performing an automatic COMMIT when the program
terminates correctly and a ROLLBACK if it does not.
Properties of transactions
1. Atomicity (the “all or nothing property”)
A transaction is an indivisible unit that is either performed in its entirety or not performed
at all.
2. Consistency
A transaction must transform the database from one consistent state to another consistent
state.
3. Isolation
Transactions execute independently of one another. This means that any partial effects of
one incomplete transaction are not “visible” to other transactions.
Transaction management & concurrency control
4. Durability
The effects of a successfully completed (committed) transaction are permanently
recorded in the database and must not be lost as a result of a subsequent failure.
If a failure occurs during a transaction, then the database could be inconsistent. Different
DBMSs provide for different ways of avoiding this. Oracle, for instance has the
following provisions;
Use of the RECOVERY MANAGER (RMAN) to ensure the database is restored to the state
it was in before the start of the transaction, and therefore a consistent state.
The BUFFER MANAGER is responsible for the transfer of data between disk storage
and main memory.
Concurrency Control
Refers to the process of managing simultaneous operations on the database without
having them interfere with each other.
A key objective of developing a database is to enable many users to access shared data
concurrently.
Concurrent access is relatively easy if all users are only reading; as there is no way that
they can interfere with one another.
Alternatively, when two or more transactions are accessing the database simultaneously
and at least one is updating data, there may be interference that can result in
inconsistencies.
Example
Assume we have two transactions T1 and T2.
T1 is withdrawing KSh. 1,000 from an account with a balance of Ksh. 10,000 and T 2 is
depositing Ksh. 5000 into the same account.
If these transactions were executed serially, one after the other without interleaving of
operations, the final balance would be Ksh. 14,000 no matter which transaction is
performed first.
2
Transaction management & concurrency control
Problem:
Transaction T1 and T2 start at nearly the same time, and both read the balance as Ksh.
10,000. T2 increases the balance by Ksh. 5,000 to have the balance at Ksh. 15,000 and
stores the update in the database. Meanwhile transaction T1 reduces its copy of the
balance by Ksh. 1,000 to Ksh. 9,000 and stores the value in the database, overwriting the
previous update by T2 and thereby “losing” the Ksh. 5,000 previously added to the
balance.
Illustration (Problem).
Time T1 T2 Balance
t1 Begin transaction 10,000
t2 Begin transaction Read (Balance) 10,000
t3 Read (Balance) Balance = Balance + 5,000 10,000
t4 Balance = Balance - 1,000 Write (Balance) 15,000
t5 Write (Balance) Commit 9,000
t6 Commit 9,000
The loss of T2’s update can be avoided by preventing T1 from reducing the value of the
Balance until T2’s update has been completed.
Solution:
T2 first requests a write_lock on the Balance. It can the proceed to read the value of the
balance from the database, increase it by Ksh. 5,000 and write back the new value back to
the database. When T1 starts, it also requests a write_lock on the Balance. However, since
the balance is already write locked by T 2, the request is not immediately granted and T 1
has to wait until the lock is released by T2. This happens only after T2 commits.
Illustration - Solution
Time T1 T2 Balance
t1 Begin transaction 10,000
t2 Begin transaction Write_lock (Balance) 10,000
t3 Write_lock (Balance) Read (Balance) 10,000
t4 WAIT Balance = Balance + 5,000 15,000
t5 WAIT Write (Balance) 15,000
t6 WAIT Commit/unlock (Balance) 15,000
t7 Read (Balance) 15,000
t8 Balance = Balance - 1,000 15,000
t9 Write (Balance) 14,000
t10 Commit/unlock )Balance) 14,000
3
Transaction management & concurrency control
Transaction T4 reads the balance as KSh. 10,000 and increases by Ksh. 5,000 and updates
the figure to Ksh. 15,000, but it aborts the transaction such that the balance is restored
back to the original value of Ksh. 10,000.
However, by this time, transaction T3 has read the new value of the balance (Ksh. 15,000)
and is using it as the basis of a Ksh. 1,000 withdrawal, giving a new incorrect balance of
KSh. 14,000 instead of Ksh. 9,000.
The reason for the aborting of the transaction is immaterial to us but the effect is the
assumption by T3 that T4‘s update completed successfully, even though it was rolled
back.
Illustration (Problem).
Time T3 T4 Balance
t1 Begin transaction 10,000
t2 Read (Balance) 10,000
t3 Balance = Balance + 5,000 10,000
t4 Begin transaction Write (Balance) 15,000
t5 Read (Balance) . 15,000
t6 Balance = Balance - 1,000 Rollback 10,000
t7 Write (Balance) 14,000
t8 Commit 14,000
The solution to the problem is to prevent T3 from reading the Balance till after T4 is
through (whether it completes successfully or not).
T4 first requests write_lock on the balance. It then proceeds to read the value of the
balance, increments the value by Ksh. 5,000 and writes back the value to the database,
but it does not commit. Instead, a roll back is issued. When the roll back is executed, the
updates of T4 are undone and the value of the Balance is returned to its original value of
Ksh. 10,000.
Illustration - solution.
Time T3 T4 Balance
t1 Begin transaction 10,000
t2 Begin transaction Write_lock (Balance) 10,000
t3 Write_lock (Balance) Read (Balance) 10,000
t4 WAIT Balance = Balance + 5,000 15,000
t5 WAIT Write (Balance) 15,000
t6 WAIT Rollback/unlock Balance 10,000
t7 Read (Balance) 10,000
t8 Balance = Balance - 1,000 10,000
t6 Write (Balance) 9,000
t7 Commit/unlock Balance 9,000
4
Transaction management & concurrency control
On the other hand, transactions that only read the database can also produce inaccurate
results, if they are allowed to read partial results of other incomplete transactions that are
simultaneously updating the database, a situation referred to as a dirty read or
unrepeatable read.
The inconsistent analysis problem occurs when a transaction reads several values from
the database but another transaction updates one or more of the values during the
execution of the first.
Illustration (Problem).
Time T5 T6 Balances Total
X Y Z
t1 Begin transaction 1,000 700 500 -
t2 Begin transaction Total = 0 1,000 700 500 0
t3 Read (X) Read (X) 1,000 700 500 0
t4 X = X - 100 Total = Total + X 1,000 700 500 1,000
t5 Write (X) Read (Y) 900 700 500 1,000
t6 Read (Z) Total = Total +Y 900 700 500 1,700
t7 Z = Z + 100 . 900 700 500 1,700
t8 Write (Z) . 900 700 600 1,700
t9 Commit Read (Z) 900 700 600 1,700
t10 Total = Total + Z 900 700 600 2,300
The solution to this problem is to prevent transaction T6 from reading balances X, Y and
Z until T5 has completed its updates.
5
Transaction management & concurrency control
Illustration (solution).
Time T5 T6 Balances Total
X Y Z
t1 Begin transaction 1,000 700 500 -
t2 Begin transaction Total = 0 1,000 700 500 0
t3 Write_lock (X) Read_lock (X) 1,000 700 500 0
t4 Read (X) WAIT 1,000 700 500 0
t5 X = X - 100 WAIT 1,000 700 500 0
t6 Write (X) WAIT 900 700 500 0
t7 Write_lock (Z) WAIT 900 700 500 0
t8 Read (Z) WAIT 900 700 500 0
t9 Z = Z + 100 WAIT 900 700 500 0
t10 Write (Z) WAIT 900 700 600 0
t11 Commit/unlock (X, Z) WAIT 900 700 600 0
t12 Read (X) 900 700 600 0
t13 Total = Total + X 900 700 600 900
t14 Read_lock (Y) 900 700 600 900
t15 Read (Y) 900 700 600 900
t16 Total = Total +Y 900 700 600 1,600
t17 Read_lock (Z) 900 700 600 1,600
t18 Read (Z) 900 700 600 1,600
t19 Total = Total + Z 900 700 600 2,200
t20 Commit/unlock (X, Y & Z) 900 700 600 2,200
Definition;
Schedule – refers to a sequence of operations by a set of concurrent transactions that
preserves the order of the operations in each of the individual transactions.
A transaction is made up of a sequence of operations consisting of reads and writes to the
database, followed by a commit or abort actions.
6
Transaction management & concurrency control
Serial schedule – refers to a schedule where the operations of each transaction are
executed consecutively without any interleaved operations from other transactions.
In a serial schedule, the transactions are performed in serial order. For instance, if we
have 2 transactions T1 and T2, the serial order would be T1 then T2, or T2 then T1.
Evidently, in serial execution, there is no interference between transactions, since only
one transaction is executing at any given time.
It may not be guaranteed that the outcome of all serial executions of a given set of
transactions will be identical. For instance, in a bank, it matters a lot whether the interest
is calculated before or after a large deposit is made.
Non-serial schedule – refers to a schedule where the operations from a set of concurrent
transactions are interleaved.
The 3 problems of concurrency control described earlier arise from the mismanagement
of concurrency control, which left the database in an inconsistent state for the first two
problems and presented the user with a wrong result in the last problem (inconsistent
analysis).
Serial execution prevents such problems. Whatever schedule is chosen, serial execution
never leaves the database in an inconsistent state. Thus any serial schedule is considered
correct even though different results may arise.
7
Transaction management & concurrency control
Time T7 T8
t1 Begin_transaction
t2 Read (Balance X)
t3 Write (Balance X)
t4 Begin_transaction
t5 Read (Balance X)
t6 Write (Balance X)
t7 Read (Balance Y)
t8 Write (Balance Y)
t9 Commit
t10 Read (Balance Y)
t11 Write (Balance Y)
t12 Commit
Since the write operation on the balance in T8 does not conflict with the subsequent read
operation on the Balance in T7, the order of these two operations can be changed to
produce an equivalent schedule S2 shown below.
Time T7 T8
t1 Begin_transaction
t2 Read (Balance X)
t3 Write (Balance X)
t4 Begin_transaction
t5 Read (Balance X)
t6 Read (Balance Y)
t7 Write (Balance X)
t8 Write (Balance Y)
t9 Commit
t10 Read (Balance Y)
t11 Write (Balance Y)
t12 Commit
8
Transaction management & concurrency control
We have simply
Changed the order of the Write (Balance X) of T8 with the Write (Balance Y) of T7.
Changed the order of the Read (Balance X) of T8 with the Read (Balance Y) of T7.
Changed the order of the Read (Balance X) of T8 with the Write (Balance Y) of T7.
The schedule S3 is a serial schedule and since S1 and S2 are equivalent to S3, S1 and S2
are serializable schedules.
This type of serializability is known as conflict serializability. This is a schedule that
orders any conflicting operations in the same way as some serial execution.
Under the unconstrained write rule (i.e. a transaction updates a data item based on its old
value, which is first read by the transaction), a precedence graph can be produced to test
for conflict serializability.
If the precedence graph contains a cycle then the schedule is not conflict serializable.
9
Transaction management & concurrency control
Time T9 T10
t1 Begin transaction
t2 Read (Balance X)
t3 Balance X = Balance X –1,000 .
t4 Write (Balance X) Begin transaction
t5 Read (Balance X)
t6 Balance X = Balance X * 1.1
t7 Write (Balance X)
t8 Read (Balance Y)
t9 Balance Y = Balance Y * 1.1
t10 Write (Balance Y)
t11 Read (Balance Y) Commit
t12 Balance Y = Balance Y + 1,000
t13 Write (Balance Y)
t14 Commit
T9 T10
Y
As the precedence graph has a cycle, then this schedule is not conflict serializable.
View serializability.
This is one other type of serializability that offers less stringent definitions of schedule
equivalence than that offered by conflict serializability.
Two schedules S1 and S2 consisting of the same operations from n transactions T1, T2, T3,
…, Tn are view equivalent if the following three conditions hold;
For each data item x, if transaction Ti reads the initial value of x in the schedule S 1,
then transaction Ti must also read the initial value of x in the schedule S2.
For each record operation on data item x by transaction Ti in schedule S1, if the
value read by x has been written by transaction T j, then transaction Ti must also
read the value of x produced by transaction Tj in Schedule S2.
For each data item x, if the last write operation on x was performed by transaction
Ti in schedule S1, the same transaction must perform the final write on data item x
in schedule S2.
10
Transaction management & concurrency control
The above two methods are conservative (or pessimistic) techniques in that they cause a
delay in transactions in case there is a conflict with other transactions at some future time.
Alternative methods, called the Optimistic methods, are base don the premise that
conflict is rare and so transactions are allowed to proceed unsynchronized and only check
for conflict at the end, when the transactions reach the “commit” stage.
Locking
This is a procedure used to control concurrent access to data. When one transaction is
accessing the database, a lock may deny access by other transactions to prevent incorrect
results.
There are two types of lock;
Read lock: if a transaction has a read lock on a data item, it can read the item but not
update it. A read lock is shared i.e. many users can be granted a read lock at the same
time without an adverse effect on the database!
Writelock: if a transaction has write lock on a data item, it can both read and update the
item. A write lock is exclusive i.e. only one user/application can be granted a write lock
at any one particular time.
Some systems permit a transaction to issue a read lock on an item and then later upgrade
the lock to a write lock. This allows a transaction to examine data first and then decide
whether to update or not. If upgrades are not supported, a transaction must hold write
locks on all data items that it may update at some time during the execution of the
transaction, thereby potentially reducing the level of concurrency in the system.
For similar reasons, some systems also permit a transaction to issue a write lock and then
later downgrade the lock to a read lock.
11
Transaction management & concurrency control
According to the rules of this protocol, every transaction can be divided into tow phases;
first a growing phase, in which it acquires all the locks required but cannot release any
locks, and then the shrinking phase, in which it releases its locks but cannot acquire any
new locks. It is not mandatory that all locks be acquired simultaneously - a transaction
will normally acquire some locks, does some processes and goes on to acquire additional
locks as needed. However, no locks are released until the transaction has reached a stage
at which no new locks are needed.
If upgrading of locks is supported, then it can only happen during the growing phase and
may dictate that the transaction wait until another transaction releases a read lock on the
item. Downgrading can only take place during the shrinking phase.
Deadlock
It refers to an impasse that may occur when two (or more) transactions are each waiting
for locks held by the other to be released.
In the above case, there's only one way to break deadlock: abort one or more of the other
transactions, which will involve undoing all the changes made by the transactions.
Assume we abort transaction TB. Once this is done, the locks held by transaction TB are
released and TA is able to proceed. Deadlocks should be transparent to the users, and
therefore the DBMS should automatically restart the aborted transactions.
12
Transaction management & concurrency control
Deadlock prevention
A common approach used in deadlock prevention is to order transactions using
transaction timestamps.
There are two algorithms used here;
Wait-die - it allows only an older transaction to wait for a younger one, otherwise the
transaction is aborted (dies) and restarted with the same timestamp, so that eventually it
will become the oldest active transaction and will not die.
Wound-wait - it works such that only younger transactions can wait for older ones. If
older transaction requests a lock held by a younger one, the younger one is aborted
(wounded).
Deadlock detection
It is usually handled by the construction of a wait-for Graph (WFG), showing
transaction dependencies; i.e. transaction Ti is dependent on Tj, if Tj holds a lock on a data
item that Ti is waiting for.
The WFG is constructed as follows;
Create a node for each transaction.
Create a directed edge Ti → Tj, if transaction Ti is waiting to lock an item that is currently
locked by Tj.
Deadlock exists if and only if the WFG contains a cycle. Since it is a necessary and
sufficient condition to have a cycle in the WFG for a deadlock to exist, the deadlock
detection algorithm generates the WFG regularly and examines it for a cycle.
Timestamping
A timestamp is a unique identifier created by the DBMS that indicates the relative
starting time of a transaction.
Timestamping, on the other hand is a concurrency control protocol in which the key
objective is to order transactions globally in a such a way that older transactions (those
with smaller timestamps) get priority in the event of conflict.
Optimistic techniques
In some systems, conflicts between transactions are rare, and the additional processing
required by locking or timestamping protocols is unnecessary for many transactions.
In this approach, it is assumed that conflict is rare and that it is more efficient to allow
transactions to proceed unsynchronised. When a transaction wishes to commit, a check is
performed to determine whether conflict has occurred.
13
Transaction management & concurrency control
If there has been conflict, the transaction must be rolled back and restarted. Since conflict
is rare, rollback is rare too.
The overhead is involved in restarting a transaction may be considerable, since it
effectively means redoing the entire transaction. This may be tolerated only if it happens
very infrequently, in which case majority of transactions will be processed without being
subjected to any delays. This allows for greater concurrency than traditional procotols,
since no locking is needed.
14