Vous êtes sur la page 1sur 6

CNET343 Distributed Systems L18: Transactions

David Lancaster, B330 PSQ, David.Lancaster@plymouth.ac.uk

Context
A pervasive class of applications in which users read and update structured data in a shared database

Acknowledge material from CDK, Bernstein and Newcomen, CMU

Air-line reservation systems On-line banking The UoP student record system The Voyager system in the UoP Library. Not necessarily huge - dental booking system

Often distributed Many large hardware and software vendors get most of their revenue from TP

Characteristics

Concurrent access to DB Often mission critical for organisation End users are not computer professionals Implemented with lots of redundancy Recovery from failures should not involve the end users it should be centrally controlled and, as automatic as possible Performance and Resilience are crucial

Bank account example



Bank account operations: withdraw(amount)

withdraw this amount from the account deposit this amount into the account return the balance in the account

deposit(amount) getBalance() Imagine an Account object with this interface

Synchronisation within each Operation



If you look carefully, deposit involves substeps: read the balance, increment, write it back If the operations are not synchronised to only allow one thread at a time to perform the operation, then two separate deposits to the same account may lead to problems as the rst thread may read the balance, but then be suspended while the second reads it. Henceforth assume operations are thread safe Java synchronized methods - lab 1

Idea of a Transaction

Transactions consist of several operations which must all be carried out or none

Buying a house

In UK - until exchange of contracts either party can withdraw

Purchasing a series of air ights when no direct route exists You dont want to be stranded in Anchorage

Bank Account Transaction


Transaction T consists of all the following

TP System Infrastructure

Users viewpoint Enter a request from a browser or other display device The system performs some application-specic work, which includes database accesses Receive a reply (usually, but not always) The TP system ensures that each transaction Is an independent unit of work Executes exactly once Produces permanent results TP system makes it easy to program transactions TP system has tools to make it easy to manage

a.withdraw(100) b.deposit(100) c.withdraw(200) b.deposit(200)

Would expect that the bank - where all accounts a,b,c are held, sees no change in the total funds it holds

TP System Infrastructure
End-User Front End Program Request Controller (routes requests and supervises their execution) Transaction Server Database System Client

Transactions

Regard each transaction as a set of operations that must all be carried out or none unacceptable to have a partially successful transaction Many separate transactions must be processed efciently - hence concurrently. But this leads to potential conict and must be done carefully Like revisiting the synchronisation issue at a coarser level of granularity

Back-End (Server)

Single or Nested

Sometimes - especially in more complex distributed systems, transactions can be composed of subtransactions. When the original model is meant - say at transaction Known as nested transactions - when a subtransaction aborts, the parent may attempt a different subtransaction, or may record the fact and then commit. It need not fail. Nested distributed transactions offer more potential parallelism

ACID

Atomicity: Either all operations are carried out, or none.

Impossible for a transaction to end up half-done.

Consistency: Each transaction transforms the system from a consistent & valid state to another consistent & valid state. It must be impossible for any transaction to leave the system in an invalid state. eg wrong total $ in bank

Isolation: The execution of each transaction must be isolated from the execution of other transactions Durability: Once a transaction is completed, the system guarantees that the results of its operations will persist

even if there are subsequent system failures.

Atomicity

Clear based on the examples so far More examples? ATM - issue 100, debit the account Atomicity will be enforced by the TP system

Consistency

Means internal consistency of the DB The DB must satisfy all its integrity constraints

unique primary keys, no dangling references total in bank, salary inside a range

Application dependent constraints Consistency is up to the programmer not the TP system

Isolation

Technically this means that transactions run in an order that is serially equivalent The DB gives the illusion that the transactions are being done one after another, but in fact are run in parallel and locks are uses to prevent conict Supported by the TP system

Durability

You expect that the bank will remember how much is in your account even though you dont use it for 3 months At the end, all the updates will be in stable storage that will survive power or OS failure Necessary for the transaction to have legal status Usually achieved with log les Supported by the TP system

Inside a TP System

How are the ACID properties supported? For non-distributed transactions, mainly by the database

Concurrency

Servers must isolate transactions and not allow them to interfere One way would be to perform each transaction in the order it arrived ie serially A performance disaster Must run them concurrently - especially as each transaction usually involves I/O (disk access) and the processor is idle during this time

Concurrency through locking Resilience through logging

Serial Equivalence

Run the transactions concurrently, but for consistency insist the the effect is equivalent to serial execution serialisable (compare with on-the-wire format = serialisable data structure) Examples of typical problems when transactions are not serial equivalent

Concurrency Examples

Three accounts

a: 100$, b: 200$, c: 300$

T transfers from a to b so as to increase balance in b by 10% U transfers from c to b so as to increase balance in b by 10% Expect nal balance in b to be 200+20+22=242$

Figure 16.5 The lost update problem

Figure 16.7 A serially equivalent interleaving of T and U

Transaction T : balance = b.getBalance(); b.setBalance(balance*1.1); a.withdraw(balance/10) balance = b.getBalance(); $200

Transaction U: balance = b.getBalance(); b.setBalance(balance*1.1); c.withdraw(balance/10) balance = b.getBalance(); b.setBalance(balance*1.1); $200 $220

TransactionT: balance = b.getBalance() b.setBalance(balance*1.1) a.withdraw(balance/10) balance = b.getBalance() $200 b.setBalance(balance*1.1) $220

TransactionU: balance = b.getBalance() b.setBalance(balance*1.1) c.withdraw(balance/10)

balance = b.getBalance() $220 b.setBalance(balance*1.1) $242 a.withdraw(balance/10) $80 c.withdraw(balance/10) $278

b.setBalance(balance*1.1); a.withdraw(balance/10)

$220 $80 c.withdraw(balance/10) $280

Instructors Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012

Instructors Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012

Exclusive Locks

The server attempts to lock any object that is about to be used in the transaction The lock can only be held by one process, and if it is held, other processes must wait for it to become free The use of the locks effectively serialises the transactions - to ensure this, a transaction is not allowed to request new locks after it has released any

Figure 16.14 Transactions T and U with exclusive locks


TransactionT: balance = b.getBalance() b.setBalance(bal*1.1) a.withdraw(bal/10) Operations Locks TransactionU: balance = b.getBalance() b.setBalance(bal*1.1) c.withdraw(bal/10) Operations Locks

openTransaction bal = b.getBalance() lock B b.setBalance(bal*1.1) a.withdraw(bal/10) closeTransaction lock A unlockA, B lock B b.setBalance(bal*1.1) c.withdraw(bal/10) lock C closeTransaction unlockB, C openTransaction bal = b.getBalance() waits for Ts lock on B

Instructors Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012

Two Phase Locking



There is a rule for the way that transactions acquire and release locks that always leads to the transactions being serially equivalent Known as two phase locking (not two phase commit) it consists of a growing phase in which more locks are acquired, followed by a shrinking phase in which the locks are released

Deadlock

The use of locks can lead to deadlock Imagine a cycle where each process is waiting for the one in front to release a lock There are software methods for detecting deadlock - in which case the deadlock can be broken somehow - but hard to do this efciently

Lock Granularity

Lock the whole database, a single object, the row of a DB? How long is the bit of code protected by the lock? For efciency want it to be short, but this leads to more complex programming Think of Java synchronized blocks compared with synchronized methods

Drawbacks of Locking

Locking is overly restrictive on the degree of concurrency Deadlocks produce unnecessary aborts Lock maintenance is an overhead that may not be required Known as a Pessimistic approach

Other Approaches to Concurrency



Locks are traditional for database systems and historically for most distributed systems - the most common way of providing concurrency control In many cases transactions are not going to interfere and the locking mechanism is a high overhead so there are other approaches

History and Examples



The problems are not new and are well understood. Solutions were developed decades ago and are still appropriate although the platforms may have evolved. Early 1960s SABRE airline reservations now Travelocity 1969 IBM CICS

Optimistic - monitor for conicts, x if found, eg collaborative editing

Before the WWW



TP Monitors or on-line TP = OLTP CICS, IMS, Tuxedo, ACMS... Mainframe directly supports terminals Or ATM machines talk to accounts via request controller

Early WWW
To support processing of transactions arriving over the WWW application servers were invented to ease the programmers job

these talked to the legacy TP monitors

But very quickly they both evolved into combined functionality of app servers and TP monitors All happening at the same time as MOM for enterprise integration, Web services and SOA fashion. Large Dist Sys combine parts of all these

Application Servers

The term middleware is not usually used in this context - but TP infrastructure plays the same role as middleware in other distributed systems Programmer writes an app to process a single request. App Server scales it up to a large, distributed system Enterprise Java Beans, IBM Websphere, Microsoft .NET (COM+), Oracle Weblogic and Application Server

IBM CICS

1969 originally for Public Utility companies who wanted their telephone staff to be able to make real time changes to customer data and not to have to wait for overnight batch jobs to run Development moved to Hursley UK in 1974 Runs on z/OS mainframes and is very deeply entwined with the OS (ran efciently on 32kB RAM) Long history and commercial importance of proven performance and resilience

CICS

90% of Fortune 500 companies still use CICS on z/OS mainframes for core business Can still program it in Cobol, but now has all the modern connectivity - EJB, Web services etc Some modularity that ts with the mainframe hardware

Summary

Idea of a Transaction, ACID Some idea about this mysterious bit of software kit besides the OS, DB, network and application Performance and Resilience are crucial

For scalability - will need to go to distributed transactions - next week

Vous aimerez peut-être aussi