Vous êtes sur la page 1sur 26

Transaction processing system

From Wikipedia, the free encyclopedia

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(June
2011)

A transaction processing system is a type of information system. TPSs collect, store, modify, and retrieve the transactions of an organization. A transaction is an event that generates or modifies data that is eventually stored in an information system. To be considered a transaction processing system the computer must pass the ACID test. The essence of a transaction program is that it manages data that must be left in a consistent state, e.g. if an electronic payment is made, the amount must be both withdrawn from one account and added to the other; it cannot complete only one of those steps. Either both must occur, or neither. In case of a failure preventing transaction completion, the partially executed transaction must be 'rolled back' by the TPS. While this type of integrity must be provided also for batch transaction processing, it is particularly important for online processing: if e.g. an airline seat reservation system is accessed by multiple operators, after an empty seat inquiry, the seat reservation data must be locked until the reservation is made, otherwise another user may get the impression a seat is still free while it is actually being booked at the time. Without proper transaction monitoring, double bookings may occur. Other transaction monitor functions include deadlock detection and resolution (deadlocks may be inevitable in certain cases of cross-dependence on data), and transaction logging (in 'journals') for 'forward recovery' in case of massive failures. Transaction Processing is not limited to application programs. The 'journaled file system' provided with IBMs AIX Unix operating system employs similar techniques to maintain file system integrity, including a journal.

Contents
[hide]

o o o o o o o

1 Types 1.1 Contrasted with batch processing 1.2 Real-time and batch processing 2 Features 2.1 Rapid response 2.2 Reliability 2.3 Inflexibility 2.4 Controlled processing 3 ACID test properties: first definition 3.1 Atomicity

o o o o o o o

3.2 Consistency 3.3 Isolation 3.4 Durability 3.5 Concurrency 4 Storing and retrieving 4.1 Databases and files 4.2 Data warehouse 4.3 Backup procedures 4.3.1 Recovery process 4.3.2 Types of back-up procedures 4.3.2.1 Grandfather-father-son 4.3.2.2 Partial backups 4.3.3 Updating in a batch 4.3.4 Updating in real-time

5 References 6 See also 7 Further reading

[edit]Types [edit]Contrasted

with batch processing

Batch processing is not transaction processing. Batch processing involves processing several transactions at the same time, and the results of each transaction are not immediately available when the transaction is being entered;[1] there is a time delay. Transactions are accumulated for a certain period (say for day) where updates are made especially after work.

[edit]Real-time

and batch processing

There are a number of differences between real-time and batch processing. These are outlined below: Each transaction in real-time processing is unique. It is not part of a group of transactions, even though those transactions are processed in the same manner. Transactions in real-time processing are stand-alone both in the entry to the system and also in the handling of output. Real-time processing requires the master file to be available more often for updating and reference than batch processing. The database is not accessible all of the time for batch processing.

Real-time processing has fewer errors than batch processing, as transaction data is validated and entered immediately. With batch processing, the data is organised and stored before the master file is updated. Errors can occur during these steps. Infrequent errors may occur in real-time processing; however, they are often tolerated. It is not practical to shut down the system for infrequent errors. More computer operators are required in real-time processing, as the operations are not centralised. It is more difficult to maintain a real-time processing system than a batch processing system.

[edit]Features [edit]Rapid

response

Fast performance with a rapid response time is critical. Businesses cannot afford to have customers waiting for a TPS to respond, the turnaround time from the input of the transaction to the production for the output must be a few seconds or less.

[edit]Reliability
Many organizations rely heavily on their TPS; a breakdown will disrupt operations or even stop the business. For a TPS to be effective its failure rate must be very low. If a TPS does fail, then quick and accurate recovery must be possible. This makes welldesignedbackup and recovery procedures essential.

[edit]Inflexibility
A TPS wants every transaction to be processed in the same way regardless of the user, the customer or the time for day. If a TPS were flexible, there would be too many opportunities for non-standard operations, for example, a commercial airline needs to consistently accept airline reservations from a range of travel agents, accepting different transactions data from different travel agents would be a problem.

[edit]Controlled

processing

The processing in a TPS must support an organization's operations. For example if an organization allocates roles and responsibilities to particular employees, then the TPS should enforce and maintain this requirement. An example of this is an ATM transaction.

[edit]ACID

test properties: first definition

[edit]Atomicity
Main article: Atomicity (database systems) A transactions changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers.[2]

[edit]Consistency
Consistency: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program![2]

[edit]Isolation
Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both.[2]

[edit]Durability
Once a transaction completes successfully (commits), its changes to the state survive failures.[2]

[edit]Concurrency
Ensures that two users cannot change the same data at the same time. That is, one user cannot change a piece of data before another user has finished with it. For example, if an airline ticket agent starts to reserve the last seat on a flight, then another agent cannot tell another passenger that a seat is available.

[edit]Storing

and retrieving

Storing and retrieving information from a TPS must be efficient and effective. The data are stored in warehouses or other databases, the system must be well designed for its backup and recovery procedures.

[edit]Databases

and files

The storage and retrieval of data must be accurate as it is used many times throughout the day. A database is a collection of data neatly organized, which stores the accounting and operational records in the database. Databases are always protective of their delicate data, so they usually have a restricted view of certain data. Databases are designed using hierarchical, network or relational structures; each structure is effective in its own sense.

Hierarchical structure: organizes data in a series of levels, hence why it is called hierarchal. Its top to

bottom like structure consists of nodes and branches; each child node has branches and is only linked to one higher level parent node.

Network structure: Similar to hierarchical, network structures also organizes data using nodes and

branches. But, unlike hierarchical, each child node can be linked to multiple, higher parent nodes.

Relational structure: Unlike network and hierarchical, a relational database organizes its data in a

series of related tables. This gives flexibility as relationships between the tables are built.

A relational structure.

A hierarchical structure.

A network structure.

The following features are included in real time transaction processing systems:

Good data placement: The database should be designed to access patterns of data from many

simultaneous users.

Short transactions: Short transactions enables quick processing. This avoids concurrency and paces

the systems.

Real-time backup: Backup should be scheduled between low times of activity to prevent lag of the

server.

High normalization: This lowers redundant information to increase the speed and improve

concurrency, this also improves backups.

Archiving of historical data: Uncommonly used data are moved into other databases or backed up

tables. This keeps tables small and also improves backup times.

Good hardware configuration: Hardware must be able to handle many users and provide quick

response times. In a TPS, there are 5 different types of files. The TPS uses the files to store and organize its transaction data:

Master file: Contains information about an organizations business situation. Most transactions and

databases are stored in the master file.

Transaction file: It is the collection of transaction records. It helps to update the master file and also

serves as audit trails and transaction history.

Report file: Contains data that has been formatted for presentation to a user. Work file: Temporary files in the system used during the processing. Program file: Contains the instructions for the processing of data.

[edit]Data

warehouse

Main article: Data warehouse A data warehouse is a database that collects information from different sources. When it's gathered in real-time transactions it can be used for analysis efficiently if it's stored in a data warehouse. It provides data that are consolidated, subject-oriented, historical andread-only:

Consolidated: Data are organised with consistent naming conventions, measurements, attributes and

semantics. It allows data from a data warehouse from across the organization to be effectively used in a consistent manner.

Subject-oriented: Large amounts of data are stored across an organization, some data could be

irrelevant for reports and makesquerying the data difficult. It organizes only key business information from operational sources so that it's available for analysis.

Historical: Real-time TPS represent the current value at any time, an example could be stock levels. If

past data are kept, querying the database could return a different response. It stores series of snapshots for an organisation's operational data generated over a period of time.

Read-only: Once data are moved into a data warehouse, it becomes read-only, unless it was

incorrect. Since it represents a snapshot of a certain time, it must never be updated. Only operations which occur in a data warehouse are loading and querying data.

[edit]Backup

procedures

A Dataflow Diagram of backup and recovery procedures.

Since business organizations have become very dependent on TPSs, a breakdown in their TPS may stop the business' regular routines and thus stopping its operation for a certain amount of time. In order to prevent data

loss and minimize disruptions when a TPS breaks down a well-designed backup and recovery procedure is put into use. The recovery process can rebuild the system when it goes down.

[edit]Recovery process
A TPS may fail for many reasons. These reasons could include a system failure, human errors, hardware failure, incorrect or invalid data, computer viruses, software application errors or natural or man-made disasters. As it's not possible to prevent all TPS failures, a TPS must be able to cope with failures. The TPS must be able to detect and correct errors when they occur. A TPS will go through a recovery of the database to cope when the system fails, it involves the backup, journal, checkpoint, and recovery manager:

Journal: A journal maintains an audit trail of transactions and database changes. Transaction logs and

Database change logs are used, a transaction log records all the essential data for each transactions, including data values, time of transaction and terminal number. A database change log contains before and after copies of records that have been modified by transactions.

Checkpoint: The purpose of checkpointing is to provide a snapshot of the data within the database. A

checkpoint, in general, is any identifier or other reference that identifies at a point in time the state of the database. Modifications to database pages are performed in memory and are not necessarily written to disk after every update. Therefore, periodically, the database system must perform a checkpoint to write these updates which are held in-memory to the storage disk. Writing these updates to storage disk creates a point in time in which the database system can apply changes contained in a transaction log during recovery after an unexpected shut down or crash of the database system. If a checkpoint is interrupted and a recovery is required, then the database system must start recovery from a previous successful checkpoint. Checkpointing can be either transaction-consistent or non-transactionconsistent (called also fuzzy checkpointing).Transaction-consistent checkpointing produces a persistent database image that is sufficient to recover the database to the state that was externally perceived at the moment of starting the checkpointing. A non-transaction-consistent checkpointing results in a persistent database image that is insufficient to perform a recovery of the database state. To perform the database recovery, additional information is needed, typically contained in transaction logs. Transaction consistent checkpointing refers to a consistent database, which doesn't necessarily include all the latest committed transactions, but all modifications made by transactions, that were committed at the time checkpoint creation was started, are fully present. A non-consistent transaction refers to a checkpoint which is not necessarily a consistent database, and can't be recovered to one without all log records generated for open transactions included in the checkpoint. Depending on the type of database management system implemented a checkpoint may incorporate indexes or storage pages (user data), indexes and storage pages. If no indexes are

incorporated into the checkpoint, indexes must be created when the database is restored from the checkpoint image.

Recovery Manager: A recovery manager is a program which restores the database to a correct

condition which can restart the transaction processing. Depending on how the system failed, there can be two different recovery procedures used. Generally, the procedures involves restoring data that has been collected from a backup device and then running the transaction processing again. Two types of recovery are backward recovery and forward recovery:

Backward recovery: used to undo unwanted changes to the database. It reverses the changes made

by transactions which have been aborted. It involves the logic of reprocessing each transaction, which is very time-consuming.

Forward recovery: it starts with a backup copy of the database. The transaction will then reprocess

according to the transaction journal that occurred between the time the backup was made and the present time. It's much faster and more accurate. See also: Checkpoint restart

[edit]Types of back-up procedures


There are two main types of Back-up Procedures: Grandfather-father-son and Partial backups:

[edit]Grandfather-father-son
This procedure refers to at least three generations of backup master files. thus, the most recent backup is the son, the oldest backup is the grandfather. It's commonly used for a batch transaction processing system with a magnetic tape. If the system fails during a batch run, the master file is recreated by using the son backup and then restarting the batch. However if the son backup fails, is corrupted or destroyed, then the next generation up backup (father) is required. Likewise, if that fails, then the next generation up backup (grandfather) is required. Of course the older the generation, the more the data may be out of date. Organizations can have up to twenty generations of backup.

[edit]Partial backups
This only occurs when parts of the master file are backed up. The master file is usually backed up to magnetic tape at regular times, this could be daily, weekly or monthly. Completed transactions since the last backup are stored separately and are called journals, orjournal files. The master file can be recreated from the journal files on the backup tape if the system is to fail.

[edit]Updating in a batch
This is used when transactions are recorded on paper (such as bills and invoices) or when it's being stored on a magnetic tape. Transactions will be collected and updated as a batch at when it's convenient or economical to process them. Historically, this was the most common method as the information technology did not exist to allow real-time processing. The two stages in batch processing are:

Collecting and storage of the transaction data into a transaction file - this involves sorting the data into

sequential order.

Processing the data by updating the master file - which can be difficult, this may involve data additions,

updates and deletions that may require to happen in a certain order. If an error occurs, then the entire batch fails. Updating in batch requires sequential access - since it uses a magnetic tape this is the only way to access data. A batch will start at the beginning of the tape, then reading it from the order it was stored; it's very timeconsuming to locate specific transactions. The information technology used includes a secondary storage medium which can store large quantities of data inexpensively (thus the common choice of a magnetic tape). The software used to collect data does not have to be online - it doesn't even need a user interface.

[edit]Updating in real-time
This is the immediate processing of data. It provides instant confirmation of a transaction. This involves a large amount of users who are simultaneously performing transactions to change data. Because of advances in technology (such as the increase in the speed ofdata transmission and larger bandwidth), real-time updating is now possible. Steps in a real-time update involve the sending of a transaction data to an online database in a master file. The person providing information is usually able to help with error correction and receives confirmation of the transaction completion. Updating in real-time uses direct access of data. This occurs when data are accessed without accessing previous data items. The storage device stores data in a particular location based on a mathematical procedure. This will then be calculated to find an approximate location of the data. If data are not found at this location, it will search through successive locations until it's found.

The information technology used could be a secondary storage medium that can store large amounts of data and provide quick access (thus the common choice of a magnetic disk). It requires a user-friendly interface as it's important for rapid response time. Reservation Systems Reservation systems are used for any type of business where a service or a product is set aside for a customer to use for a future time.

[edit]References

1. 2.
[edit]See

TPS Example ^ a b c WICS TP Chapter 2

also

In computer science, transaction processing is information processing that is divided into individual, indivisible operations, calledtransactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state. Transaction mandatorily requires acknowledgment to get received as a necessary feedback for accomplishment. Contents
[hide]

o o o o

1 Description 2 Methodology 2.1 Rollback 2.2 Rollforward 2.3 Deadlocks 2.4 Compensating transaction 3 ACID criteria (Atomicity, Consistency, Isolation, Durability) 4 Implementations 5 See also

6 External references 7 Further reading

[edit]Description Transaction processing is designed to maintain a computer system (typically a database or some modern filesystems) in a known, consistent state, by ensuring that any operations carried out on the system that are interdependent are either all completed successfully or all canceled successfully. For example, consider a typical banking transaction that involves moving $700 from a customer's savings account to a customer's checking account. This transaction is a single operation in the eyes of the bank, but it involves at least two separate operations in computer terms: debiting the savings account by $700, and crediting the checking account by $700. If the debit operation succeeds but the credit does not (or vice versa), the books of the bank will not balance at the end of the day. There must therefore be a way to ensure that either both operations succeed or both fail, so that there is never any inconsistency in the bank's database as a whole. Transaction processing is designed to provide this. Transaction processing allows multiple individual operations to be linked together automatically as a single, indivisible transaction. The transaction-processing system ensures that either all operations in a transaction are completed without error, or none of them are. If some of the operations are completed but errors occur when the others are attempted, the transaction-processing system rolls backall of the operations of the transaction (including the successful ones), thereby erasing all traces of the transaction and restoring the system to the consistent, known state that it was in before processing of the transaction began. If all operations of a transaction are completed successfully, the transaction is committed by the system, and all changes to the database are made permanent; the transaction cannot be rolled back once this is done. Transaction processing guards against hardware and software errors that might leave a transaction partially completed, with the system left in an unknown, inconsistent state. If the computer system crashes in the middle of a transaction, the transaction processing system guarantees that all operations in any uncommitted (i.e., not completely processed) transactions are cancelled. Transactions are processed in a strict chronological order. If transaction n+1 intends to touch the same portion of the database as transaction n, transaction n+1 does not begin until transaction n is committed. Before any transaction is committed, all other transactions affecting the same part of the system must also be committed; there can be no holes in the sequence of preceding transactions. [edit]Methodology

The basic principles of all transaction-processing systems are the same. However, the terminology may vary from one transaction-processing system to another, and the terms used below are not necessarily universal. [edit]Rollback Transaction-processing systems ensure database integrity by recording intermediate states of the database as it is modified, then using these records to restore the database to a known state if a transaction cannot be committed. For example, copies of information on the database prior to its modification by a transaction are set aside by the system before the transaction can make any modifications (this is sometimes called a before image). If any part of the transaction fails before it is committed, these copies are used to restore the database to the state it was in before the transaction began. [edit]Rollforward It is also possible to keep a separate journal of all modifications to a database (sometimes called after images); this is not required for rollback of failed transactions, but it is useful for updating the database in the event of a database failure, so some transaction-processing systems provide it. If the database fails entirely, it must be restored from the most recent back-up. The back-up will not reflect transactions committed since the back-up was made. However, once the database is restored, the journal of after images can be applied to the database (rollforward) to bring the database up to date. Any transactions in progress at the time of the failure can then be rolled back. The result is a database in a consistent, known state that includes the results of all transactions committed up to the moment of failure. [edit]Deadlocks In some cases, two transactions may, in the course of their processing, attempt to access the same portion of a database at the same time, in a way that prevents them from proceeding. For example, transaction A may access portion X of the database, and transaction B may access portion Y of the database. If, at that point, transaction A then tries to access portion Y of the database while transaction B tries to access portion X, a deadlock occurs, and neither transaction can move forward. Transactionprocessing systems are designed to detect these deadlocks when they occur. Typically both transactions will be cancelled and rolled back, and then they will be started again in a different order, automatically, so that the deadlock doesn't occur again. Or sometimes, just one of the deadlocked transactions will be cancelled, rolled back, and automatically re-started after a short delay. Deadlocks can also occur between three or more transactions. The more transactions involved, the more difficult they are to detect, to the point that transaction processing systems find there is a practical limit to the deadlocks they can detect. [edit]Compensating

transaction

In systems where commit and rollback mechanisms are not available or undesirable, a Compensating transaction is often used to undo failed transactions and restore the system to a previous state. [edit]ACID

criteria (Atomicity, Consistency, Isolation, Durability)

Main article: ACID Transaction processing has these benefits: It allows sharing of computer resources among many users It shifts the time of job processing to when the computing resources are less busy It avoids idling the computing resources without minute-by-minute human interaction and

supervision It is used on expensive classes of computers to help amortize the cost by keeping high rates of

utilization of those expensive resources A transaction is an atomic unit of processing.

[edit]Implementations Main article: Transaction processing system It has been suggested that Extreme Transaction Processing be merged into this article or section. (Discuss) Proposed since January 2010. Standard transaction-processing software, notably IBM's Information Management System, was first developed in the 1960s, and was often closely coupled to particular database management systems. clientserver computing implemented similar principles in the 1980s with mixed success. However, in more recent years, the distributed clientserver model has become considerably more difficult to maintain. As the number of transactions grew in response to various online services (especially the Web), a single distributed database was not a practical solution. In addition, most online systems consist of a whole suite of programs operating together, as opposed to a strict clientserver model where the single server could handle the transaction processing. Today a number of transaction processing systems are available that work at the inter-program level and which scale to large systems, including mainframes. An important open industry standard is the X/Open Distributed Transaction Processing (DTP) (see JTA). However, proprietary transaction-processing environments such as IBM's CICS are still very popular, although CICS has evolved to include open industry standards as well. A modern transaction processing implementation combines elements of both object-oriented persistence with traditional transaction monitoring. One such implementation is the commercial DTS/S1 product from Obsidian Dynamics, or the open-source product db4o.

[edit]See

also

Concepts and Planning


Ibm

Transaction processing systems


Transaction processing is supported by programs that are called transaction processing systems. Transaction processing systems provide three functional areas: System runtime functions Transaction processing systems provide an execution environment that ensures the integrity, availability, and security of data. It also ensures fast response time and high transaction throughput. System administration functions Transaction processing systems provide administrative support that lets users configure, monitor, and manage their transaction systems. Application development functions Transaction processing systems provide functions for use in custom business applications, including functions to access data, to perform intercomputer communications, and to design and manage the user interface. The services of a transaction processing system runtime environment include the following:

Scheduling and load balancing. Controlling the rate and order in which tasks are processed to give higher-priority tasks the best response times and to adapt to the availability of application servers and other system resources.

Managing system resources. Maintaining a pool of operating system resources to be used for transaction processing, loading application programs, and acquiring and releasing storage. Monitoring. Monitoring the progress of tasks, suspending those waiting for input, adjusting task priorities, and resolving problems. Managing data. Obtaining required data needed by tasks, coordinating resource managers (such as file servers and database managers), locking data for update, and logging changes. Managing communications. Monitoring communications with users and between servers and other systems, starting communications sessions as needed, managing data handling and conversion, and routing data to the right destination. Time management. Managing transaction processing in relation to the passage of time, starting tasks at predefined times, logging the date and time of events onto disk, and regularly controlling part of the business system to provide degrees of automation.

Services for systems administration and application development are described in subsequent sections. TXSeries for Multiplatforms TXSeries for Multiplatforms is a transaction processing system that provides the transaction processing facilities that enable application programs to be implemented as transactions. The work of many users can be processed at the same time, by a single server or by multiple servers. To users, TXSeries for Multiplatforms provides seemingly dedicated processing of their work, with the security of access, reliability of data update, and other benefits that transactions provide. It hides the complexity of the facilities from user applications by providing standard APIs. In CICS, the transaction processing system is implemented by developing one or more CICS regions, which are individual administrative units that support multiple concurrent application programs. CICS regions do the following:

Perform work that one or more clients have requested. For example, a user application that is running on one machine (the client machine) requests work to be done on another machine (the server machine). Typically, the region or application server accesses some data, applies some business logic to it, then replies to the client. Such service is provided by running one or more programs on behalf of a transaction.

Maintain and use a pool of multithreaded processes, each of which provides a complete environment for running a transaction. In CICS, such processes are called application servers. Coordinate all the facilities that its application servers need. For example, they coordinate the security of the application servers, obtain data and storage that they need, and log their transactions. Among other advantages, multiple CICS regions can be used to provide a distributed transaction processing environment for greater throughput and management of workload. Subcontract many services to other servers that can do the work better and provide extra services that are needed for integrated transaction processing. For example, they can use Structured File Server (SFS) files or DB2 databases to store and manage user data. They also provide services to locate and interface with the resource managers, record ongoing changes to data, and coordinate the update of data across multiple resource managers. ============= ==============

Definition: A Transaction Processing System (TPS) is a type of information system that collects, stores, modifies and retrieves the data transactions of an enterprise. A transaction is any event that passes the ACID test in which data is generated or modified before storage in an information system Features of Transaction Processing Systems The success of commercial enterprises depends on the reliable processing of transactions to ensure that customer orders are met on time, and that partners and suppliers are paid and can make payment. The field of transaction processing, therefore, has become a vital part of effective business management, led by such organisations as the Association for Work Process Improvement and theTransaction Processing Performance Council. Transaction processing systems offer enterprises the means to rapidly process transactions to ensure the smooth flow of data and the progression of processes throughout the enterprise. Typically, a TPS will exhibit the following characteristics: Rapid Processing The rapid processing of transactions is vital to the success of any enterprise now more than ever, in the face of advancing technology and customer demand for immediate action. TPS systems are designed to process transactions virtually instantly to ensure that customer data is available to the processes that require it.

Reliability Similarly, customers will not tolerate mistakes. TPS systems must be designed to ensure that not only do transactions never slip past the net, but that the systems themselves remain operational permanently. TPS systems are therefore designed to incorporate comprehensive safeguards and disaster recovery systems. These measures keep the failure rate well within tolerance levels. Standardisation Transactions must be processed in the same way each time to maximise efficiency. To ensure this, TPS interfaces are designed to acquire identical data for each transaction, regardless of the customer. Controlled Access Since TPS systems can be such a powerful business tool, access must be restricted to only those employees who require their use. Restricted access to the system ensures that employees who lack the skills and ability to control it cannot influence the transaction process. Transactions Processing Qualifiers In order to qualify as a TPS, transactions made by the system must pass the ACID test. The ACID tests refers to the following four prerequisites: Atomicity Atomicity means that a transaction is either completed in full or not at all. For example, if funds are transferred from one account to another, this only counts as a bone fide transaction if both the withdrawal and deposit take place. If one account is debited and the other is not credited, it does not qualify as a transaction. TPS systems ensure that transactions take place in their entirety. Consistency TPS systems exist within a set of operating rules (or integrity constraints). If an integrity constraint states that all transactions in a database must have a positive value, any transaction with a negative value would be refused. Isolation Transactions must appear to take place in isolation. For example, when a fund transfer is made between two accounts the debiting of one and the crediting of another must appear to take place simultaneously. The funds cannot be credited to an account before they are debited from another. Durability Once transactions are completed they cannot be undone. To ensure that this is the case even if the TPS suffers failure, a log will be created to document all completed transactions.

These four conditions ensure that TPS systems carry out their transactions in a methodical, standardised and reliable manner. Types of Transactions While the transaction process must be standardised to maximise efficiency, every enterprise requires a tailored transaction process that aligns with its business strategies and processes. For this reason, there are two broad types of transaction: Batch Processing Batch processing is a resource-saving transaction type that stores data for processing at pre-defined times. Batch processing is useful for enterprises that need to process large amounts of data using limited resources. Examples of batch processing include credit card transactions, for which the transactions are processed monthly rather than in real time. Credit card transactions need only be processed once a month in order to produce a statement for the customer, so batch processing saves IT resources from having to process each transaction individually. Real Time Processing In many circumstances the primary factor is speed. For example, when a bank customer withdraws a sum of money from his or her account it is vital that the transaction be processed and the account balance updated as soon as possible, allowing both the bank and customer to keep track of funds. Further information regarding transaction processing systems can be found at theUniversity of Illinois and John Hopkins University.

Introduction
Transaction management is one of the most crucial requirements for enterprise application development. Most of the large enterprise applications in the domains of finance, banking and electronic commerce rely on transaction processing for delivering their business functionality. Given the complexity of todays business requirements, transaction processing occupies one of the most complex segments of enterprise level distributed applications to build, deploy and maintain. This article walks the reader through the following:

What is a transaction? What is ACID?

What are the issues in building transactional applications? Why is transaction What is the architecture of a typical transaction processing application? What are What are the concepts involved with transaction management systems? What are the standards and technologies in the transaction management

management middleware important? the responsibilities of various components of this architecture?

domain? This article is not specific to any product, and so attempts to be generic while describing various issues and concepts. This article does not aim to compare various transaction processing technologies/standards, and offers a study only.

What is a Transaction?
Enterprise applications often require concurrent access to distributed data shared amongst multiple components, to perform operations on data. Such applications should maintain integrity of data (as defined by the business rules of the application) under the following circumstances:

distributed access to a single resource of data, and access to distributed resources from a single application component.

In such cases, it may be required that a group of operations on (distributed) resources be treated as one unit of work. In a unit of work, all the participating operations should either succeed or fail and recover together. This problem is more complicated when

a unit of work is implemented across a group of distributed components the participating operations are executed sequentially or in parallel threads

operating on data from multiple resources, and/or requiring coordination and/or synchronization. In either case, it is required that success or failure of a unit of work be maintained by the application. In case of a failure, all the resources should bring back the state of the data to the previous state (i.e., the state prior to the commencement of the unit of work). The concept of a transaction, and a transaction manager (or a transaction processing service) simplifies construction of such enterprise level distributed applications while maintaining integrity of data in a unit of work. A transaction is a unit of work that has the following properties:

ATOMICITY: A transaction should be done or undone completely and

unambiguously. In the event of a failure of any operation, effects of all operations that make up the transaction should be undone, and data should be rolled back to its previous state.

CONSISTENCY: A transaction should preserve all the invariant properties (such

as integrity constraints) defined on the data. On completion of a successful transaction, the data should be in a consistent state. In other words, a transaction should transform the system from one consistent state to another consistent state. For example, in the case of relational databases, a consistent transaction should preserve all the integrity constraints defined on the data.

ISOLATION: Each transaction should appear to execute independently of other

transactions that may be executing concurrently in the same environment. The effect of executing a set of transactions serially should be the same as that of running them concurrently. This requires two things:

During the course of a transaction, intermediate (possibly inconsistent) Two concurrent transactions should not be able to operate on the same

state of the data should not be exposed to all other transactions. data. Database management systems usually implement this feature using locking.

DURABILITY: The effects of a completed transaction should always be

persistent. These properties, called as ACID properties, guarantee that a transaction is never incomplete, the data is never inconsistent, concurrent transactions are independent, and the effects of a transaction are persistent. For a brief description of what can go wrong in distributed transaction processing, see Fault Tolerance and Recovery in Transaction Processing Systems.

Issues in Building Transactional Applications


To elicit the issues involved in building transactional applications, consider an order capture and order process application with the architecture shown in Figure 1.

Figure 1: Order Capture and Order Process Application


This application consists of two client components implementing the order capture and order process operations respectively. These two operations constitute a unit of work ortransaction. The order capture and order process components access and operate on four databases for products, orders, inventory and shipping information respectively. In this figure, while the dotted arrows indicate read-only data access, the continuous arrows are transactional operations modifying data. The following are the transactional operations in this application:

create order, update inventory, create shipping record, and update order status.

While implementing these operations as a single transaction, the following issues should be addressed: 1. The application should keep track of all transactional operations and the databases operated upon. The application should therefore define a context for every transaction to include the above four operations. 2. Since the order capture and order process transaction is distributed across two components, the transaction context should be global and be propagated from the first component to the second along with the transfer of control. 3. The application should monitor the status of the transaction as it occurs. 4. To maintain atomicity of the transaction, the application components, and/or database servers should implement a mechanism whereby changes to databases could be undone without loss of consistency of data. 5. To isolate concurrent transactions on shared data, the database servers should keep track of the data being operated upon, and lock the data during the course of a transaction. 6. The application should also maintain association between database connections and transactions. 7. To implement reliable locking, the application components should notify the database servers of transaction termination.

Transaction Processing Architecture


Having seen the issues in building transactional applications from scratch, consider the same application built around a transaction processing architecture as shown in Figure

2. Note that, although there are several architectures possible, as will be discussed in a later section, the one shown in Figure 2 represents the essential features.

` Figure 2: Transaction Processing Architecture


This architecture introduces a transaction manager and a resource manager for each database (resource). These components abstract most of the transaction specific issues from application components (Order Capture and Order Process), and share the responsibility of implementation of transactions. The various components of this architecture are discussed below.

Application Components
Application components are clients for the transactional resources. These are the programs with which the application developer implements business transactions.

Application Components: Responsibilities Create and demarcate transactions Propagate transaction context Operate on data via resource managers

With the help of the transaction manager, these components create global transactions, propagate the transaction context if necessary, and operate on the transactional resources with in the scope of these transactions. These components are not responsible for implementing semantics for preserving ACID properties of transactions.

However, as part of the application logic, these components generally make a decision whether to commit or rollback transactions.

Resource Managers
A resource manager is a component that manages persistent and stable data storage system, and participates in the two phase commit and recovery protocols with the transaction manager.

Resource Managers: Responsibilities Enlist resources with the transaction manager Participate in two-phase commit and recovery protocol

A resource manager is typically a driver or a wrapper over a stable storage system, with interfaces for operating on the data (for the application components), and for participating in two phase commit and recovery protocols coordinated by a transaction manager. This component may also, directly or indirectly, register resources with the transaction manager so that the transaction manager can keep track of all the resources participating in a transaction. This process is called as resource enlistment. For implementing the two-phase commit and recovery protocols, the resource manager should implement supplementary mechanisms using which recovery is possible. Resource managers provide two sets of interfaces: one set for the application components to get connections and perform operations on the data, and the other set for the transaction manager to participate in the two-phase commit and recovery protocol.

Transaction Manager
The transaction manager is the core component of a transaction processing environment. Its primary responsibilities are to create transactions when requested by application components, allow resource enlistment and delistment, and to conduct thetwo-phase commit or recovery protocol with the resource managers.

Transaction Manager: Responsibilities Establish and maintain transaction context Maintain association between a transaction and the participating resources. Initiate and conduct two-phase commitand recovery protocol with the resource managers. Make synchronization calls to the application components before beginning and after end of two-phase commit and recovery process

A typical transactional application begins a transaction by issuing a request to a transaction manager to initiate a transaction. In response, the transaction manager starts a transaction and associates it with the calling thread. The transaction manager also establishes a transaction context. All application components and/or threads participating in the transaction share the transaction context. The thread that initially

issued the request for beginning the transaction, or, if the transaction manager allows, any other thread may eventually terminate the transaction by issuing a commit or rollback request. Before a transaction is terminated, any number of components and/or threads may perform transactional operations on any number of transactional resources known to the transaction manager. If allowed by the transaction manager, a transaction may be suspended or resumed before finally completing the transaction. Once the application issues the commit request, the transaction manager prepares all the resources for a commit operation (by conducting a voting), and based on whether all resources are ready for a commit or not, issues a commit or rollback request to all the resources. The following sections discuss various concepts associated with transaction processing.

Transaction Processing Concepts Transaction Demarcation


A transaction can be specified by what is known as transaction demarcation. Transaction demarcation enables work done by distributed components to be bound by a global transaction. It is a way of marking groups of operations to constitute a transaction. The most common approach to demarcation is to mark the thread executing the operations for transaction processing. This is called as programmatic demarcation. The transaction so established can be suspended by unmarking the thread, and be resumed later by explicitly propagating the transaction context from the point of suspension to the point of resumption. The transaction demarcation ends after a commit or a rollback request to the transaction manager. The commit request directs all the participating resources managers to record the effects of the operations of the transaction permanently. The rollback request makes the resource managers undo the effects of all operations on the transaction.

An alternative to programmatic demarcation is declarative demarcation. Component based transaction processing systems such as Microsoft Transaction Server, and application servers based on the Enterprise Java Beans specification support declarative demarcation. In this technique, components are marked as transactional at the deployment time. This has two implications. Firstly, the responsibility of demarcation is shifted from the application to the container hosting the component. For this reason, this technique is also called as container managed demarcation. Secondly, the demarcation is postponed from application build time (static) to the component deployment time (dynamic).

Transaction Context and Propagation


Since multiple application components and resources participate in a transaction, it is necessary for the transaction manager to establish and maintain the state of the transaction as it occurs. This is usually done in the form of transaction context. Transaction context is an association between the transactional operations on the resources, and the components invoking the operations. During the course of a transaction, all the threads participating in the transaction share the transaction context. Thus the transaction context logically envelops all the operations performed on transactional resources during a transaction. The transaction context is usually maintained transparently by the underlying transaction manager.

Resource Enlistment
Resource enlistment is the process by which resource managers inform the transaction manager of their participation in a transaction. This process enables the transaction manager to keep track of all the resources participating in a transaction. The transaction manager uses this information to coordinate transactional work performed by the resource managers and to drive two-phase commit and recovery protocol. At the end of a transaction (after a commit or rollback) the transaction manager delists the resources. Thereafter, association between the transaction and the resources does not hold.

Two-Phase Commit
This protocol between the transaction manager and all the resources enlisted for a transaction ensures that either all the resource managers commit the transaction or they all abort. In this protocol, when the application requests for committing the

transaction, the transaction manager issues a prepare request to all the resource managers involved. Each of these resources may in turn send a reply indicating whether it is ready for commit or not. Only when all the resource managers are ready for a commit, does the transaction manager issue a commit request to all the resource managers. Otherwise, the transaction manager issues a rollback request and the transaction will be rolled back.

Transaction Processing Standards and Technologies X/Open Distributed Transaction Processing Model
The X/Open Distributed Transaction Processing (DTP) model is a distributed transaction processing model proposed by the Open Group, a vendor consortium. This model is a standard among most of the commercial vendors in transaction processing and database domains. This model consists of four components:

1. Application Programs to implement transactional operations. 2. Resource Managers as discussed above. 3. Transaction Managers as discussed above. 4. Communication Resource Manager to facilitate interoperability between
different transaction managers in different transaction processing domains. This model also specifies the following interfaces:

1. TX Interface: This is an interface between the application program and the


transaction manager, and is implemented by the transaction manager. This interface provides transaction demarcation services, by allowing the application programs to bound transactional operations within global transactions. This interface consists of the following

Vous aimerez peut-être aussi