Vous êtes sur la page 1sur 59

Ch 5.

Transaction Processing Monitors


An Overview

Dr. Kien A. Hua
2
An Overview
Many transaction processing (TP) monitors differ widely
in functionality and scope.
There is no commonly accepted definition of precisely
what a TP monitor is, how it interfaces to other system
components.
The intent of Chapter 5 and 6 is to present a reference
architecture of transaction-oriented system, and define the
role of a TP monitor within that framework.
The current chapter, in particular, explains the services
provided by a TP monitor, and introduces the structure of
this system component.
3
The Role of TP Monitors
Operating systems, communication systems, etc.
are usually not designed for the needs of a
transaction-oriented environment:
A TP monitor provides either essential services absent
from the host system, or services the host performed so
poorly that a new implementation was required.
The main function of a TP monitor is to integrate other
system components to make them work together to
support transaction-oriented processing.
4
COMPUTING STYLES
Computing systems are used in a variety of ways, which
are largely determined by the type of applications these
systems were developed for
Batch Processing
Time-Sharing
Real-Time Processing
Client-Server Processing
Transaction-Oriented Processing
It is helpful to analyze these styles to understand what
system facilities are fundamental to support each style.
5
Batching Processing
Large unit of work: Work comes in large portions at
prescheduled times and with well-defined resource
requirements.
Coarse-grained resource allocation: The programs typically
operate on their own private data.
Sequential access patterns: Batch jobs typically go sequentially
through a large number of processing steps, access files in a
sequential scan, and so on.
Application does recovery: Batch applications have to make
their own provisions against system crashes.
Few (tens of) concurrent jobs: There are not many batch jobs
running concurrently on any given system. Throughput is the
key performance criterion.
Isolated execution: each batch job executes in its own process;
this process has exclusive control of the files, data streams, and
other resources it uses.
6
Time Sharing (1)
Time-sharing is the terminal-oriented version of batch. It
is a way of giving interactive access to computing
resources via low-bandwidth (dumb) terminals.
Process per terminal: While batch processing is driven
through a predefined job control program, a time-sharing
session is controlled by the terminal user. A terminal
session gives the user a complete abstract machine with
memory, devices, and the like.
Coarse-grained resource allocation: Terminal sessions are
typically long, and resources are assigned in large
granules; as in batch processing, the application works on
private data.
7
Time Sharing (2)
Unpredictable demands: The actual resource demands are
not as predictable as they are with batch processing.
Sequential access pattern.
Application does recovery: It is up to the user to
reestablish the session and to figure out how far he/she had
gotten before the crash.
Hundreds of concurrent users: Response time is the key
performance criterion.
8
Real-Time Processing
Event driven operation: Activities in the system are driven
largely by interrupts coming from the sensor devices. The
workload pattern is not preplanned.
Repetitive workload: The set of programs that can be
activated by outside events is statically defined.
High availability: The system must be highly available
because it controls a real-world process.
High performance: Sometimes, a system is characterized
as real-time if it is supposed to react real fast. The
distinctive requirement, however, is that it is able to do
deadline scheduling.
9
Client-Server Computing
Client-server processing is the modern version of time-
sharing.
Rather than running everything a user requests in one
process, services are invoked by passing requests to
dedicated servers, which can reside in other processes on
the same machine or in different machines of a distributed
system.
All persistent data are now encapsulated in database
servers, so data are shared among many users through that
server.
Such servers have to be highly available.
10
TRANSACTION-ORIENTED
PROCESSING (1)
Data sharing: Computations read and update databases
shared among all users.
Variable requests: user requests are random.
Repetitive workload: users do not run arbitrary programs,
but rather request the system to execute certain functions
out of a predefined set.
Mostly simple functions: consume 10
5
10
7
instructions
and do some 10 disk I/Os.
11
TRANSACTION-ORIENTED
PROCESSING (2)
Some batch transactions: have the size and duration of
typical batch job.
Many terminals: 10
3
10
5
terminals.
High availability: Because of the large number of users,
the system must be highly reliable and available.
System does recovery.
Automatic load balancing: The system should deliver high
throughput with guaranteed low response times (soft real-
time system).
12
A TAXONOMY OF
TRANSACTION EXECUTION (1)
Transaction

Direct Queued

Single Message Conversational Shot Long

Local Distributed Local Distributed Local Distribute Local Distribute
Direct OLTP Complex Online Queued OLTP Lone Batch
Transaction Transaction Transaction Transaction
13
A TAXONOMY OF
TRANSACTION EXECUTION (2)
Direct: The terminal and the process running the
server program (handling the request) are
associated with each other.
Queued: Transactions are put in a queue and
scheduled for processing according to the queuing
discipline.

14
A TAXONOMY OF
TRANSACTION EXECUTION (3)
Simple
Single message: There is a single input message from
the terminal, and upon commit a single output
message is delivered.
Short: The number of object it touches is in the tens.
Complex
Conversational: It allows for repeated exchange of
messages between the user and the application.
Long: The number of objects it touches is in the tens
of thousands (batch-like transaction).
15
Transaction Processing Services
Transaction services must provide the
application programmer with a
programming environment that integrates
transaction control in a seamless manner.
The program needs not worry about
concurrency, failures, clean-up, and so forth.
As far as data sharing is concerned,
applications can use the services provided
by a database manager.
16
The Transaction Processing
Services (1)
Manage heterogeneity: The local transaction mechanisms in
each subsystem will not be sufficient to ensure the ACID
properties for the whole function.
Control communication: The status of communication
sessions must also be subject to transaction control by
the transaction services.
Apart from the technical issue of access to shared data,
more system service are required to support
transaction-oriented processing.
17
The Transaction Processing
Services (2)
Terminal management: Since the ACID properties must be
perceived by the user and not just by the program,
sending and receiving the message must be part of the
transaction.
Presentation services: If the terminal uses sophisticated
presentation services, then reestablishing the window
environment after a crash of the workstation is also a
part of the transaction guarantee.
18
The Transaction Processing
Services (3)
Context management
Start/restart: TP monitor must also handle the restart after
any failure. By doing so, all the subsystems are brought
up in a state that is consistent with respect to the ACID
rules.
19
Integrated Control
Note: Many textbooks create the impression that database transaction
control is all there is to transaction processing.
The need to support other resources with ACID properties forces a
more generalized transaction management.
20
Key Terms
Typically, a number of services are bunched
together in one application.
At run time, a server class is maintained for each
application program.
A server class is a group of processes that are able to
run the code of the corresponding application program.
An actual execution of a service request requires
the request to be sent to a process (server) of the
right server class.
The activation of a server on behalf of a service request
is called service invocation.
21
One Process Per Terminal (1)
At logon, each terminal is given its own process,
which it holds on to for the rest of the session.
Example: Time-Sharing systems
Problem:
1. Too many capabilities per process: Each process can
run all applications. It comes with more capabilities
than a terminal needs.
2. Too many process switches: Process switches are very
expensive operations in most operating systems (2000
5000 instructions)
22
One Process Per Terminal (2)
Conclusions:
1. This approach does not work well for transaction-
oriented systems.
2. It is acceptable only for small systems of less than 100
clients.

23
Only One Terminal Process (1)
All terminals talk to one process, which can be the
TP monitor process itself.
The TP monitor process receives the function
requests and routes them to the programs that can
service them.
Example: CICS, ComPlete.
24
Only One Terminal Process (2)
Advantages: Simplicity! The TP monitor can
check the function requests, schedule them
according to its own polices, and so on.
Disadvantages:
Each page fault or other exception in the TP monitors
process will stop the whole TP environment.
Since a single process can employ only one CPU at a
time, the TP system can use only one CPU.
The process is confined within one address space,
which can be a serious limitation for large applications.
25
Many Servers, One Scheduler (1)
There is only one (data communications) process that
handles all the request and response messages.
There is a group of processes (i.e., a server class) for each
application program.
Different applications are fenced off against each other.
The data communication process routes the service request to the
appropriate server.
26
Many Servers, One Scheduler (2)
Example: IMS/DC
Advantages: Simplicity! There is one place for
scheduling and load control.
Disadvantages: The data communication resource
can become a bottleneck.
27
Many Servers, Many Schedulers
(1)
A number of (functionally identical) data
communication processes do the terminal
handling.
There is a server class for data communication services.
The data communication process must multiplex itself
among the terminals it is attached to, and therefore must
be multi-threaded.
28
Many Servers, Many Schedules
(2)
The application server classes are set up as in the
last scenario.
Example: Tandems Pathway, DECs ACMS.
Advantage: The data communication process is no longer
a bottleneck.
Disadvantage: Load balance becomes more difficult.
29
The Tasks of TP Monitors (1)
Scheduling: Service requests must be mapped to
the proper servers.
Server class management: The TP monitor is
responsible for setting up the server class.
Recovery: After a crash, the TP monitor is
responsible for bringing up the TP environment.
It starts all the system processes, brings up the server
classes, and then passes control to the transaction
manager.
30
The Tasks of TP Monitors (2)
Resource administration: Information about the
terminals, databases, application programs, users,
etc. is kept in a system repository managed by the
TP monitor.
Authentication and authorization: Service
requests must be cleared by the TP monitor before
they are executed.
System operation: The TP monitor must provide
the operators with sufficient information to tune
the system, and inform them about any problems
that occur during normal operations.
31
Resource Managers
A resource manager is a software subsystem that ties into the TP
monitor to provide protected actions on its state.
It must be able to participate in transaction-oriented recovery
BEGIN WORK
receive (input message)
send (statistics menu) to (window w1);
COMMIT WORK;
32
Context-Sensitive Scheduling
The completion of a request typically frees the
server so that it can be reassigned to another
request.
However, there are cases in which a server is
reserved for a special user.
Example: For chained transactions, the server must be
reserved for the next transaction, because it may refer
to local context variables available only in that server
process.
33
Transaction Manager (TM)
Once the transaction program has started, TP
monitor has little to do with transaction
management.
The coordination of the resource managers is done by the
transaction manager.
34
Transaction Manager (TM) (2)
We want to separate the components exercising
transaction control (the transaction manager) from
those that do transaction-oriented resource
scheduling (TP monitor).
Reason: There are transactions that do not come in
through the TP monitor.
Examples:
Ad hoc query interface of SQL systems.
CAD applications run their own terminal environment.
35
Responsibilities of TP Monitors
(1)
The TP monitor brings up the resource
managers upon startup.
For restart, the TP monitor only has to bring
up the resource managers.
The actual recovery protocol is completely
handled among the resource managers and the
transactions manager.

36
Responsibilities of TP Monitors
(2)
To dispatch a server for a request, the TP
monitor creates a process (or reuse an
existing one) and load the code into it.
All the calls among resource managers are
so-called transactional remote procedure
calls (TRPCs). The mechanisms to handle
them are provided by the TP monitor.
Example: BEGIN_WORK is a TRPC to the
transaction manager.
37
Transaction Processing
Components
TP monitors main tasks:
To handle the incoming requests
To provide the resources for their processing
To hand back the results
Orchestrating the cooperation among the
various resource managers is the task of the
transaction manager.
38
Transaction Processing
Components
Transactional Remote Procedure
Call
(TRPC)
40
Remote Procedure Call (RPC)
An RPC system enables a client program to communicate
with server programs on different computers by calling
procedures in a similar way to the conventional use of
procedure calls in high-level language.
At the RPC level a service may be viewed as a module
with an interface that exports a set of procedures
appropriate for operating on some data abstraction or
resource.
From the perspective of client programs a service provides
the same facilities as a software module enabling clients
to import its procedures.
41
Marshalling
Marshalling is the process of taking a collection of
data items and assembling them into a form suitable
for transmission in a message.
Flatten structured data items into a sequence of basic data
items.
Translate those data items into an external data
representation.
Unmarshalling is the process of disassembling
them on arrival to produce an equivalent collection
of data items at the destination.
Translate the external data representation to the local one.
Unflatten the data item.
42
Message Destinations
Potential clients need to know an identifier for
communicating with a server.
In the Internet protocols, the destination addresses
for messages are specified as a port number used
by a process and the Internet address of the
computer on which it runs.

Send
(p, message)
Receive
(p, message)
port p
port q
Message
43
RPC: Main Tasks
The software that supports remote procedure calling has three
main tasks:
Interface processing: Integrating the RPC mechanism with
client and server programs in conventional programming
languages.
dispatching of request messages to the appropriate procedure in the
server.
marshalling and unmarshalling of arguments in the client and the
server.
Communication handling: Transmitting and receiving
request and reply messages.
Binding: Locating an appropriate server for a particular
service.
44
Building the Client Programs
45
Building the Client Programs (2)
An RPC system provides a stub procedure to stand in for
each remote procedure that is called by the client program.
The purpose of a client stub procedure is to convert a local
procedure call to a remote procedure call to the server.
The task of a client stub procedure is to
marshal the arguments and to pack them up with the procedure
identifier into message,
send the message to the server and then await the reply message,
unmarshal it and return the results.
46
Building the Server Programs
47
Building the Server Programs (2)
An RPC system provides a despatcher and a set of server
stub procedures.
The despatcher uses the procedure identifier in the request
message to select one of the server stub procedures and
pass on the arguments.
The task of a server stub procedure is to
unmarshal the arguments,
call the appropriate service procedure, and
when it returns, marshal the output arguments into a reply
message.
48
Interface
The types of the arguments and results in the
client stub must conform to those expected by the
server stub. This is achieved by the use of a
common interface definition.
An RPC interface definition specifies those
characteristics of the procedures provided by a
server that are visible to the servers clients.
The characteristics that must be defined include the
names of the procedures and the types of their
parameters.
49
Interface Compilers
Interface compilers can be designed to process interfaces for use with
different languages enabling clients and servers written in different
languages to communicate by using RPCs.
50
Binding
An interface definition specifies a textual service
name for a server. However, client request
messages must be addressed to a server port.
In a distributed system, a Name Service, called a
Binder, is used to maintain a table containing
mappings from service names to server ports.
When a server process starts executing, it sends a
message to the binder requesting it to Register its
service name and server port.
When a client process starts, it sends a message to the
binder requesting it to LookUp the identifier of the
server port of a named service.
51
Transactional RPC (TRPC)
Bind RPCs to transaction: Each RPC is tagged with a
TRID.
Inform the transaction manager: It makes sure that the
transaction manager always knows who is participating in
a transaction.
Binding processes to transactions: When dispatching a
server, the TP monitor remembers the transaction for
which the server is running and thus can inform the
transaction manager if that process crashes.

TP monitor provide the mechanism to handle RPCs. In addition,
TP monitors turn each RPC into a TRPC:
52
Transactional RPC (TRPC) (2)
Observations
TP monitors allocate resources for other system
components to do the work, rather than doing the
work itself.
Their tasks are similar to the duties of an operating
system.
Some believe it would be best if the operating system
just swallowed the TP monitor.
53
Summary
The sum of TP monitors functioning is twofold:
1. It extends standard RPC mechanisms to include
server class management.
2. It provides the transaction manager with enough
information to keep the dynamically expanding
web of resource managers participating in a
transaction within a sphere of control.
54
Remote Procedure Calls (RPCs)
RPC makes the invocation of services at remote nodes
look like local subroutine calls.
The RPC stub on the callee acts fully complementary to
the stub at the callers side.
CALLER (client) CALLEE (server)
RPC stub
RPC stub
:
Procedure Call
:
Service Routine
1. Subroutine Call
2. Request
massage
3. Subroutine Call
55
The Dynamics of RPCs:
Compilation
1. A pre-compiler parses and translates the SQL statement
into an internal representations that can be interpreted
directly by the SQL executor.
2. The pre-compiler also generates code for the host
language to call the SQL server:
!sqlselect(fastsql, format_CB, expression_CB,
&variable_CB);

A resource manager
invocation (recognized
by the stub compiler)
Entry
point
Resource
manager
name
(RMNAME)
Parameters
56
The Dynamics of RPCs:
Compilation (2)
3. The above statement is recognized by a stub compiler. It
does two things:
- It first translates the statement into code that prepares
parameters for the TPRC stub, and into calls to the right
entries.
- It coerces the parameter formats back and forth.
57
The Dynamics of RPCs:
Execution (1)
1. Bond the RMNAME in the invocation to a
NODEID and an RMID; information is obtained
from the name server.
2. Look up the callees interface prototype
description (in the repository).
3. Coerce the local parameter representation into
the one expected by the invoked resource
manager.
4. Pack all the transformed parameter values into a
byte string (parameter marshalling).
58
The Dynamics of RPCs:
Execution (2)
5. Send the message to the peer TRPC stub.
6. The caller is now suspended until the response
from the server arrives.
7. Unpack the byte string (reverse marshalling).
8. Coerce the parameter values received into the
representation used by the caller.
Note: Client makes it right: coercing the parameter values is
done at the callers site.
Server makes it right: coercing is done at the servers
site.
59
Execution Plans
At compile time, the client has to issue rmCall to the SQL
server for it to compile the statement.
The SQL server compiles the statement and generates the
access plan, hands back an ID for that plan.
At run time, the rmCalls from the client refer to the access
plan ID and thereby ask the server to run that pre-compiled
query.
Embedded SQL is compiled once, and from then on
the generated query plan is executed.

Vous aimerez peut-être aussi