Vous êtes sur la page 1sur 117

Message passing

ô Key concepts

• Introduction

• IPC

• Remote Procedure Calls

• Group communication

m m m  m
¬ntroduction
ô In distributed system, processes executing on different computers
often need to communicate with each other to achieve some
common goal

ô Inter process communication (I.P.C.) requires information sharing


among two or more processes

ô m basic methods for information sharing are

. Original sharing or shared-Memory approach

m. Copy sharing or message-passing approach

© m m  m
¬ntroduction (contd«)
ô ~hared-Memory approach
Write A Read A
Shared common
 
Memory area

ô Message-passing approach
Send A Receive A

 

ô A message passing system is a subsystem of a distributed O~ that


provides a set of message-based IPC protocols
ô It serves as a suitable infrastructure for building other higher level
IPC systems, such as remote procedure call(RPC) and distributed
shared memory(D~M)

§ m m  m
Ôesirable Features of a good message-
message-passing
system
ô ~implicity
ô Uniform semantics
• ^ocal communication
• Remote communication
ô ufficiency
• If the message passing system is not efficient, IPC become more
expensive. i.e. users will not feel like using this mechanism)

6 m m  m
Features of a good message-
message-passing system
(contd
contd«)
«)
• ~ome optimizations normally adopted
è Avoiding the cost of establishing and terminating connection between
the processes for each and every message exchange.
è Minimizing the costs of maintaining connections
è Piggy backing of acknowledgement.
ô Reliability
• D~ are prone to node crashes or link failures. Retransmit the message (
may be based on timeouts)
• Due to timeouts ± Duplicate Message

• A good Message passing system should have IPC protocols to handle


these issues.

º m m  m
Features of a good message-
message-passing system
(contd
contd«)
«)
ô Correctness - related to group communication
• Atomicity ± either to all or None
• Ordered delivery ± Order acceptable to the application
• ~urvivability ± guarantees message delivery despite of failures
ô lexibility
• IPC primitives must also have the flexibility to permit any kind of
control flow between the co-operating processes including
synchronous and asynchronous send Receive.

 m m  m
Features of a good message-
message-passing system
(contd«)

ô ~ecurity
è Authentication of the receiver sender
è uncrypted message

ô Portability
• m aspects of portability
è [he message passing system should itself be portable
è [he applications written by using the primitives of IPC protocols of the
message passing system should be portable. ~o, Heterogeneity must
be considered while desiging message passing system

D m m  m
¬ssues in ¬ by message passing
ô A message is a block of information formatted by a sending process in
such a manner that it is meaningful to receiving process
ô It consists of a fixed length header and a variable size collection of
typed data objects
ô [he header consists of:

• Address ± to identify the sending receiving process

• ~equence number ± message identifier for identifying lost


duplicate message
• ~tructural information
. [ype ± data or pointer to data
m. ^ength of the variable size message

‰ m m  m
¬ssues in ¬ (contd«)

Structural information
Addresses

Actual data Sequence


or Number number
pointer of or Receiving Sending
Type
to the data bytes/ Message id rocess process
elements address address

Variable Fixed-length header


size
collection
of typed
data
A typical message structure
 m m  m
¬ssues in ¬ (contd«)
ô In the design of an IPC protocol, following important issues need to be
considered
• Who is the sender?
• Who is the receiver?
• Is there one receiver or many receivers?
• Is the message guaranteed to have been accepted by its receiver's?
• Does the sender need to wait for the reply?
• What should be done if a node crash or link failure occurs?
• What should be done if the receiver is not ready to accept the message?
• If there are several outstanding messages for a receiver, can it choose the
order in which to service the outstanding messages?

 m m  m
¬ssues in ¬ (contd«)
ô Issues in IPC are addressed by:

• ~ynchronization

• Buffering

• Multi Data gram Messages

• uncoding and Decoding

• Process Addressing

• ailure Handling

• Group Massaging

m m m  m
Synchronization
ô ~emantics used for synchronization may be broadly classified as
• Blocking ± its invocation blocks the execution of its invoker

• Nonblocking - Its invocation does not block the execution of its invoker

ô How a non blocking RuCuIVING process knows message arrival?


è Polling : Periodically poll the Kernal to check the Buffer status.

è Interrupts: When message is filled in the buffer, a software interrupt is


used to notify the receiving process.

© m m  m
Synchronous mode of communication with both <  and
   primitives having blocking-type semantics

Sender¶s execution Receiver¶s execution

Receive (message);
Execution suspended
Send (message)
Execution suspended Message

Execution resumed

Send (acknowledgement)
Execution resumed
Acknowledgement

Blocked state
Executing state
§ m m  m
Buffering
ô Messages can be transmitted from one process to another by copying
the body of the message from the address space of sending process to
the address space of the receiving process
ô [he message buffering strategy in IPC is strongly related to
synchronization strategy
ô our types of buffering strategy are:
• Null buffer ( or no buffering)

• ~ingle message buffer

• Buffer with unbounded capacity

• inite-bound (or multiple-message) buffer

6 m m  m
Buffering (contd«)
ô Null buffer (or No buffering)
• [here is no place to temporarily store the message
• ~trategies used are:
è [he message remains in the senders process¶s address space and
the execution of the send is delayed until the receiver executes the
corresponding receive
è [he message is simply discarded and the timeout mechanism is used
to resend the message after a timeout period

º m m  m
Buffering (contd«)

ô [he logical path of message transfer is directly from the sender¶s


address space to the receiver¶s address space, involving single copy
operation

Sending process Receiving process

MSG

Message transfer in synchronous send with no buffering strategy

 m m  m
Buffering (contd«)
ô ~ingle-message buffer
• Null buffer strategy is not suitable for synchronous communication
èA message has to be transferred two or more times, and receiver of the
message has to wait for the entire time taken to transfer the message
across the network
• ~ynchronous communication mechanisms in Distributed systems use a
single-message buffer strategy
• A buffer having the capacity to store a single-message is used on the
receiver¶s node

D m m  m
Buffering (contd«)
ô Idea is to keep the message ready for use at location of the receiver
ô [he request message is buffered on the receiver¶s node if the receiver is not
ready to receive the message
ô [he message buffer may be either in kernel¶s address space or in the
receiver¶s process¶s address space

Sending process Receiving process

Single-message
buffer
Node boundary

Message transfer in synchronous send with single-message buffering


strategy (two copy operations needed)
‰ m m  m
Buffering (contd«)
ô Unbounded-capacity buffer
• In asynchronous mode of communication, since a sender does not wait
for the receiver to be ready, there may be several pending messages
that have not yet been accepted by the receiver
• An unbounded-capacity buffer is needed that can store all unreceived
messages to support asynchronous communication
è With assurance that all the messages sent to the receiver will be
delivered

m m m  m
Buffering (contd«)
ô inite-bound (or multiple-message) buffer
• Unbounded capacity of a buffer is practically impossible

• When buffer has finite-bound - problem is buffer overflow

• [he buffer overflow can be dealt with one of m ways:


è Unsuccessful communication
o Message transfers simply fail whenever there is no more buffer space
è low-controlled communication
o [he sender is blocked until the receiver accepts some messages, thus
creating space in the buffer for new messages

m m m  m
Buffering (contd«)

Receive
Send

MSG

Multiple-message
buffer/mailbox port
Message transfer in asynchronous send with multiple-message buffering strategy

ô [he message is first copied from the sending process¶s memory into the
receiving process¶s mailbox
ô [hen message is copied from the mailbox to the receiver¶s memory
when the receiver calls for the message
mm m m  m
Multidatagram messages
ô Maximum transfer unit (M[U)
• Upper bound on the size of data that can be transmitted at a time
ô Message whose size is greater than M[U has to be fragmented into
multiples of M[U and sent separately
ô uach fragment is sent in a packet (known as datagram)
ô Messages smaller than M[U can be sent in a single packet (known as
single-datagram messages)
ô Messages larger than M[U have to separated and sent in multiple
packets (known as Multidatagram messages)
ô [he disassembling and reassembling of messages on sender and
receiver side is the responsibility of message passing system

m© m m  m
Encoding and Ôecoding of message data
ô [he structure of the message data should be preserved between the
sending and receiving processes
ô It is very difficult to achieve this goal in both heterogeneous and
homogenous systems
• m reasons

è An absolute pointer value loses its meaning when transferred from


one process address space to another
è Different program objects occupy varying amount of storage space
o A message must normally contain several types of program objects,
such as long integers, short int, variable length characters and so on

m§ m m  m
Encoding and decoding of message data
(contd«)
ô [wo representation for encoding and decoding of message data:
• [agged representation
è [he type of each program object along with its value is encoded in the
message
è Because of self-describing nature of the coded data format
o Receiving process does not need prior knowledge

• Untagged representation
è Message data only contains program objects
è No information is included in the message data to specify the type of each
program object
o Receiving process must have prior knowledge of how to decode the
received data

m6 m m  m
rocess addressing
ô Message passing system usually supports m types of process addressing
• uxplicit addressing

è [he process with which communication is desired is explicitly named as


a parameter in the communication primitive used

o ~end (Process-id, Message)

[o the process

o Receive (Process_id, Message)

rom the process

mº m m  m
rocess addressing (contd«)
• Implicit addressing
è Processwilling to communicate does not explicitly name a process for
communication
o ~end-any (service_id, Message)
~end a message to any process that provides the service of
type ³service id´
o Receive any (Process_id, Message)
Receive a message from any process & return the
³process_id´ of the process from which message was received.

m m m  m
rocess addressing (contd«)
ô Processes can be identified by the combination of three fields:
• Machine_id, local_id, machine_id
è irst field identifies the node on which process is created

è ~econd field is a local identifier generated by the node on which processes


is created
è [hird filed identifies the last known location (node) of the process
• [he value of the first m fields of its identifier never change; the third field,
however, may
è [his method of addressing is known as link-based addressing

mD m m  m
rocess addressing (contd«)
ô ^ink-based addressing:
• When a process is migrated from its current node to a new node, a link
information {process id, networks M c id} is left on its previous node and
on a new node,
• a new local id is assigned to a process, and its process identifier and the
new local-id is entered in a mapping table maintained by the kernel of
the new node for all processes created on another node but running on
their node.
• If the value of the third field is equal to the first field, the message will be
sent to the node on which the process was created

m‰ m m  m
rocess addressing (contd«)
ô Drawbacks: uventhough it supports migration facility, it suffers from m
main drawbacks
è [he overhead of locating a process may be large if the process has
migrated several times during its lifetime
è It
may not be possible to locate a process if an intermediate node on
which the process once resided during its lifetime is down
ô Both process addressing methods are nontransparent due to the need
to specify the machine identifier
ô What are the alternatives?

© m m  m
rocess addressing (contd«)
. Centralized process identifier allocator
Maintains a counter. When it receives a request for identifier, it
returns the current value of the counter and increments the counter
It suffers from poor reliability and scalability
m. [wo level naming scheme for processes
. Machine independent high level name
m. Machine dependent low level name
with a centralized( or replicated distributed) name server maintaining
the map table that maps high level name to the low level name

© m m  m
Failure handling
ô Possible problems in IPC due to different types of system failures
• ^oss of request message

è ailure of communication link between sender and receiver or receiver¶s


node is down at time the request reaches there
• ^oss of response message

è ailureof communication link or ~ender¶s node is down at the time the


response message reaches there
• Unsuccessful execution of the request

è Receiver¶s node crashing while request is being processed

©m m m  m
Failure handling (contd«)
Sender Receiver

Send Request
a) Request request message
message is lost

Lost

Send Request
b) Response request message
message is lost Successful request execution
Response
message Send response
Lost
Send
c) Receiver¶s Request
request
computer crashed message
Successful request
execution
rash
Restarted

©© m m  m
Failure handling (contd«)
ô our-message reliable IPC protocol for client-server communication
between two processes

lient Server
Request

Acknowledgement

Reply Executing state

Blocked state
Acknowledgement

©§ m m  m
Failure handling (contd«)
ô [hree-message reliable IPC protocol for client-server communication
between two processes

lient Server
Request

Reply Executing state

Blocked state
Acknowledgement

©6 m m  m
Failure handling (contd«)
ô [wo-message reliable IPC protocol for client-server communication
between two processes

lient Server
Request

Reply Executing state

Blocked state

©º m m  m
Failure handling (contd«)
ô ault tolerant communication between a client and a server
Client Server
Send
Request REQUEST Message

Time Out Lost


Send Retransmit REQUEST Message
Request

Time Out Unsuccessful


Execution
 
Send
Request Retransmit REQUEST Message Restarted

Time Out Successful Execution


Send Response

Lost These 2 successful


Send executions of the
Request Retransmit REQUEST Message message may
produce different
results

Successful Execution
Response

© m m  m
Failure handling (contd
(contd«)
«)
ô Idempotency and handling of duplicate request messages

• Idempotency means repeatability

• An Idempotent operation produces same results without any side


effects no matter how many times it is performed with the same
arguments

• uxample simpleIntrest( m, m, 8 ) procedure produces same


result when executed repeatedly

• A Non Idempotent operation produces different results for the same


set of arguments when executed repeatedly

©D m m  m
Failure handling (contd«)
ô uxample : Non Idempotent operation
int Cal_inal_Marks (int und_~em_Marks, int attndnce)
{ [otal_Marks += und_~em_Marks ;
if ( attndnce > 9 )
[otal_Marks +=  ;
else if ( attndnce > 9 )
[otal_Marks += 3 ;
else if ( attndnce > 8 )
[otal_Marks += m ;
else if ( attndnce > 8 )
[otal_Marks +=  ;
return([otal_Marks );
}
©‰ m m  m
Failure handling (contd«)
CLIENT SERVER
Send Total_Marks = 43
Request Cal_Final_Marks (34, 87)

Execute Cal_Final_Marks.

Total_Marks=43+34+2 = 79
Timeout Retrun(79)

Lost

Send Retransmit REQUEST Message


Request Cal_Final_Marks(34, 87)
Execute Cal_Final_Marks.

Total_Marks=79++34+2 = 115

Retrun(115)
Receive Total_Marks
= 115

A nonidempotent procedure
§ m m  m
Failure handling (contd«)
• When no response is received by the client, it is impossible to
determine whether the failure was due to server crash or loss of the
request or response message.
• Using timeouts client resends the request.
• Repeated execution of NonIdempotent requests results in
³ORPHAN´ executions
• How to ensure only one execution of NonIdempotent requests ?
• Using uxactly once semantics
• uxactly once semantics is implemented using unique identifier for
each request at the client side and reply cache on the server side

§ m m  m
Failure handling (contd«)
CLIENT SERVER
Total_Marks = 43
Send
Cal_Final_Marks (34, 87) Reply Cache
Request01 Check reply Cache for request01.
NOT FOUND
Execute Cal_Final_Marks. REQUEST REPLY TO BE
IDENTIFIER SENT
Total_Marks=43+34+2 = 79
Time Save Reply
out Retrun(79) Request 01 79
Request02 45
Lost
.. ..
Send Retransmit
Request01 Cal_Final_Marks(34, 87)

Check reply Cache for request01.


FOUND
Extract Reply
Retrun(79)
Receive
Total_Marks =
79

EXACTLY ONCE SEMANTICS USING REQUEST IDENTIFIERS AND REPLY CACHE


m  m
Failure handling (contd«)
Keeping track of ^ost &out-of-sequence packets in multi data gram
Messages.
ôHow ensure reliable delivery of all the packets of the Multidatagram
message?
ô~imple approach is using ~[OP & WAI[ Protocol
o Acknowledge each packet seperately
• Disadvantage: Communication Overhead.

ôBetter approach is using B^A~[ Protocol


o ~ingle Acknowledgement packet for all the packets of a
multidatagram message

§© m m  m
Failure handling (contd«)
When B^A~[ protocol is used, Node or Common link failure leads to
• ^oss of packets

• Out of sequence delivery of Packets

[o solve this:
• Use of Bitmap to identify the packet of a message using m extra
fields to the Header.
[otal No of Packets, Bit map specifying the position of the packet.
• Use ³ ~u^uC[IVu RuPuA[ ³ method to transmit the ^ost packets
after time out period.

§§ m m  m
Failure handling (contd«)
SENDER RECEIVER
Send Request message

Buffer for 4 packets


Create a buffer for 4 packets and
place this packet in position 1

Packets of
the
Response Timeout
message

Retransmit the request of missing packets

Resend
Missing
packets

§6 m m  m
Group communication
ô [hree types of group communication:

• One to many (single sender and multiple receivers)

• Many to one (multiple senders and single receiver)

• Many to many ( multiple senders and multiple receivers)

§º m m  m
Group communication (contd«)
ô One-to-many communication
• Also known as multicast communication
• ~pecial case of multicast communication is broadcast communication
è Message is sent to all processors connected to a network

ô Group management
è Closed Group - Only the members of the group can send message to
the group.
è Open Group ± Any person in the system can send the message to the
group.
è CentralizedGroup ~ervers (with Replication) ± or dynamic
management of Group members.

§ m m  m
Group communication (contd«)
ô Group addressing
• m level naming scheme is normally used for group addressing

• High level group name is an A~CII string that is independent of the


location information of processes in the group
• ^ow level group name depends on underlying hardware

• ~pecial address to which multiple machines can listen is called multicast


address
• Networks that do not have multicast address have broadcasting facility
with Broadcast address

§D m m  m
Group communication (contd«)
ô Message delivery to receiver process
• User applications use high-level group names in programs
• [he centralized group server maintains a mapping of high-level group
names to their low-level names
• Group server also maintains a list of the process identifiers of all the
processes for each group

§‰ m m  m
Group communication (contd«)
ô Buffered and unbuffered multicast
• Multicast is an asynchronous communication mechanism
• Multicast send cannot be synchronous due to:
è Itis unrealistic to expect a sending process to wait until all the receiving
processes that belong to the multicast group are ready to receive the
multicast message
è [he sending process may not be aware of all the receiving processes that
belong to the multicast group
• or unbuffered multicast, the message is not buffered
è ^ost if receiving process is not in a state to receive it
• or buffered multicast, the message is buffered for receiving process
è uach process of group receive the message

6 m m  m
Group communication (contd«)
ô m types of semantics for one-to-many communications
è ~end-to-all semantics

è Bulletin-board semantics

• Bulletin-board semantics is more flexible than send-to-all semantics,


because of the following factors ignored by send to all:
è [he relevance of a message to a particular receiver may depend on the
receiver¶s state
è Messages not accepted within a certain time after transmission may no
longer be useful

6 m m  m
Group communication (contd«)
ô lexible reliability in multicast communication
• In one to many communication, the degree of reliability is normally
expressed in:
è [he -reliable

o ux. [ime signal generation


è [he -reliable

o ux. Request for service


è [he m-out-of-n-reliable

o (  m  n) ux. Consistency Control Algorithm


è All reliable

o ux. Updation of replicas

6m m m  m
Group communication (contd«)
ô Atomic multicast
• Has an all-or-nothing property
• When message is sent to group, it is either received by all processes that are
members of the group or else it is not received by any of them
ô Many-to-one communication
• Multiple senders send messages to a single receiver
• ~ingle receiver may be selective or nonselective
• ~elective receiver specifies a unique sender
è Message exchange takes place only if that sender sends a message

• Nonselective receiver specifies a set of senders


è Message exchange takes place only if any sender in the set sends a
message to this receiver

6© m m  m
Group communication (contd«)
ô Many-to-many communication
• Multiple senders send messages to multiple receivers

• Important issue is ordered message delivery

• Ordered message delivery ensures that all messages are delivered to


all receivers in an order acceptable to the application
• Ordered message delivery requires message sequencing

• Commonly used semantics for ordered delivery of multicast messages


are:
è Absolute ordering
è Consistent ordering
è Casual ordering

6§ m m  m
Group communication (ontd«)
ô Absolute ordering
• All messages are delivered to all receiver processes in the exact order
in which they were sent
• ~ystem is assumed to have clock at each machine, and clocks are
synchronized with each other
• Uses global timestamp as message identifiers
• Kernal of the receiver places the message in a queue
• ~liding window mechanism is used to deliver the message periodically
• Messages whose time stamp falls within the current window are
delivered to the receiver

66 m m  m
Group communication (ontd«)
ô Absolute ordering

S1 R1 R2 S2

Time
t1
m1
t2

m1 t1 < t2
m2
m2

Absolute ordering of messages


6º m m  m
Group communication (contd«)
ô Consistent ordering
• All messages are delivered to all receiver processes in the same order
• However this order may be different from the order in which messages
were sent

S1 R1 R2 S2
Time
t1
t2
m2
m2 t1 < t2
m1 m1

Consistent ordering of messages


6 m m  m
Group communication (contd«)
ô Implementation of consistent-ordering
I Approach : Centralised ~equencer Method
• Many-to-many scheme appear as a combination of many-to-one and
one-to-many schemes
• Kernels of sending machines send messages to a single receiver
(known as sequencer)
è Assigns sequence number to each message and then multicasts it

• Kernel of each receiving machine saves all incoming messages meant


for a receiver in a separate queue
è Messages in queue are delivered immediately to receiver unless there
is a gap in the sequence number

6D m m  m
Group communication (ontd«)
ô Implementation of consistent-ordering (Contd«)

• ~equencer based method is subject to single point failure and has


poor reliability

ô II Approach : ABCA~[ protocol (Distributed)

• Assigns sequence number to a message by Distributed agreement


among the group members and the sender
. ~ender assigns a temporary sequence number to the message and
sends it to all members of the multicast group.

6‰ m m  m
Group communication (ontd«)
ô ABCA~[ protocol :
• [his sequence number should be greater than the previous number used
by the sender. A counter is used.
m. On receiving the message, each member of the group returns a
proposed sequence number to the sender
è Member(i) calculates its proposed sequence number as
max ( max, Pmax) +  + i N
o max o largest final sequence number agreed upon so far for a
message received by the group
o Pmax o largest proposed sequence number by this member
o N o total number of members in the multicast group
o i o member number

º m m  m
Group communication (ontd«)
ô ABCA~[ protocol :

3. When sender has received the proposed sequence numbers from all
the members, it selects the largest one as the final sequence number
for the message and sends it to all members in a COMMI[ message

• On receiving the COMMI[ message, each member attaches the final


sequence number to the message

• Committed messages with final sequence numbers are delivered to


the application programs in order of their final sequence numbers

º m m  m
Group communication (contd«)
ô Casual ordering

• unsures that if the event of sending one message is casually related to


the event of sending another message, the two messages are delivered
to all receivers in the correct order

• [wo message sending events are said to be casually related if they are
co-related by the happened-before relation

ºm m m  m
Group communication (contd«)
ô Casual ordering
S1 R1 R2 R3 S2

m1
t1
Time
m1 m2
m3

m1
m2
m3

CASUAL ORDERING OF MESSAGES


º© m m  m
Group communication (contd«)
ô Implementation of casual ordering

• CBCA~[ protocol

. uach member process of a group maintains a vector of ³n´


components, where ³n´ is the total number of members in the group

m. uach member is assigned a sequence number from  to n.

3. ith component of the vector corresponds to the member with


sequence number i and it is equal to the number of last message
received in sequence by the ith member.

º§ m m  m
Group communication (contd«)
§. [o send a message, a process increments the value of its own component
in its own vector and sends the vector as part of the message
. When message arrives at a receiver process¶s site, it is buffered by the
runtime system and the Runtime system tests the two conditions, to decide
whether message can be delivered or it must be delayed to ensure casual-
ordering semantics
è ~ i ] = R i ] + and
è ~ j ] = R j ] for all j != i
where ~ is Vector of ~ender process and R is Vector of Receiver process

º6 m m  m
Group communication (contd«)
~ i] = R i] + ensures that the receiver has not missed any message from the
sender

~ j] = R j] for all j!=i ensures that the sender has not received any message
that the receiver has not yet received

6. If message passes these two tests, the runtime system delivers it to the
user process

7. Otherwise the message is left in the buffer and the test is carried out again
for it when a new message arrives

ºº m m  m
Group communication (contd«)
ô CBCA~[ protocol for implementing casual ordering

Status of vectors at some instance of time


Vector of Vector of Vector of Vector of
process A process B process C process D
3 2 5 1 3 2 5 1 2 2 5 1 3 2 4 1
Process A sends a
new message to
other processes
4 2 5 1 message data Delay because
Deliver the condition
A[1]=C[1] + 1
Delay because
is FALSE
the condition
A[3]<=D[3]
is not TRUE

º m m  m
Remote rocedure alls
ô It is a special case of general message-passing model of IPC
ô RPC has become a widely accepted IPC mechanism in distributed
systems because of the following features
• ~imple call syntax
• amiliar semantics ( similar to ^ocal procedure calls)
• Well-defined interface
• uase of use
• Generality
• ufficiency
• Can be used as an IPC mechanism to communicate between
processes on different machines as well as between different
processes on the same machine

ºD m m  m
R model
ô RPC model is similar to the procedure call model used for the transfer of
control and data within a program in the following manner:
• or making a procedure call, the caller places arguments to the
procedure in some well specified location
• Control is then transferred to the sequence of instructions that
constitutes the body of the procedure
• [he procedure body is executed in a newly created execution
environment
• After the procedure¶s execution, control returns to the calling point,
possibly returning a result

º‰ m m  m
Typical Model of a R
aller allee
(lient rocess) (Server rocess)

all rocedure &


Wait for reply

Receive request
it can be & start rocedure
asynchronous , Execution
so that client
can do other rocedure Executes
task while
waiting for replya Send Reply & Wait
for next Request

Resume Execution

 m m  m
Transparency of R
ô A transparent RPC mechanism is one in which local procedures and
remote procedures are indistinguishable to programmers

ô [ransparent RPC require

. ~yntactic transparency

è RPC should have exactly the same syntax as a local procedure call

m. ~emantic transparency

è ~emantics of RPC should be identical to those of a local procedure


call

 m m  m
Transparency of R (ontd«)
ô Differences between RPC and ^PC:
• With RPC, the called procedure is executed in an address space that
is disjoint from the calling program¶s address space. ~o, remote
procedure cannot have access to any variables or data values in the
calling program¶s environment
• RPC are more vulnerable to failure than ^PC¶s
è ~ince they involve m different processes and possibly a network and m
different computers
• RPCs consume much more time (- times more) than ^PCs
è Due to involvement of a communication network

m m m  m
¬mplementation of R mechanism
ô Implementation of RPC mechanism involves five elements of program:
. [he client
m. [he client stub
3. [he RPCRuntime
§. [he server stub
. [he server
ô [he client, the client stub, and one instance of RPCRuntime execute
on the client machine
ô [he ~erver, the ~erver stub, and one instance of RPCRuntime execute
on the server machine

© m m  m
¬mplementation of R mechanism
Client Machine Server Machine

Client Process Server Process


Call Return Return Execute Call

10
1 6 5

Client Stub Server Stub


Pack Unpack Pack Unpack

2 9 7 4

RPC Runtime RPC Runtime


Wait
Send Receive
Send Receive

Result packet D

Call packet 3

§ m m  m
¬mplementation of R mechanism
ô Client
• User process that initiates a RPC

• Makes perfectly normal local procedure call that in turn invokes


corresponding procedure in client stub
ô Client stub
• [wo tasks:

è On receipt of call request from client, it packs a specification of the


target procedure and the arguments into a message and then asks the
local RPC Runtime to send it to the server stub
è On receipt of the result of procedure execution, it unpacks the result
and passes to the client

6 m m  m
¬mplementation of R mechanism
ô RPCRuntime
• Handles transmission of messages across the network between client
and server machines
• It is responsible for retransmission, acknowledgements, packet routing
and encryption
• RPC runtime on the client machine receives the call request message
from the client stub and sends it to the server machine. It also receives
the result message from the server and passes it to the client stub
• RPC runtime on the ~erver machine receives the result message from
the server stub and sends it to the client machine. It also receives the
call request message from the client and passes it to the server stub

º m m  m
¬mplementation of R mechanism
ô ~erver stub
• [wo tasks:

è On receipt of request from local RPCRuntime, it unpacks it and makes


a perfectly normal call to invoke the appropriate procedure in the server
è On receipt of result, it packs the result into a message and then asks
the local RPCRuntime to send it to the client stub

ô ~erver

• On receiving call request from server stub, the server executes the
appropriate procedure and returns the result of procedure execution to
the server stub

 m m  m
¬mplementation of R mechanism
ô ~tub generation:
• m ways
è Manually : RPC implementor provides a set of translation functions
from which a user can construct stubs
è Automatically : Uses Interface Definition ^anguage (ID^) to define the
interface between a client and a server.
ô RPC messages:
• m types of messages involved in the implementation of an RPC system
are:
è Call messages
è Reply messages

D m m  m
¬mplementation of R mechanism
ô Call messages:

• m basic components necessary in a call message are:

è [he identification information of the remote procedure to be executed

è [he arguments necessary for the execution of the procedure

• In addition to these fields, a call message normally has

è A message identification field

è A message type field

è A client identification field

‰ m m  m
¬mplementation of R mechanism

A typical RPC call message format

Remote procedure identifier


Message Message
identifier type lient Version
rogram Arguments
identifier number rocedure
number
(all / number
(Seq.No.)
Reply)

D m m  m
¬mplementation of R mechanism
RPC reply message format

Message Reply Result


Message
identifier status
type
(successful)

a) A successful reply message format

Message Message Reply status Reason for


identifier type failure
(unsuccessful)

b) A unsuccessful reply message format


D m m  m
Server Management
) ~erver Implementation

m) ~erver Creation

ô Based on style of server Implementation sever can be classified as

. ~tateful server ± Maintains client state information. ~o client need

not send the information all the time.

m. ~tateless server ± Does not Maintain client state information.

Dm m m  m
Stateful server
Client Process Server Process
Open ( Filename, Mode )
File Mode R/W Pointer
Id
Return ( Fid )

Read ( Fid , 200, buffer )

Return ( bytes 0 to 199 )

Open ( Fid, 5, buffer )

Return ( bytes 200 to 204 )

Close ( fid )

Return ( Successful )
~tateful file server

D© m m  m
Stateless server
Client Process Server Process
Read( Filename,0, 200,buffer )
File Mode R/W Pointer
Id
Return ( bytes 0 to 199 )

Read( Filename,400,20,buffer )

Return ( bytes 400 to 419 )

~tateless file server


D§ m m  m
Staless vs. Stateful servers
ô ~tateful servers provide an easier programming paradigm, clients
need not keep track of state information

ô ~tateful servers are more efficient than stateless servers

ô ~tateless servers make crash recovery easy in the event of server


crash

ô Choice of using stateless or stateful server is purely application


dependent

D6 m m  m
Server reation Semantics
ô ~ever processes may either be created and installed before their client
processes or be created on demand basis.
ô Based on the time duration for which RPC server survive, RPC servers
are classified as
. Instance ± per-call ~erver.
m. Instance ± per- session ~erver
3. Persistent ~erver

. Instance±per-call ~erver : ~ervers exist only for the duration of a single


call.
It is created by RPC Runtime on the server machine, only when the call
message arrives.
~erver is deleted after the call execution.

Dº m m  m
Server reation Semantics
Not commonly used approach because,
• It is stateless approach, needs state information to be presented
either at client process ([ime consuming and loss of data
abstraction) or at server O.~. (uxpensive)
• Multiple invocation of same server becomes more expensive.
m. Instance ± per- session ~erver : ~erver exists for the entire session
for which client & server interact. ~erver can maintain internal state
information. Overhead involved in creation and destruction is
minimized.
3. Persistent ~erver : ~erver remains in existence indefinitely. A
persistent server can be shared unlike other two.

D m m  m
ommunication protocols for Rs
. [he Request(R) protocol
lient Server

Request message
rocedure
First R
execution

Request message
Next R rocedure
execution

DD m m  m
ommunication protocols for Rs
ô [he Request protocol
• Used in RPC in which the called procedure has nothing to return and
client requires no confirmation that procedure is executed
• Only one message per call is transmitted

• An RPC that uses the R protocol is called asynchronous RPC

• In asynchronous RPC, the RPCRuntime does not take responsibility for


retrying a request in case of communication failure
• Asynchronous RPC with unreliable transport protocol are generally useful
for implementing periodic update services

D‰ m m  m
ommunication protocols for R¶s
m. [he Request Reply(RR) protocol
lient Server

Request message

First R rocedure


execution

Also serves as acknowledgement


for the request message

Request message

Also serves as acknowledgement for the


Next R reply of previous R
rocedure
execution
Reply message

Also serves as acknowledgement for the


request message
‰ m m  m
ommunication protocols for R¶s
ô [he Request Reply (RR) protocol
• ~uitable for simple RPC in which all the arguments & results fit in a
single packet buffer and duration of call and interval between the call is
short (less than transmission time)
• It is based on the idea of using implicit acknowledgement to eliminate
explicit acknowledgment messages
• In this protocol
èA server¶s reply message is regarded as an acknowledgment of client¶s
request message
èA subsequent call packet from a client is regarded as an
acknowledgement of the server¶s reply message of the previous call
made by that client

‰ m m  m
ommunication protocols for R¶s
3. [he Request Reply Acknowledge-reply(RRA) protocol
lient Server

Request message
First rocedure
R execution
Reply message

Reply ack message

Request message

rocedure
Next
execution
R
Reply message

Reply ack message


‰m m m  m
ommunication protocols for R¶s
ô [he RRA protocol
• Message identifiers associated with request messages are ordered

• Client acknowledges the reply message only if it has received the reply
for all the previous requests
• ~erver deletes information from its cache only after receiving an
acknowledgement for it from the client
• ^oss of acknowledgement is harmless, since an acknowledgement
message guarantees the receipt of reply for earlier messages

‰© m m  m
lient Server Binding
ô Binding: Process by which client become associated with server so that
calls can take place.
• ~erver locating:
. Broadcasting:
Message is broadcast to all nodes.
Node housing the desired server responds.
uasy to implement & suitable for small networks. uxpensive for large
networks.
m. Binding Agent:
A name server used to bind a client to a server.
Name server maintains the Binding [able.

‰§ m m  m
lient Server Binding

Name Server
Binding Agent

2 1
3

4
Client Calls the Server
Client Process Server Process

‰6 m m  m
lient Server Binding
Advantages of using Binding Agent:
ô Can support Multiple ~ervers having the same interface type so that any
of the available server may be used to service the client¶s request.
ô Binding agent can Balance the load evenly among the servers providing
the same service.
ô User Authorization facility can be provided for binding

Disadvantages:
ô Overhead becomes large when many client processes are short lived.
ô Binding Agent may become a performance bottleneck

‰º m m  m
lient Server Binding
Binding time: -
. Compile time Binding ĺ Hard coding of ~erver¶s network addresses.
uxtremely Inflexible (if configuration changes)
m. ^ink time Binding ĺ Request B.A. before making call
• ~erver process exports its services by registering it
• Client makes Import request to the binding agent for the service before
making call
• Binding Agent returns the server details to the client
• Client caches it to avoid contacting the Binding agent for subsequent
calls
3. Call time Binding
• Client is bound to a server at the time when it calls the server for the
first time during its execution.

‰ m m  m
lient Server Binding - all time Binding

Binding Agent

1
2
4 3

5
Subsequent calls are Server Process
Client Process Sent directly

‰D m m  m
omplicated R¶s
ô m types of complicated RPC¶s are:
. RPC¶s involving long-duration calls or large gaps between calls
è m methods used to handle
o Periodic probing of the server by the client
o Periodic generation of an acknowledgement by the server
m. RPC¶s involving arguments and or results that are too large to fit in a
single-datagram packet
è A long RPC argument or result is fragmented and transmitted in
multiple packets

‰‰ m m  m
Special types of R¶s
. Call Back RPC
m. Broadcast RPC
3. Batch-mode RPC
. Call Back RPC
ô In Normal RPC, the caller and callee processes have a client-server
relationship, where as in call back RPC uses Peer-to-Peer paradigm
where a node acts as both client and ~erver.
ô Call Back RPC is for interactive applications, which require user
intermediate inputs
ô During procedure execution the server process makes a callback RPC
to client process

 m m  m


Special types of R¶s
ô Callback RPC
lient Server

Start procedure
execution

Stop procedure
rocess callback execution
request and send temporarily
reply
Reply (result of callback)
Resume procedure
execution

rocedure
execution ends

 m m  m


Special types of R¶s
ô Implementation of callback RPC should address:
• Providing the server with the clients Handle
è ~erver should have clients handle to call the client back. Clients
handle uniquely identifies the client. Using handle server makes a
normal RPC to client
• Making client process wait for the call back RPC
è Call back RPC should not be mistaken for reply to the RPC
• Handling of call back Dead Backs
è care must be taken to avoid call back dead locks will be discussed
later

m m m  m


Special types of R¶s (ontd«)
ô Handling callback deadlocks

R R 

R

 

 is waiting for R (reply from  to  )


 is waiting for R (reply from  to )
 is waiting for R  (reply from  to )
© m m  m
Special types of R¶s (ontd«)
m. Broadcast RPC
• Client request is Broadcast on Network & processed by all the
servers providing that service.
• [wo ways
è Using Binding Agent, which forwards the request to all ~ervers
registered with it.
è Using Broadcast Ports of servers.
o Client process may wait for zero, one, m-out±of-n, all replies
Depending on reliability desired.

§ m m  m


Special types of R¶s
3. Batch-mode RPC
• ueue separate RPC request at client side in a transmission buffer &
send them over network in a batch.
è Reduces overhead of sending each RPC.
è Applications requiring higher RPC call rates (- RPC sec) can be
implemented easily.
è [ransmission Buffer is flushed when
o Predetermined interval lapses.
o Predetermined number of requests received.
o Amount of batch data exceeds the buffer size.
o A call is made to one of the server¶s procedure for which result is
expected. ( Nonqueuing RPC)

6 m m  m


¦ptimizations in R for better
performance
. Concurrent Access to Multiple ~ervers

a) Use of threads: - uach thread can independently make calls to


different servers.

b) uarly Reply Approach: -

è RPC is split into m RPC calls

. One RPC for Passing Parameters

m. One RPC for requesting result

c) Call Buffering Approach

º m m  m


Optimizations in RPC for better performance***
ô uarly Reply Approach: - to provide concurrent access to multiple
servers.
lient Server

Return (tag)

Reply (tag) Execute the


arry out other procedure
activities Store (result)
Request result (tag)

Return (result)

Reply (result)

 m m  m


Optimizations in RPC for better performance
ô Call buffering approach : to provide concurrent access to multiple servers.
• Clients and servers do not interact directly with each other
• Interact indirectly via a call buffer sever
• [o make an RPC call
è A client sends its call request to the call buffer server
è Client then performs other activities until it needs the result
è Client periodically polls the call buffer server, when it needs the result
è If result is available it recovers the result

D m m  m


Optimizations in RPC for better performance
ô Call buffering approach
• On server side
è When server is free, it periodically polls the call buffer server, if there is
any call for it
è Ifthere is, it recovers the call request, executes it and makes a call
back to the call buffer server
è Returns the result of execution to the call buffer server

‰ m m  m


Client Call Buffer Server Server

Check for a waiting request

Polling
for
Waiting
Reply ( Tag) request

Carry
out other Reply ( Tag, Parameter)
activities Check for result ( Tag)
Execute the
Procedure

Polling
for Acknowledgement
result

 m m  m


Optimizations in RPC for better performance
m. ~erving multiple requests simultaneously
• Delays encountered in RPC systems :
è Delay caused while a server waits for a resource that is temporarily
unavailable
è A delay can occur when a server calls a remote function that involves
a considerable amount of computation to complete or involves
considerable transmission delay
• Use of Multi-threaded server with dynamic thread creation facility allow
the server to accept and process other requests, instead of being idle
while waiting will provide better performance

 m m  m


Optimizations in RPC for better performance
3. Reducing per-call workload of servers

• One way to achieve this improvement is to use stateless servers

§. Reply caching of idempotent remote procedures

• Proper selection of timeout values

è [oo small timeout value will cause timers to expire too often, resulting in
unnecessary retransmissions

è [oo large timeout value will cause a needlessly long delay in the event
that a message is actually lost

m m m  m


Optimizations in RPC for better performance
è ~ervers are likely to take varying amounts of time to service individual
requests, depending on various factors like server load, network routing
and network congestion
è If the clients continue to retry sending requests, the server loading and
network congestion problem will become worse
è One method for proper selection of timeout values is to use some back-
off strategy or exponentially increasing timeout values
. Proper design of RPC protocol specification

© m m  m


ase studies
ô ~un RPC

• ~teps in creating an RPC application in ~un RPC

è Application programmer manually writes the client program and server


program for the application

è [he client program file is compiled to get a client object file

è [he server program file is compiled to get a server object file

è [he server stub file and the XDR filters file are compiled to get a client
stub object file

§ m m  m


ase studies ( ontd«)
ô ~un RPC
• [he server stub file and the XDR filters file are compiled to get a
server stub object file
• [he client object file, the client stub file, and the client-side
RPCRuntime library are linked together to get the client executable
file
• [he server object file, the server stub object file, and the server-side
RPCRuntime library are linked together to get the server executable
file

6 m m  m


º m m  m
End of chapter 

 m m  m