Vous êtes sur la page 1sur 11

D.

*Process Addressing and Group Communication


Another important issue in message-based communication is addressing (or naming) of the parties involved in an interaction. For greater
flexibility a message-passing system usually supports two types of process addressing:

Explicit addressing. The process with which communication is desired is explicitly named as a parameter in the communication primitive used.
Implicit addressing. The process willing to communicate does not explicitly name a process for communication (the sender names a server
instead of a process). This type of process addressing is also known as functional addressing.

Methods to Identify a Process (naming)


A simple method to identify a process is by a combination of machine_id and local_id. The local_id part is a process identifier, or a port
identifier of a receiving process, or something else that can by used to uniquely identify a process on a machine. The machine_id part of the
address is used by the sending machines kernel to send the message to the receiving processs machine, and the local_id part of the address is
then used by the kernel of the receiving processs machine to forward the message to the process for which it is intended.

A drawback of this method is that it does not allow a process to migrate from one machine to another if such a need arises.
To overcome the limitation of the above method, process can be identified by a combination of the following three fields: machine_id, local_id
and machine_id.

The first field identifies the node on which the process was created.
The second field is the local identifier generated by the node on which the process was created.
The third field identifies the last known location (node) of the process.

Another method to achieve the goal of location transparency in process addressing is to use a two-level naming scheme for processes. In this
method each process has two identifiers: a high-level name that is machine independent (an ASCII string) and the low-level name that is
machine dependent (such as pair (machine_id, local_id). A name server is used to maintain a mapping table that maps high-level names of
processes to their low-level names.

Group Communication:
The most elementary form of message-based interaction is one-to-one communication (also known as point-to-point, or unicast
communication) in which a single-sender process sends a message to a single-receiver process. For performance and ease of programming,
several highly parallel distributed applications require that a message-passing system should also provide group communication facility.
Depending on single or multiple senders and receivers, the following three types of group communication are possible:

One to many (single sender and multiple receivers).


Many to one (multiple senders and single receivers).
Many to many (multiple senders and multiple receivers).
One-to-Many Communication:
In this scheme, there are multiple receivers for a message sent by a single sender. One-to-many scheme is also known as multicast
communication. A special case of multicast communication is broadcast communication, in which the message is sent to all processors
connected to a network.

Group Management:
In case of one-to-many communication, receiver processes of a message form a group. Such groups are of two types closed and open. A
closed group is one in which only the members of the group can send a message to the group. An outside process cannot send a message to
the group as a whole, although it may send a message to an individual member of the group. On the other hand, an open group is one in which
any process in the system can send a message to the group as a whole.
Group Addressing
D.C

A two-level naming scheme is normally used for group addressing. The high-level grup name is an ASCII string that is independent of
the location of the processes in the group. On the other hand, the low-level group name depends to a large extent on the underlying
hardware.
On some networks it is possible to create a special network address to which multiple machines can listen. Such a network address is called a
multicast address. Therefore, in such systems a multicast is used as a low-level name for a group.
Some networks that do not have the facility to create multicast address may have broadcast facility. A packet sent to a broadcast address is
automatically delivered to all machines in the network. In this case, the software of each machine must check to see if the packet is intended
for it.
If a network does not support either the facility to create multicast address or the broadcasting facility, a one-to-one communication
mechanism has to be used to implement the group communication facility. That is, the kernel of the sending machine sends the message
packet separately to each machine that has a process belonging to the group. Therefore, in this case, the low-level name of a group contains a
list of machine identifiers of all machines that have a process belonging to the group.

Many-to-One Communication:
In this scheme, multiple senders send messages to a single receiver. The single receiver may be selective or nonselective. A selective
receiver specifies a unique sender; a message exchange takes place only if that sender sends a message. On the other hand, a nonselective
receiver specifies a set of senders, and if any one sender in the set sends a message to this receiver, a message exchange takes place. An
important issue related to the many-to-one communication scheme is nondeterminism. It is not known in advance which member (or
members) of the group will have its information available first.

Many-to-Many Communication:
In this scheme, multiple senders send messages to multiple receivers. An important issue related to many-to-many communication
scheme is that of ordered message delivery. Ordered message delivery ensures, that all messages are delivered to all receivers in an order
acceptable to the application.

No ordering constraints for message delivery

*Lamport's Distributed Mutual Exclusion Algorithm


Lamport's Distributed Mutual Exclusion Algorithm is a contention-based algorithm for mutual exclusion on a distributed system.
Algorithm:
Nodal properties
Every process maintains a queue of pending requests for entering critical section in order. The queues are ordered by virtual time stamps
derived from Lamport timestamps.
Algorithm
Requesting process
Pushing its request in its own queue (ordered by time stamps)
Sending a request to every node.
Waiting for replies from all other nodes.
If own request is at the head of its queue and all replies have been received, enter critical section.
Upon exiting the critical section, remove its request from the queue and send a release message to every process.
Other processes
After receiving a request, pushing the request in its own request queue (ordered by time stamps) and reply with a time stamp.
After receiving release message, remove the corresponding request from its own request queue.
If own request is at the head of its queue and all replies have been received, enter critical section.
Message complexity
This algorithm creates 3(N 1) messages per request, or (N 1) messages and 2 broadcasts. 3(N 1) messages per request includes:
(N 1) total number of requests
D.C

(N 1) total number of replies


(N 1) total number of releases
Drawbacks
There exist multiple points of failure.
* Threads are called Lightweight and Processes are called as Heavy weight. Justify with suitable examples and appropriate diagrams. Explain in
brief LRPC.
Process is a program in execution. Suppose there r two processes that means that occurs at
different-different memory location and the context switching b/w process is more expensive bcz it will take
more time from one memory allocation to other memory allocation that is why Process is called HEAVY WEIGHT.

Thread is smallest part of program and It is independent sequential path of execution with in a program.
Suppose there r two threads that means that occurs at same memory location bcz of smallest part of program.
and the context switching b/w threads is less expensive rather than process that is why Thread is called Light
WEIGHT.
The major difference between threads and processes is

1. Threads (Light weight Processes) share the address space of the process that created it; processes have their own address
2.Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
3. Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with
sibling processes.
4. Threads have almost no overhead; processes have considerable overhead.
5. New threads are easily created; new processes require duplication of the parent process.
6. Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
7. Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads
of the process changes to the parent process does not affect child processes. If we consider running a word
processing program to be a process, then the auto-save and spell check features that occur in the background are different threads of that
process which are all operating on the same data set (your document).
Eg:
Threads are typically compared in terms of processing time. For example, a lightweight thread is a thread that takes less processing time,
whereas a heavyweight thread is a thread that requires more processing time. Thread processing time is also contingent on the language used
for thread implementation. For example, it may be more efficient to use C# to implement a program containing multiple threads.

Modern operating systems, like Mac, allow for more than a single thread in the same address space, reducing switching time between threads.
However, this does not make use of multithreading benefits.
Communication between processes also known as IPC, or inter-process communication is quite difficult and resource-intensive.

*Buffering
In the standard message passing model, messages can be copied many times: from the user buffer to the kernel buffer (the output buffer of a
channel), from the kernel buffer of the sending computer (process) to the kernel buffer in the receiving computer (the input buffer of a
channel), and finally from the kernel buffer of the receiving computer (process) to a user buffer.

Null Buffer (No Buffering)


In this case, there is no place to temporarily store the message. Hence one of the following implementation strategies may be used:
The message remains in the sender processs address space and the execution of the send is delayed until the receiver executes the
corresponding receive.
The message is simply discarded and the time-out mechanism is used to resend the message after a timeout period. The sender may have to
try several times before succeeding.

Single-Message Buffer:
In single-message buffer strategy, a buffer having a capacity to store a single message is used on the receivers node. This strategy is usually
used for synchronous communication, an application module may have at most one message outstanding at a time.

Unbounded-Capacity Buffer:
In the asynchronous mode of communication, since a sender does not wait for the receiver to be ready, there may be several pending
messages that have not yet been accepted by the receiver. Therefore, an unbounded-capacity message-buffer that can store all unreceived
messages is needed to support asynchronous communication with the assurance that all the messages sent to the receiver will be delivered.

Finite-Bound Buffer:
Unbounded capacity of a buffer is practically impossible. Therefore, in practice, systems using asynchronous mode of communication use
finite-bound buffers, also known as multiple-message buffers. In this case message is first copied from the sending processs memory into the
receiving processs mailbox and then copied from the mailbox to the receivers memory when the receiver calls for the message.

When the buffer has finite bounds, a strategy is also needed for handling the problem of a possible buffer overflow. The buffer overflow
problem can be dealt with in one of the following two ways:
Unsuccessful communication. In this method, message transfers simply fail, whenever there is no more buffer space and an error is returned.
Flow-controlled communication. The second method is to use flow control, which means that the sender is blocked until the receiver accepts
some messages, thus creating space in the buffer for new messages. This method introduces a synchronization between the sender and the
receiver and may result in unexpected deadlocks. Moreover, due to the synchronization imposed, the asynchronous send does not operate in
the truly asynchronous mode for all send commands.
D.C

The three types of

buffering strategies used in interprocess communication

*Munin
Software distributed shared memory (DSM) is a software abstraction of shared memory on a distributed memory machine. The key problem in
building an efficient DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. The Munin
DSM system incorporates a number of novel techniques for doing so, including the use of multiple consistency protocols and support for
multiple concurrent writer protocols. Due to these, and other, features, Munin is able to achieve high performance on a variety of numerical
applications. This paper contains a detailed description of the design and implementation of the Munin prototype, with special emphasis given
to its novel write shared protocol. Furthermore, it describes a number of lessons that we learned from our experience with the prototype
implementation that are relevant to the implementation of future DSM's.

*State the address space transfer mechanism in detail.


Process address space
Code
Data
Program stack
The size of the processs address space (several megabytes) overshadows the size of the processs state information (few kilobytes).
Mechanisms for address space transfer:
Total freezing
A processs execution is stopped while its address space is being transferred.
Disadvantage:
Process is suspended for long time during migration, timeouts may occur, and if process is interactive, the delay will be noticed by the user.

Pre-transferring
It is also known as pre-copying.
The address space is transferred while the process is still running on the source node.
It is done as an initial transfer of the complete address space followed by repeated transfers of the page modified during the previous transfer.
The pre-transfer operation is executed at a higher priority than all other programs on the source node.
Reduces the freezing time of the process but it may increase the total time for migrating due to the possibility of redundant page transfers.
D.C

Transfer on reference
The process address space is left behind on its source node, and as the relocated process executes on its destination node.
Attempts to reference memory page results in the generation of requests to copy in the desired blocks from their remote location.
A page is transferred from its source node to its destination node only when referenced.
Very short switching time of the process from its source node to its destination node.
Imposes a continued load on the processs source node and results in the process if source node fails or is rebooted.

* Explain the various object locating mechanisms in brief


Object locating is the process of mapping an object's system-oriented unique identifier (UID for short) to the replica locations of the object. In
a distributed system, object locating is only the process of knowing the object's location, that is, the node on which it is located. On the other
hand, object accessing involves the process of carrying out the desired operation (e.g., read, write) on the object. Therefore, the object-
accessing operation starts only after the object-locating operation has been carried out successfully. Several object-locating mechanisms have
been proposed and are being used by various distributed operating systems. Broadcasting In this method, an object is located by broadcasting
a request for the object from a client node. The request is processed by all nodes and then the nodes currently having the object reply back to
the client node. Amoeba uses this method for locating a remote port.
The method is simple and enjoys a high degree of reliability because it supplies all replica locations of the target object. However, it suffers
from the drawbacks of poor efficiency and scalability because the amount of network traffic generated for each request is directly proportional
to the number of nodes in the system and is prohibitive for large networks. Therefore, this method is suitable only when the number of nodes
is small, communication speed is high, and object-locating requests are not so frequent.
D.C

Expanding Ring Broadcast :


Pure broadcasting is expensive for large networks. Moreover, direct broadcasting to all nodes may not be supported by wide-area networks.
Therefore, a modified form of broadcasting, called expanding ring broadcast is normally employed in an internetwork that consists of local
area networks (LANs) connected by gateways. In this method, increasingly distant LANs are systematically searched until the object is found or
until every LAN has been searched unsuccessfully. The distance metric used is a hop. A hop corresponds to a gateway between processors. For
example, if a message from processor A to processor B must pass through at least two gateways, A and B are two hops distant. Processors on
the same LAN are zero hop distant. A ring is the set of LANs a certain distance away from a processor. Thus, Ring0[A] is A's local network,
Ring1[A] is the set of LANs one hop away, and so on.

Encoding location of Object within Its UID:


This scheme uses structured object identifiers. One field of the structured UID is the location of the object. Given a UID, the system simply
extracts the corresponding object's location from its UID by examining the appropriate field of the structured UID. The extracted location is the
node on which the object resides.
D.C

Searching Creator Node First and Than Broadcasting:


This scheme is a simple extension of the previous scheme. The included extension is basically meant for supporting object migration facility.
The method is based on the assumption that it is very likely for an object to remain at the node where it was created (although it may not be
always true). This is because object migration is an expensive operation and objects do not migrate frequently.

Using Forward Location Pointers:


This scheme is an extension of the previous scheme. The goal of this extension is to avoid the use of broadcast protocol. A forward location
pointer is a reference used at a node to indicate the new location of an object.

Using Hint Cache and Broadcasting:


Another commonly used approach is the cachebroadcast scheme. In this method, a cache is maintained on each node that contains the (UID,
last known location) pairs of a number of recently referenced remote objects. Given a UID, the local cache is examined to determine if it has an
entry for the UID. If an entry is found, the corresponding location information is extracted from the cache. The object access request is then
sent to the node specified in the extracted location information.

*Distributed Computing Environment


DCE is client/server architecture, defined by the Open Software Foundation (OSF) that provides Open System platform to address the
challenges of distributed computing.
DCE is middleware software sandwiched between the application and the operating system layer.
Goals of DCE:
It should run on different computers, operating systems and networks in a distributed system.
It should provide a coherent seamless platform for distributed applications.
It should provide a mechanism for clock synchronization on different machines.
It should provide tools which make it easier to write distributed applications where multiple users at multiple locations can work together.
Provide extensive tools for authentication authorization.
DCE architecture:
DCE cell:
The DCE system is highly scalable.To accommodate such large systems DCE uses the concept of cells. A cell is the basic unit of operation in the
DCE which breaks down a large system into smaller, manageable units.

DCE Components: DCE uses various technologies that form the components such as:
1. Thread package
It is a package that provides a programming model for building concurrent applications. It is a collection of user-level library procedures that
supports the creation, management, and synchronization of multiple threads.
It was designed to minimize the impact on the existing software that could convert single threaded into multithreaded one.
D.C

Threads facility is not provided by many operating systems and DCE components require threads, hence threads package is included in DCE.
2. Remote procedure call
A procedure call is a method of implementing the Client/Server Communication.
The procedure call is translated into network communications by the underlying RPC mechanism.
In DCE RPC, one or more DCE RPC interfaces are defined using the DCE interface definition language (IDL). Each interface comprises a set of
associated RPC calls each with their input and output parameters.
RPC hides communication detail and removes system and hardware dependencies. It can automatically handle data type conversions between
the client and the server without considering whether they run on the same or different architecture, or have same or different byte ordering.
3. Name service
The Name service of DCE include
a. Cell Directory Service (CDS):The CDS manages a database of information about the resources in a group of machines called a DCE cell.
b. Global Directory Service (GDS):The Global Directory Service implements an international, standard directory service and provides a global
namespace that connects the local DCE cells into one worldwide hierarchy.
c. Global Directory Agent (GDA): The GDA acts as a go-between for cell and global directory services.
4. Distributed file service
Distributed File service is a worldwide distributed file system which allows users to access and share files stored on a file server anywhere on
the network without knowing the physical location of the file.
It provides characteristics such as high performance, high availability and location transparency and is able to perform tasks like replicating
data; log file system data, quick recovery during crash etc.
5. Time service
The DCE Time Service (DTS) provides synchronized time on the computers.
It enables distributed applications on different computers to determine event sequencing, duration, and scheduling participating in a
Distributed Computing Environment.
DTS also provides services which return a time range to an application, which compare time ranges from different machines.
6. Scheduling and synchronization
Scheduling determines how long a thread runs and which thread will run next.
It uses three algorithms namely FIFO, Round Robin, or the Default algorithm.
Synchronization prevents multiple threads from accessing the same resource at the same time.
7. Security service
There are three aspects to DCE security which provide resource protection within a system against illegitimate access. They are:
Authentication: This identifies that a DCE user or service is allowed to use a particular service.
Secure communications: Communication over the network can be checked for tampering or encrypted for privacy.
Authorization: The permission to access the service is given after authorization.
These are services are provided using Registry Service, Privilege Service, Access Control List (ACL) Facility, and Login Facility etc.
Advantages of DCE
Supports portability and interoperability
Supports distributed file service

*Happens before relation and Idempotent operations


Happens before:
In synchronize logical clocks lamport defined a relation called happens-before.
the happens-before relation can be observed directly in two situations.
1) If a and b are events in same process and a occurs before b, the ab is true.
2) If a is the event of a message being sent by another process then ab is also true.
3) A message cannot be received before it is sent, even at the same time it is sent , since it takes a finite(non-zero) amount of time to arrive.
4) The happen-before is a transitive relation so if ab & bc then ac.
5) If two events, x & y happen in different processes that do not exchange message, then xy is not true but neither is yx.
6) the events are said to be concurrent which simply means that nothing can be said about when the event happened which event happened
first.

a) 3-process each with its own clock, the clocks run at different rates. b) Lamport algorithm corrects the clocks.
7)Using these method , there is a way to assign time to all events in a distributed system subject to the following condition.
If a and b represent to sending and receiving of a message, respectively c(a)>c(b),
If a happens before b in the same process, c(a)<c(b)< li="">
D.C

for all distinctive event a and b,


e.g. totally ordered multicasting.
Idempotent Operation:
Idempotent means the same thing and idempotent operation is something like the operation which produces the same result over and
over, no matter how many times the operation you perform with the similar arguments and also without any side effects. For example -
GetSqrt procedure is for calculating the root of a given number. GetSqrt will always give you 9. This is idempotent operation, it always gives
you the same result.
To understand these operations, lets consider one example here. Lets say a client has 1000 in his account and now he wants to withdraw
100 from it. We will see this whole transaction with a non-idempotent operation first and then, we will move on to the idempotent operation
so that you can understand the difference.
Steps 1
Client requests Server to process the debit of 100 from his account. For this, the client makes a request message as Request Debit(100).
Step 2
This process on the Server side should be like 1000-100 = 900 left in the client account.
Step 3
After processing, the Server sends a response message as Return (Success, 900) to client.

Now, lets consider a case where the above Return Reply message gets lost in the connection, definitely with the timeout mechanism. The
client will wait for the reply from the Server for a specific time and after that, it assumes the request message has gotten lost and therefore
retransmission is needed. So, he will retransmit the same request to the Server this is the same thing we had seen in our previous article of
exactly once semantics, and this is not a reliable way to communicate.
Step 4
Now what happens is that the client sends the same request again as Retransmit (Debit, 100).
Step 5
On the server side, the server assumes it as a new request message, and treats it like that Balance = 900-100 = 800. You see here, 100 is
deducted twice, but the client wants to debit only 100. So we are very off in this communication path.
Step 6
The Server sends a reply message as Return (Success, 800) to the client.
D.C

*Namespace
Namespaces are commonly structured as hierarchies to allow reuse of names in different contexts. As an analogy, consider a system of naming
of people where each person has a proper name, as well as a family name shared with their relatives. If, in each family, the names of family
members are unique, then each person can be uniquely identified by the combination of first name and family name; there is only one Jane
Doe, though there may be many Janes. Within the namespace of the Doe family, just "Jane" suffices to unambiguously designate this person,
while within the "global" namespace of all people, the full name must be used.
In a similar way, hierarchical file systems organize files in directories. Each directory is a separate namespace, so that the directories "letters"
and "invoices" may both contain a file "to_jane".
In computer programming, namespaces are typically employed for the purpose of grouping symbols and identifiers around a particular
functionality and to avoid name collisions between multiple identifiers that share the same name.
In networking, the Domain Name System organizes websites (and other resources) into hierarchical namespaces.

*Real Time Distributed System


A real-time system is a computer system in which the correctness of the system behavior depends not only on the logical results of the
computations but also on the time when the results are produced.
Real-time systems usually are in strong interaction with their physical environment. They receive data, process it, and return results in right
time. Examples: Process control systems Computer-integrated manufacturing systems Aerospace and avionics systems Automotive
electronics Medical equipment Nuclear power plant control Defence systems Consumer electronics Multimedia
Telecommunications
Real-time systems very often are implemented as distributed systems.
Some reasons:
Fault tolerance
Certain processing of data has to be performed at the location of the sensors and actuators.
Performance issues.

*Mach Operating System


The Mach operating system was designed to provide basic mechanisms that most current operating systems lack. The goal is to design an
operating system that is BSD-compatible and, in addition, excels in the following areas:
Support for diverse architectures, including multiprocessors with varying degrees of shared memory access: uniform memory access (UMA),
nonuniform memory access (NUMA), and no remote memory access (NORMA)
D.C

Ability to function with varying intercomputer network speeds, from wide-area networks to high-speed local-area networks and tightly
coupled multiprocessors
Simplified kernel structure, with a small number of abstractions (In turn, these abstractions are sufficiently general to allow other operating
systems to be implemented on top of Mach.)
Distributed operation, providing network transparency to clients and an object-oriented organization both internally and externally
Integrated memory management and interprocess communication, to provide efficient communication of large numbers of data as well as
communication-based memory management
Heterogeneous system support, to make Mach widely available and interoperable among computer systems from multiple vendors The
designers of Mach have been heavily influenced by BSD (and by UNIX, in general), whose benefits include
A simple programmer interface, with a good set of primitives and a consistent set of interfaces to system facilities Easy portability to a wide
class of single processors
An extensive library of utilities and applications
The ability to combine utilities easily via pipes Of course, the designers also wanted to redress what they saw as the drawbacks of BSD:
A kernel that has become the repository of many redundant features and that consequently is difficult to manage and modify
Original design goals that made it difficult to provide support for multiprocessors, distributed systems, and shared program libraries (For
instance, because the kernel was designed for single processors, it has no provisions for locking code or data that other processors might be
using.) 4 Appendix B The Mach System
Too many fundamental abstractions, providing too many similar, competing means with which to accomplish the same tasks
* compare and contrast between Network Operating System and the Distributed Operating System with a suitable example.

*Difference network operating system vs Distributed operating system:


Network Os Distributed Operating System
A network operating system (NOS) is a computer operating system An operating system which manages a number of computers and
that is designed primarily to support workstation, personal computer, hardware devices which make up a distributed system.
and, in some instances, older terminal that are connected on a local
area network.
The main goal is to offer local services to remote clients. The main goal is to hide and manage hardware resources.
It is a loosely coupled operating system for heterogeneous multi- It is a tightly-coupled operating system for multiprocessor and
computers homogeneous multi-computers.
Network OS follows 2 tier Client server architecture Distributed OS follows n tier client/server architecture
Two types of DOS- multi computer operating system and Two types of NOS-,Peer-to-peer and client/server architecture.
multiprocessor operating system.
NOS uses files for communication DOS uses messages for communication
Degree of Transparency is low Degree of Transparency is high
Eg. Novell NetWare Eg. Microsoft-Distributed Component Object Model(DCOM)

Vous aimerez peut-être aussi