Vous êtes sur la page 1sur 9

BE (CIS) Spring Semester 2014 CS-417: COMPUTER SYSTEMS MODELING

Performance Evaluation of High Parallel Systems Architecture


S. Zaffar Qasim Assistant Professor (CIS)

Evaluation of High Parallel Systems Architecture


The system chosen here is more indicative of realistic systems where multiple servers are interconnected to serve several users.

Fig 1: Multibank shared memory model

Here we have essentially the problem of memory allocation to processors. A processor can have all of the memory or none of the memory or anything in between. 2

Computer Systems Modeling (CS-417)

Evaluation of High Parallel Systems Architecture


Allocations are done using the entire memory module. That is, a CPU cannot share a memory module with another CPU during a cycle. On each CPU cycle, each processor makes a memory request. If there is a free memory meeting the CPU's request, it gets filled; otherwise, the CPU must wait until the next cycle. When several processors make memory module requests to the same memory module, only one is served (chosen at random from those requesting). New memory requests for each processor are chosen randomly from the M memory modules using a uniform distribution. Let the system state be the number of memory requests for each memory module:K = (k1, k2, k3, , km)
where ki represents the memory request by processors for memory bank i. 3

Evaluation of High Parallel Systems Architecture


At the start of a cycle the sum of all requests cannot exceed the number of processors in the system, N:k1 + k2 + k3 + + km = N The total number of possible states is related to the number of ways N processor requests can be distributed to M memory modules:-

or, in other terms, how to allocate N balls to M cells.


For N = 2 and M = 4 (see Fig 2) the possible way to allocate the four memory modules to processors (indistinguishable from each other) is shown in Table 1.

Fig 2: Multiprocessor system with N = 2 and M= 4.

Computer Systems Modeling (CS-417)

Evaluation of High Parallel Systems Architecture


Table 1

and is found by:

Evaluation of High Parallel Systems Architecture


We can see that if the number of processors requesting memory modules and the number of memory modules are increased, o the number of possible states grows very quickly, o making this analysis difficult for even relatively small problems, as shown in Table 2.
Table 2

Computer Systems Modeling (CS-417)

Evaluation of High Parallel Systems Architecture


Let H = (h1,h2, ... ,hm) represent the intermediate state, when the memory access requested on a cycle has been filled and the new requests have not yet been made:

Let G represent a new (feasible) system state: G = (g1, g2, g3, , gm) First, let's define:-

Properties
1. If G is reachable from K in one cycle, the probability it will in fact be the next state is given by:-

where x represents the number of new requests. 2. The system can be described by a Markov chain, since the next state probabilities at any time depend only on the current state. 3. The system is aperiodic, since a one-step transition from a state to itself is possible at any time. 4. The system is irreducible, since it can reach any other in a finite number of steps.
8

Computer Systems Modeling (CS-417)

Performance Assessment
Also, since these conditions hold, there is an equilibrium state probability distribution, , so that: = P where P is the state transition matrix = (1, 2, 3, 4, , j) A performance assessment typically made in such system configurations to determine what the Effective processor power of the N processors with M memory system is: o EP (N, M) = the expected number of instructions executed per second compared with an N =1, M =1 system. Let Proc(i) represent the number of memory requests serviced (instructions executed) when the system is in state i:-

Performance Assessment
For the simple case where N = 2 and M = 2, we have the system illustrated in Fig 3.

Fig 3: Multiprocessor system with N = 2 and M= 2.

Fig 4: Probability state transition diagram.

10

Computer Systems Modeling (CS-417)

Performance Assessment
The possible states this model could be in, representing the requested memory requested by the two processors, is described as (see Fig 4):-

which represents the probability of being in state (2,0) and transitioning to state (1,1).
11

Performance Assessment
Similarly, the probability of being in state (1,1) and traversing to state (2,0) would be found as:

and so on. The balance equations for this Markov chain can be found using the relationship:Flow In = Flow Out

12

Computer Systems Modeling (CS-417)

Performance Assessment

The discovered effective processor power is computed using the relationship:EP(2,2 ) = 11 + 22 + 13 = 0.25 + 1.0 + 0.25 = 1.5
Limitations: The model does not take into account memory interference caused by I/O operations. o It also assumes the processors and memory are synchronized, as are memory access/cycle. 13

Evaluation of Parallel Systems Architecture Petri net Perspective Assumptions: There are o np processors, o nm shared memory modules, and o nb data buses. Each of the processors has local memory, o gets used until a page miss o new page being loaded into local memory from external memory module. The miss rate () is exponentially distributed. The access time (1/) to shared memory is also assumed to be exponentially distributed.
14

Computer Systems Modeling (CS-417)

Evaluation of Parallel Systems Architecture Petri net Perpective

Fig 5: Petri net model for multiprocessor system (np= 5, nm = 3, and nb = 2)

The model depicted contains two places per memory module o one place for processor tokens and one place for bus tokens and o one timed transition (for memory allocation and use). There are also two immediate transitions associated with synchronizing and controlling the memory access. We have total nine places, four timed transitions, and six immediate transitions. 15

Petri net Perpective

Tokens in place P1 represent processors executing on their local memory. Tokens in place P2 represent data buses available for use. An important assumption: every processor and memory module act in an identical manner. When a processor completes its local memory access (has a page miss resulting in firing transition t1) and requires more shared memory resources, a token is moved from place P1 to place P3.
16

Computer Systems Modeling (CS-417)

Petri net Perpective

A processor determines which memory it needs by firing the immediate transition, t2, on the memory module it has chosen using a probabilistic branch. Once t2 fires, a token is moved from place 3 to place 4. Once a token is in place 4, the processor is requesting access to a data bus. The processor acquires the memory desired, and then acquires a data bus to retrieve the needed information. Once a processor has the bus, signaled by the firing of transition t3, and has acquired the memory (indicated by the token in place, P5), it begins to model using the memory module by initiating the timer on transition t417 .

Petri net Perpective

Upon completion of using the bus, the token representing the processor and the bus are routed back to their initial places, P2 and P1. If we run this model with inputs similar to what were applied to the queuing model, we would find results that very closely match the queuing model case. That is, we would find out that the effective processor power would be proportional to about 2.05 with the configuration as specified. 18

Computer Systems Modeling (CS-417)

Vous aimerez peut-être aussi