Académique Documents
Professionnel Documents
Culture Documents
Here we have essentially the problem of memory allocation to processors. A processor can have all of the memory or none of the memory or anything in between. 2
Let G represent a new (feasible) system state: G = (g1, g2, g3, , gm) First, let's define:-
Properties
1. If G is reachable from K in one cycle, the probability it will in fact be the next state is given by:-
where x represents the number of new requests. 2. The system can be described by a Markov chain, since the next state probabilities at any time depend only on the current state. 3. The system is aperiodic, since a one-step transition from a state to itself is possible at any time. 4. The system is irreducible, since it can reach any other in a finite number of steps.
8
Performance Assessment
Also, since these conditions hold, there is an equilibrium state probability distribution, , so that: = P where P is the state transition matrix = (1, 2, 3, 4, , j) A performance assessment typically made in such system configurations to determine what the Effective processor power of the N processors with M memory system is: o EP (N, M) = the expected number of instructions executed per second compared with an N =1, M =1 system. Let Proc(i) represent the number of memory requests serviced (instructions executed) when the system is in state i:-
Performance Assessment
For the simple case where N = 2 and M = 2, we have the system illustrated in Fig 3.
10
Performance Assessment
The possible states this model could be in, representing the requested memory requested by the two processors, is described as (see Fig 4):-
which represents the probability of being in state (2,0) and transitioning to state (1,1).
11
Performance Assessment
Similarly, the probability of being in state (1,1) and traversing to state (2,0) would be found as:
and so on. The balance equations for this Markov chain can be found using the relationship:Flow In = Flow Out
12
Performance Assessment
The discovered effective processor power is computed using the relationship:EP(2,2 ) = 11 + 22 + 13 = 0.25 + 1.0 + 0.25 = 1.5
Limitations: The model does not take into account memory interference caused by I/O operations. o It also assumes the processors and memory are synchronized, as are memory access/cycle. 13
Evaluation of Parallel Systems Architecture Petri net Perspective Assumptions: There are o np processors, o nm shared memory modules, and o nb data buses. Each of the processors has local memory, o gets used until a page miss o new page being loaded into local memory from external memory module. The miss rate () is exponentially distributed. The access time (1/) to shared memory is also assumed to be exponentially distributed.
14
The model depicted contains two places per memory module o one place for processor tokens and one place for bus tokens and o one timed transition (for memory allocation and use). There are also two immediate transitions associated with synchronizing and controlling the memory access. We have total nine places, four timed transitions, and six immediate transitions. 15
Tokens in place P1 represent processors executing on their local memory. Tokens in place P2 represent data buses available for use. An important assumption: every processor and memory module act in an identical manner. When a processor completes its local memory access (has a page miss resulting in firing transition t1) and requires more shared memory resources, a token is moved from place P1 to place P3.
16
A processor determines which memory it needs by firing the immediate transition, t2, on the memory module it has chosen using a probabilistic branch. Once t2 fires, a token is moved from place 3 to place 4. Once a token is in place 4, the processor is requesting access to a data bus. The processor acquires the memory desired, and then acquires a data bus to retrieve the needed information. Once a processor has the bus, signaled by the firing of transition t3, and has acquired the memory (indicated by the token in place, P5), it begins to model using the memory module by initiating the timer on transition t417 .
Upon completion of using the bus, the token representing the processor and the bus are routed back to their initial places, P2 and P1. If we run this model with inputs similar to what were applied to the queuing model, we would find results that very closely match the queuing model case. That is, we would find out that the effective processor power would be proportional to about 2.05 with the configuration as specified. 18