Académique Documents
Professionnel Documents
Culture Documents
1 eee/dsdca/sv
2
Parallel processing
Increase throughput
Improves throughput
eee/dsdca/sv
4
Pipeline processing
Any operation that can be decomposed into a
sequence of sub operations
Application
Repeat same task many times
eee/dsdca/sv
5
latch
latch
latch
latch
Segment 1 Segment 2 Segment 3 Segment 4
input
H1 H2 H3 H4
clock
eee/dsdca/sv
6
To complete 4 instructions
Non pipelined processor (4 x 4)= 16 clock cycles
Pipelined processor = 7 clock cycles =
4 (segment ) + 4(task) -1
eee/dsdca/sv
8
Overlapped execution of
four tasks using a pipeline
eee/dsdca/sv
9
For example:
eee/dsdca/sv
10
Let
Ti - propagation delays of segment i
TL - propagation delay of latch
T- pipeline clock period
T = max(T1,T2,Tn)+ TL
The segment with max delay(bottleneck) decides
pipeline clock period T
eee/dsdca/sv
11
Consider
the execution of m tasks
using an n-segment pipeline
m-1 tasks are shipped out at the rate of one task per
pipeline clock
Speed up= = p(n)
+
/
=
eee/dsdca/sv
13
lim = lim = =n
+1
Throughput
= =
+
()
= =
eee/dsdca/sv
16
Example
For a 4 segment floating point pipeline the
propagation delays of the segments are given
below.
Assume T1= 40ns, T2= 100ns, T3=180ns, T4=60ns, Tl
=20ns
a) Determine the pipeline clock rate
b) Find the time taken to add 1000 pairs of floating
numbers using this pipeline
c) What is the efficiency of the pipeline when 2000
pairs of numbers are added?
eee/dsdca/sv
17
eee/dsdca/sv
18
eee/dsdca/sv
19
() 2000 4
+1
= = =0.998 = 99.8%
4+20001
eee/dsdca/sv
20
Thank you
eee/dsdca/sv