Académique Documents
Professionnel Documents
Culture Documents
Compare
Direct-mapped cache:
each address maps to
a unique address
Tag array
Data array
10100000
Tag
Tag array
32-byte cache
line size or
block size
Offset
Data array
Associativity
Byte address
10100000
Tag
Tag array
Way-1
Compare
Way-2
Data array
Problem 2
Assume a direct-mapped cache with just 4 sets. Assume
that block A maps to set 0, B to 1, C to 2, D to 3, E to 0, and
so on. For the following access pattern, estimate the hits
and misses:
AB B E C CAD B FAE G C G A
Problem 2
Assume a direct-mapped cache with just 4 sets. Assume
that block A maps to set 0, B to 1, C to 2, D to 3, E to 0, and
so on. For the following access pattern, estimate the hits
and misses:
AB B E C CAD B FAE G C G A
M MH MM H MM HM HMM M M M
Problem 3
Assume a 2-way set-associative cache with just 2 sets.
Assume that block A maps to set 0, B to 1, C to 0, D to 1,
E to 0, and so on. For the following access pattern,
estimate the hits and misses:
AB B E C CAD B F AE G C G A
Problem 3
Assume a 2-way set-associative cache with just 2 sets.
Assume that block A maps to set 0, B to 1, C to 0, D to 1,
E to 0, and so on. For the following access pattern,
estimate the hits and misses:
AB B E C CAD B F AE G C G A
M MH M MH MM HM HMM M H M
Problem 4
64 KB 16-way set-associative data cache array with 64
byte line sizes, assume a 40-bit address
How many sets?
How many index bits, offset bits, tag bits?
How large is the tag array?
10
Problem 4
64 KB 16-way set-associative data cache array with 64
byte line sizes, assume a 40-bit address
How many sets? 64
How many index bits (6), offset bits (6), tag bits (28)?
How large is the tag array (28 Kb)?
11
Problem 5
8 KB fully-associative data cache array with 64
byte line sizes, assume a 40-bit address
How many sets? How many ways?
How many index bits, offset bits, tag bits?
How large is the tag array?
12
Problem 5
8 KB fully-associative data cache array with 64
byte line sizes, assume a 40-bit address
How many sets (1) ? How many ways (128) ?
How many index bits (0), offset bits (6), tag bits (34) ?
How large is the tag array (544 bytes) ?
13
Capacity
Conflict
Increasing cache
capacity
Increasing
number of sets
Increasing block
size
Increasing
associativity
15
16
18
19
Victim Caches
A direct-mapped cache suffers from misses because
multiple pieces of data map to the same location
The processor often tries to access data that it recently
discarded all discards are placed in a small victim cache
(4 or 8 entries) the victim cache is checked before going
to L2
Can be viewed as additional associativity for a few sets
that tend to have the most conflicts
20
Replacement Policies
Pseudo-LRU: maintain a tree and keep track of which
side of the tree was touched more recently; simple bit ops
NRU: every block in a set has a bit; the bit is made zero
when the block is touched; if all are zero, make all one;
a block with bit set to 1 is evicted
DRRIP: use multiple (say, 3) NRU bits; incoming blocks
are set to a high number (say 6), so they are close to
being evicted; similar to placing an incoming block near
the head of the LRU list instead of near the tail
21
Prefetching
Hardware prefetching can be employed for any of the
cache levels
It can introduce cache pollution prefetched data is
often placed in a separate prefetch buffer to avoid
pollution this buffer must be looked up in parallel
with the cache access
Aggressive prefetching increases coverage, but leads
to a reduction in accuracy wasted memory bandwidth
Prefetches must be timely: they must be issued sufficiently
in advance to hide the latency, but not too early (to avoid
22
pollution and eviction before use)
Stream Buffers
Simplest form of prefetch: on every miss, bring in
multiple cache lines
When you read the top of the queue, bring in the next line
Sequential lines
L1
Stream buffer
23
Stride-Based Prefetching
For each load, keep track of the last address accessed
by the load and a possibly consistent stride
FSM detects consistent stride and issues prefetches
incorrect
init
steady
correct
incorrect
(update stride)
correct
PC
correct
state
correct
trans
no-pred
incorrect
(update stride)
incorrect
(update stride)
24
Title
Bullet
25