Académique Documents
Professionnel Documents
Culture Documents
Chapter 17
S. Dandamudi
Outline
Introduction
How cache memory works
Why cache memory works
Cache design basics
Mapping function
Direct mapping
Associative mapping
Set-associative mapping
Replacement policies
Write policies
Space overhead
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Introduction
Memory hierarchy
Registers
Memory
Disk
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Introduction (contd)
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Important terms
2003
Miss penalty
Hit ratio
Miss ratio = (1 hit ratio)
Hit time
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
250
Column-order
200
150
100
Row-order
50
0
500
600
700
800
900
1000
Matrix size
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
Block placement
Mapping function
Block replacement
Write policies
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Mapping Function
Determines how memory blocks are mapped to
cache lines
Three types
Direct mapping
Specifies a single cache line for each memory block
Set-associative mapping
Specifies a set of cache lines for each memory block
Associative mapping
No restrictions
Any cache line can be used for any memory block
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Direct mapping
Reference pattern:
0, 4, 0, 8, 0, 8,
0, 4, 0, 4, 0, 4
Hit ratio = 0%
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Direct mapping
Reference pattern:
0, 7, 9, 10, 0, 7,
9, 10, 0, 7, 9, 10
Hit ratio = 67%
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Associative
mapping
Reference pattern:
0, 4, 0, 8, 0, 8,
0, 4, 0, 4, 0, 4
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Set-associative
mapping
Reference pattern:
0, 4, 0, 8, 0, 8,
0, 4, 0, 4, 0, 4
Hit ratio = 67%
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Replacement Policies
We invoke the replacement policy
When there is no place in cache to load the memory
block
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Write Policies
Memory write requires special attention
We have two copies
A memory copy
A cached copy
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Figure 17.3a
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Example: Pentium
Uses a 32-byte write buffer
Buffer is written at several trigger points
An example trigger point
4Buffer full condition
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Write-back
Advantage
Reduces write traffic to memory
Disadvantages
Takes longer to load new cache lines
Requires additional dirty bit
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Space Overhead
The three mapping functions introduce different
space overheads
Overhead decreases with increasing degree of
associativity
4 GB address space
Several examples in the text
32 KB cache
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Capacity misses
Induced due to cache capacity limitation
Can be avoided by increasing cache size
Conflict misses
Due to conflicts caused by direct and set-associative mappings
Can be completely eliminated by fully associative
mapping
Also called collision misses
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Capacity misses
Reduced by increasing cache size
Law of diminishing returns
Conflict misses
Reduced by increasing degree of associativity
Fully associative mapping: no conflict misses
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Types of Caches
Separate instruction and data caches
Initial cache designs used unified caches
Current trend is to use separate caches (for level 1)
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Disadvantage
Rigid boundaries between data and instruction caches
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Examples
Pentium
L1: 32 KB
L2: up to 2 MB
PowerPC
L1: 64 KB
L2: up to 1 MB
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Example Implementations
We look at three processors
Pentium
PowerPC
MIPS
Pentium implementation
Two levels
L1 cache
Split cache design
4Separate data and instruction caches
L2 cache
Unified cache design
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Write combining
Not cached
Writes are buffered to reduce access to main memory
Useful for video buffer frames
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Write back
Uses write-back policy
Writes are delayed as in the write-through mode
Write protected
Inhibits cache writes
Write are done directly on the memory
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Write-back
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
L1 cache implements a
pseudo-LRU
Each set maintains seven
PLRU bits (B0B6)
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Location of a block
Depends on the placement policy
Replacement policy
LRU is the most popular
Pseudo-LRU is often implemented
Write policy
Write-through
Write-back
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Design Issues
Several design issues
Cache capacity
Law of diminishing
returns
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.
Last slide
2003
S. Dandamudi
To be used with S. Dandamudi, Fundamentals of Computer Organization and Design, Springer, 2003.