Vous êtes sur la page 1sur 17

CSCE430/830 Computer Architecture

Memory Hierarchy: Virtual Memory


Lecturer: Prof. Hong Jiang
Courtesy of Yifeng Zhu (U. Maine)
Fall, 2006

CSCE430/830

Portions of these slides are derived from:


Dave Patterson UCB

Memory: Virtual Memory

Virtual Memory: Definitions


Key idea: simulate a larger physical memory than is actually available
General approach:
Break address space up into pages
Each program accesses a working set of pages
Store pages:
In physical memory as space permits
On disk when no space left in physical memory
Access pages using virtual address

Individual
Pages

Virtual
CSCE430/830
Memory

Memory
Map

Physical
Memory

Disk
Memory: Virtual Memory

Virtual Memory: Paging


Permits a program's memory to be physically noncontiguous so it can be allocated
from wherever available. This avoids fragmentation and compaction.
Frames = physical blocks
Pages
= logical blocks

Size of frames/pages is
defined by hardware (power
of 2 to ease calculations)

HARDWARE
An address is determined by:
page number ( index into table ) + offset
---> mapping into --->
base address ( from table ) + offset.

CSCE430/830

Memory: Virtual Memory

Virtual Memory: Paging


0
Paging Example - 32-byte memory with 4-byte pages

0a
1b
2c
3d

4e
5f
6g
7h

8I
9j
10 k
11 l

12 m
13 n
14 o
15 p
Logical Memory

CSCE430/830

0
4

1
8

0
1
2
3

5
6
1
2

Page Table

2
3

12

16

m
n
o
p

20

a
b
c
d

24

e
f
g
h

5
6

I
j
k
l

28
Memory: Virtual Memory

Physical Memory

Virtual Address Translation


Page table register

Virtual address
31 30 29 28 27

15 14 13 12 11 10 9 8
Virtual page number

Page offset

20
Valid

3 2 1 0

12
Physical page number

Page table

18
If 0 then page is not
present in memory
29 28 27

15 14 13 12 11 10 9 8
Physical page number

3 2 1 0

Page offset

Physical address

CSCE430/830

Memory: Virtual Memory

Virtual Address Translation


What happens during a memory access?
map virtual address into physical address using page table
If the page is in memory: access physical memory
If the page is on disk: page fault
Suspend program
Get operating system to load the page from disk

Page table is in memory - this slows down access!


Translation lookaside buffer (TLB) special cache of
translated addresses (speeds access back up)

CSCE430/830

Memory: Virtual Memory

TLB Structure
Virtual page
number

TLB
Valid

Tag

Physical page
address

1
1

Physical memory

1
1
0
1
Page table
Physical page
Valid or disk address
1
1
1

Disk storage

1
0
1
1
0
1
1
0
1

CSCE430/830

Memory: Virtual Memory

TLB / Cache Interaction


Virtual address
31 30 29

15 14 13 12 11 10 9 8
Virtual page number

3210

Page offset

20

Valid Dirty

12

Physical page number

Tag

TLB
TLB hit

20

Physical page number


Page offset
Physical address
Physical address tag
Cache index
14

16

Valid

Tag

Byte
offset
2

Data

Cache

32
Cache hit

CSCE430/830

Data

Memory: Virtual Memory

Techniques for Fast Address Translation


Just like any other cache, the TLB can be organized as fully associative, set
associative, or direct mapped
TLBs are usually small, typically not more than 128 - 256 entries even on
high end machines. This permits fully associative lookup on these
machines. Most mid-range machines use small n-way set associative
organizations.
hit
PA

VA
CPU
Translation
with a TLB

TLB
Lookup
miss

miss
Cache

Main
Memory

hit

Translation
data
1/2 t
CSCE430/830

20 t
Memory: Virtual Memory

Techniques for Fast Address Translation


Virtual Address and a Cache

VA
CPU

miss

PA
Translation

Cache

Main
Memory

hit
data
It takes an extra memory access to translate VA to PA
This makes cache access very expensive, and this is the "innermost loop"
that you want to go as fast as possible

CSCE430/830

Memory: Virtual Memory

Virtual Memory Design

Page size: 8KB


TLB is direct mapped with 256 entries
L1 cache is direct-mapped 8KB.
L2 cache is direct mapped 4MB.
L1 and L2 use 64-byte blocks
The virtual address is 64 bits
The physical address is 41 bits

Please show the overall picture of memory


hierarchy.

CSCE430/830

Memory: Virtual Memory

Virtually Indexed, Physically Tagged Cache


What motivation?
Fast cache hit by parallel TLB access
No virtual cache shortcomings

How could it be correct?


Require cache way size <= page size; now physical index is from page offset
Then virtual and physical indices are identical works like a physically
indexed cache!
CSCE430/830

Memory: Virtual Memory

Virtually Indexed, Physically Tagged Cache

28

CSCE430/830

Memory: Virtual Memory

Virtual Memory and Protection


Important function of virtual memory: Protection
Allow sharing of single main memory by multiple processes
Provide each process with its own address space
Protect each process from memory accesses by other
processes

Basic mechanism: two modes of operation


User mode - allows access only to user address space
Supervisor (kernel) mode - allows access to OS address
space

System call - allows processor to change mode

CSCE430/830

Memory: Virtual Memory

Virtual Memory
Crosscutting Issues: The Design of Memory Hierarchies
Superscalar CPU and Number of Ports to the Cache
Cache must provide sufficient peak bandwidth to benefit from
multiple issues. Some processors increase complexity of
instruction fetch by allowing instructions to be issued to be
found on any boundary instead of, say, multiples of 4 words.
Speculative Execution and the Memory System
Speculative and conditional instructions generate exceptions (by
generating invalid addresses) that would otherwise not occur,
which in turn can overwhelm the benefits of speculation with
the exception handling overhead. Such CPUs must be matched
with non-blocking caches and only speculate on L1 misses
(due to the unbearable penalty of L2).
Combining Instruction Cache with Instruction Fetch and Decode
Mechanisms
Increasing demand for ILP and clock rate has led to the merging
of the first part of instruction execution with instruction cache,
by incorporating trace cache (which combines branch
prediction with instruction fetch) and storing the internal RISC
operations in the trace cache (e.g., Pentium 4s NetBurst
microarchitecture). A cache hit in the merged cache saves
portion of the instruction execution cycles.
CSCE430/830

Memory: Virtual Memory

Virtual Memory
Crosscutting Issues: The Design of Memory Hierarchies
Embedded Computer Caches and Real-Time Performance
In real-time applications, variation of performance matters much
more than average performance. Thus, caches that offer
average performance enhancement have to be used carefully.
Instruction caches are often used due to the highly
predictability of instructions; whereas data caches are locked
down, forcing them to act as small scratchpad memory under
program control.
Embedded Computer Caches and Power
It is much more power efficient to access on-chip memory than
to access off-chip one (which needs to drive the pins, buses
and activate external memory chips, etc). Other techniques,
such as way prediction, can be used to save power (by only
powering half of the two-way set-associative cache).
I/O and Consistency of Cached Data
Cache coherence problem must be addressed when I/O devices
also share the same cached data.

CSCE430/830

Memory: Virtual Memory

Summary - Virtual Memory


Bottom level of memory hierarchy for programs
Used in all general-purpose architectures
Relies heavily on OS for support

CSCE430/830

Memory: Virtual Memory

Vous aimerez peut-être aussi