Memory VMem

CSCE430/830 Computer Architecture
Memory Hierarchy: Virtual Memory

Lecturer: Prof. Hong Jiang
Courtesy of Yifeng Zhu (U. Maine)
Fall, 2006
CSCE430/830
Portions of these slides are derived from:

Dave Patterson UCB
Memory: Virtual Memory
Virtual Memory: Definitions

Key idea: simulate a larger physical memory than is actually available
General approach:
Break address space up into pages
Each program accesses a working set of pages
Store pages:
In physical memory as space permits
On disk when no space left in physical memory
Access pages using virtual address
Individual
Pages
Virtual
CSCE430/830
Memory
Memory
Map
Physical
Memory
Disk
Virtual Memory: Paging

Permits a program's memory to be physically noncontiguous so it can be allocated
from wherever available. This avoids fragmentation and compaction.
Frames = physical blocks
Pages
= logical blocks
Size of frames/pages is
defined by hardware (power
of 2 to ease calculations)
HARDWARE
An address is determined by:
page number ( index into table ) + offset
---> mapping into --->
base address ( from table ) + offset.
CSCE430/830
Virtual Memory: Paging

0
Paging Example - 32-byte memory with 4-byte pages
0a
1b
2c
3d
4e
5f
6g
7h
8I
9j
10 k
11 l
12 m
13 n
14 o
15 p
Logical Memory
CSCE430/830
0
4
1
8
0
1
2
3
5
6
1
2
Page Table
2
3
12
16
m
n
o
p
20
a
b
c
d
24
e
f
g
h
5
6
I
j
k
l
28
Physical Memory
Virtual Address Translation

Page table register
Virtual address
31 30 29 28 27
15 14 13 12 11 10 9 8
Virtual page number
Page offset
20
Valid
3 2 1 0
12
Physical page number
Page table
18
If 0 then page is not
present in memory
29 28 27
15 14 13 12 11 10 9 8
3 2 1 0
Page offset
Physical address
CSCE430/830
Virtual Address Translation

What happens during a memory access?
map virtual address into physical address using page table
If the page is in memory: access physical memory
If the page is on disk: page fault
Suspend program
Get operating system to load the page from disk
Page table is in memory - this slows down access!

Translation lookaside buffer (TLB) special cache of
translated addresses (speeds access back up)
CSCE430/830
TLB Structure
Virtual page
number
TLB
Valid
Tag
Physical page
address
1
1
Physical memory
1
1
0
1
Page table
Physical page
Valid or disk address
1
1
1
Disk storage
1
0
1
1
0
1
1
0
1
CSCE430/830
TLB / Cache Interaction

Virtual address
31 30 29
15 14 13 12 11 10 9 8
Virtual page number
3210
Page offset
20
Valid Dirty
12
Tag
TLB
TLB hit
20

Page offset
Physical address
Physical address tag
Cache index
14
16
Valid
Tag
Byte
offset
2
Data
Cache
32
Cache hit
CSCE430/830
Data
Techniques for Fast Address Translation

Just like any other cache, the TLB can be organized as fully associative, set
associative, or direct mapped
TLBs are usually small, typically not more than 128 - 256 entries even on
high end machines. This permits fully associative lookup on these
machines. Most mid-range machines use small n-way set associative
organizations.
hit
PA
VA
CPU
Translation
with a TLB
TLB
Lookup
miss
miss
Cache
Main
Memory
hit
Translation
data
1/2 t
CSCE430/830
20 t
Techniques for Fast Address Translation

Virtual Address and a Cache
VA
CPU
miss
PA
Translation
Cache
Main
Memory
hit
data
It takes an extra memory access to translate VA to PA
This makes cache access very expensive, and this is the "innermost loop"
that you want to go as fast as possible
CSCE430/830
Virtual Memory Design
Page size: 8KB

TLB is direct mapped with 256 entries
L1 cache is direct-mapped 8KB.
L2 cache is direct mapped 4MB.
L1 and L2 use 64-byte blocks
The virtual address is 64 bits
The physical address is 41 bits
Please show the overall picture of memory

hierarchy.
CSCE430/830
Virtually Indexed, Physically Tagged Cache

What motivation?
Fast cache hit by parallel TLB access
No virtual cache shortcomings
How could it be correct?

Require cache way size <= page size; now physical index is from page offset
Then virtual and physical indices are identical works like a physically
indexed cache!
CSCE430/830
Virtually Indexed, Physically Tagged Cache
28
CSCE430/830
Virtual Memory and Protection

Important function of virtual memory: Protection
Allow sharing of single main memory by multiple processes
Provide each process with its own address space
Protect each process from memory accesses by other
processes
Basic mechanism: two modes of operation

User mode - allows access only to user address space
Supervisor (kernel) mode - allows access to OS address
space
System call - allows processor to change mode
CSCE430/830
Virtual Memory
Crosscutting Issues: The Design of Memory Hierarchies
Superscalar CPU and Number of Ports to the Cache
Cache must provide sufficient peak bandwidth to benefit from
multiple issues. Some processors increase complexity of
instruction fetch by allowing instructions to be issued to be
found on any boundary instead of, say, multiples of 4 words.
Speculative Execution and the Memory System
Speculative and conditional instructions generate exceptions (by
generating invalid addresses) that would otherwise not occur,
which in turn can overwhelm the benefits of speculation with
the exception handling overhead. Such CPUs must be matched
with non-blocking caches and only speculate on L1 misses
(due to the unbearable penalty of L2).
Combining Instruction Cache with Instruction Fetch and Decode
Mechanisms
Increasing demand for ILP and clock rate has led to the merging
of the first part of instruction execution with instruction cache,
by incorporating trace cache (which combines branch
prediction with instruction fetch) and storing the internal RISC
operations in the trace cache (e.g., Pentium 4s NetBurst
microarchitecture). A cache hit in the merged cache saves
portion of the instruction execution cycles.
CSCE430/830
Virtual Memory
Crosscutting Issues: The Design of Memory Hierarchies
Embedded Computer Caches and Real-Time Performance
In real-time applications, variation of performance matters much
more than average performance. Thus, caches that offer
average performance enhancement have to be used carefully.
Instruction caches are often used due to the highly
predictability of instructions; whereas data caches are locked
down, forcing them to act as small scratchpad memory under
program control.
Embedded Computer Caches and Power
It is much more power efficient to access on-chip memory than
to access off-chip one (which needs to drive the pins, buses
and activate external memory chips, etc). Other techniques,
such as way prediction, can be used to save power (by only
powering half of the two-way set-associative cache).
I/O and Consistency of Cached Data
Cache coherence problem must be addressed when I/O devices
also share the same cached data.
CSCE430/830
Summary - Virtual Memory

Bottom level of memory hierarchy for programs
Used in all general-purpose architectures
Relies heavily on OS for support
CSCE430/830

Memory VMem

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Memory VMem

Transféré par

Droits d'auteur :

Formats disponibles

CSCE430/830 Computer Architecture

Memory Hierarchy: Virtual Memory

Portions of these slides are derived from:

Memory: Virtual Memory

Virtual Memory: Definitions

Virtual Memory: Paging

Memory: Virtual Memory

Virtual Memory: Paging

Virtual Address Translation

Memory: Virtual Memory

Virtual Address Translation

Page table is in memory - this slows down access!

Memory: Virtual Memory

Memory: Virtual Memory

TLB / Cache Interaction

Physical page number

Physical page number

Memory: Virtual Memory

Techniques for Fast Address Translation

Techniques for Fast Address Translation

Memory: Virtual Memory

Virtual Memory Design

Page size: 8KB

Please show the overall picture of memory

Memory: Virtual Memory

Virtually Indexed, Physically Tagged Cache

How could it be correct?

Memory: Virtual Memory

Virtually Indexed, Physically Tagged Cache

Memory: Virtual Memory

Virtual Memory and Protection

Basic mechanism: two modes of operation

System call - allows processor to change mode

Memory: Virtual Memory

Memory: Virtual Memory

Memory: Virtual Memory

Summary - Virtual Memory

Memory: Virtual Memory

Vous aimerez peut-être aussi