Chapter 3

CHAPTER 3: MEMORY ORGANIZATION
3.1 COMPUTER MEMORY SYSTEM OVERVIEW

Characteristics of memory systems
Location
Capacity
Unit of transfer
Access method
Performance
Physical type
Physical characteristics
Organization
i. Location
Internal memory is often equated with main memory.
The processor requires its own local memory in the form of registers (e.g.,
PC, IR, ID).
The control unit portion of the processor may also require its own internal
memory.
Cache is another form of internal memory.
External memory consists of peripheral storage devices e.g., disk, tape, that
are accessible to the processor via I/O controller.
ii. Capacity
The natural unit of organization.
Common word lengths are 8, 16, and 32 bits.
External memory capacity is typically expressed in terms of bytes (1GB,
20MB).
iii. Unit of transfer

Internal memory, the unit of transfer is equal to the number of data lines of the
memory module. Equal to the word length but is often larger, such as 64, 128 or
256 bits. Consider 3 related concepts for internal memory:
Word:
-The natural unit of organization of memory.
-The size of a word is typically equal t the number of bits used to represent
an integer and to the instruction length.
-The Intel x86 architecture has a wide variety of instruction lengths,
expressed as multiples of bytes and a word size of 32 bits.
Addressable units:
-Usually bytes, but can be words. 2A = number of addressable units, where
A is bits in an address.
Unit of transfer:
- Number of bits written out or read in to memory at a time. On Pentium,
64-bits.
- For external memory, data are often transferred in much larger units than
a word, and these are referred to as blocks.
iv. Access method

Sequential:
-Start at the beginning and read through in order.
-Access time depends on location of data and previous location.
-E.g. tape
Direct:
-Individual blocks have unique address.
-Access time depends on location of data and previous location.
-E.g. disk
Random:
-Individual addresses identify locations exactly.
-Access time independent of location or previous access.
-E.g. RAM, cache
Associative:
-Data is located by a comparison with contents of a portion of the store.
-Access time is independent of location or previous access.
-E.g. cache
v. Performance
3 performance parameters are used:
Access time(latency) for random-access, this is the time it takes to perform
a read or write operation, that is, the time from the instant that an address is
presented to the memory to the instant that data have been stored or made
available for use. For non-random-access is the time for the read/write
mechanism to position at the desired location.
Memory cycle time applied to random-access, consists of the access time
plus any additional time requires before a second access can commence.
This additional time may be required for transients to die out on signal lines
or to regenerate data if they are read destructively.
Transfer rate this is the rate that data can be transferred into or out of a
memory unit.
Transfer rate calculation
For random-access:
-transfer is based in 1 cycle time (clock period)
For non-random-access:
-Tn=TA + N/R
Where:
Tn = average time to read/write N bits;
TA = average access time;
N = number of bits;
R = transfer rate (bps)
vi. Physical types of memory

Semiconductor (RAM)
Magnetic (Disk & tape)
Optical (CD & DVD)
Magneto-optical
vii. Physical characteristic

Volatile information decays naturally or is lost when electrical power is
switched off.
Nonvolatile information once recorded remains without deterioration until
deliberately changed and no electrical electric power is needed to retain
information. E.g. magnetic-surface memory.
Nonerasable cannot be altered, except by destroying the storage unit. E.g.
ROM.
Power consumption
THE MEMORY HIERARCHY

There is a trade-off among the 3 key characteristics of memory (cost, capacity and access time0
Faster access time, greater cost per bit
Greater capacity, smaller cost per bit
Greater capacity, slower access time
Registers
In CPU
Internal or Main memory
May include one or more levels of cache
RAM
External memory
Backing store
A typical hierarchy is illustrated in figure 3.1. as one goes down the hierarchy, the
following occur:
a) Decreasing cost per bit;
b) Increasing capacity;
c) Increasing access time;
d) Decreasing frequency of access of the memory by the processor.
Figure 3.1: the memory hierarchy
Locality of Reference
Two or more levels of memory can be used to produce average access time approaching
the highest level.
The reason that this works well is called locality of reference.
In practice memory references (both instructions and data) tend to cluster.
- Instructions: iterative loops and repetitive subroutine calls.
- Data: tables, arrays, etc. Memory references cluster in short run
3.2 CACHE MEMORY PRINCIPLES
Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or module
Figure3.2 cache and main memory

Principles
1. Intended to give memory speed approaching that of fastest memories available but with
large size, at close to price of slower memories
2. Cache is checked first for all memory references.
3. If not found, the entire block in which that reference resides in main memory is stored in a
cache slot, called a line.
4. Each line includes a tag (usually a portion of the main memory address) which identifies
which particular block is being stored.
5. Locality of reference implies that future references will likely come from this block of
memory, so that cache line will probably be utilized repeatedly.
6. The proportion of memory references, which are found already stored in cache, is called
the hit ratio.
Figure 3.2b depicts the use of multiple level cache memory. it is a bit slower. It has three
level which is level 1 (L1), level 2 (L2), and level 3 (L3). L2 cache is slower and typically
larger compared to L1 cache, and the L3 cache is slower and larger compared to L2 cache
Figure3.3 cache/main memory structure

Cache view of memory:
N address lines => 2n words of memory
Cache stores fixed length blocks of K words
Cache views memory as an array of M blocks where M = 2n/K
A block of memory in cache is referred to as a line. K is the line size
Cache size of C blocks where C < M (considerably)
Each line includes a tag that identifies the block being stored
Tag is usually upper portion of memory address
Cache operation - overview

CPU requests contents of memory location
Check cache for this data
If present, get from cache (fast)
If not present, read required block from main memory to cache
Then deliver from cache to CPU
Cache includes tags to identify which block of main memory is in each cache slot
Figure3.4 cache read operation
Figure3.2 typical cache design

Cache organization:
The preceding diagram illustrates a shared connection between the processor, the cache
and the system bus (look-aside cache).
Another way to organize this system is to interpose the cache between the processor
and the system bus for all lines (look-through cache).
3.3 ELEMENTS OF CACHE DESIGN
CACHE ADDRESSES
1. VIRTUAL MEMORY
- Allow program to address memory from logical point of view without regard the
amount of main memory physically available.
- For read and write from memory, hardware memory management unit (MMU)
translates each virtual address into physical address in main memory
(a) Logical Cache
(b) Physical Cache

Comparison of Cache Sizes
MAPPING FUNCTION
- Algorithm is needed for mapping main memory blocks into cache lines because there
fewer cache lines than main memory.
- Three technique used:
i. Direct mapping
The simplest technique
Maps each block of main memory into only one possible cache line
ii. Associative mapping
Permits each main memory into any line cache
Cache control logic interprets a memory address simply as a Tag and
Word field
To determine whether blocks is in the cache, the cache control logic
must simultaneously examine every lines Tag for a match
iii. Set Associative Mapping
A compromise that exhibits the strengths of both direct and associative
approaches while reducing their disadvantages
Consists number of sets
Each set contains number of lines
VICTIM CACHE
- Originally proposed as an approach to reduce conflict misses of direct mapped caches
without affecting its fast access time.
- Fully associative cache
- Typical size is 4 to 16 cache lines
- Residing between direct mapped L1 cache and next level of memory
REPLACEMENT ALGORITHMS
- Once cache has been filled, when new block is brought into cache, one of existing
blocks must be replaced
- For direct mapping, only one possible line for any particular block and no choice is
possible
- For associative and set-associative techniques, replacement algorithm is needed.
- To achieve high speed, algorithm must be implemented in hardware
COMMON REPLACEMENT ALGORITHM

1. Least recently used (LRU)
- Most effective
- Replace block in set that has been in cache longest with no reference to it
- Because of its simplicity implementation, LRU is the most popular replacement
algorithm.
2. First-in-first-out (FIFO)
- Replace block in set that has been in cache longest
- Easily implemented as a round-robin or circular buffer technique
3. Least frequently used (LFU)
- Replace block in set that has experienced fewest references
- Could be implemented by associating counter with each line
WRITE POLICY
- When block that is resident in cache is to be replaced there are two cases to consider:
1. If old block in the cache has not been altered then it may be overwritten with new
block without first writing out the old block.
2. If at least one write operation has been performed on word in that line of cache then
main memory must be updated by writing line of cache out to the block of memory
before bringing in the new block
- There two problem contend with:
1. More than one device may access to main memory
2. More complex problem occurs when multiple processors are attached to the same
bus and each processor has its own local cache-if word is altered in one cache it
could conceivably invalidate a word in other caches
WRITE THROUGH AND WRITE BACK
Write through
- Simplest technique
- All write operations are made to main memory as well as to cache
- Main disadvantages of this technique is its generates substantial memory traffic and
may creates a bottleneck
Write back
- Minimizes memory writes
- Updates are made only in the cache
- Portions of main memory are invalid and hence accesses by I/O modules can be
allowed only through cache
- Makes for complex circuitry and a potential bottleneck
-
LINE SIZE
- When block of data is retrieved and placed in the cache not only desired word but also
some number of adjacent words are retrieved
- As block size increases the hit ratio will at first increase because of principle of locality
- As block size increases more useful data are brought into cache
- Hit ratio begin to decreased as block becomes bigger and probability of using newly
fetched information becomes less than the probability of reusing information that has
to be replaced
- Two specific effects come into play:
Larger blocks reduce number of blocks that fit into cache
As block becomes larger each additional word is farther from requested word
MULTILEVEL CACHES
- As logic density increased it has become possible to have cache on same chip as
processor
- The on-chip cache reduces processor external bus activity and speeds up execution time
and increases overall system performance
When requested instruction or data is found in on-chip cache, bus access is
eliminated
On-chip cache accesses will complete appreciably faster than would even zero-
wait state bus cycles
During this period the bus is free to support others transfers
- Two-level cache:
Internal cache designated as level 1 (L1)
External cache designated as level 2 (L2)
- Potential savings due to the use of L2 cache depends on hit rates in both L1 and L2
caches
- The use of multilevel caches complicates all of design issues related to caches, include
the size, replacement algorithm and write policy
UNIFIED VERSUS SPLIT CACHES
- Has becomes common to split cache
One dedicated to instructions
One dedicated to data
Both exist at same level, typically as two L1 caches
- Advantages of unified cache:
Higher hit rate
Balances load of instruction and data fetches automatically
Only one cache needs to be designed and implemented
- Trend is toward split caches at L1 and unified caches for higher levels
- Advantages of split cache:
Eliminates cache contention between instruction fetch/decode unit and
execution unit
Important in pipelining
INTEL CACHE EVOLUTION

TUTORIAL
3.1 State the eight key characteristics of computer memory system.
1. Location
Internal (e.g., processor registers, cache, main memory)
External (e.g., optical disks, magnetic disks, tapes)
2. Capacity
Number of words
Number of bytes
3. Unit of Transfer
Word
Block
4. Access Method
Sequential
Direct
Random
Associative
5. Performance
Access time
Cycle time
Transfer rate
6. Physical type
Semiconductor (RAM)
Magnetic (Disk & tape)
Optical (CD & DVD)
Magneto-optical
7. Physical characteristic
Volatile / nonvolatile
Erasable / nonerasable
8. Organization
Memory modules
3.2 Describe any four of these characteristics of memory system.

Organization:
- key design issue for random-access memory.
- refers to the physical arrangement of bits to form words.
- the obvious arrangement is not always used.
Physical types:
- the most common today are semiconductor memory, magnetic surface
memory, used for disk and tape, and optical and magneto-optical.
Physical characteristics:
- in a volatile memory, information decays naturally or is lost when electrical
power is switched off.
- in a nonvolatile memory, information once recorded remains without
deterioration until deliberately changed; no electrical power is needed to
retain information. Magnetic-surface memories are nonvolatile.
- semiconductor memory (memory on integrated circuits) may be either volatile
or nonvolatile.
- nonerasable memory cannot be altered, except by destroying the storage unit.
Semiconductor memory of this type is known as read-only memory (ROM).
- of necessity, a practical nonerasable memory must also be nonvolatile.
Capacity:
- for internal memory, this is typically expressed in term of bytes (1 byte = 8
bits) or words.
- common word lengths are 8, 16, and 32 bits.
- external memory capacity is typically expressed in terms of bytes.
3.7 With the aid of a suitable diagram, describe the factors to be considered when
designing a memory system, in terms of capacity, access time, frequency of access and
cost.
Factors that to be considered when designing a memory system, are;

1. Faster access time, greater cost per bit
2. Greater capacity, smaller cost per bit
3. Greater capacity, slower access time
As one goes down the hierarchy, the following occur;
a) Decreasing cost per bit
b) Increasing capacity
c) Increasing access time
d) Decreasing frequency of access of the memory by the processor
It could be concluded that smaller, more expensive, faster memories are supplemented by larger,
cheaper, slower memories. The key to the success of this organization is decreasing frequency of
access. The use of two levels of memory to reduce average time access time works in principle,
but only if conditions (a) through (d) apply. Fortunately, condition (d) is also generally valid.
3.8 Explain what is meant by the term cache memory, and how it differs from the main
memory.
Cache memory is a design for the purpose to combine the memory access time of
expensive, high-speed memory combined with the large memory size of less
expensive, lower-speed memory.
Cache memory different from main memory because it contains a copy of
portions of main memory. Cache is closer to CPU so it is faster. The size of
cache memory also is smaller compared to main memory.
3.9 Describe the operations of a single and a multiple level cache memory.
For a single level cache memory, it is faster compare to multiple level. But it is
smaller in size.
For a multiple level cache memory, it is a bit slower. It has three level which is
level 1 (L1), level 2 (L2), and level 3 (L3). L2 cache is slower and larger
compared to L1 cache, and the L3 cache is slower and larger compared to L2
cache
3.10 State the seven elements of cache design.

1) Cache addressing
2) Cache size
3) Mapping function
4) Replacement algorithm
5) Write policy
6) Block size / line size
7) Number of Caches
3.11 With the aid of suitable diagrams, describe the logical and physical cache addresses.
The cache is between the processor and MMU.

A logical cache stores data using virtual address. The processor accesses the
cache directly, without going through the MMU.
The cache is between the main memory and MMU.

A physical cache stores data using main memory physical addresses.

Chapter 3

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Chapter 3

Transféré par

Droits d'auteur :

Formats disponibles

CHAPTER 3: MEMORY ORGANIZATION

3.1 COMPUTER MEMORY SYSTEM OVERVIEW

iii. Unit of transfer

iv. Access method

vi. Physical types of memory

vii. Physical characteristic

THE MEMORY HIERARCHY

Figure 3.1: the memory hierarchy

Figure3.2 cache and main memory

Figure3.3 cache/main memory structure

Cache operation - overview

Figure3.2 typical cache design

(a) Logical Cache

(b) Physical Cache

COMMON REPLACEMENT ALGORITHM

INTEL CACHE EVOLUTION

3.2 Describe any four of these characteristics of memory system.

Factors that to be considered when designing a memory system, are;

3.10 State the seven elements of cache design.

The cache is between the processor and MMU.

The cache is between the main memory and MMU.

Vous aimerez peut-être aussi