Vous êtes sur la page 1sur 4

Homework 4

CDA 3101: Fall 2017


Due Date: 11/30/2017 11:55 PM

You are not allowed to take or give help in completing this assignment. Submit the typewritten PDF
or Microsoft Word version of the submission in Sakai website before the deadline. Scanned
handwritten submissions will NOT be accepted. You are given a 24-hour grace period with 20%
penalty. No late submission will be accepted. Exceptions will be made for legitimate reasons
communicated well ahead of time with the instructor. Necessary calculation procedure or explanation
MUST be provided in your answer. You are not allowed to take or give help in completing this
assignment. Submit the PDF version of the homework solution in e-Learning (Canvas).

Total Points: 100 pts

1.) [20 points] (Exercise 5.3) For a direct-mapped cache design with a 32-bit
address, the following bits of the address are used to access the cache.

Tag Index Offset


31-10 9-4 3-0

a.) [5 points] What is the cache block size (in words)?

b.) [5 points] How many entries does the cache have?

c.) [10 points] What is the ratio between total bits required for such a
cache Implementation (tag + data) over the data storage bits(data)?

2.) [30 points] (Exercise 5.6) In this exercise, we will look at the different ways
capacity affects overall performance. In general, cache access time is proportional
to capacity. Assume that main memory accesses take 70 ns and that memory
accesses are 36% of all instructions. The following table shows data for L1 caches
attached to each of two processors, P1 and P2.
L1 Size L1 Miss Rate L1 Hit Time
P1 2 KiB 7.5% 0.55 ns
P2 4 KiB 7.0% 1.70ns
a.) [10 points] Assuming that the L1 hit time determines the cycle
times for P1 and P2, what are their respective clock rates?

b.) [10 points] What is the Average Memory Access Time for P1 and
P2?

c.) [10 points] Assuming a base CPI of 1.0 without any memory stalls,
what is the total CPI for P1 and P2? Which processor is faster?

3.) [10 points] (Exercise 5.12) In this exercise, we will examine space/time
optimizations for page tables. The following list provides parameters of a virtual
memory system.
Virtual Address (bit

a.) [10 points] For a single-level page table, how many page table entries
(PTEs) are needed? How much physical memory is needed for storing the
page table?
Physical DRAM

4.) [20 points] (Exercise 5.11) As described in Section 5.7 of the text book,
virtual memory uses a page table to track the mapping of virtual addresses to
physical addresses. This exercise shows how this table must be updated as
addresses are accessed. The following data constitutes a stream of virtual
addresses as seen on a system. Assume 4 KiB pages, a 4-entry fully associative
TLB, and true LRU replacement. The LRU status at the beginning of the
address sequence is show in the TLB table.0 being the Least Recently Used
and 3 being the Most Recently Used. If pages must be brought in from disk,
increment the next largest page number. i.e., when the page table is accessed,
if that particular page is on disk, the page number would be 13 since the largest
page number in the page table at the beginning is 12.

4669, 2227, 13916, 34587, 48870, 12608, 49225


TLB
Valid Tag Physical Page Number LRU
1 11 12 1
1 7 4 2
1 3 6 3
0 4 9 0

Page Table

Valid Physical Page or in Disk


1 5
0 Disk
0 Disk
1 6
1 9
1 11
0 Disk
1 4
0 Disk
0 Disk
1 3
1 12

a.) [20 points] Given the address stream shown, and the initial TLB and
page table states provided above, show the final state of the system.
Also list for each reference if it is a hit in the TLB, a hit in the page
table, or a page fault.
5.) [10 points] (Exercise 5.13) In this exercise, we will examine how replacement
policies impact miss rate. Assume a 2-way set associative cache with 4 blocks. To
solve the problems in this exercise, you may find it helpful to draw a table like the
one below, as demonstrated for the address sequence 0, 1, 2, 3, 4.

Consider the following address sequence: 0, 2, 4, 8, 10, 12, 14, 16, 0

a.) [10 points] Assuming an LRU replacement policy, how many hits does
this address sequence exhibit?

6.) [10 points] (Exercise 5.17) Cache coherence concerns the views of multiple
processors on a given cache block. The following data shows two processors
and their read/write operations on two different words of a cache block X
(initially X[0] = X[1] = 0). Assume the size of integers is 32 bits.

a.)[10 points] For a snooping protocol, list a valid operation sequence on


each processor/cache to finish the above read/write operations.

Vous aimerez peut-être aussi