Académique Documents
Professionnel Documents
Culture Documents
multicore
multiplexor
opcode
original equipment
manufacturer (OEM)
program counter (PC)
rate metric
ratio
61
SPEC
speed metric
stored-program concept
upward compatible
von Neumann machine
wafer
word
Review Questions
2.1
2.2
2.3
2.4
2.5
2.6
Problems
2.1
You are to write an IAS program to compute the results of the following equation.
N
Y = aX
X=1
Assume that the result of the computation does not arithmetic overflow and that X, Y,
and N are positive integers with N 1. Note: The IAS did not have assembly language
only machine language.
a. Use the equation Sum(Y)= N(N+1)/2 when writing the IAS program.
b. Do it the hard way, without using the equation from part (a).
2.2 a. On the IAS, what would the machine code instruction look like to load the
contents of memory address 2 to the accumulator?
b. How many trips to memory does the CPU need to make to complete this instruction during the instruction cycle?
2.3 On the IAS, describe in English the process that the CPU must undertake to read a
value from memory and to write a value to memory in terms of what is put into the
MAR, MBR, address bus, data bus, and control bus.
2.4 Given the memory contents of the IAS computer shown below,
2.5
Address
Contents
08A
010FA210FB
08B
010FA0F08D
08C
020FA210FB
show the assembly language code for the program, starting at address 08A. Explain
what this program does.
In Figure 2.3, indicate the width, in bits, of each data path (e.g., between AC and
ALU).
108
KOLB05 Kolbehdari, M., et al. The Emergence of PCI Express* in the Next Generation of Mobile Platforms. Intel Technology Journal, February 2005.
MADD09 Maddox, R., et al. Weaving High Performance Multiprocessor Fabric: Architectural Insights to the Intel QuickPath Interconnect. Hillsboro, OR: Intel Press, 2009.
SING10 Singh, G., et al. The Feeding of High-Performance Processor CoresQuickpath
Interconnects and the New I/O Hubs. Intel Technology Journal, September 2010.
WILE03 Wilen, A.; Schade, J.; and Thronburg, R. Introduction to PCI ExpressA
Hardware and Software Developers Guide. Hillsboro, OR: Intel Press, 2003.
distributed arbitration
error control function
execute cycle
fetch cycle
flit
flow control function
instruction cycle
interrupt
interrupt handler
interrupt service routine (ISR)
lane
memory address register
(MAR)
Review Questions
3.1
3.2
3.3
3.4
3.5
3.6
3.7
147
hit ratio
instruction cache
L1 cache
L2 cache
L3 cache
line
locality
logical cache
memory hierarchy
miss
multilevel cache
physical address
physical cache
random access
replacement algorithm
secondary memory
sequential access
set-associative mapping
spatial locality
split cache
tag
temporal locality
unified cache
virtual address
virtual cache
write back
write through
Review Questions
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
What are the differences among sequential access, direct access, and random access?
What is the general relationship among access time, memory cost, and capacity?
How does the principle of locality relate to the use of multiple memory levels?
What are the differences among direct mapping, associative mapping, and set-associative mapping?
For a direct-mapped cache, a main memory address is viewed as consisting of three
fields. List and define the three fields.
For an associative cache, a main memory address is viewed as consisting of two fields.
List and define the two fields.
For a set-associative cache, a main memory address is viewed as consisting of three
fields. List and define the three fields.
What is the distinction between spatial locality and temporal locality?
In general, what are the strategies for exploiting spatial locality and temporal locality?
Problems
4.1
4.2
4.3
A set-associative cache consists of 64 lines, or slots, divided into four-line sets. Main
memory contains 4K blocks of 128 words each. Show the format of main memory
addresses.
A two-way set-associative cache has lines of 16 bytes and a total size of 8 Kbytes. The
64-Mbyte main memory is byte addressable. Show the format of main memory addresses.
For the hexadecimal main memory addresses 111111, 666666, BBBBBB, show the
following information, in hexadecimal format:
a. Tag, Line, and Word values for a direct-mapped cache, using the format of
Figure 4.10
b. Tag and Word values for an associative cache, using the format of Figure 4.12
c. Tag, Set, and Word values for a two-way set-associative cache, using the format of
Figure 4.15
181
Hamming code
hard failure
nonvolatile memory
programmable ROM
(PROM)
RamBus DRAM
(RDRAM)
read-mostly memory
read-only memory
(ROM)
semiconductor memory
single-error-correcting
(SEC) code
single-error-correcting,
double-error-detecting
(SEC-DED) code
soft error
static RAM (SRAM)
synchronous DRAM
(SDRAM)
syndrome
volatile memory
Review Questions
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
182
Problems
5.1
5.2
5.3
5.4
5.5
Suggest reasons why RAMs traditionally have been organized as only 1 bit per chip
whereas ROMs are usually organized with multiple bits per chip.
Consider a dynamic RAM that must be given a refresh cycle 64 times per ms. Each
refresh operation requires 150 ns; a memory cycle requires 250 ns. What percentage of
the memorys total operating time must be given to refreshes?
Figure 5.16 shows a simplified timing diagram for a DRAM read operation over a bus.
The access time is considered to last from t1 to t2. Then there is a recharge time, lasting
from t2 to t3, during which the DRAM chips will have to recharge before the processor can access them again.
a. Assume that the access time is 60 ns and the recharge time is 40 ns. What is the
memory cycle time? What is the maximum data rate this DRAM can sustain, assuming a 1-bit output?
b. Constructing a 32-bit wide memory system using these chips yields what data
transfer rate?
Figure 5.6 indicates how to construct a module of chips that can store 1 MByte based
on a group of four 256-Kbyte chips. Lets say this module of chips is packaged as a
single 1-Mbyte chip, where the word size is 1 byte. Give a high-level chip diagram of
how to construct an 8-Mbyte computer memory using eight 1-Mbyte chips. Be sure to
show the address lines in your diagram and what the address lines are used for.
On a typical Intel 8086-based system, connected via system bus to DRAM memory,
for a read operation, RAS is activated by the trailing edge of the Address Enable
signal (Figure 3.19). However, due to propagation and other delays, RAS does not go
active until 50 ns after Address Enable returns to a low. Assume the latter occurs in
the middle of the second half of state T1 (somewhat earlier than in Figure 3.19). Data
are read by the processor at the end of T3. For timely presentation to the processor,
however, data must be provided 60 ns earlier by memory. This interval accounts for
Address
lines
Row address
Column address
RAS
CAS
R/W
Data
lines
t2
t3
218
DVD-RW
fixed-head disk
flash memory
floppy disk
gap
hard disk drive (HDD)
head
land
magnetic disk
magnetic tape
magnetoresistive
movable-head disk
multiple zoned recording
nonremovable disk
optical memory
pit
platter
RAID
removable disk
rotational delay
sector
seek time
serpentine recording
solid state drive (SSD)
striped data
substrate
track
transfer time
Review Questions
6.1
6.2
6.3
6.4
6.5
6.6
What are the advantages of using a glass substrate for a magnetic disk?
How are data written onto a magnetic disk?
How are data read from a magnetic disk?
Explain the difference between a simple CAV system and a multiple zoned recording
system.
Define the terms track, cylinder, and sector.
What is the typical disk sector size?
219
Define the terms seek time, rotational delay, access time, and transfer time.
What common characteristics are shared by all RAID levels?
Briefly define the seven RAID levels.
Explain the term striped data.
How is redundancy achieved in a RAID system?
In the context of RAID, what is the distinction between parallel access and independent access?
What is the difference between CAV and CLV?
What differences between a CD and a DVD account for the larger capacity of the latter?
Explain serpentine recording.
Problems
6.1
Consider a disk with N tracks numbered from 0 to (N - 1) and assume that requested
sectors are distributed randomly and evenly over the disk. We want to calculate the
average number of tracks traversed by a seek.
a. First, calculate the probability of a seek of length j when the head is currently
positioned over track t. Hint: This is a matter of determining the total number of
combinations, recognizing that all track positions for the destination of the seek
are equally likely.
b. Next, calculate the probability of a seek of length K. Hint: this involves the summing over all possible combinations of movements of K tracks.
c. Calculate the average number of tracks traversed by a seek, using the formula for
expected value
N-1
E[x] = a i * Pr[x = i]
i=0
n(n + 1)
i=1
6.2
6.3
6.4
n(n + 1)(2n + 1)
i=1
; a i2 =
d. Show that for large values of N, the average number of tracks traversed by a seek
approaches N/3.
Define the following for a disk system:
ts = seek time; average time to position head over track
r = rotation speed of the disk, in revolutions per second
n = number of bits per sector
N = capacity of a track, in bits
tA = time to access a sector
Develop a formula for tA as a function of the other parameters.
Consider a magnetic disk drive with 8 surfaces, 512 tracks per surface, and 64 sectors
per track. Sector size is 1 kB. The average seek time is 8 ms, the track-to-track access
time is 1.5 ms, and the drive rotates at 3600 rpm. Successive tracks in a cylinder can be
read without head movement.
a. What is the disk capacity?
b. What is the average access time? Assume this file is stored in successive sectors
and tracks of successive cylinders, starting at sector 0, track 0, of cylinder i.
c. Estimate the time required to transfer a 5-MB file.
d. What is the burst transfer rate?
Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek
time: one ms for every hundred tracks traversed. Let the disk receive a request to
access a random sector on a random track and assume the disk head starts at track 0.
a. What is the average seek time?
b. What is the average rotational latency?
261
Review Questions
7.1
7.2
7.3
7.4
7.5
7.6
7.7
Problems
7.1
7.2
7.3
7.4
7.5
7.6
On a typical microprocessor, a distinct I/O address is used to refer to the I/O data
registers and a distinct address for the control and status registers in an I/O controller
for a given device. Such registers are referred to as ports. In the Intel 8088, two I/O
instruction formats are used. In one format, the 8-bit opcode specifies an I/O operation; this is followed by an 8-bit port address. Other I/O opcodes imply that the port
address is in the 16-bit DX register. How many ports can the 8088 address in each I/O
addressing mode? .
A similar instruction format is used in the Zilog Z8000 microprocessor family. In this
case, there is a direct port addressing capability, in which a 16-bit port address is part
of the instruction, and an indirect port addressing capability, in which the instruction
references one of the 16-bit general purpose registers, which contains the port address. How many ports can the Z8000 address in each I/O addressing mode?
The Z8000 also includes a block I/O transfer capability that, unlike DMA, is under the
direct control of the processor. The block transfer instructions specify a port address
register (Rp), a count register (Rc), and a destination register (Rd). Rd contains the
main memory address at which the first byte read from the input port is to be stored. Rc
is any of the 16-bit general purpose registers. How large a data block can be transferred?
Consider a microprocessor that has a block I/O transfer instruction such as that found
on the Z8000. Following its first execution, such an instruction takes five clock cycles
to re-execute. However, if we employ a nonblocking I/O instruction, it takes a total
of 20 clock cycles for fetching and execution. Calculate the increase in speed with the
block I/O instruction when transferring blocks of 128 bytes.
A system is based on an 8-bit microprocessor and has two I/O devices. The I/O controllers for this system use separate control and status registers. Both devices handle
data on a 1-byte-at-a-time basis. The first device has two status lines and three control
lines. The second device has three status lines and four control lines.
a. How many 8-bit I/O control module registers do we need for status reading and
control of each device?
b. What is the total number of needed control module registers given that the first
device is an output-only device?
c. How many distinct addresses are needed to control the two devices?
For programmed I/O, Figure 7.5 indicates that the processor is stuck in a wait loop
doing status checking of an I/O device. To increase efficiency, the I/O software could
be written so that the processor periodically checks the status of the device. If the
device is not ready, the processor can jump to other tasks. After some timed interval,
the processor comes back to check status again.
a. Consider the above scheme for outputting data one character at a time to a
printer that operates at 10 characters per second (cps). What will happen if its
status is scanned every 200 ms?