Académique Documents
Professionnel Documents
Culture Documents
10CS46
Unit I
Basic Structure of Computers
Computer Architecture in general covers three aspects of computer design namely: Computer
Hardware, Instruction set Architecture and Computer Organization.
Computer hardware consists of electronic circuits, displays, magnetic and optical storage
mediaand communication facilities.
Instruction set Architecture is programmer visible machine interface such as instruction set,
registers, memory organization and exception handling. Two main approaches are mainly
CISC(Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer)
Computer Organization includes the high level aspects of a design, such as memory system, the bus
structure and the design of the internal CPU.
Computer Types
Computer is a fast electronic calculating machine which accepts digital input, processes it
according to the internally stored instructions (Programs) and produces the result on the
outputdevice. The internal operation of the computer can be as depicted in the figure below:
Page 1
Computer Organization
10CS46
Functional Units
A computer in its simplest form comprises five functional units namely input unit, output unit
memory unit, arithmetic & logic unit and control unit. Figure 2 depicts the functional units of a
computer system.
Page 2
Computer Organization
10CS46
Page 3
Computer Organization
10CS46
5. Control Unit: Control unit co-ordinates activities of all units by issuing control signals.
Control signals issued by control unit govern the data transfers and then appropriate operations
take place. Control unit interprets or decides the operation/action to be performed.
The operations of a computer can be summarized as follows:
1. A set of instructions called a program reside in the main memory of computer.
2. The CPU fetches those instructions sequentially one-by-one from the main memory, decodes
them and performs the specified operation on associated data operands in ALU.
3. Processed data and results will be displayed on an output unit.
4. All activities pertaining to processing and data movement inside the computer machine are
governed by control unit.
Page 4
Computer Organization
10CS46
The interaction between the processor and the memory and the direction of flow of information
is as shown in the diagram below:
Page 5
Computer Organization
10CS46
BUS STRUCTURES
Group of lines that serve as connecting path for several devices is called a bus (one bit per
line).Individual parts must communicate over a communication line or path for exchanging data,
address and control information as shown in the diagram below. Printer example processor to
printer. A common approach is to use the concept of buffer registers to hold the content during
the transfer.
SOFTWARE
If a user wants to enter and run an application program, he/she needs a System Software. System
Software is a collection of programs that are executed as needed to perform functions such as:
Receiving and interpreting user commands
Entering and editing application programs and storing then as files in secondary storage
devices
Running standard application programs such as word processors, spread sheets,
gamesetc
Operating system - is key system software component which helps the user to exploit the below
underlying hardware with the programs.
Page 6
Computer Organization
10CS46
PERFORMANCE
The total time required to execute a program is the most important measure of performance for a
computer. (t0-t5 of earlier example). Compiler, instruction set and hardware architecture,
program all have impact on performance.
Basic Performance Equation: The basic performance equation is given by
T = (N * S) / R
where T=execution time, N=number of instructions, S=average cycles per instruction, R=clock
rate in cycles per second
CACHING
Commonly used data are copied to on-processor memory (cache) to reduce access time.
Small memories can be made with higher speed than large ones. In a computer, we need both.
PERFORMANCE MEASUREMENT
Dept Of CSE, ACE Bangalore
Page 7
Computer Organization
10CS46
Benchmark refers to standard task used to measure how well a processor operates. To evaluate
the performance of Computers, a non-profit organization known as SPEC-System Performance
Evaluation Corporation employs agreed-upon application programs of real world for
benchmarks. Accordingly, it gives performance measure for a computer as the time required to
execute a given benchmark program. The SPEC rating is computed as follows
GENERATION OF COMPUTERS
Development of technologies used to fabricate the processors, memories and I/O units of the
computers has been divided into various generations as given below:
First generation
Second generation
Third generation
Fourth generation
Beyond the fourth generation
First generation:
1946 to 1955: Computers of this generation used Vacuum Tubes. The computes were built using
stored program concept. Ex: ENIAC, EDSAC, IBM 701.
Computers of this age typically used about ten thousand vacuum tubes. They were bulky in size
had slow operating speed, short life time and limited programming facilities.
Second generation:
1955 to 1965: Computers of this generation used the germanium transistors as the active
switching electronic device. Ex: IBM 7000, B5000, IBM 1401. Comparatively smaller in size
About ten times faster operating speed as compared to first generation vacuum tube based
computers. Consumed less power, had fairly good reliability. Availability of large memory was
an added advantage.
Third generation:
1965 to 1975: The computers of this generation used the Integrated Circuits as the active
electronic components. Ex: IBM system 360, PDP minicomputer etc. They were still smaller in
size. They had powerful CPUs with the capacity of executing 1 million instructions per
second(MIPS). Used to consume very less power consumption.
Fourth generation:
1976 to 1990: The computers of this generation used the LSI chips like microprocessor as their
active electronic element. HCL horizen III, and WIPROS Uniplus+ HCLs Busybee PC etc.
Page 8
Computer Organization
10CS46
They used high speed microprocessor as CPU. They were more user friendly and highly reliable
systems. They had large storage capacity disk memories.
Beyond Fourth Generation:
1990 onwards: Specialized and dedicated VLSI chips are used to control specific functions of
these computers. Modern Desktop PCs, Laptops or Notebook Computers.
Unit II:
Dept Of CSE, ACE Bangalore
Page 9
Computer Organization
10CS46
This unit deals with concepts as to how programs are executed in the computer from the
machineinstruction set viewpoint. All computers deal with numbers. They have instructions that
performoperations on the data operands. Hence we start with basic Number systems.
NUMBER, ARITHMETIC OPERATIONS and CHARACTERS
Binary numbers (0, 1) are used in computers. Various number systems are used in computers.
Numbers in binary are represented as vectors
B=bn-1b1b0
Unsigned numbers are in range 0 to 2n-1 and are represented by
V (B) =bn-12n-1 ++b1 21 +b0 20
Page 10
Computer Organization
10CS46
The operation (7+4) mod 16 yields the value 11. To perform this graphically using the above
representation locate 7 on the circle and then move 4 units in the clock wise direction to arrive
atthe answer 11.
Page 11
Computer Organization
10CS46
2s complement representation for 7 is 0111 and -3 is 1101. Thus locate 0111 and then move
1101(13 steps) in clockwise direction to arrive at 0100 = +4.
Some more examples of 2s complement add and substract operations are as shown in figure
below:
Page 12
Computer Organization
10CS46
Page 13
Computer Organization
10CS46
Big Endian and Little Endian: Consider a 32 bit integer (in hex): 0xabcdef12. It consists of 4
bytes: ab, cd, ef, and 12. Hence this integer will occupy 4 bytes in memory. Say we store it at
memory address starting 1000. There are 24 different orderings possible to store these 4 bytes
in4 locations (1000 - 1003). 2 among these 24 possibilities are very popular. These are called as
little endian and big endian.
Page 14
Computer Organization
10CS46
Page 15
Computer Organization
10CS46
Page 16
Computer Organization
10CS46
Branching: The Address of the memory locations containing the n numbers are symbolically
givenas NUM1, NUM2..NUMn.
Separate Add instruction is used to add each number to the contents of register R0.
After all the numbers have been added, the result is placed in memory location SUM.
Page 17
Computer Organization
10CS46
Page 18
Computer Organization
10CS46
ADDRESSING MODE
The different ways in which the location of an operand is specified in an instruction is called as
Addressing mode.
Generic Addressing Modes:
Immediate mode
Register mode
Absolute mode
Indirect mode
Index mode
Base with index
Base with index and offset
Relative mode
Auto-increment mode
Auto-decrement mode
Implementation of Variables and Constants:
Variables:
The value can be changed as needed using the appropriate instructions. There are 2 accessing
modes to access the variables. They are
Register Mode
Absolute Mode
Register Mode:
The operand is the contents of the processor register. The name (address) of the register is given
in the instruction.
Absolute Mode (Direct Mode):
The operand is in a memory location. The address of this location is given explicitly in the
instruction.The various addressing modes and their assembler syntax and functions are as shown
in figurebelow:
Page 19
Computer Organization
10CS46
Page 20
Computer Organization
10CS46
Auto-increment mode: The Effective Address of the operand is the contents of a register in the
instruction. After accessing the operand, the contents of this register is automatically
incremented to point to the next item in the list.
Auto-decrement mode: The Effective Address of the operand is the contents of a register in the
instruction. After accessing the operand, the contents of this register is automatically
decremented to point to the next item in the list.
Assembly Language
We generally use symbolic names to write a program. A complete set of such symbolic names
and rules for their use constitute a programming language, is referred to as assembly language.
LOAD - To load operand from memory
STORE - To store operand to memory
MOVE - To transfer data from one location to another location/Register
Assembler Directives: Directives are the assembler commands to the assembler concerning the
program being assembled. These commands are neither translated into machine opcode nor
assigned any memory location in the object program.
S EQU 150
EQU directs the assembler that the symbolic name S must be replaced with memory location
address 150,
ORIGIN 201
Instruct assembler to initiate data block at main memory locations starting from 201
N DATAWORD 40
Inform the assembler that value of N i.e. data value 40 is to be placed in the memory location
201.
ORIGIN 100
States that assembler directive must load machine instructions of the object program in the
mainmemory starting from location 100.
END START
End of the program and the label of where program starts
N1 RESERVE 400
Reserve memory block of 400 bytes
Assembler: Has to know
How to interpret machine language (directives, instructions, addressing modes etc)
Where to place the instructions in memory
Where to place the data in memory ; Scans through source program, keeps track of all
names and corresponding numerical values in symbol table e.g. what all the labels mean
Calculate branch addresses; Forward branch problem how can it work out forward
addresses?
Dept Of CSE, ACE Bangalore
Page 21
Computer Organization
10CS46
Page 22
Computer Organization
10CS46
READWAIT
WRITEWAIT
WRITEWAIT
Page 23
Computer Organization
10CS46
Branch=0 WRITEWAIT
Move Byte R1, DATAOUT
Program to read a line of character and display it :
Move
#LOC,R0
Page 24
Computer Organization
10CS46
Stack Implementation:
Subtract #4, SP
Move NEWITEM,(SP) ; push
Move (SP), ITEM; pop
Add #4, SP
With auto increment and auto decrement
Move NEWITEM,-(SP); push
Move (SP) +, ITEM; pop
Queue:
First-in-first-out
Unlike a stack, need to keep track of both the front and end for removal and insertion
respectively
Need two pointers to keep track of both ends
Assuming it moves through memory in direction of higher addresses, as it is used, it
walks through memory towards higher addresses.
Circular buffers:
Avoid this problem by limiting to a fixed region in memory
Start at BEGINNING and entries appended until it reaches END after which it wraps
back around to BEGINNING
Need to deal with cases when it is completely full and completely empty
SUBROUTINES
Often need to perform subtask on different data we use subtask called a subroutine
Rather than include the same sequence of instructions everywhere it is needed, call a
subroutine instead
One copy of subroutine stored in memory
Subroutine call causes a branch to the subroutine
At the end of the subroutine, a return instruction is executed
Program resumes execution at the instruction immediately following the subroutine call
Parameter Passing:
Subroutine call
e.g. SUM = listadd (N, NUM);
N is a variable in memory and NUM is an address pointing to the start of the NUM list
How do we send the parameters N, NUM to the subroutine?
How do we receive the return value SUM?
One way is putting the parameters in registers; second way is Passing Parameters on stack and
third way is passing by value and reference.
Page 25
Computer Organization
10CS46
Complex instruction set computers (CISC) have complex instruction encodings like this (e.g.
IA-32)
Reduced instruction set computers (RISC) only allow simple 32-bit formats, few addressing
modes and all data to be manipulated must be in registers e.g. Add (R3),R2 is not allowed,
instead use Move (R3),R1 followed by Add R1,R2 (e.g. ARM)
RISC machines often are 3-address machines as the addressing mode field is either not necessary
or simplified e.g. Add R1, R2, R3
CISC machines usually require less instructions but have a lower clock rate, RISC require more
instructions but have a higher clock rate.
Page 26
Computer Organization
10CS46
Unit 3 and Unit 4:
INPUT/OUTPUT ORGANIZATION
A general purpose computer should have the ability to exchange information with a wide range of devices
in varying environments. Computers can communicate with other computers over the Internet and access
information around the globe. They are an integral part of home appliances, manufacturing equipment,
transportation systems, banking and point-of-sale terminals. In this chapter, we study the various ways in
which I/O operations are performed.
A single-bus structure
A simple arrangement to connect I/O devices to a computer is to use a single bus
arrangement, as shown in above figure. Each I/O device is assigned a unique set of
address. When the processor places a particular address on the address lines, the
device that recognizes this address responds to the commands issued on the control
lines. The processor requests either a read or a write operation which is transferred
over the data lines. When I/O devices and the memory share the same address
space, the arrangement is called memory-mapped I/O.
Consider, for instance, with memory-mapped I/O, if DATAIN is the address of the
input buffer of the keyboard
Move DATAIN, R0
And DATAOUT is the address of the output buffer of the display/printer
Move R0, DATAOUT
This sends the contents of register R0 to location DATAOUT, which may be the
output data buffer of a display unit or a printer.
Most computer systems use memory-mapped I/O. Some processors have special
I/O instructions to perform I/O transfers. The hardware required to connect an I/O
device to the bus is shown below:
Page 27
Computer Organization
10CS46
Interrupts
There are many situations where other tasks can be performed while waiting for an I/O device to become
ready. A hardware signal called an Interrupt will alert the processor when an I/O device becomes ready.
Interrupt-request line is usually dedicated for this purpose.
For example, consider, COMPUTE and PRINT routines. The routine executed in response to an interrupt
request is called interrupt-service routine. Transfer of control through the use of interrupts happens. The
processor must inform the device that its request has been recognized by sending interrupt-acknowledge
Page 28
Computer Organization
10CS46
signal. One must therefore know the difference between Interrupt Vs Subroutine. Interrupt latencyis
concerned with saving information in registers will increase the delay
between the time an interrupt request is received and the start of execution of the interrupt-service routine.
Interrupt hardware
Most computers have several I/O devices that can request an interrupt. A single interrupt request line may
be used to serve n devices.
Interrupt nesting
The processor should continue to execute the interrupt-service routine till completion, before it accepts an
interrupt request from a second device. Privilege exception means they execute privileged instructions.
Individual interrupt-request and acknowledge lines can also be implemented. Implementation of interrupt
priority using individual interrupt-request and acknowledge lines has been shown in figure 4.7.
Simultaneous requests
Dept Of CSE, ACE Bangalore
Page 29
Computer Organization
10CS46
Page 30
Computer Organization
10CS46
memory address and keep track of the number of transfers. The operations of DMA controller must be
under the control of a program executed by the processor. To initiate the transfer of block of words, the
processor sends the starting address, the number of words in the block and the direction of the transfer.
On receiving this information, DMA controller transfers the entire block and informs the processor by
raising an interrupt signal. While a DMA transfer is taking place, the processor can be used to execute
another program. After the DMA transfer is completed, the processor can return to the program that
requested the transfer.
Three registers in a DMA interface are:
Starting address
Word count
Status and control flag
Bus Arbitration
The device that is allowed to initiate data transfers on the bus at any given time is called the bus master.
Arbitration is the process by which the next device to become the bus master is selected and bus
mastership is transferred to it. The two approaches are centralized and distributed arbitrations.
In centralized, a single bus arbiter performs the required arbitration whereas in distributed, all device
participate in the selection of the next bus master. The bus arbiter may be the processor or a separate unit
connected to the bus. The processor is normally the bus master unless it grants bus mastership to one of
the DMA controllers. A simple arrangement for bus arbitration using daisy chain and a distributed
arbitration scheme are discussed in figure 4.20 and 4.22 respectively.
In Centralized arbitration, A simple arrangement for bus arbitration using a daisy chain shows the
arbitration solution. A rotating priority scheme may be used to give all devices an equal chance of being
serviced (BR1 to BR4). In Distributed arbitration, all devices waiting to use the bus have equal
responsibility in carrying out the arbitration process, without using a central arbiter. The drivers are of the
open-collector type. Hence, if the input to one driver is equal to 1 and the input to another driver
Dept Of CSE, ACE Bangalore
Page 31
Computer Organization
10CS46
connected to the same bus line is equal to 0 the bus will be in the low-voltage state. This uses ARB0 to
ARB3.
Buses
The Primary function of the bus is to provide a communication path for the transfer of data. It must also
look in to,
When to place information on the bus?
When to have control signals?
Some bus protocols are set. These involve data, address and control lines. A variety
of schemes have been devised for the timing of data transfers over a bus. They are:
Synchronous and Asynchronous schemes
Bus master is an initiator. Usually, processor acts as master. But under DMA setup, any other device can
be master. The device addressed by the master is slave or target.
Synchronous bus
All devices derive timing information from a common clock line. Equallyspaced pulses on this line define
equal time intervals. Each of these intervalsconstitutes a bus cycle during which one data transfer can take
place. Timing of aninput/output transfer on a synchronous bus is shown in figure 4.23.
Asynchronous bus
This is a scheme based on the use of a handshake between the master andthe slave for controlling data
transfers on the bus. The common clock is replaced bytwo timing control lines, master-ready and slaveready. The first is asserted by themaster to indicate that it is ready for a transaction and the second is a
response from the slave. The master places the address and command information on the bus. It indicates
to all devices that it has done so by activating the master-ready line. This causes all devices on the bus to
decode the address. The selected slave performs the required operation and informs the processor it has
done so by activating the slave ready line. A typical handshake control of data transfer during an input
and an output operations are shown in figure 4.26 and 4.27 respectively. The master waits for slave-ready
to become asserted before it removes its signals from the bus. The
handshake signals are fully interlocked. A change of state in one signal is followed bya change in the
other signal. Hence this scheme is known as a full handshake.
Interface Circuits
An I/O interface consists of the circuitry required to connect an I/O device to a computer bus. On one side
of the interface, we have bus signals. On the other side, we have a data path with its associated controls to
transfer data between the interface and the I/O device port. We have two types:
Serial port and
Parallel port
A parallel port transfers data in the form of a number of bits (8 or 16)simultaneously to or from the
device. A serial port transmits and receives data one bit at a time. Communication with the bus is the
same for both formats. The conversion from the parallel to the serial format, and vice versa, takes place
inside the interface circuit. In parallel port, the connection between the device and the computer uses a
multiple-pin connector and a cable with as many wires. This arrangement is suitable for devices that are
Dept Of CSE, ACE Bangalore
Page 32
Computer Organization
10CS46
physically close to the computer. In serial port, it is much more convenient and cost-effective where
longer cables are needed.
Typically, the functions of an I/O interface are:
Provides a storage buffer for at least one word of data
Contains status flags that can be accessed by the processor to determinewhether the buffer is full
or empty
Contains address-decoding circuitry to determine when it is being addressed
by the processor
Generates the appropriate timing signals required by the bus control scheme
Performs any format conversion that may be necessary to transfer data
between the bus and the I/O device, such as parallel-serial conversion in the
case of a serial port
Parallel Port
The hardware components needed for connecting a keyboard to a processor
Consider the circuit of input interface which encompasses (as shown in below figure):
Status flag, SIN
R/~W
Master-ready
Address decoder
A detailed figure showing the input interface circuit is presented in figure 4.29. Now,
consider the circuit for the status flag (figure 4.30). An edge-triggered D flip-flop is
used along with read-data and master-ready signals.
Page 33
Computer Organization
10CS46
Page 34
Computer Organization
10CS46
A serial interface circuit involves Chip and register select, Status andcontrol, Output shift register,
DATAOUT, DATAIN, Input shift register and Serialinput/output as shown in figure 4.37.
Page 35
Computer Organization
10CS46
The SCSI controller sends an interrupt to the processor to inform it that the requested operation
has been completed.
The bus signals, arbitration, selection, information transfer and reselection are the
topics discussed in addition to the above.
Universal Serial Bus (USB)
The USB has been designed to meet several key objectives such as:
Provide a simple, low-cost and easy to use interconnection system thatovercomes the difficulties
due to the limited number of I/O ports available ona computer
Accommodate a wide range of data transfer characteristics for I/O devices,including telephone
and Internet connections
Enhance user convenience through a plug-and-play mode of operation
Port Limitation
Here to add new ports, a user must open the computer box to gain access to theinternal expansion bus and
install a new interface card. The user may also need toknow how to configure the device and the software.
And also it is to make it possibleto add many devices to a computer system at any time, without opening
thecomputer box.
Device Characteristics
The kinds of devices that may be connected to a computer cover a wide range of functionality - speed,
volume and timing constraints. A variety of simple devices attached to a computer generate data in
different asynchronous mode. A signal must be sampled quickly enough to track its highest-frequency
components.
Plug-and-play
Whenever a device is introduced, do not turn the computer off/restart to connect/disconnect a device. The
system should detect the existence of this new device automatically, identify the appropriate device-driver
software and any other facilities needed to service that device, and establish the appropriate addresses and
logical connections to enable them to communicate.
USB architecture
To accommodate a large number of devices that can be added or removed at anytime, the USB has the
tree structure. Each node has a device called a hub. Root hub, functions, split bus operations high speed
(HS) and Full/Low speed (F/LS).
Page 36
Computer Organization
10CS46
Unit 5.
Memory Systems
Basic Concepts
The two primary classifications of memory are
1)Primary memory
2) Secondary memory
Under Primary memory the two classifications are
1)RAM (Random Accesses Memory)
2)ROM(Read Only Memory)
RAM :-Its is further classified into
a) Static RAM (SRAM)
b) Dynamic RAM(DRAM)
DRAM:-the Dynamic RAM is again further Sub Classified into
i) Synchronous DRAM
ii) Asynchronous DRAM
The Maximum size of the memory in any computer is determined by the number address lines, provided
by processor used in the computer. For ex: if processor has 20 address lines, it is capable of addressing
220 = 1M (mega ) memory locations.
The maximum bits that can be transferred from memory or to the memory depend on the data lines
supported by the processor. From the system standpoint, the memory unit is viewed as a black box. Data
transfer between the memory and the processor takes place through the two processor registers
MAR(memory address register) and MDR(memory data register). If MAR is k-bits long and MDR is nbits long then the memory unit may contain up to 2k addressable locations. The bus also includes control
lines Read/Write (R/W )and MFC(memory function completed) for coordinating data transfers.
Page 37
Computer Organization
10CS46
Random access memory (RAM) is the best known form of computer memory. RAM is considered
random access" because you can access any memory cell directly if you know the row and column that
intersect at that cell. RAM data, on the other hand, can be accessed in any order.
RAM memory consists of memory cells. Each memory cell represents a single bit of data (logic1 or logic
0). Memory cells are etched onto a silicon wafer in an array of columns (bit lines) and rows (word lines).
The intersection of a bit line and word line constitutes the address of the memory cell.
RAM memory is available in many physical forms. Memory chips in desktop computers originally used a
pin configuration called dual inline package (DIP). This arrangement was later replace with memory
modules, that consist of memory chips, along with all of the support components, on a separate printed
circuit board (PCB) that could then be plugged into a special connector (memory bank) on the
motherboard. The type of board and connector used for RAM in desktop computers has evolved over the
past years. First there were proprietary memory Processor Memory Up to 2k addressable locations
Word length=n bits
MDR
MAR
modules, then came SIMMs (single in-line memory module). There has been two different IMM types
widely in use: 30-pin connector version (8 bit bus version) and 72-pin connector version (wider bus, more
address lines). As processors grew in speed and bandwidth capability, the industry adopted a new
standard in dual in-line memory module (DIMM). Many brands of notebook computers use small outline
dual in-line memory module (SODIMM).
Memory chips are normally nowadays only available to general public in a form of a card called a
module. Most memory available today is highly reliable. Most systems simply have the memory
controller check for errors at start-up and rely on that. Memory chips with built-in error checking
typically use a method known as parity to check for errors. Parity chips have an extra bit for every bits of
data. The way parity works is simple. Let's look at even parity first. Computers in critical positions need a
higher level of fault tolerance. High-end servers often have a form of error-checking known as errorcorrection code (ECC). The majority of computers sold today use nonparity memory chips. These chips
do not provide any type of built-in error checking, but instead rely on the memory controller for error
detection.
Page 38
Computer Organization
10CS46
A capacitor is like a small bucket that is able to store electrons. To store a 1 in the memory cell, the
bucket is filled with electrons. To store a 0, it is emptied. he problem with the capacitor's bucket is that it
has a leak (usually in few milliseconds a full bucket becomes empty).
Therefore, for dynamic memory to work, either the CPU or the memory controller has to come
along and recharge all of the capacitors holding a logic 1 state voltage level before they discharge. To do
this, the memory controller reads the memory and then writes it right back. This refresh operation
typically happens automatically thousands of times per second.
DRAM works by sending a charge through the appropriate column (CAS) to activate the transistor at each
bit in the column. When writing, the row lines contain the state the capacitor should take on. When
reading, the sense-amplifier determines the level of charge in the capacitor. If it is more than 50 percent, it
reads it as a 1; otherwise it reads it as a 0. The counter tracks the refresh sequence based on which rows
have been accessed in what order. The length of time necessary to do is expressed in nanoseconds
(billionths of a second). A memory chip rating of 70ns means that it takes 70 nanoseconds to completely
read and recharge each cell.
The amount of time that RAM takes to write data or to read it once the request has been received from the
processor is called the access time. Typical access times vary from 9nanoseconds to 70 nanoseconds,
depending on the kind of RAM. Although fewer nano seconds access is better, user-perceived
performance is based on coordinating access times with the computer's clock cycles. Access time consists
of latency and transfer time. Latency is the timeto coordinate signal timing and refresh data after reading
it.
Typical DRAM memory access procedure is the following: To read a memory cell, we place arow
address on the address bus lines (all the address lines together are called an address bus)and activate the
Row Access Select (RAS) line and wait for 15ns while the holding circuitry to latches the Row address.
Then we place column address on the address bus and activate the Column Access Select (CAS) line.
Now, we have to wait for the level checking circuitry to determine if the location contains a 0 or 1. This
information or data will appear as a high or low voltage on the data output pin.
Page 39
Computer Organization
10CS46
Synchronous DRAMs
More recent developments in memory technology have resulted in DRAMs whose operation is directly
synchronized with a clock signal. such memories are known as synchronous DRAMs(SDRAMs).The
figure below describes the structure of an SDRAM. The cell array is the same as in asynchronous
DRAMs. The address and data connections are buffered by means of registers. The output of each sense
amplifier is connected to a latch.
SDRAMSs have several different modes of operation, which can be selected by writing control
information into a mode register.
Asynchronous DRAMs: In the DRAM, the timing of the memory device is controlled asynchronously.
A Specialized memory controller circuit provides the necessary control signals, RAS and CAS, that
govern the timing. The processor must take into account the delay in the response of the memory. Such
Memories are referred to as Asynchronous DRAMs.
Page 40
Computer Organization
10CS46
the state is changed with a write signal to that memory cell. Flip-flop for a memory cell takes four or six
transistors along with some wiring, which is much more than what is needed by DRAM. Therefore, you
get less memory per chip, and that makes static RAM a lot more expensive.
ASRAM: A sync SRAM has been with us since the days of the 386, and is still in place in the L2
cache of many PCs. It's called asynchronous because it's not in sync with the system clock, and
therefore the CPU must wait for data requested from the L2 cache. However, the wait isn't as long
as it is with DRAM.
BSRAM: Burst SRAM (also known as Synch Burst SRAM) is synchronized with the system
clock or, in some cases, the cache bus clock. This allows it be more easily synchronized with any
device that accesses it and reduces access waiting time. It is used as the external level-2 cache
memory for the Pentium II microprocessor chipset.
PB SRAM: Using burst technology, SRAM requests can be pipelined, or collected so that
requests within the burst are executed on a nearly instantaneous basis. PB SRAM uses pipelining,
and while it's slightly behind system synchronization speeds, it's a possible improvement over
Sync SRAM because it's designed to work well with bus speeds of 75 MHz and higher.
Static RAM typically is fast and expensive. So static RAM is typically used to create the CPU's
speed-sensitive cache.
In addition SRAM is sometimes used to store data "semi permanently", so that when system is
not powered up, the data in SRAM chip is retained with a help of a small backup battery that
provides operating power to memory when rest of the system is not operating (there are special
SRAM ICs that consume very little power when they are not accessed, so they are suitable for
battery backed up application).
Page 41
Computer Organization
10CS46
ROM chips contain a grid of columns and rows. ut where the columns and rows intersect, there
is a diode to connect the lines if the value is 1. If the value is 0, then the lines are not connected
at all.
PROM
Programmable Read-only memory (PROM) is an integrated non-volatile memory circuit that is
manufactured to be empty. It can be later programmed with specific data. The programming can be done
only once. After programming this data is always stored to this IC. Blank PROM chips can be bought
inexpensively and coded by anyone with a special tool called a programmer.
PROM chips have a grid of columns and rows just as ordinary ROMs do. The difference is that every
intersection of a column and row in a PROM chip has a fuse connecting them. A charge sent through a
column will pass through the fuse in a cell to a grounded row indicating a value of 1. Since all the cells
have a fuse, the initial (blank) state of a PROM chip is all 1s. To change the value of a cell to 0, you use a
programmer to send a specific amount of current to the cell. The higher voltage breaks the connection
between the column and row by burning out the fuse. This process is known as burning the PROM.
EPROM
Erasable programmable read-only memory (EPROM) chips work PROM chips, but they can be rewritten
many times. EPROM is constructed to have a grid of columns and rows. In an EPROM, the cell at each
intersection has two transistors. The two transistors are separated from each other by a thin oxide layer.
One of the transistors is known as the floating gate and the other as the control gate. The floating gate's
only link to the row (wordline) is through the control gate.
As long as this link is in place, the cell has a value of 1. To change the value to 0 requires altering the
placement of electrons in the floating gate. An electrical charge, usually 10 to 13 volts, is applied to the
floating gate to charge the floating gate and thus turn bit to 0.
A blank EPROM has all of the gates fully open, giving each cell a value of 1. Programming can change
wanted cells to 0. To rewrite an EPROM, you must erase it first. Erasing an EPROM requires a special
tool that emits a certain frequency of ultraviolet (UV) light (253.7 nm wavelength). An EPROM eraser is
not selective, it will erase the entire EPROM. Erasing EPROM typically takes several minutes (be careful
on erasing time, because over-erasing can
damage the IC). EPROMs are configured using an EPROM programmer that provides voltage at
specified levels depending on the type of EPROM used.
EEPROM
Page 42
Computer Organization
10CS46
Electrically erasable programmable read-only memory (EEPROM) chips that can be electrically
programmed and erased. EEPROMs are typically changed 1 byte at time. Erasing EEPROM takes
typically quite long.
The drawback of EEPROM is their speed. EEPROM chips are too slow to use in many products
that make quick changes to the data stored on the chip.
Typically EEPROMs are found in electronics devices for storing the small amounts of nonvolatile
data in applications where speed is not the most important. Small EEPROMs with serial interfaces are
commonly found in many electronics devices.
Flash
Flash memory is a type of EEPROM that uses in-circuit wiring to erase by applying an electrical
field to the entire chip or to predetermined sections of the chip called blocks. Flash memory works much
faster than traditional EEPROMs because it writes data in chunks, usually 512 bytes in size, instead of 1
byte at a time.
Flash memory has many applications. PC BIOS chip might be the most common form of Flash memory.
Removable solid-state storage devices are becoming increasingly popular. Smart Media and Compact
Flash cards are both well-known, especially as "electronic film" for digital cameras. Other removable
Flash memory products include Sony's Memory Stick, PCMCIA memory cards, and memory cards for
video game systems.
Page 43
Computer Organization
10CS46
Page 44
Computer Organization
10CS46
The correspondence between the main memory and cache is specified by a Mapping function.
When the cache is full and a memory word that is not in the cache is referenced, the cache control
hardware must decide which block should be removed to create space for the new block that constitutes
the Replacement algorithm.
Mapping Functions
There are three main mapping techniques which decides the cache organization:
1. Direct-mapping technique
2. Associative mapping Technique
3. Set associative mapping technique
To discuss possible methods for specifying where memory blocks are placed in the cache, we use a
specific small example, a cache consisting of 128 blocks of 16 word each, for a total of2048(2k) word,
and assuming that the main memory is addressable by a 16-bit address. The main memory has 64k word,
which will be viewed as 4K blocks of 16 word each, the consecutive addresses refer to consecutive word.
It is the simplest mapping technique, in which each block from the main memory has only one possible
location in the cache organization. For example, the block I of the main memory maps on to block i
module128 of the cache. Therefore, whenever one of the main memory blocks 0, 128, 256, . Is
loaded in the cache, it is stored in the block 0. Block 1, 129, 257,..are stored in block 1 of the cache and
so on.
Page 45
Computer Organization
10CS46
Set-Associative Mapping
It is a combination of the direct and associative-mapping techniques can be used. Blocks of the cache are
grouped into sets and the mapping allows a block of main memory to reside in any block of the specific
set. In this case memory blocks 0, 64,1284032 mapped into cache set 0, and they can occupy either
of the two block positions within this set. The cache might contain
Dept Of CSE, ACE Bangalore
Page 46
Computer Organization
10CS46
the desired block. The tag field of the address must then be associatively compared to the tags of the two
blocks of the set to check if the desired block is present this two associative search is simple to implement
Replacement Algorithms
In a direct-mapped cache, the position of each block is fixed, hence no replacement strategy exists. In
associative and set-associative caches, when a new block is to be brought into the cache and all the
Positions that it may occupy are full, the cache controller must decide which of the old blocks to
overwrite. This is important issue because the decision can be factor in system performance.
The objective is to keep blocks in the cache that are likely to be referenced in the near future. Its not easy
to determine which blocks are about to be referenced. The property of locality of reference gives a clue to
a reasonable strategy. When a block is to be over written, it is sensible to overwrite the one that has gone
the longest time without being referenced. This block is called the least recently used(LRU) block, and
technique is called the LRU Replacement algorithm. The LRU algorithm has been used extensively for
many access patterns, but it can lead to poor performance in some cases. For example, it produces
disappointing results when accesses are made to sequential elements of an array that is slightly too large
to fit into the cache. Performance of LRU algorithm can be improved by introducing a small amount of
randomness in deciding which block to replace.
Virtual Memory
Dept Of CSE, ACE Bangalore
Page 47
Computer Organization
10CS46
A cache stores a subset of the address space of RAM. An address space is the set of valid addresses.
Thus, for each address in cache, there is a corresponding address in RAM. This subset of addresses (and
corresponding copy of data) changes over time, based on the behavior of your program.
Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed
quickly. This is necessary because CPU speeds increase much faster than speed of memory access. If we
could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because
it can't keep up, we use cache.
One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10
Megs of disk space. At any time, only 1 Meg resides in RAM. In effect, RAM acts like cache for disk.
This idea of extending memory is called virtual memory. It's called "virtual" only because it's not RAM. It
doesn't mean it's fake.
The real problem with disk is that it's really, really slow to access. If registers can be accessed in 1
nanosecond, and cache in 5 ns and RAM in about 100 ns, then disk is accessed in fractions of seconds. It
can be a million times slower to access disk than a register.
The advantage of disk is it's easy to get lots of disk space for a small cost. Still, because disk is so slow to
access, we want to avoid accessing disk unnecessarily.
Page 48
Computer Organization
10CS46
A load instruction
A store instruction
Fetching an instruction
Load/store create data addresses, while fetching an instruction creates instruction addresses. Of course,
RAM doesn't distinguish between the two kinds of addresses. It just sees it as an address.
Each address generated by a program is considered virtual. It must be translated to a real physical address.
Thus, address translation is occurring all the time. As you might imagine, this must be handled in
hardware, if it's to be done efficiently.
You might think translating each address from virtual to physical is a crazy idea, because of how slow it
is. However, you get memory protection from address translation, so it's worth the hardware needed to get
memory protection.
Secondary Storage
Electronic data is a sequence of bits. This data can either reside in
primary storage - main memory (RAM), relatively small, fast access, expensive (cost per MB), volatile
(go away when power goes off)
secondary storage - disks, tape, large amounts of data, slower access, cheap (cost per MB), persistent
(remain even when power is off)
We will focus on secondary storage since the collections of data in databases are usually both too large to
fit in primary storage and must be persistent.
Hard Disks
Features
spinning platter of special material
mechanical arm with read/write head must be close to the platter to read/write data
data is stored magnetically (if you'd like to keep your data it is usually best to avoid using
powerful magnets near your hard disk)
sometimes the mechanical arm digs into platter, resulting in a very bad crash and subsequent
loss of data on part of your hard disk
storage capacity is commonly between 2GB - 11GB
disks are random access meaning data can be read/written anywhere on the disk to read a piece
of data, the mechanical arm must be repositioned over the place in the platter where that data is
stored, this is called the disk seek. 8 to 15 milliseconds is a common seek time.
once the arm has been positioned the data transfer rate varies, but commonly is between 1MB
and 10MB a second
a 5GB hard disk will cost anywhere from $300 to $1500, there are many options and vendors
SCSI (Small Computer System Interface), special hardware to improve throughput, 100s MB
per second transfer rates
solid state hard disks, with no mechanical parts, are starting to become commercially available,
they are generally faster and more expensive
Page 49
Computer Organization
10CS46
Optical Disks
CD-ROM - read only (books, software releases)
WORM - write once, read many (archival storage)
laser encoding, not magnetic
30-50 ms seek times
640MB - 17GB storage capacity
cheaper than hard disks per MB of storage capacity, but slower
portable
Jukeboxes of optical disks are becoming popular for storing really, really large collections of
data. The Mercury-20 jukebox (no I'm not selling these, just using it as atypical example)
provides access to up to 150 CD-ROMs, or in other words 94GBs of storage capacity. The
Mercury jukebox takes a maximum of four seconds to exchange and load a disc into a drive, 2.5
seconds to spin up and access the data and 10 seconds to transfer a 6.0 MB file to the computer or
server
Page 50