Vous êtes sur la page 1sur 71

Memory

Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Big Picture: Building a Processor

memory

inst

register file
+4

alu

+4

PC
offset

=? control
imm cmp

addr din dout memory

new pc

target extend

A Single cycle processor

Goals for today


Review
Finite State Machines

Memory
Register Files Tri-state devices SRAM (Static RAMrandom access memory) DRAM (Dynamic RAM)

Which statement(s) is true


(A) In a Moore Machine output depends on both current state and input (B) In a Mealy Machine output depends on current state and input (C) In a Mealy Machine output depends on next state and input (D) All the above are true (E) None are true

Mealy Machine
General Case: Mealy Machine
Registers

Current State Input

Comb. Logic

Output Next State

Outputs and next state depend on both current state and input

Moore Machine
Special Case: Moore Machine
Registers

Current State Input

Comb. Logic Comb. Logic

Output Next State

Outputs depend only on current state

Example #2: Digital Door Lock


Digital Door Lock Inputs:
keycodes from keypad clock

Outputs:
unlock signal display how many keys pressed so far

Door Lock: Inputs


Assumptions:
signals are synchronized to clock Password is B-A-B
K A B Meaning 0 0 0 (no key) 1 1 0 A pressed 1 0 1 B pressed

K A B

Door Lock: Outputs


Assumptions:
High pulse on U unlocks door D3D2D1D0
4 LED 8 dec

Strategy: (1) Draw a state diagram (e.g. Moore Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs

Door Lock: Simplified State Diagram

G1 1

A else
G2 2

G3 3, U

B
Idle 0

else

any

else
B1 1

any
else B2
2

else

B3 3

(1) Draw a state diagram (e.g. Moore Machine)

Door Lock: Simplified State Diagram

G1 1

A else
G2 2

G3 3, U

B
Idle 0

else

any

else
B1 1

else
else B2
2

(1) Draw a state diagram (e.g. Moore Machine)

Door Lock: Simplified State Diagram

G1 1

A else
G2 2

B Cur. State

G3 3, U

B
Idle 0

else

any Output

else
B1 1

else
else B2
2

(2) Write output and next-state tables

Door Lock: Simplified State Diagram

G1 1

A else
G2 2

B Cur. State Idle G1 G2 G3 B1 B2

G3 3, U

B
Idle 0

else

any Output 0 1 2 3, U 1 2

else
B1 1

else
else B2
2

(2) Write output and next-state tables

Door Lock: Simplified State Diagram

G1 1

Cur. State

Input

Next State

A
else

G2 2

G3 3, U

B
Idle 0

else

any

else
B1 1

else

else B2

(2) Write output and next-state tables

Door Lock: Simplified State Diagram

G1 1

A
else

B
Idle 0

else
B1 1

else

else

(2) Write output and next-state tables

Cur. State Idle G2 Idle B 2Idle G1 else G1 G1 G2 G2 G2 G3 B2 B1 2 B1 B2 B2

Input Next State Idle BG3 G1 3, U A B1 any G1 A G2 B B2 B2 B G3 A Idle any Idle B1 K B2 B2 K Idle

State Table Encoding


S2 S1 S0 Cur. State 0 Idle 0 0 0 G1 1 0 0 G2 0 1 0 G3 1 1 1 B1 0 0 1 B2 1 0 D3 0 0 0 0 0 0 D2 D1 D0 Output 0 0 0 0 0 1 1 0 0 2 0 1 03, U1 1 0 1 1 0 0 2 0 1 U 0 0 0 1 0 0 Cur.S1 S0 S2 State 0 Idle 0 0 0 Idle 0 0 0 Idle 0 0 0 G1 1 0 0 G1 1 0 0 G1 1 0 0 G2 0 1 0 G2 0 1 0 G2 0 1 0 G3 1 1 1 B1 0 0 1 B1 0 0 1 B2 1 0 1 B2 1 0 K Input B A 0 0 0 1 B 1 0 1 A 0 1 0 0 0 1 A 0 1 1 B 1 0 0 0 0 1 B 1 0 1 A 0 1 x any x x 0 0 0 1 K x x 0 0 0 1 K x x S2 SState0 Next 1 S 0 Idle 0 0 0 G1 1 0 1 B1 0 0 0 G1 1 0 0 G2 0 1 1 B2 1 0 0 B2 0 1 0 G3 1 1 0 Idle 0 0 0 Idle 0 0 1 B1 0 0 1 B2 1 0 1 B2 1 0 0 Idle 0 0

State B4 Meaning S0 S2 S1 8 KD A D3D2D1 0 Idle 0 dec 0 0 0 (no key) 0 0 U 0 0 G1 0 A pressed1 1 1 G2 1 B pressed0 0 1K 1 0 A G3 0 1 1 B B1 1 0 0 (3) Encode states, inputs, and outputs as bits B2 1 0 1

Door Lock: Implementation


3bit Reg

S2-0

D3-0
U

clk

S2-0 K A S2-0

U = S2S1S0 D0 = S2S1S0 + S2S1S0 + S2 S1S0 D1 = S2S1S0 + S2S1S0 + S2S1S0

S2 0 0 0 0 1 1

dec

S1 0 0 1 1 0 0

S0 0 1 0 1 0 1

D3 0 0 0 0 0 0

D2 0 0 0 0 0 0

D1 0 0 1 1 0 1

D0 0 1 0 1 1 0

U 0 0 0 1 0 0

(4) Determine logic equations for next state and outputs

Door Lock: Implementation


3bit Reg

S2-0

clk

S2-0 K A

S0 = ? S1 = ? S2 = S2S1S0KAB + S2S1S0KAB + S2S1S2KAB + S2S1S0K + S2 S1S0 KAB

S2 S1 S 0 4 D0 0 0 3-0 0 0 0 U 0 0 0 0 0 1 0 0 1 0 0 1 S 1 0 2-0 0 0 1 0 0 1 0 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1

K 0 1 1 0 1 1 0 1 1 x 0 1 0 1

A 0 0 1 0 1 0 0 0 1 x 0 x 0 x

B 0 1 0 0 0 1 0 1 0 x 0 x 0 x

S2 S1 S0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0

dec

Door Lock: Implementation


3bit Reg

S2-0

D3-0
U

clk

S2-0 K A S2-0

Strategy: (1) Draw a state diagram (e.g. Moore Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs

dec

Administrivia
Make sure to go to your Lab Section this week
Find project partners this week (for upcoming project1 next week) Lab2 due in class this week (it is not homework) Design Doc for Lab1 due yesterday, Monday, Feb 4th Completed Lab1 due next week, Monday, Feb 11th Work alone

Homework1 is due Wednesday


Work alone Academic Integrity

BUT, use your resources


Lab Section, Piazza.com, Office Hours, Homework Help Session, Class notes, book, Sections, CSUGLab

Second C Primer: Thursday, B14 Hollister, 6-8pm

Administrivia
Check online syllabus/schedule
http://www.cs.cornell.edu/Courses/CS3410/2013sp/schedule.html

Slides and Reading for lectures Office Hours Homework and Programming Assignments Prelims (in evenings):
Tuesday, February 26th Thursday, March 28th Thursday, April 25th

Schedule is subject to change

Collaboration, Late, Re-grading Policies


Black Board Collaboration Policy
Can discuss approach together on a black board Leave and write up solution independently Do not copy solutions

Late Policy
Each person has a total of four slip days Max of two slip days for any individual assignment Slip days deducted first for any late assignment,

cannot selectively apply slip days For projects, slip days are deducted from all partners 25% deducted per day late after slip days are exhausted

Regrade policy
Submit written request to lead TA,

and lead TA will pick a different grader Submit another written request, lead TA will regrade directly Submit yet another written request for professor to regrade.

Lab1: Thoughts thus far


Great experience! But, read all instructions (multiple times)! Sign up for design doc meetings early
We were very lenient for Lab1 In future, CMS will lock the Friday after release lose points for not signing up before lock, canceling last minute, and/or not showing up

Design doc meeting


Not the same as office hours You are presenting your current design during design doc meeting Bring printout of your design document TA will grade and give feedback based on your presentation of your design

Goals for today


Memory
CPU: Register Files (i.e. Memory w/in the CPU) Scaling Memory: Tri-state devices Cache: SRAM (Static RAMrandom access memory) Memory: DRAM (Dynamic RAM)

Goal:
How do we store results from ALU computations? How do we use stored results in subsequent operations? Register File How does a Register File work? How do we design it?

Big Picture: Building a Processor

memory

inst

register file
+4

alu

+4

PC
offset

=? control
imm cmp

addr din dout memory

new pc

target extend

A Single cycle processor

Register File
Register File
N read/write registers Indexed by register number
32

DW Dual-Read-Port

QA

32

Single-Write-Port Q B 32 x 32 Register File W


1

32

RW RA RB
5 5 5

Register File
D0 D1 D2 D3
4

Recall: Register D flip-flops in parallel shared clock extra clocked inputs: write_enable, reset,

clk

4-bit reg 4 clk

Register File
Register File
N read/write registers Indexed by register number
32 D
5-to-32 decoder

Reg 0 Reg 1

Reg 30 Reg 31

RW W

How to write to one register in the register file?


Need a decoder

Activity# write truth table for 3-to-8 decoder


Register File
N read/write registers Indexed by register number
32 D
5-to-32 decoder

Reg 0 Reg 1

Reg 30 Reg 31

RW W

How to write to one register in the register file?


Need a decoder

Register File
Register File
N read/write registers Indexed by register number
Reg 0 Reg 1 32
M U X

32 QA

Reg 30 Reg 31

. How to read from two registers?


Need a multiplexor
5 RA

M U X

32 QB

5 RB

Register File
Register File
N read/write registers Indexed by register number
32 D
Reg 0 Reg 1
5-to-32 decoder

32

Reg 30 Reg 31

M U X

32 QA

Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port

M U X

32 QB

5 W RW

5 RA

5 RB

Register File
Register File
N read/write registers Indexed by register number
32

DW Dual-Read-Port

QA

32

Single-Write-Port Q B 32 x 32 Register File W


1

32

Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port

RW RA RB
5 5 5

Register File
Register File
What happens if same N read/write registers register read and written Indexed by during same clock cycle?
register number

Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port

Tradeoffs
Register File tradeoffs
a
+ Very fast (a few gate delays for b both read and write) c + Adding extra ports is d straightforward e Doesnt scale f e.g. 32MB register file with g 32 bit registers Need 32x 1M-to-1 multiplexor h and 32x 10-to-1M decoder How many logic gates/transistors?

8-to-1 mux

s2s1 0 s

Takeway
Register files are very fast storage (only a few gate delays), but does not scale to large memory sizes.

Goals for today


Memory
CPU: Register Files (i.e. Memory w/in the CPU) Scaling Memory: Tri-state devices Cache: SRAM (Static RAMrandom access memory) Memory: DRAM (Dynamic RAM)

Next Goal
How do we scale/build larger memories?

Building Large Memories


Need a shared bus (or shared bit line)
Many FlipFlops/outputs/etc. connected to single wire Only one output drives the bus at a time

D0 S0

D1 S1

D2 S2

D3 S3

D1023 S1023

shared line
How do we build such a device?

Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Activity#2: Tri-State Buffer from an Inverter


Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

E
Vsupply

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Q
D

Gnd

Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Q
D

E D

Vsupply

Gnd

Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Q
D

E
0 0
A
0 0 1

Vsupply
1 0

off
off

B
0 1 0

AND NAND
0 0 0 1 1 1

A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0

Gnd

1 1 1

Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Q
D
0

E
1 0 1 0
A
0 0 1

Vsupply
1 1

off
on

B
0 1 0

AND NAND
0 0 0 1 1 1

A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0

Gnd

1 1 1

Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)

D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1

Q
D
1

E
1 1 1 1
A
0 0 1

Vsupply
0 0

on
off

B
0 1 0

AND NAND
0 0 0 1 1 1

A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0

Gnd

1 1 1

Shared Bus
D0 S0 D1 S1 D2 S2 D3 S3 D1023 S1023

shared line

Takeway
Register files are very fast storage (only a few gate delays), but does not scale to large memory sizes. Tri-state Buffers allow scaling since multiple registers can be connected to a single output, while only one register actually drives the output.

Goals for today


Memory
CPU: Register Files (i.e. Memory w/in the CPU) Scaling Memory: Tri-state devices Cache: SRAM (Static RAMrandom access memory) Memory: DRAM (Dynamic RAM)

Next Goal
How do we build large memories?

Use similar designs as Tri-state Buffers to connect multiple registers to output line. Only one register will drive output line.

SRAM
Static RAM (SRAM)Static Random Access Memory
Essentially just D-Latches plus Tri-State Buffers A decoder selects which line of memory to access Data (i.e. word line) A R/W selector determines the type of access That line is then coupled to the data lines
Address Decoder

SRAM
Static RAM (SRAM)Static Random Access Memory
Essentially just D-Latches plus Tri-State Buffers A decoder selects which line of memory to access (i.e. word line) A R/W selector determines the 22 Address type of access That line is then coupled to SRAM 8 8 4M x 8 the data lines Din
Dout

Chip Select Write Enable Output Enable

SRAM
Din[1] Din[2]
D Q
enable

E.g. How do we design a 4 x 2 SRAM Module?


0

D Q
enable

(i.e. 4 word lines that are each 2 bits wide)?


Address 2

2-to-4 decoder

D Q
enable

D Q
enable

D Q 4 x 2 SRAM
enable

D Q
enable

2 D Q 3
enable

D Q
enable

Write Enable Output Enable Dout[1]

Dout[2]

SRAM
Din[1] Din[2]
D Q
enable

E.g. How do we design a 4 x 2 SRAM Module?


0

D Q
enable

(i.e. 4 word lines that are each 2 bits wide)?


Address 2

2-to-4 decoder

D Q
enable

D Q
enable

D Q
enable

D Q
enable

2 D Q 3
enable

D Q
enable

Write Enable Output Enable Dout[1]

Dout[2]

SRAM
Word line

Bit line

Din[1]
D Q
enable

Din[2]
D Q
enable

E.g. How do we design a 4 x 2 SRAM Module?


0

(i.e. 4 word lines that are each 2 bits wide)?


Address 2

2-to-4 decoder

D Q
enable

D Q
enable

D Q
enable

D Q
enable

2 D Q 3
enable

D Q
enable

Write Enable Output Enable Dout[1]

Dout[2]

SRAM Cell
Pass-Through Transistors

bit line

Typical SRAM Cell

word line

Each cell stores one bit, and requires 4 8 transistors (6 is typical)

SRAM Cell
2) Enable(wordline = 0) Disabled (wordline = 1)
1) Pre-charge B = Vsupply/2 3) Cell pulls B high i.e. B = 1

bit line

Typical SRAM Cell


1
on off

word line
1) Pre-charge B = Vsupply/2 3) Cell pulls B low i.e. B = 0

0
on off

Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference

SRAM Cell
1) Enable(wordline = 0) Disabled (wordline = 1)

bit line

Typical SRAM Cell


1 0
2) Drive B low i.e. B = 0

word line

0 1
on off
2) Drive B high i.e. B = 1

on off

Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference Write: pull word line high drive B and B to flip cell

SRAM
Word line

Bit line

Din[1]
D Q
enable

Din[2]
D Q
enable

E.g. How do we design a 4 x 2 SRAM Module?


0

(i.e. 4 word lines that are each 2 bits wide)?


Address 2

2-to-4 decoder

D Q
enable

D Q
enable

D Q
enable

D Q
enable

2 D Q 3
enable

D Q
enable

Write Enable Output Enable Dout[1]

Dout[2]

SRAM
Din[1] Din[2]
D Q
enable

E.g. How do we design a 4 x 2 SRAM Module?


0

D Q
enable

(i.e. 4 word lines that are each 2 bits wide)?


Address 2

2-to-4 decoder

D Q
enable

D Q
enable

D Q 4 x 2 SRAM
enable

D Q
enable

2 D Q 3
enable

D Q
enable

Write Enable Output Enable Dout[1]

Dout[2]

SRAM
Din

E.g. How do we design a 4M x 8 SRAM Module? (i.e. 4M word lines that are each 8 bits wide)?
22 Address

4M x 8 SRAM

Chip Select Write Enable Output Enable Dout

SRAM
E.g. How do we design a 4M x 8 SRAM Module? 4M x 8 SRAM
12 x 4096 decoder

Address [21-10]

12

4k x 4k x 4k x 4k x 4k x 4k x 4k x 4k x 1024 1024 1024 1024 1024 1024 1024 1024 SRAM SRAMSRAM SRAMSRAM SRAMSRAM SRAM

Address [9-0]

10

1024

1024 1024

1024 1024

1024 1024

1024

mux mux mux


1 1 1

mux mux mux mux


1 1 1 1

mux
1

Dout[7]Dout[6] Dout[5]Dout[4] Dout[3]Dout[2] Dout[1]Dout[0]

SRAM
E.g. How do we design a 4M x 8 SRAM Module? 4M x 8 SRAM
Row decoder

Address [21-10]

12

4k x 4k x 4k x 4k x 4k x 4k x 4k x 4k x 1024 1024 1024 1024 1024 1024 1024 1024 SRAM SRAMSRAM SRAMSRAM SRAMSRAM SRAM

Address [9-0]

10

1024

1024 1024

1024 1024

1024 1024

1024

Chip Select (CS) R/W Enable

column selector, sense amp, and I/O circuits

Shared Data Bus

SRAM Modules and Arrays


4M x 8 SRAM 4M x 8 SRAM 4M x 8 SRAM 4M x 8 SRAM

A21-0

R/W

CS
msb lsb CS

Bank 2 Bank 3 Bank 4

CS

CS

SRAM Summary
SRAM A few transistors (~6) per cell Used for working memory (caches) But for even higher density

Dynamic RAM: DRAM


Dynamic-RAM (DRAM)
Data values require constant refresh bit line

word line

Capacitor Gnd
Each cell stores one bit, and requires 1 transistors

Dynamic RAM: DRAM


Dynamic-RAM (DRAM)
Pass-Through Transistors

Data values require constant refresh

bit line

word line

Capacitor Gnd
Each cell stores one bit, and requires 1 transistors

Dynamic RAM: DRAM


Dynamic-RAM (DRAM)
Disabled (wordline = 0) 2) Enable (wordline = 1)

bit line

word line

0 Capacitor Gnd
off on

1) Pre-charge B = Vsupply/2 3) Cell pulls B low i.e. B = 0

Each cell stores one bit, and requires 1 transistors Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B low, sense amp detects voltage difference

Dynamic RAM: DRAM


Dynamic-RAM (DRAM)
Disabled (wordline = 0) 1) Enable (wordline = 1)

bit line

word line

0 1 Capacitor Gnd
on off
2) Drive B high i.e. B = 1 Charges capacitor

Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference Write: pull word line high drive B charges capacitor

DRAM vs. SRAM


Single transistor vs. many gates
Denser, cheaper ($30/1GB vs. $30/2MB) But more complicated, and has analog sensing

Also needs refresh


Read and write back every few milliseconds Organized in 2D grid, so can do rows at a time Chip can do refresh internally

Hence slower and energy inefficient

Memory
Register File tradeoffs
+ + Very fast (a few gate delays for both read and write) Adding extra ports is straightforward Expensive, doesnt scale Volatile

Volatile Memory alternatives: SRAM, DRAM,


Slower + Cheaper, and scales well Volatile

Non-Volatile Memory (NV-RAM): Flash, EEPROM,


+ Scales well Limited lifetime; degrades after 100000 to 1M writes

Summary
We now have enough building blocks to build machines that can perform non-trivial computational tasks Register File: Tens of words of working memory SRAM: Millions of words of working memory DRAM: Billions of words of working memory NVRAM: long term storage (usb fob, solid state disks, BIOS, ) Next time we will build a simple processor!

Vous aimerez peut-être aussi