Académique Documents
Professionnel Documents
Culture Documents
memory
inst
register file
+4
alu
+4
PC
offset
=? control
imm cmp
new pc
target extend
Memory
Register Files Tri-state devices SRAM (Static RAMrandom access memory) DRAM (Dynamic RAM)
Mealy Machine
General Case: Mealy Machine
Registers
Comb. Logic
Outputs and next state depend on both current state and input
Moore Machine
Special Case: Moore Machine
Registers
Outputs:
unlock signal display how many keys pressed so far
K A B
Strategy: (1) Draw a state diagram (e.g. Moore Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs
G1 1
A else
G2 2
G3 3, U
B
Idle 0
else
any
else
B1 1
any
else B2
2
else
B3 3
G1 1
A else
G2 2
G3 3, U
B
Idle 0
else
any
else
B1 1
else
else B2
2
G1 1
A else
G2 2
B Cur. State
G3 3, U
B
Idle 0
else
any Output
else
B1 1
else
else B2
2
G1 1
A else
G2 2
G3 3, U
B
Idle 0
else
any Output 0 1 2 3, U 1 2
else
B1 1
else
else B2
2
G1 1
Cur. State
Input
Next State
A
else
G2 2
G3 3, U
B
Idle 0
else
any
else
B1 1
else
else B2
G1 1
A
else
B
Idle 0
else
B1 1
else
else
Input Next State Idle BG3 G1 3, U A B1 any G1 A G2 B B2 B2 B G3 A Idle any Idle B1 K B2 B2 K Idle
State B4 Meaning S0 S2 S1 8 KD A D3D2D1 0 Idle 0 dec 0 0 0 (no key) 0 0 U 0 0 G1 0 A pressed1 1 1 G2 1 B pressed0 0 1K 1 0 A G3 0 1 1 B B1 1 0 0 (3) Encode states, inputs, and outputs as bits B2 1 0 1
S2-0
D3-0
U
clk
S2-0 K A S2-0
S2 0 0 0 0 1 1
dec
S1 0 0 1 1 0 0
S0 0 1 0 1 0 1
D3 0 0 0 0 0 0
D2 0 0 0 0 0 0
D1 0 0 1 1 0 1
D0 0 1 0 1 1 0
U 0 0 0 1 0 0
S2-0
clk
S2-0 K A
S2 S1 S 0 4 D0 0 0 3-0 0 0 0 U 0 0 0 0 0 1 0 0 1 0 0 1 S 1 0 2-0 0 0 1 0 0 1 0 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1
K 0 1 1 0 1 1 0 1 1 x 0 1 0 1
A 0 0 1 0 1 0 0 0 1 x 0 x 0 x
B 0 1 0 0 0 1 0 1 0 x 0 x 0 x
S2 S1 S0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0
dec
S2-0
D3-0
U
clk
S2-0 K A S2-0
Strategy: (1) Draw a state diagram (e.g. Moore Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs
dec
Administrivia
Make sure to go to your Lab Section this week
Find project partners this week (for upcoming project1 next week) Lab2 due in class this week (it is not homework) Design Doc for Lab1 due yesterday, Monday, Feb 4th Completed Lab1 due next week, Monday, Feb 11th Work alone
Administrivia
Check online syllabus/schedule
http://www.cs.cornell.edu/Courses/CS3410/2013sp/schedule.html
Slides and Reading for lectures Office Hours Homework and Programming Assignments Prelims (in evenings):
Tuesday, February 26th Thursday, March 28th Thursday, April 25th
Late Policy
Each person has a total of four slip days Max of two slip days for any individual assignment Slip days deducted first for any late assignment,
cannot selectively apply slip days For projects, slip days are deducted from all partners 25% deducted per day late after slip days are exhausted
Regrade policy
Submit written request to lead TA,
and lead TA will pick a different grader Submit another written request, lead TA will regrade directly Submit yet another written request for professor to regrade.
Goal:
How do we store results from ALU computations? How do we use stored results in subsequent operations? Register File How does a Register File work? How do we design it?
memory
inst
register file
+4
alu
+4
PC
offset
=? control
imm cmp
new pc
target extend
Register File
Register File
N read/write registers Indexed by register number
32
DW Dual-Read-Port
QA
32
32
RW RA RB
5 5 5
Register File
D0 D1 D2 D3
4
Recall: Register D flip-flops in parallel shared clock extra clocked inputs: write_enable, reset,
clk
Register File
Register File
N read/write registers Indexed by register number
32 D
5-to-32 decoder
Reg 0 Reg 1
Reg 30 Reg 31
RW W
Reg 0 Reg 1
Reg 30 Reg 31
RW W
Register File
Register File
N read/write registers Indexed by register number
Reg 0 Reg 1 32
M U X
32 QA
Reg 30 Reg 31
M U X
32 QB
5 RB
Register File
Register File
N read/write registers Indexed by register number
32 D
Reg 0 Reg 1
5-to-32 decoder
32
Reg 30 Reg 31
M U X
32 QA
Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port
M U X
32 QB
5 W RW
5 RA
5 RB
Register File
Register File
N read/write registers Indexed by register number
32
DW Dual-Read-Port
QA
32
32
Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port
RW RA RB
5 5 5
Register File
Register File
What happens if same N read/write registers register read and written Indexed by during same clock cycle?
register number
Implementation:
D flip flops to store bits Decoder for each write port Mux for each read port
Tradeoffs
Register File tradeoffs
a
+ Very fast (a few gate delays for b both read and write) c + Adding extra ports is d straightforward e Doesnt scale f e.g. 32MB register file with g 32 bit registers Need 32x 1M-to-1 multiplexor h and 32x 10-to-1M decoder How many logic gates/transistors?
8-to-1 mux
s2s1 0 s
Takeway
Register files are very fast storage (only a few gate delays), but does not scale to large memory sizes.
Next Goal
How do we scale/build larger memories?
D0 S0
D1 S1
D2 S2
D3 S3
D1023 S1023
shared line
How do we build such a device?
Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
E
Vsupply
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
Q
D
Gnd
Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
Q
D
E D
Vsupply
Gnd
Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
Q
D
E
0 0
A
0 0 1
Vsupply
1 0
off
off
B
0 1 0
AND NAND
0 0 0 1 1 1
A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0
Gnd
1 1 1
Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
Q
D
0
E
1 0 1 0
A
0 0 1
Vsupply
1 1
off
on
B
0 1 0
AND NAND
0 0 0 1 1 1
A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0
Gnd
1 1 1
Tri-State Devices
Tri-State Buffers
If enabled (E=1), then Q = D Otherwise, Q is not connected (z = high impedance)
D
E 0 0 1 1 D Q 0 z 1 z 0 0 1 1
Q
D
1
E
1 1 1 1
A
0 0 1
Vsupply
0 0
on
off
B
0 1 0
AND NAND
0 0 0 1 1 1
A B OR NOR
0 0 0 0 1 1 1 0 1 1 0 0
Gnd
1 1 1
Shared Bus
D0 S0 D1 S1 D2 S2 D3 S3 D1023 S1023
shared line
Takeway
Register files are very fast storage (only a few gate delays), but does not scale to large memory sizes. Tri-state Buffers allow scaling since multiple registers can be connected to a single output, while only one register actually drives the output.
Next Goal
How do we build large memories?
Use similar designs as Tri-state Buffers to connect multiple registers to output line. Only one register will drive output line.
SRAM
Static RAM (SRAM)Static Random Access Memory
Essentially just D-Latches plus Tri-State Buffers A decoder selects which line of memory to access Data (i.e. word line) A R/W selector determines the type of access That line is then coupled to the data lines
Address Decoder
SRAM
Static RAM (SRAM)Static Random Access Memory
Essentially just D-Latches plus Tri-State Buffers A decoder selects which line of memory to access (i.e. word line) A R/W selector determines the 22 Address type of access That line is then coupled to SRAM 8 8 4M x 8 the data lines Din
Dout
SRAM
Din[1] Din[2]
D Q
enable
D Q
enable
2-to-4 decoder
D Q
enable
D Q
enable
D Q 4 x 2 SRAM
enable
D Q
enable
2 D Q 3
enable
D Q
enable
Dout[2]
SRAM
Din[1] Din[2]
D Q
enable
D Q
enable
2-to-4 decoder
D Q
enable
D Q
enable
D Q
enable
D Q
enable
2 D Q 3
enable
D Q
enable
Dout[2]
SRAM
Word line
Bit line
Din[1]
D Q
enable
Din[2]
D Q
enable
2-to-4 decoder
D Q
enable
D Q
enable
D Q
enable
D Q
enable
2 D Q 3
enable
D Q
enable
Dout[2]
SRAM Cell
Pass-Through Transistors
bit line
word line
SRAM Cell
2) Enable(wordline = 0) Disabled (wordline = 1)
1) Pre-charge B = Vsupply/2 3) Cell pulls B high i.e. B = 1
bit line
word line
1) Pre-charge B = Vsupply/2 3) Cell pulls B low i.e. B = 0
0
on off
Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference
SRAM Cell
1) Enable(wordline = 0) Disabled (wordline = 1)
bit line
word line
0 1
on off
2) Drive B high i.e. B = 1
on off
Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference Write: pull word line high drive B and B to flip cell
SRAM
Word line
Bit line
Din[1]
D Q
enable
Din[2]
D Q
enable
2-to-4 decoder
D Q
enable
D Q
enable
D Q
enable
D Q
enable
2 D Q 3
enable
D Q
enable
Dout[2]
SRAM
Din[1] Din[2]
D Q
enable
D Q
enable
2-to-4 decoder
D Q
enable
D Q
enable
D Q 4 x 2 SRAM
enable
D Q
enable
2 D Q 3
enable
D Q
enable
Dout[2]
SRAM
Din
E.g. How do we design a 4M x 8 SRAM Module? (i.e. 4M word lines that are each 8 bits wide)?
22 Address
4M x 8 SRAM
SRAM
E.g. How do we design a 4M x 8 SRAM Module? 4M x 8 SRAM
12 x 4096 decoder
Address [21-10]
12
4k x 4k x 4k x 4k x 4k x 4k x 4k x 4k x 1024 1024 1024 1024 1024 1024 1024 1024 SRAM SRAMSRAM SRAMSRAM SRAMSRAM SRAM
Address [9-0]
10
1024
1024 1024
1024 1024
1024 1024
1024
mux
1
SRAM
E.g. How do we design a 4M x 8 SRAM Module? 4M x 8 SRAM
Row decoder
Address [21-10]
12
4k x 4k x 4k x 4k x 4k x 4k x 4k x 4k x 1024 1024 1024 1024 1024 1024 1024 1024 SRAM SRAMSRAM SRAMSRAM SRAMSRAM SRAM
Address [9-0]
10
1024
1024 1024
1024 1024
1024 1024
1024
A21-0
R/W
CS
msb lsb CS
CS
CS
SRAM Summary
SRAM A few transistors (~6) per cell Used for working memory (caches) But for even higher density
word line
Capacitor Gnd
Each cell stores one bit, and requires 1 transistors
bit line
word line
Capacitor Gnd
Each cell stores one bit, and requires 1 transistors
bit line
word line
0 Capacitor Gnd
off on
Each cell stores one bit, and requires 1 transistors Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B low, sense amp detects voltage difference
bit line
word line
0 1 Capacitor Gnd
on off
2) Drive B high i.e. B = 1 Charges capacitor
Each cell stores one bit, and requires 4 8 transistors (6 is typical) Read: pre-charge B and B to Vsupply/2 pull word line high cell pulls B or B low, sense amp detects voltage difference Write: pull word line high drive B charges capacitor
Memory
Register File tradeoffs
+ + Very fast (a few gate delays for both read and write) Adding extra ports is straightforward Expensive, doesnt scale Volatile
Summary
We now have enough building blocks to build machines that can perform non-trivial computational tasks Register File: Tens of words of working memory SRAM: Millions of words of working memory DRAM: Billions of words of working memory NVRAM: long term storage (usb fob, solid state disks, BIOS, ) Next time we will build a simple processor!