Vous êtes sur la page 1sur 9
ECEN 4303 Digital VLSI Design Clocked Systems A synchronous design style with a system clock

ECEN

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design Clocked Systems A synchronous design style with a system clock is

Clocked Systems

A synchronous design style with a system clock is used in the vast majority of digital sys-

tems. Asynchronous designs without a system clock must be carefully designed to guar- antee correct operation even though there can be considerable variation in the delay of circuit elements. Clocked systems, on the other hand, can be designed more easily by

meeting simple constraints on the minimum and maximum delay. Specialized circuit ele- ments are used with the clock signal.

Fig. 7.1, p. 384 Latches and Flip-Flops

Clocked Circuit Elements

In the following, the circuit implementations of latches and flip-flops are grouped accord-

ing to the number of global clock lines needed. The trend in modern design has been to use implementations with the minimum number of global clock lines to save area and to limit the engineering effort needed to insure proper distribution of the clock signal to all parts of the chip.

True Single Phase Clock.

Static implementations:

Full CMOS latches

D

Phase Clock. Static implementations: Full CMOS latches D clk Q Q level high sensitive D Q
Phase Clock. Static implementations: Full CMOS latches D clk Q Q level high sensitive D Q

clk

Q

Q

level high sensitive

D

Q Q
Q
Q

clk

level low sensitive

Can synthesize as 2 complex CMOS gates plus inverter => no. of FETs = 2x6 + 2 = 14

ECEN 4303 Digital VLSI Design Full CMOS gate implementation of master-slave D-FF D Q Q

ECEN

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design Full CMOS gate implementation of master-slave D-FF D Q Q clk

Full CMOS gate implementation of master-slave D-FF

D

Q Q
Q
Q

clk rising edge sensitive

D

of master-slave D-FF D Q Q clk rising edge sensitive D Q Q clk falling edge

Q

Q

clk falling edge sensitive

Can synthesize as 4 complex CMOS gates plus inverter => no. of FETs = 4x6 +2 = 26

Differential Pair master-slave D-FF: Fig. 7.29a, p. 413 is faster than full CMOS, but has about the same number of transistors if the fast asynchronous slave latch is used in Fig.

7.29b.

MUX Pass Transistor implementation is not practical for single phase clocking.

Q D Q reduced noise margin here clk
Q
D
Q
reduced noise margin here
clk

As we saw earlier, both the nFET and the pFET reduce the noise margin which is not tol- erable in modern processes.

Dynamic implementations:

Dynamic registers and latches depend on charge storage on internal nodes in the circuit for memory storage. If the charge leaks off between clock pulses, the memory state is lost. This puts a minimum frequency requirement on dynamic registers and latches. The increased leakage in deep sub-micron processes has reduced storage times to a µsec or less which means that the minimum operating frequency must be 1MHz or higher. Since current clock rates are in excess of 1GHz, this is not a major problem at present. However, since leakage is anticipated to be worse in the future, dynamic techniques could become impractical.

ECEN4303 Digital VLSI Design non-full swing pass transist or Fig. 7.17a, p. 404. The out

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design non-full swing pass transist or Fig. 7.17a, p. 404. The out

non-full swing pass transistor Fig. 7.17a, p. 404. The output is usually buffered as shown below since the output can be floating.

D Q D
D
Q
D

clk

Q
Q

clk

level high sensitive latch

level low sensitive latch

The necessity of using the pFET for the level low sensitive latch makes this implementa- tion unattractive, but it does have the minimum transistor count. Only 3 transistors are required for a latch and 6 for a flip-flop. Also, power consumption is a problem unless some of the techniques discussed previously are used to make it full swing.

True Single Phase Clock (TSPC) Fig. 7.30 p.414. These circuits are slower than other alternatives since the latch and flip-flop implementation require an extra level of logic. Also, the output usually needs buffering since it is a dynamic node.

Pulsed Latches:

Pulsed latches are difficult to get to work correctly, and for this reason, have not been used traditionally. With the emphasis on single global clock lines, pulsed latches have become more popular recently.

Any latch can be used with clock pulses as in the bottom of Fig. 7.2, p. 385. Signals are prevented from going through the latches except for the short period of time when the pulse is high. The trick is to keep the pulses short enough so that signals cannot propagate through more than one latch at a time. On the other hand, the pulse must be long enough to allow propagation of the signals through the latch.

Note that latches take the place of flip-flops. Flip-flops are not needed. Only level high sensitive latches are needed which makes a non-full swing implementation more attractive since pMOS pass FETs are not required.

more attractive since pMOS pass FETs are not required. combinational logic φ p Special care must
combinational logic
combinational
logic
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken
since pMOS pass FETs are not required. combinational logic φ p Special care must be taken

φ p

Special care must be taken when driving the pulsed clock, φ p , over long distances. Recall that the wires act like RC transmission lines which not only delays but spreads out any

ECEN4303 Digital VLSI Design edges. Over long clock lines, the edges of the clock pulse

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design edges. Over long clock lines, the edges of the clock pulse

edges. Over long clock lines, the edges of the clock pulse gradually merge and the whole pulse can disappear.

clk

gradually merge and the whole pulse can disappear. clk One solution to this problem is to

One solution to this problem is to have a normal global clock with a square wave where the edges are much farther apart and do not grow together so easily. Then, each pulsed latch includes a local pulse generator. It is relatively straightforward to design the local pulse generator so that it produces pulses of the desired length triggered by an edge on the global clock. Pulse generators Fig. 7.22, p. 408; combined pulse generator and latch Fig. 7.23 (since Q is a dynamic floating node, do not use it without buffering it first).

Pseudo Single Phase Clock

Use of both clk and clk control signals allows the use of transmission gates to make full swing logic to reduce power consumption and improve noise margin. To avoid two global

clock lines, clk is usually generated from clk with a local inverter as shown in Fig. 7.20, p.

406.

Static implementations:

MUX transmission gate implementation for latches Fig. 7.17e,f,g,h p. 404; regis- ters Fig. 7.19b p. 405

clk clk D clk
clk
clk
D
clk

Q

Q

level high sensitive latch

clk

D

clk
clk

clk clk clk D clk Q Q level high sensitive latch clk D clk Q Q level

Q

Q

level low sensitive latch

Can synthesize as 2 transmission gates plus 2 inverters => no. of FETs = 2x2 + 2x2 = 8 Note that this is a static implementation since charge storage is not used.

Clocked CMOS (C 2 MOS) Fig. 7.18, p. 405 is just a combined inverter and trans- mission gate. They can be used in static latches also, Fig. 7.17g,h p. 404.

ECEN 4303 Digital VLSI Design “jamb” latch Fig. 7.17i p. 404 uses a “weak” feedback

ECEN

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design “jamb” latch Fig. 7.17i p. 404 uses a “weak” feedback inve

“jamb” latch Fig. 7.17i p. 404 uses a “weak” feedback inverter fabricated with high impedance FETs (small width, possibly longer than minimum length).

Dynamic implementations:

latches Fig. 7.17c,d p. 404, registers Fig. 7.19a Latches synthesized as 1 transmission gate plus 1 inverter => no. of FETs = 2 + 2 = 4

True Two Phase Clock

Two non-overlapping clock signals can be used to insure that no signals can propagate through more than one latch at a time (middle of Fig. 7.2, p. 385). t nonoverlap is kept small compared with the clock period to avoid wasting time. It is very difficult to keep the two clock signals precisely aligned when t nonoverlap is short. To avoid two global clock lines, the two clock signals can be generated from a single global clock. The local two phase clocks can be shared between several nearby latches and/or registers.

clk

between severa l nearby latches and/or registers. clk φ 2 φ 1 Static Implementations: D Q

φ 2

φ 1

Static Implementations:

D

Q Q φ φ 2 1 D Q Q
Q
Q
φ
φ
2
1
D
Q
Q

φ 1

φ 2

level high φ 1 sensitive latch

level high φ 2 sensitive latch

Use different clock phases on alternate stages as shown in middle of Fig. 7.2, p. 385.

Dynamic Implementations:

D

Q
Q

φ

1

level high φ 1 sensitive latch

D

Q
Q

φ

2

level high φ 2 sensitive latch

ECEN 4303 Pseudo Two Phase Clocks Digital VLSI Design Use of both φ 1 ,

ECEN

4303

Pseudo Two Phase Clocks

Digital

VLSI

Design

ECEN 4303 Pseudo Two Phase Clocks Digital VLSI Design Use of both φ 1 , φ

Use of both φ 1 , φ 2 and φ 1 , φ 2 control signals allows the use of transmission gates to make full swing logic to reduce power consumption and improve noise margin. To avoid four global clock lines, the four clock signals can be generated from a single global clock. The increased complexity necessary to properly synchronize 4 clock lines makes it necessary to share the local clock signals over several latches and registers.

clk

local clock signals over several latches and registers. clk φ 2 φ 1 φ 1 φ

φ 2

φ 1

φ 1

φ 2

Static Implementations: Fig. 7.21, p. 407

Dynamic Implementations:

D

φ 1 Q
φ 1
Q

φ

1

level high φ 1 sensitive latch

D

φ 2 Q
φ 2
Q

φ

2

level high φ 2 sensitive latch

Register and Latch Timing with Two Phase Clocks

When using the two phase clocks, care must be taken to get the correct timing.

φ

1

,,, φ

φ

2

φ

1

2 are all distinctly different functions of time and must be connected correctly

to the transmission gates.

are non-overlapping high and therefore should be used to control the nFETs in the transmission gates.

are non-overlapping low and therefore should be use to control the pFETs in the transmission gates.

φ 1

φ 1

,

φ 2

,

φ 2

ECEN 4303 Digital VLSI Design φ 1 t
ECEN
4303
Digital
VLSI
Design
φ
1
t
φ 2
φ
2
φ 1
φ
1

t

t

φ 2 t pos edge φ 1 reg t pos edge φ 2 reg t
φ
2
t
pos
edge
φ
1
reg
t
pos
edge
φ
2
reg
t
φ 2
φ 1
D
Q
positive edge
φ 1
sensitive
φ 2
φ 1
φ 1
φ 2
D
Q
positive edge
φ 2
sensitive
φ 1
φ 2
ECEN 4303 Digital VLSI Design It is also necessary to specify which clock si gnal

ECEN

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design It is also necessary to specify which clock si gnal is

It is also necessary to specify which clock signal is used when classifying latches as high or low level sensitive and when classifying registers as positive or negative edge sensitive. Note that a positive edge φ 1 sensitive register has a different behavior than a positive edge

φ 2 sensitive register and the two should not be mixed.

Timing Constraints Fig. 7.4, p. 387, Table 7.1, p. 386 Timing Diagrams

Note that latches have setup and hold requirements relative to the clock edge that turns them off, not the clock edge that turns them on.

We will examine the timing constraints necessary to insure correct operation of the three clocking schemes in Fig. 7.2, p. 385. The two phase clocking scheme is the traditional way of using latches. The pulse latch scheme has become more popular as a way to reduce the number of clock lines that must be distributed across the chip.

Flip-Flop Timing Constraints. The maximum delay of a logic block between two flip- flops is constrained as shown in fig. 7.5, and eq 7.1, 7.2, p. 388.

The minimum delay of a logic block between two flip-flops is constrained as shown in fig. 7.9, and eq 7.7, p. 393. Flip-flops are often designed so that the right side of the inequality is negative, meaning that the flip-flops can be connected directly together without any intervening combinational logic.

Two Phase Latch Timing Constraints. A flip-flop can be regarded as being composed of two series latches clocked with complementary signals (fig. 7.3, p. 386). There is no real need to put combinational logic only between latch pairs; combinational logic can be put between all latches (fig. 7.7, p. 391) with the timing requirements in eq. 7.4.

Note that the constraint on the maximum logic delay in eq. 7.4 does not replace the requirement for D1 to stabilize a setup time before the falling edge of φ1 and D2 to stabi- lize a setup time before the falling edge of φ2. In fig. 7.7, the D inputs stabilize near the beginning of the clock phase, but the circuit will still work correctly if the D inputs are delayed less than half of a clock period (fig.7.12, p.397).

The combinational logic in the first half of the clock period “borrows” time from the com- binational logic in the second half of the clock period. The φ2 latch can be “moved” in time to anywhere within the φ2 clock phase. Borrowing can always take place across the internal half cycle boundary. Borrowing is also possible across the clock period boundary if there are no loops in the circuit (for example, pipelines). We will make use of this tech- nique next semester to design high performance pipelines.

Fig. 7.13, p. 397 and eq. 7.10, p. 396 maximum borrowing.

ECEN 4303 Digital VLSI Design The minimum delay of a logic block between tw o

ECEN

4303

Digital

VLSI

Design

ECEN 4303 Digital VLSI Design The minimum delay of a logic block between tw o latches

The minimum delay of a logic block between two latches is constrained as shown in fig. 7.10, p. 395 and eq 7.8, p. 394. Increasing the t nonoverlap forces the right side of the ine- quality negative, meaning that the latches can be connected directly together without any intervening combinational logic.

Pulsed Latch Timing Constraints. The maximum delay of a logic block between two pulsed latches is constrained as shown in fig. 7.8, and eq 7.5, 7.6, p. 392. No borrowing is possible if the pulse width is short.

The minimum delay of a logic block between two pulsed latches is constrained as shown in fig. 7.11, p. 395, and eq 7.9, p. 394.