Académique Documents
Professionnel Documents
Culture Documents
INTRODUCTION
1.1 AIM
The challenge of the verifying a large design is growing exponentially. There is a
need to define new methods that makes functional verification easy. Several strategies in
the recent years have been proposed to achieve good functional verification with less
effort. Recent advancement towards this goal is methodologies. The methodology defines
a skeleton over which one can add flesh and skin to their requirements to achieve
functional verification.
The report is organized as two major portions; first part is brief introduction and
history of the functional verification of regular Carry select adder which tells about
different advantages of Carry select adder and RCA architecture and in this Regular
model, there is a drawback and in order to overcome that complexity, the modified
architecture of CSLA has been designed.
Thus, the aim of this project is to design a simple and efficient gate level
modification to significantly reduce the area and powerof the CSLA.Based on this
modification, 8-, 16-, 32-, and 64-b square-root CSLA (SQRT CSLA) architecture have
been developed and compared withthe regular SQRT CSLA architecture.
Here in this design, the carry select adder is designed using Ripple carry adders and
multiplexers. The design can be viewed as groups where the groups are internally
designed using n-bit RCA and multiplexers.
Since this design uses multiple pairs of Ripple carry adders to generate partial sum
and carry, it is not area efficient. Thus Binary to Excess-1 Converter is used (BEC)
instead of RCA with Cin=1 in the regular CSLA to achievelower area and power
consumption. The main advantage of thisBEC logic comes from the lesser number of
logic gates than the n-bitFull Adder (FA) structure.
CHAPTER 2
LITERATURE REVIEW
2.1 INTRODUCTION TO VLSI:
The electronics industry has achieved a phenomenal growth over the last two
decades, mainly due to the rapid advances in integration technologies, large-scale systems
design due to the advent of VLSI. The number of applications of integrated circuits in
high-performance computing, telecommunications and consumer electronics has been
rising steadily and at a very fast pace. Typically, the required computational power of
these applications is the driving force for the fast development of this field.
The figure 2.1 gives an overview of the prominent trends in information technologies
over the next few decades. The current leadingedge technologies already provide the endusers, a certain amount of processing power and portability.
As more and more complex functions are required in various data processing and
telecommunications devices, the need to integrate these functions in a small
system/package is also increasing.
The level of integration, as measured by the number of logic gates in a monolithic
chip, has been steadily rising for almost three decades mainly due to the rapid progress in
processing technology and interconnect technology.
Table 2.1 shows the evolution of logic complexity in integrated circuits over the last
three decades and marks the milestones of each era. Here, the numbers for circuit
complexity should be interpreted only as representative examples to show the order-ofmagnitude. A logic block can contain anywhere from 10 to 100 transistors, depending on
the function. State-of-the-art examples of ULSI chips, such as the DEC Alpha or the
INTEL Pentium contain 3 to 6 million transistors.
T
able-2.1: Evolution of logic complexity in integrated circuits
The most important message here is that the logic complexity per chip has been (and
still is) increasing exponentially. The monolithic integration of a large number of
functions on a single chip usually provides:
The discussionon different types of adders is carried out here and the comparison is
carried out with respect to their own functionalities.
most substantial areas of research in VLSI system design. In digital adders, the speed of
addition is limited by the time required to propagate a carry through the adder. The sum
for each bit position in an elementary adder is generated sequentially only after the
previous bit position has been summed and a carry propagated into the next position.
The CSLA is used in many computational systems to alleviate the problem of carry
propagation delay by independently generating multiple carries and then select a carry to
generate the sum. However, the CSLA is not area efficient because it uses multiple pairs
of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry
input Cin=0 and Cin=1, then the final sum and carry are selected by the multiplexers
(mux).
Adder is about a digital circuit. In electronics, an adder or summer is a digital
circuit that performs addition of numbers. In many computers and other kinds of
processors, adders are used not only in the arithmetic logic units, but also in other parts of
the processor, where they are used to calculate addresses, table indices and similar.
Although adders can be constructed for many numerical representations, such
as binary-coded
decimal or excess-3
but
the
most
common
adders
operate
on binary numbers. In cases where two's complement or ones' complement is being used
to represent negative numbers, it is trivial to modify an adder into an addersubtractor.
Other signed number representations require a more complex adder.
2.2.1.Basic Adders
2.2.1.1 Half Adder
+ (Cin . (A B)).
A full adder can be constructed from two half adders by connecting A and B to the
input of one half adder, connecting the sum from that to an input to the second adder,
connecting Ci to the other input and OR the two carry outputs. Equivalently, S could be
made the three-bit XOR of A, B and Ci and Cout could be made the three-bit majority
function of A, B and Ci.
10
11
12
Other adder designs include thecarry save adder, carry-select adder, conditional-sum
adder, carry-skip adder and carry-complete adder.
14
next block.
If Ai = Bi = 1 for some i in the group, a carry is generated which may be
The basic idea of a carry-skip adder is to detect if in each group all Ai # Bi and
enable the blocks carryin to skip the block when this happens as shown in figure 2.8.
In general, a blockskip delay can be different from the delay due to the propagation
of a carry to the next bit position. With carry skip adders, the linear growth of carry chain
delay with the size of the input operands is improved by allowing carries to skip across
blocks of bits, rather than rippling through them.
2.2.2.4 Carry Select Adder (CSLA)
The carry select adder comes in the category of conditional sum adder. Conditional
sum adder works on some condition. Sum and carry are calculated by assuming input
carry as 1 and 0 prior the input carry comes. When actual carry input arrives, the actual
calculated values of sum and carry are selected using a multiplexer.
The conventional carry select adder consists of k/2 bit adder for the lower half of the
bits i.e. least significant bits and for the upper half i.e. most significant bits (MSBs) two
k/2 bit adders.
In MSB adders, one adder assumes carry input as one for performing addition and
another assumes carry input as zero. The carry out calculated from the last stage i.e. least
significant bit stage is used to select the actual calculated values of output carry and sum.
The selection is done by using a multiplexer. This technique of dividing adder in to stages
increases the area utilization but addition operation fastens.
In electronics, a carry-select adder is a particular way to implement an adder, which
is a logic element that computes the (n+1)-bit sum of two n-bit numbers. The carry-select
adder is simple but rather fast, having a gate level depth of O( n).
The carry select adder generally consists of two ripple carry adders and
a multiplexer. Adding two n-bit numbers with a carry select adder is done with two adders
(therefore two ripple carry adders) in order to perform the calculation twice, one time
with the assumption of the carry being zero and the other assuming one. After the two
15
results are calculated, the correct sum, as well as the correct carry, is then selected with
the multiplexer once the correct carry is known.
The number of bits in each carry select block can be uniform or variable. In the
uniform case, the optimal delay occurs for a block size of [n]. When variable, the block
size should have a delay, from addition inputs A and B to the carry out, equal to that of
the multiplexer chain leading into it, so that the carry out is calculated just in time.
The o( n) delay is derived from uniform sizing, where the ideal number of fulladder
elements per block is equal to the square root of the number of bits being added, since
that will yield an equal number of MUX delays.
Design of area- and power-efficient high-speed data path logicsystems are one of the
most substantial areas of research in VLSIsystem design. In digital adders, the speed of
addition is limited by thetime required to propagate a carry through the adder. The sum
for eachbit position in an elementary adder is generated sequentially only afterthe
previous bit position has been summed and a carry propagated intothe next position.
The CSLA is used in many computational systems to alleviate theproblem of carry
propagation delay by independently generating multiplecarries and then select a carry to
generate the sum. However,the CSLA is not area efficient because it uses multiple pairs of
RippleCarry Adders (RCA) to generate partial sum and carry by consideringcarry input
Cin=0 and Cin=1, then the final sum and carry areselected by the multiplexers (mux).
The basic idea of this work is to use Binary to Excess-1 Converter(BEC) instead of
RCA with Cin=1 in the regular CSLA to achievelower area and power consumption. The
main advantage of thisBEC logic comes from the lesser number of logic gates than the nbitFull Adder (FA) structure. The details of the BEC logic are discussedin the next
chapter.
The carry select adder comes in the category of conditional sum adder. Conditional
sum adder works on some condition. Sum and carry are calculated by assuming input
carry as 1 and 0 prior the input carry comes. When actual carry input arrives, the actual
calculated values of sum and carry are selected using a multiplexer.
The conventional carry select adder consists of k/2 bit adder for the lower half of the
bits i.e. least significant bits and for the upper half i.e. most significant bits (MSBs) two
k/ bit adders. In MSB adders, one adder assumes carry input as one for performing
addition and another assumes carry input as zero. The carry out calculated from the last
16
stage i.e. least significant bit stage is used to select the actual calculated values of output
carry and sum. The selection is done by using a multiplexer.
17
18
Thus,
for
example,
binary
input
of 101 results
in
an
output
of 1+0+1=10 (decimal number '2'). The carry out represents bit 1 of the result, while the
sum represents bit zero. Likewise, the half adder can be used as a 2:2 lossy compressor,
compressing the four possible inputs into three possible outputs.
Such compressors can be used to speed up the summation of three or more addends.
If the addends are exactly three, the layout is known as the carry-save adder. If the
addends are four or more, more than one layer of compressors is necessary and there are
various possible designs for the circuit: the most common are Dadda and Wallace trees.
This kind of circuit is most notably used in multipliers, which is why these circuits are
also known asDadda and Wallace multipliers.
19
2.4 Multiplexer
In electronics, a multiplexer (or MUX) is a device that selects one of several analog
or digital input signals and forwards the selected input into a single line.A multiplexer of
2n inputs has n select lines, which are used to select which input line to send to the
output.
Multiplexers are mainly used to increase the amount of data that can be sent over the
network within a certain amount of time and bandwidth. A multiplexer is also called a
data selector. They are used in CCTV and almost every business that has CCTV fitted,
will own one of these.
An electronic multiplexer makes it possible for several signals to share one device or
resource, for example one A/D converter or one communication line, instead of having
one device per input signal.
On the other hand, a demultiplexer (or demux) is a device taking a single input signal
and selecting one of many data-output-lines, which is connected to the single input. A
multiplexer is often used with a complementary demultiplexer on the receiving end.
An electronic multiplexer can be considered as a multiple-input, single-output switch
and a demultiplexer as a single-input, multiple-output switch.
The schematic symbol for a multiplexer is an isosceles trapezoid with the longer
parallel side containing the input pins and the short parallel side containing the output pin.
The wire connects the desired input to the output based on the selection line.
In digital circuit design, the selector wires are of digital value. In the case of a 2-to-1
multiplexer, a logic value of 0 would connect
would connect
to
where
For example, 9 to 16 inputs would require no fewer than 4 selector pins and 17 to 32
inputs would require no fewer than 5 selector pins. The binary value expressed on these
selector pins determines the selected input pin.
20
selector
pins for n inputs. Other common sizes are 4-to-1, 8-to-1 and 16-to-1. Since digital logic
uses binary values, powers of 2 are used (4, 8, 16) to maximally control a number of
inputs for the given number of selector inputs.
one realized from 3-state buffers and AND gates (the AND gates are acting as the
decoder)
21
22
CHAPTER 3
DESIGN OF 16-bit SQUARE ROOT CARRY SELECT ADDER AND
CARRY SELECT ADDER WITH AOI
Basically, this project can be classified into three major parts.
23
24
The design procedure and the delay propagation of the 16-bit square root CSLA
can be best explained from the figure 3.4. As from the figure, it can be seen that the
model consists of 5 groups of different size. The addition process is carried out by
considering the carry Cin=0 and Cin=1 and then generating the actual sum and carry
using the actual carry from the previous stage is accomplished.
3.1.2 Architecture of 16-bit square root CSLA:
Fig 3.5: Delay and area evaluation of regular SQRT CSLA: (a) group2, (b) group3, (c)
group4 and (d) group5. F is a Full Adder.
26
This 16-bit square root CSLA consists of five groups where each group is of
variable size. The 16-bit value data is divided as 2-bit, 2-bit, 3-bit, 4-bit, 5-bit groups. The
first group consists of 2-bit ripple carry adder. The actual input carry is applied to this
adder. The ripple carry adder receives the carry and performs the 2 2-bit addition (a[1:0],
b[1:0]).
The 2-bit sum generated from this adder is written as sum[1:0]. The carry
generated by this adder is propagated to the next group with a delay. This delay is
calculated using the basic circuit shown in Fig 3.6.
3.1.3 Delay and Area Evaluation Methodology of the basic adder blocks:
The AND, OR and Inverter (AOI) implementation of an XOR gate is shown in the
figure 3.6. The gates between the dotted lines are performing the operations in parallel
and the numeric representation of each gate indicates the delay contributed by that gate.
27
Table 3.1: Delay and area count of the basic blocks of CSLA
3.1.4 Delay and Area Evaluation of CSLA groups:
The structure of the 16-b regular SQRT CSLA is shown in the figure 3.5. Ithas five
groups of different size RCA. The delay and area evaluation ofeach group are shown in
Fig. 5, in which the numerals within [ ] specifythe delay values, e.g., sum2 requires 10
gate delays. The steps leadingto the evaluation are as follows.
1) The group2 has two sets of 2-b RCA. Based onthe consideration of delay values of
Table 3.1, the arrival time ofselection input c1[time(t)= 7] of 6:3 mux is earlier than
s3[t=8] and later than s2[t=6]. Thus, sum3[t=11]is summation of s3 and mux[t=3] and
sum2[t=10] is summation of c1 and mux.
2) Except for group2, the arrival time of mux selection input is alwaysgreater than the
arrival time of data outputs from the RCAs.Thus, the delay of group3 to group5 is
determined, respectively asfollows:
{c6, sum [6:4]}=c3 [t=10] +mux
{c10, sum [10:7]}=c6 [t=13] +mux
{Cout, sum [15:11]}=c10 [t=16] +mux
3) The one set of 2-b RCA in group2 has 2 FA forCin=1 and theother set has 1 FA and 1
HA for Cin=0.
Based on the area countof Table 3.1, the total number of gate counts in group2 is
determinedas follows:
Gate count =57 (HA+FA+Mux)
FA=39(3*13)
HA=6(1*6)
Mux=12(3*4)
28
4) Similarly, the estimated maximum delay and area of the othergroups in the regular
SQRT CSLA are evaluated and listed.
The area and delay values of all the groups of square root CSLA are shown in the Table
3.2.
Table 3.2: Delay and area count of regular SQRT CSLA groups
3.2. AOI
3.2.1 Design of CSLA with AOI:
As stated above, the main idea of this work is to use BEC instead of the RCA with
Cin=1 in order to reduce the area and power consumptionof the regular CSLA. To replace
the n-bit RCA, an (n+1)-bit AOI is required.
The structure and the function table of a 4-b AOI are shown in the figure 3.7 and
Table 3.3 respectively.
3.2.3 Block diagram of CSLA with AOI:
The structure of the proposed 16-b SQRT CSLA using BEC for RCAwithCin=1 to
optimize the area and power is shown in the figure 3.9.
Comparing the block diagram of the regular square root CSLA with the modified square
root CSLA, it can be seen that the RCA with Cin=1 is replaced by AOI. This is done to
reduce the area consumption. This can be seen after evaluating the group delay and the
number of gates required for the design.
3.2.4 Architecture of modified model:
The architecture of this modified model is shown in figure 3.10.
30
Fig 3.10: Delay and area evaluation of modified SQRT CSLA: (a) group2, (b) group3, (c)
group4 (d) group5. H is a Half Adder.
The structure is split into five groups. The delay and area estimation of each group
are shown in the figure 3.10. The steps leading to the evaluation are given here.
1. The group2 has one 2-b RCA which has 1 FA and1 HA for Cin=0. Instead of
another 2-b RCA with Cin=1, a 3-b BEC is used which adds one to the output
from 2-b RCA.Based on the consideration of delay values of the gates, the arrival
time of selection input c1[time(t)=7] of 6:3 mux is earlier than the s3[t=9] and
c3[t=10] and later than the s2[t=4]. Thus, the sum3 and final c3 (output from mux)
are depending on s3and mux and partial c3 (input to mux) and mux, respectively.
The sum2 depends on c1 and mux.
2. For the remaining groups the arrival time of mux selection input is always greater
than the arrival time of data inputs from the BECs.Thus, the delay of the
remaining groups depends on the arrivaltime of mux selection input and the mux
delay.
3. The area count of group2 is determined as follows:
Gate count= 43 (FA + HA + Mux + AOI)
FA= 13(1*13)
HA= 6(1*6)
AND=4
OR=2
NOT=3
MUX = 12 (3*4)
4. Similarly, the estimated maximum delay and area of the othergroups of the
modified SQRT CSLA are evaluated and listed in the Table 3.4.
Table 3.4: Delay and area count of modified SQRT CSLA groups
32
33
Delay
11
13
16
19
Delay
11
16
23
30
Area
40
57
74
98
Table 3.5: Comparison Table of Regular CSLA and CSLA with AOI
34
CHAPTER 5
regular
and
modified
designs
have
been
developed
using
35
5.2 Applications:
1. Image processing
In image processing with interpolation, an output of the gamma circuit and the input
data are input to an adder circuit so as to obtain the added and averaged values at a
predetermined ratio.
2. Signal processing
Addition is by far the most fundamental arithmetic operation. It has been ranked the
most extensively used operation among a set of real-time digital signal processing
benchmarks from applicationspecific DSP to general purpose processors.
3. Arithmetic logic units
Carry select adder is used in arithmetic logic units to perform addition and
multiplication in a less amount of time.
4. Advanced microprocessor design
In microprocessor design, the adder is used for the conversion mechanism in
calculating the physical address using the offset address and segment address.
5. High speed multiplications
In multiplier, each bit of the Product P is obtained by a summation of bits AiBj using
an array of single bit adders. The bits AiBj are formed using AND gates.
5.3 Advantages:
1. Low area:
The modified Carry Select adder consumes less logic gates (low area) as it eliminates
the pairs of Ripple carry adders.
36
5.4Disadvantages:
1. Increased delay:
Even though there is reduction in area, a slight increase in delay can be seen in the
modified CSLA.
37
CHAPTER 6
38