Low Power
Low Power
                                             iii
                           LIST OF FIGUTRES
FIGURE.NO.          NAME                                           PAGE.NO
   1.0       VLSI design flow                                            3
   1.1       Physical design process                                     7
    2.1      GDI basic cell                                             15
   3.1       Schematic diagram of Half adder                            16
   3.2       schematic diagram of full adder                            17
   3.3       Ripple carry adder                                         18
   3.4       carry skip adder                                           19
   3.5       full adder-                                                20
   3.6       Carry-Look ahead Adder                                     21
    3.7      Parallel Carry Adder                                       23
   3.8       16 bit carry look ahead adder                              24
   3.9       Carry Look ahead adder IC                                  26
   3.10      4-Bit parallel Adder-74LS283                               27
   3.11      CIA_CLA                                                    28
   3.12      CIA-RCA                                                    29
   3.13      Block Diagram of CBA                                       30
   5.1       AND gate using GDI method                                  40
   5.2       Input and output waveform of an AND gate using GDI         41
   5.3       OR GATE-GDI`                                               41
   5.4       Input and output waveform of an OR gate using GDI          42
   5.5       NOT gate using GDI method                                  42
   5.6       Input and output waveform of an NOT gate using GDI         43
   5.7       XOR gate using GDI method                                  44
   5.8       Input and output waveform of an XOR gate using GDI         44
   5.9       Graphical analysis of the average power consumption        45
   6.1       AND Gate Design                                            46
   6.2       NOT Gate Design                                            46
                                    iv
6.3    OR Gate Design                                              47
6.3    XOR Gate Design                                             47
6.5    Half adder Design                                           48
6.6    1-bit full adder Design                                     48
6.7    4-bit RCA adder Design                                      49
6.8    8-bit CIA Adder Design                                      49
6.9    8-bit CBA Adder Design                                      50
6.10   8-bit CSKA adder Gate Design                                50
6.11   8-bit CLA Adder Design                                      51
6.12   Output waveforms for AND Gate                               51
6.13   Output waveforms for MUX                                    52
6.14   Output waveforms for OR Gate                                52
6.15   Output waveforms for XOR Gate                               53
6.16   Output waveforms for Half Adder                             53
6.17   Output waveforms for 1-Bit Full Adder                       54
6.18   Output waveforms for 8-Bit CIA for Adding of (1101 1011)2 and
                                                     (1110 1110)   54
6.19   Output waveforms for 8-Bit CBA for Adding of (0101 1101)2 and
                                                    (1011 1101)    55
6.20   Output waveforms for 8-Bit CKA for Adding of (1111 0111)2 and
                                                     (1100 1111) 55
6.21   Output waveforms for 8-Bit CLA for Adding of (1100 1111)2 and
                                                     (1111 1101)   56
                       LIST OF TABLES
                                   v
                                       CONTENTS
Declaration                                                    i
Acknowledgement                                                ii
Abstract                                                       iii
List of Figures                                                iv
List of Tables                                                 v
                                 INTRODUCTION
1.1 INTRODUCTION
          Adders are extensively used circuit elements in Very Large Scale Integration
(VLSI) systems such as Digital Signal Processing (DSP) processors, microprocessors etc.
It is the nucleus of many other operations like subtraction, multiplication, division and
address calculation. In most of the digital systems, adders lie in a critical path which
influences the overall system performance. Hence, enhancing adder‟s performance is
becoming an important goal. The Pexplosive growth in portable systems like laptops has
intensified the research efforts in low power microelectronics. The reason behind is that the
battery technology does not advance at the same rate as the microelectronics technology.
There is only a limited amount of power available for the mobile systems. Therefore, low
power design has become a major design consideration . The advances in VLSI technology
allow hardware realization of most computing intensive applications such as multimedia
processing, DSP, to enhance the speed of operation. Moreover, with increasing demand
and the popularity of portable electronic products, the researchers are driven to strive for
smaller silicon area, higher speed, longer battery life and enhanced reliability. The
importance of digital computing lies in full adder design. The design criteria for full adder
are usually multifold.
        Transistor count, which is one of the attributes, determines the system complexity
of arithmetic circuits like multiplier, Arithmetic Logic Unit (ALU), etc. Power
consumption and speed would be the other two important criteria when it comes to the
design of full adders. However, they have a contradictory relationship with each other.
Therefore, power delay product or energy consumption per operation has been introduced
to accomplish optimal design tradeoffs. The performance of digital circuits can be
optimized by proper selection of logic styles. Different logic styles tend to favor the
accomplishment of one performance aspect at the expense of others. The logic styles are
varied in the method of computing intermediate nodes, the number of transistor count,
though they are implementing the same function. Numerous full adder designs in the
                                              1
classes of static CMOS, dynamic circuit, transmission gate, GDI logic and Pass Transistor
Logic (PTL) are discussed in the literature .
       The well known static CMOS adders with complementary pullup PMOS and pull
down NMOS network require 28 transistors for generating sum and carry outputs. PTL is
an alternative to CMOS and offers most functions implementations with fewer transistors.
This may reduce overall capacitances which in turn will increase the speed and decrease
the power dissipation. However, in the PTL based design, the output voltage is varied due
to threshold voltage drop across the input and the output. This problem can be resolved by
the adaptation of Complementary Pass Logic (CPL) and Swing Restored PTL (SRPL). But
these logics produce larger short circuit current, higher transistor count and increased
wiring complexity due to demand of complementary input signals. Building logic using
transmission gate is another choice to minimize complexity. The full adder design
implemented using transmission gate is discussed in Reference 13.
       It requires 20 transistors, further reduction in transistor count is also possible using
transmission function adder which needs 16 transistors, and it is discussed in Reference 14.
GDI logic is introduced as an alternative to CMOS logic. It is a low power design
technique which offers the implementation of the logic function with fewer numbers of
transistors. GDI gates provide reduced voltage swing at their outputs, i.e. the output high
(or low) voltage is deviated from the VDD (or ground) by threshold voltage Vt. The
reduction in Voltage swing is beneficial to power consumption. On the other hand, this
may lead to slow switching in the case of cascaded operation. At low VDD operation, the
degraded output may even cause circuit malfunction. Therefore, special attention must be
needed to achieve full swing operation. In this paper, an efficient methodology for digital
circuits such as AND, OR and XOR gates with full swing is implemented. After that, three
full adders are proposed based on the full swing gates in a standard 45 nm technology. The
performance of three proposed full adder Designs are compared with other adders based
on CMOS, CPL, hybrid and GDI logic cited in the literature Wide utilization of memory
storage systems and sequential logic in modern electronics triggers a demand for high-
performance and low area implementations of basic memory components. In these circuits,
the output not only depends upon the current values of the inputs, but also upon preceding
input values. These circuits are often called cyclic logic circuits.
                                                2
         These inputs are often used to initialize the state of digital ICs at the time the power
is first applied. Normally, a set or reset input is required, but seldom both. A T flip flop
alternately sends an output signal to two different outputs when an input signal is applied.
(vi)    Design verification: In this step, the layout is verified to ensure that the layout
meets the system specifications and the fabrication requirements. Design verification
consists of design rule checking (DRC) and circuit extraction. DRC is a process which
verifies that all geometric patterns meet the design rules imposed by the fabrication
process. After checking the layout for design rule violations and removing them, the
functionality of the layout is verified by circuit extraction. This is a reverse engineering
process and generates the circuit representation from the layout. This reverse engineered
                                                  4
circuit representation can then be compared to the original circuit representation to verify
the correctness of the layout
(vii) Fabrication: This step is followed after the design verification. The fabrication
process consists of several steps like, preparation of wafer, deposition, and diffusion of
various materials on the wafer according to the layout description. A typical wafer is 10 cm
in diameter and can be used to produce between 12 and 30 chips. Before the chip is mass
produced, a prototype is made and tested.
(viii)   Packaging, testing, and debugging: In this step, the chip is fabricated and
diced in a fabrication facility. Each chip is then packaged and tested to ensure that it meets
all the design specifications and that it functions properly. Chips used in printed circuit
boards (PCBs) are packaged in a dual in-line package (DIP) or pin grid array (PGA). Chips
which are to be used in a multichip module (MCM) are not packaged because MCMs use
bare or naked chips.
(a) Partitioning: The chip layout is always a complex task and hence it is divided into
several smaller tasks. A chip may contain several million transistors. Layout of the entire
circuit cannot be handled due to the limitation of memory space as well as computation
power available. Therefore, it is normally partitioned by grouping the components into
blocks. The actual partitioning process considers many factors such as size of the blocks,
number of blocks, and number of interconnections between the blocks. The output of
partitioning is a set of blocks along with the interconnections required between blocks. The
set of interconnections required is referred to as a net list. In large circuits the partitioning
rocess is hierarchical and at the topmost level a chip may have between 5 and 25 blocks.
Each module is then partitioned recursively into smaller blocks.
(b)   Placement: It is the process of arranging a set of modules on the layout surface.
Each module has fixed shape and fixed terminal locations. A poor placement uses larger
area and hence results in performance degradation.
         The placement process determines the exact positions of the blocks on the chip, so
as to find a minimum area arrangement for the blocks that allows completion of
interconnections between the blocks. Placement is typically done in two phases. In the first
phase an initial placement is created. In the second phase the initial placement is evaluated
and iterative improvements are made until the layout has minimum area and conforms to
design specifications.
            It is important to note that some space between the blocks is intentionally left
empty to allow interconnections between blocks. Placement may lead to un-routable
design, i.e., routing may not be possible in the space provided. Thus, another iteration of
placement is necessary. To limit the number of iterations of the placement algorithm, an
estimate of the required routing space is used during the placement phase. A good routing
and circuit performance heavily depend on a good placement algorithm. This is due to the
fact that once the position of each block is fixed, very little can be done to improve the
routing and the overall circuit performance. There are various types of placements.
System-level placement: Place all the PCBs together such that             Area occupied is
minimum and Heat dissipation is within limits.
Board-level placement: All the chips have to be placed on a PCB. Area is fixed all
modules of rectangular shape. The objective is to, minimize the number of routing layers
and Meet system performance requirements.
                                               6
Chip-level placement: Normally, floor planning / placement carried out along with
pin assignment. It has limited number of routing layers (2 to 4). Bad placements may be
unroutable. Can be detected only later (during routing). Costly delays in design cycle.
Minimization of area.
1.5 Floor planning:
        Floor-plan design is an important step in physical design of VLSI circuits to plan
the positions of a set of circuit modules on a chip in order to optimize the circuit
performance. In floor-planning, the information of a set of modules, including their areas
and interconnection is considered and the goal is to plan their positions on a chip to
minimize the total chip area and interconnect cost.
        In the floor planning phase, the macro cells are positioned on the layout surface in
such a way that no blocks overlap and that there is enough space left to complete the
interconnections. The input for the floor planning is a set of modules, a list of terminals
(pins for interconnections) for each module and a net list, which describes the terminals
which have to be connected.
(c) Routing:
Compaction: The operation of layout area minimization without violating the design rules
and without altering the original functionality of layout is called as compaction. The input
of compaction is layout and output is also layout but by minimizing area.
Compaction is done by three ways:
(i) By reducing space between blocks without violating design space rule.
(ii) By reducing size of each block without violating design size rule.
(iii).By reducing shape of blocks without violating electrical characteristics of blocks.
The objective behind any simulation tool is to create a computer based model for the
design verification and analyzing the behavior of circuits under construction also checking
the current level of abstraction.
Types of Simulation:
Device level simulation, Circuit level simulation, Timing level & Macro level simulation,
Switch level simulation, Gate level simulation, RTL simulation, System level simulation.
                                               9
computes the next state and the output based on the current state and the input. Here the
important consideration is the state transitions and the precise timing of intermediate
signals in the computation of the next state is not considered.
System level Simulation:
It deals with the hardware described in terms of primitives that need not correspond with
hardware building blocks. VHDL is the most popular hardware description language used
for system level simulation. When used in the initial stages of a design, it can describe the
behavior of a circuit as a processor as a set of communicating processes.
                                              10
                                    CHAPTER 2
         LOW POWER HIGH SPEED DIFFUSION INPUT
                     TECHNIQUE
        Gate Diffusion Input (GDI) method is based on the utilization of a simple cell as
shown in Fig. 1 which can be used for low power digital circuits. This technique is
implemented in twin-well CMOS or Silicon on Insulator (SOI) technologies. In this
process, the bulks of both NMOS and PMOS transistors are hardwired to their diffusions to
reduce the bulk effect that is dependence of threshold voltage on source-to-bulk voltage .
The dependence of transistor threshold voltage on source-to-bulk voltage is as follows:
                                             11
design are as follows- 1) lesser number of transistors results in low power dissipation and
lesser delay. 2) Lesser number of transistors so smaller area and lesser interconnect effects.
However, PTL technologies also suffer from two main problems such as reduced circuit
speed at low power operations and greater static power dissipation.
       GDI technique that can be used to design fast, low power circuits using only a few
transistors. The GDI cell is similar to a CMOS inverter structure. In a CMOS inverter the
source of the PMOS is connected to VDD and the source of NMOS is grounded. But in a
GDI cell this might not necessarily occur. There are some important differences between
the two. The three inputs in GDI are namely-
1) G- common inputs to the gate of NMOS and PMOS
2) N- input to the source/drain of NMOS
3) P- input to the source/drain of PMOS
Bulks of both NMOS and PMOS are connected to N or P (respectively), that is it can be
arbitrarily biased unlike in CMOS inverter. Moreover, the most important difference
between CMOS and GDI is that in GDI N, P and G terminals could be given a supply
„VDD‟ or can be grounded or can be supplied with input signal depending upon the circuit
to be designed and hence effectively minimizing the number of transistors used in case of
most logic circuits (eg. AND, OR, XOR, MUX, etc). As the allotment of supply and
ground to PMOS and NMOS is not fixed in case of GDI, therefore, problem of low voltage
swing arises in case of GDI which is a drawback and hence finds difficulty in case of
implementation of analog circuits.
       The most common problem with PTL technique is its low voltage swing. An extra
buffer circuitry may be used additionally to eliminate the problem of low swing and
improve drivability. The problem of low swing can be understood with the help of a
random function shown in figure and table.
                                              12
Functionality Of any B                               Functionality          Logic
    Random Function
      using GDI A
A                         B                          Y
0                         0                          pMOS Trans Gate        Vtp
0                         1                          CMOS Inverter          1
1                         0                          nMOS Trans Gate        0
1                         1                          CMOS Inverter          0
The problem of low swing occurs only when A=0 and B=0 where the voltage level is VTP
instead of 0.This occurs due to the poor high to low transition characteristics of PMOS. In
the rest of the cases it provides full swing.
        The basic GDI cell is shown in Fig. 1. Though it resembles a conventional CMOS
inverter the source/drain diffusion input of both PMOS and NMOS transistor is different.
In conventional inverter circuit, source and drain diffusion input of PMOS and NMOS
transistors are always tied at VDD and GND potential, respectively. On the other hand, the
diffusion terminal acts as an external input in the GDI cell. It helps in the realization of
various Boolean functions such as AND, OR, MUX, INVERTER, F1 and F2, as listed in
Table 1. The main drawback of GDI gate is that it suffers due to threshold voltage drop.
This reduces current drive and affects the performance of the gate. The output voltage
reduction can be compensated by the use of swing restoration buffers at the output .
        However, the presence of inverters in the buffers increases the transistor count and
also increases the static power consumption when they are connected in cascade. A
multiple Vt technique is presented in the lieu of swing restoration buffer in Reference 15].
This approach utilizes low threshold transistors in the places where a voltage drop is to
occur and also high threshold transistors for the inverters. Though this hybrid threshold
                                                13
voltage method minimizes power consumption, it becomes a bottleneck at the transistor
fabrication process. Another method of swing restoration of GDI based, full adder output,
using an Ultra Low Power Diode (ULPD) technique is detailed in Reference 17. This
technique configures the MOS transistor to work as a diode and uses 8 additional
transistors for providing full swing. It mitigates the problem of static power dissipation as a
conventional swing restoration buffer but still the complexity issue in the fabrication of
ULPD is to be taken into account. The techniques presented so far to achieve full swing at
the full adder output either increase the number of transistors (more than half from non-full
swing design) or increase the power consumption (use of buffers). So, a general method is
required to design full swing at the gate level like AND, OR, XOR, etc. Hence, an attempt
is made to design full swing gates subsequently three adders using the proposed gates; a
detailed explanation of the same is discussed in the following section
               The GDI method is based on the simple cell shown in Fig.2.2. A basic GDI
cell contains four terminals - G (the common gate input of the nMOS and pMOS
transistors), P (the outer diffusion node of the pMOS transistor), N (the outer diffusion
node of the nMOS transistor) and the D node (the common diffusion of both transistors). P,
N and D may be used as either input or output ports, depending on the circuit structure.
Table 2.3 shows how various configuration changes of the inputs P, N and G in the basic
GDI cell correspond to different Boolean functions at the output D. GDI enables simpler
gates, lower transistor count, and lower power dissipation in many implementations, as
compared with standard CMOS and Pass-transistor Logic (PTL) design techniques .
Where
K denotes device transconductance parameter,
VTH denotes threshold voltage,
W denotes channel width
                                              14
L denotes channel length.
N                  P                  G                  D                    Function
'0'                B                  A                  HB                   F1
B                  '1'                A                  H+B                  F2
'1'                B                  A                  A+B                  OR
B                  '0'                A                  AB                   AND
C                  B                  A                  HB+AC                MUX
'0'                '1'                A                  H                    NOT
Table 2.2: some logic functions that can be implemented with a single GDI cell.
From table , it can be noticed that using only 2 transistors various functions can be
performed. For instance, OR gate can be designed using a single GDI cell whereas in case
of designing of an OR gate utilizing transmission gates, it required 6 transistors. Similarly,
AND gate can be designed using only 2 transistors and even a Multiplexer (MUX) can be
devised using a single GDI cell. Thus, a simple alteration to the input configuration of the
GDI cell would yield myriad variety of Boolean functions. Multiple-input gates can be
implemented by combining several GDI cells. The main advantage of GDI cell is that a
huge number of functions can be carried out using basic GDI cell. The GDI gates are more
compact and flexible compared to static CMOS gates and have very low leakage current.
This is due to the unique structure of the GDI cell which purges both the sub-threshold as
well as gate leakage current
                                              15
                                  CHAPTER-3
                               TYPES OF ADDERS
The half adder is an example of a simple, functional digital circuit built from two logic
gates. The half adder adds to one-bit binary numbers (AB). The output is the sum of the
two bits (S) and the carry (C). Note how the same two inputs are directed to two different
gates.
                                              16
3.3Full adder:
A full adder adds binary numbers and accounts for values carried in as well as out. A one-
bit full adder adds three one-bit numbers, often written as A, B, and Cin; A and B are the
                                                                                  [2]
operands, and Cin is a bit carried in from the previous less significant stage.         The full
adder is usually a component in a cascade of adders, which add 8, 16, 32, etc. bit binary
numbers. The circuit produces a two-bit output, output carry and sum .
S = A XOR B;------------------------------------------ ( 1)
                                                 17
3.4 Ripple carry adder:
Arithmetic operation like addition ,subtraction ,multiplication ,division are basic operation
to be implemented digital computer using basic gates among all arithmetic operation if we
can implemented addition then it is easy to perform multiplication repeated addition .Half
adders can be used to add two one bit binary numbers .it is also possible to create a logical
circuit using multiple adder to add N bit binary number .each full adder inputs carry ,which
is the output carry of the previous adder .this kind of adder is a ripple carry adder ,since
each carry bits “ripples” to the next full adder .the first full adder may be replaced by the
half adder.
                                             18
transmit state is true for each pair (ai,bi). For each operand input bit pair (ai,bi) the transmit
conditions are determined using an XOR gate. When all transmit conditions are true, then
the carry-in bit determines the carry-out bit. The n-bit CSKA consists of mux, AND gate
with n-inputs and n-bit carry ripple chain. The carry ripple chain provides transmit bit,
which is connected to the AND gate. The mux uses the resultant bit
In case of parallel adders, the binary addition of two numbers is initiated when all the bits
of the augend and the addend must be available at the same time to perform the
computation. In a parallel adder circuit, the carry output of each full adder stage is
connected to the carry input of the next higher-order stage, hence it is also called as ripple
carry type adder.
        In such adder circuits, it is not possible to produce the sum and carry outputs of any
stage until the input carry occurs. So there will be a considerable time delay in the addition
process , which is known as , carry propagation delay. In any combinational circuit , signal
must propagate through the gates before the correct output sum is available in the output
terminals.
                                                19
                                      fig3.5 : full adder
Consider the above figure, in which the sum S4 is produced by the corresponding full
adder as soon as the input signals are applied to it. But the carry input C4 is not available
on its final steady state value until carry c3 is available at its steady state value. Similarly
C3 depends on C2 and C2 on C1. Therefore, carry must propagate to all the stages in order
that output S4 and carry C5 settle their final steady-state value.
The propagation time is equal to the propagation delay of the typical gate times the number
of gate levels in the circuit. For example, if each full adder stage has a propagation delay of
20n seconds, then S4 will reach its final correct value after 80n (20 × 4) seconds. If we
extend the number of stages for adding more number of bits then this situation becomes
much worse.So the speed at which the number of bits added in the parallel adder depends
on the carry propagation time. However, signals must be propagated through the gates at a
given enough time to produce the correct or desired output.
The following are the methods to get the high speed in the parallel adder to produce the
binary addition.
1.     By employing faster gates with reduced delays, we can reduce the propagation
delay. But there will be a capability limit for every physical logic gate.
2.     Another way is to increase the circuit complexity in order to reduce the carry delay
time. There are several methods available to speeding up the parallel adder, one commonly
                                               20
used method employs the principle of look ahead-carry addition by eliminating inter stage
carry logic.
Carry-Lookahead Adder
This method makes use of logic gates so as to look at the lower order bits of the augend
and addend to see whether a higher order carry is to be generated or not. Let us discuss in
detail.
                                             21
                                Table 3.3: Truth table CLA
Consider the full adder circuit shown above with corresponding truth table. If we define
two variables as carry generate Gi and carry propagate Pi then,
Pi = Ai ⊕ Bi
Gi = Ai Bi
Si = Pi ⊕ Ci
C i +1 = Gi + Pi Ci
Where Gi is a carry generate which produces the carry when both Ai, Bi are one regardless
of the input carry. Pi is a carry propagate and it is associate with the propagation of carry
from Ci to Ci +1.
The carry output Boolean function of each stage in a 4 stage carry-Lookahead adder can be
expressed as
                                             22
C1 = G0 + P0 Cin
C2 = G1 + P1 C1
= G1 + P1 G0 + P1 P0 Cin
C3 = G2 + P2 C2
= G2 + P2 G1+ P2 P1 G0 + P2 P1 P0 Cin
C4 = G3 + P3 C3
= G3 + P3 G2+ P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 Cin
From the above Boolean equations we can observe that C4 does not have to wait for C3
and C2 to propagate but actually C4 is propagated at the same time as C3 and C2. Since
the Boolean expression for each carry output is the sum of products so these can be
implemented with one level of AND gates followed by an OR gate.
The implementation of three Boolean functions for each carry output (C2, C3 and C4) for a
carry-Look ahead carry generator shown in below figure.
                                           23
                         Fig: 3.7 bit parallel-carry look head adder
Therefore, a 4 bit parallel adder can be implemented with the carry-Lookahead scheme to
increase the speed of binary addition as shown in below figure. In this, two Ex-OR gates
are required by each sum output. The first Ex-OR gate generates Pi variable output and the
AND gate generates Gi variable.
Hence, in two gates levels all these P‟s and G‟s are generated. The carry-Lookahead
generators allows all these P and G signals to propagate after they settle into their steady
state values and produces the output carriers at a delay of two levels of gates. Therefore,
the sum outputs S2 to S4 have equal propagation delay times.
It is also possible to construct 16 bit and 32 bit parallel adders by cascading the number of
4 bit adders with carry logic. A 16 bit carry-Lookahead adder is constructed by cascading
                                             24
the four 4 bit adders with two more gate delays, whereas the 32 bit carry-Lookahead adder
is formed by cascading of two 16 bit adders.
In a 16 bit carry-Lookahead adder, 5 and 8 gate delays are required to get C16 and S15
respectively, which are less as compared to the 9 and 10 gate delay for C16 and S15
respectively in cascaded four bit carry-Lookahead adder blocks. Similarly, in 32 bit adder,
7 and 10 gate delays are required by C32 and S31 which are less compared to 18 and 17
gate delays for the same outputs if the 32 bit adder is implemented by eight 4 bit adders.
The high speed carry-Lookahead adders are integrated on integrated circuits in different bit
configurations by several manufacturers. There are several individual carry generator ICs
are available so that we have to make connection with logic gates to perform the addition
operation.
A typical carry-Lookahead generator IC is 74182 which accept four pairs of active low
carry propagate (as P0, P1, P2 and P3) and carry generate (Go, G1, G2 and G3) signals and
an active high input (Cn).
It provides active high carriers (Cn+x, Cn+y, Cn+z) across the four groups of binary
adders. This IC also facilitates the other levels of look ahead by active low propagate and
carry generate outputs.
                                               25
                           Fig 3.9: Carry-Look ahead Adder ICs
On the other hand, there are many high speed adder ICs which combine a set of full adders
with carry-Look ahead circuitry. The most popular form of such IC is 74LS83/74S283
which is a 4 bit parallel adder high speed IC that contains four interconnected full adders
with a carry-Lookahead circuitry.
The functional symbol for this type of IC is shown in below figure. It accepts the two 4 bit
numbers as A3A2A1A0 and B3B2B1B0 and input carry Cin0 into the LSB position. This
IC produce output sum bits as S3S2S1S0 and the carry output Cout3 into the MSB
                                             26
               Fig3.10: 4-bit parallel adder -74LS283
By cascading two or more parallel adder ICs we can perform the addition of larger binary
numbers such as 8 bit, 24-bit and 32 bit addition.
3.7.1 CIA_CLA
CIA_CLA In this subsection we present the modified carry increment adder i.e. CIA_CLA.
We know that RCA is the basic binary adder circuit and is quite popular because of its
simple design. However it suffers from the worst propagation delay affecting the overall
performance of the system. It is proved that CLA performs better than RCA in terms of
delay at the expense of increased design complexity. We have modified CIA_RCA by
replacing the RCA with CLA block. It is quite obvious because of the property of CLA, the
overall delay performance will be improved. As similar to CIA_RCA incremental circuit
can be designed using HA‟s in ripple carry chain with a sequential order. The block
diagram representation of CIA_CLA is as shown in Figure
                                             27
                                    Fig3.11: CIA_CLA
3.7.2 CIA_RCA
       The standard Carry Increment Adder (CIA) consists of RCA‟s and incremental
circuitry. The incremental circuit is designed using HA‟s in ripple carry chain with a
sequential order. The addition operation is done by separating the total number of bits in to
group of 4bits and addition operation is performed by several 4-bit RCA‟s. Instead of
computing two partial sums for each group and selecting the correct one, only one partial
sum is calculated and incremented if necessary, according to the input carry. Thus the
second adder and the multiplexers in the carry-select scheme can be replaced by a much
smaller incremental circuit and the modified architecture is the Carry Increment Adder
(CIA). For example, an 8-bit CIA comprises of two 4-bit RCA. The first block of RCA
adds first 4-bits to produce 4-bit partial sum and a carry output. Thus, first 4-bit of sum of
CIA is directly obtained from first block of RCA. And the carry output of first RCA block
is given as input to the cin of incremental circuit. Incremental circuit consists of Half
Adders (HA). Hence, the partial sum obtained from the second RCA block is given to
incremental circuit
                                              28
                                    Fig3.12: CIA-RCA
                                            29
              Figure3.13: Block Diagram of CBA
CMOS vs GDI
                             30
                                      CHAPTER-4
                  INTRODUCTION TO TANNER TOOL
4.1 What Is Tanner Tool
Tanner tool is a Spice Computer Analysis Programmed for Analogue Integrated Circuits.
Tanner tool consists of the following Engine Machines:
Using these engine tools, spice program provides facility to the use to design & simulate
new ideas in Analogue Integrated Circuits before going to the time consuming & costly
process of chip fabrication.
S-Edit is hierarchy of files, modules & pages. It introduces symbol & schematic modes. S-
Edit provides the facility of:
1. Beginning a design.
3. Design connectivity.
Beginning a design: It explains the design process in detail in terms of file module
operation and module.
Browser: Effective schematic design requires a working knowledge of the S-Edit design
hierarchy of files & modules. S-Edit design files consist of modules. A module is a
functional unit of design such as a transistor, a gate and an amplifier.
                                               31
1) Primitives: Geometrical objects created with drawing tools.
2) Instances: References to other modules in file. The instanced module is the original.
An introduction to the integrated components of the T- Spice Pro circuit analysis suite:
Schematic data files (.sdb): describes the circuits to be analyzed in graphical form, for
display and editing by S- Edit" Schematic Editor.
Simulation input files (.sp): describes the circuits to be analyzed in textual form, for
editing and simulation by T- Spice" Circuit Simulator.
Simulation output files (.out): containing the numerical results of the circuit analyses, for
manipulation and display by W- Edit" Waveform Viewer.
T- Spice Pro‟s waveform probing feature integrates S- Edit, T- Spice, and W- Edit to allow
individual points in a circuit to be specified and analyzed. A few analysis is described
below:
The heart of T-Spice operation is the input file (also known as the circuit description, the
net list & the input deck). This is a plain text file that contains the device statement &
simulation commands, drawn from the SPICE circuit description language with which T-
Spice constructs a model of the circuit to be simulated. Input files can be created and
modified with any text editor.
T-Spice is a tool used for simulation of the circuit. It provides the facility of
1. Design Simulation
2. Simulation Commands
  3.     Device Statements
                                               32
  4.   User-Designed External Models
T-Spice uses Kirchhoff‟s Current Law (KCL) to solve circuit problems. To T-Spice, a
circuit is a set of devices attached to nodes. The voltage at all nodes represents the circuit
state. T-Spice solves for a set of node voltage that satisfied KCL (implying that sum of
currents flowing into each node is zero). In order to evaluate whether a set of node voltages
is a solution, T-Spice computers and sums all the current flowing out of each device into
nodes connected to it (its terminals). The relationship between the voltages at device
terminals and the currents through the terminal is determined by the device model for a
resistor of resistance R is
I=∆V/R
Where, ∆V represents the voltage difference across the device. A few analyses are
discussed below:
DC operating point analysis finds a circuit‟s steady- state condition, obtained (in principle)
after the input voltages have been applied for an infinite amount of time. The .include
command causes T- Spice to read in the contents of the model file for the evaluation of
NMOS and PMOS transistors.
The technology file assigns values to MOSFET model parameters for both n - and p -type
devices. When read by the input file, these parameters are used to evaluate MOSFET
model equations, and the results are used to construct internal tables of current and charge
values. Values read or interpolated from these tables are used in the computations called
for by the simulation. Following each transistor name are the names of its terminals. The
required order of terminal names is: drain -gate -source -bulk. Then the model name
(NMOS or PMOS in this example), and physical characteristics such as length and width,
are specified. The .op command performs a DC operating point calculation and writes the
results to the file specified in the Simulate > Start Simulation dialog. The output file lists
the DC operating point information for the circuit described by the input file.
                                              33
4.3.2 DC Transfer Analysis
DC transfer analysis is used to study the voltage or current at one set of points in a circuit
as a function of the voltage or current at another set of points. This is done by sweeping the
source variables over specified ranges, and recording the output. A list of sources to be
swept, and the voltage ranges across which the sweeps are to take place follow the .dc
command, indicating transfer analysis. The transfer analysis will be performed as follows:
vdd will be set at 5 volts and vin will be swept over its specified range; vdd will then be
incremented and vin will be reswept over its range; and so on, until vdd reaches the upper
limit of its range. The .dc command ignores the values assigned to the voltage sources vdd
and vin in the voltage source statements, but they must still be declared in those
statements. The results for nodes in and out are reported by the .print dc command to the
specified destination.
Transient analysis provides information on how circuit elements vary with time. The basic
T- Spice command for transient analysis has three modes. In the default mode, the DC
operating point is computed, and T- Spice uses this as the starting point for the transient
simulation. The .tran command specifies the characteristics of the transient analysis to be
performed.
4.3.4 AC Analysis
                                              34
and currents for all circuit devices
Real circuits, of course, are never immune from small, random fluctuations in voltage and
current levels. In T- Spice, the influence of noise in a circuit can be simulated and reported
in conjunction with AC analysis. The purpose of noise analysis is to compute the effect of
the noise associated with various circuit devices on an output voltage or voltages as a
function of frequency. Noise analysis is performed in conjunction with AC analysis; if the
.ac command is missing, then the .noise command is ignored. With the .ac command
present, the .noise command causes noise analysis to be performed at the same frequencies.
The .noise command takes two arguments: the output at which the effects of noise are to be
computed, and the input at which the .noise can be considered to be concentrated for the
purposes of estimating the equivalent noise spectral density. The print command is used to
print results.
The ability to visualize the complex numerical data resulting from VLSI circuit simulation
is critical to testing, understanding & improving these circuits. W-Edit is a waveform
viewer that provides ease of use, power & speed in a flexible environment designed for
graphical data representation. The advantages of W-Edit include:
1. Tight Integration with T-spice, Tanner EDA_s circuit level simulator. W-Edit can chart
data generated by T-spice directly, without modification of the output text data files. The
data can also be charted dynamically as it is produced during the simulation.
2. Charts can automatically configure for the type of data being presented.
3. A data is treated by W-Edit as a unit called a trace. Multiple traces from different output
files can be viewed simultaneously in single or several windows; traces can be copied and
moved between charts & windows. Trace arithmetic can be performed on existed tracing to
create new ones.
4. Chart views can be panned back & forth and zoomed in & out, including specifying the
exact X-Y co-ordinate range.
5. Properties of axes, traces, rides, charts, text & colors can be customized.
                                              35
Numerical data is input to W-Edit in the form of plain or binary text files. Header &
Comment information supplied by T-Spice is used for automatic chart configuration.
Runtime update of results is made possible by linking W-Edit to a running simulation in T-
Spice. W-Edit saves data with chart, trace, axis & environment settings in files with the
WDB (W-Edit Database).
4.5 LAYOUT(L-EDIT)
It is a tool that represents the masks that are used to fabricate an integrated circuit. It
describes a layout design in terms of files, cells & mask primitives. On the layout level, the
component parameters are totally different from schematic level. So it provides the facility
to the user to analyze the response of the circuit before forwarding it to the time consuming
& costly process of fabrication. There are rules for designing layout diagram of a
schematic circuit using which user can compare the output response with the expected one.
In L- Edit, layers are associated with masks used in the fabrication process. Different
layers can be conveniently represented by different colors and patterns. L- Edit describes a
layout design in terms of files, cells, instances, and mask primitives. You may load as
many files as desired into memory. A file may be composed of any number of cells. A file
may be composed of any number of cells. These cells may be hierarchically related, as in a
typical design, or they may be independent, as in a library file. Cells may contain any
number or combination of mask primitives and instances of other cells.
The basic building block of the integrated circuit design in L- Edit is a cell. Design layout
occurs within cells. A cell can:
L- Edit supports fully hierarchical mask design. Cells may contain instances of other cells.
An instance is a reference to a cell; should you edit the instanced cell, the change is
reflected in all the instances of that cell. Instances simplify the process of updating a
design, and also reduce data storage requirements, because an instance does not need to
store all the data within the instanced cell instead, only a reference to the instanced cell is
stored, along with information on the position of the instance and on how the instance may
be rotated and mirrored.
L- Edit does not use a “separated” hierarchy: instances and primitives may coexist in the
same cell at any level in the hierarchy. Design files are self- contained. The pointer to a cell
contained in an instance always points to a cell within the same design file. When cells are
copied from one file to another, L- Edit automatically copies across any cells that are
instanced by the copied cell, to maintain the self- contained nature of the destination file.
Design Rules
Design Features
L- Edit is a full- custom mask editor. Manual layout can be accomplished more quickly
because of L Edit‟s intuitive user interface. In addition, one can construct special structures
to utilize a technology without, worrying about problems caused by automatic
transformations. Phototransistors, guard bars, vertical and horizontal bipolar transistors,
static structures, and Schottky diodes, for example, are as easy to design in CMOS- Bulk
technology as are conventional MOS transistors.
Floor plans
L- Edit is a manual floor planning tool. You have the choice of displaying instances in
outline, identified only by name, or as fully fleshed- out mask geometry. When you display
your design in outline, you can manipulate the arrangement of the cells in your design
quickly and easily to achieve the desired floor plan. One can manipulate instances at any
level in the hierarchy, with insides hidden or displayed, using the same graphical move/
                                               37
select operations or rotation/ mirror commands that you use on primitive mask geometry.
Memory Limits
In L- Edit, one can make your design files as large as one like, given available RAM and
disk space.
Hard Copy
L- Edit provides the capability to print hard copy of the design. A multiage option allows
very large plots to be printed to a specific scale on multiple 8 1/ 2 x 11 inch pages. An L-
Edit macro is available to support large- format, high- resolution, color plotting on inkjet
plotters.
Variable Grid
L- Edit‟s grid options support lambda- based design as well as micron- based and mil-
based design.
Error Recovery
L- Edit‟s error- trapping mechanism catches system errors and in most cases provides a
means to recover without losing or damaging data.
L- Edit Modules
L- Edit ¤ Extract creates SPICE- compatible circuit netlists from L- Edit layouts. It can
recognize active and passive devices, sub circuits, and the most common device
parameters, including resistance, capacitance, device length, width, and area, and device
source and drain area.
                                              38
L- Edit ¤ DRC features user- programmable rules and handles minimum width, exact
width, minimum space, minimum surround, non- exist, overlap, and extension rules. It can
handle full chip and region- only DRC. DRC offers Error Browser and Object Browser
functions for quickly and easily cycling through rule- checking errors.
                                             39
                          CHAPTER-5
                GATES DESIGN USING GDI METHOD
shows the input and output waveforms of an AND gate conceived using GDI
                                             40
            Figure5.2 : Input and output waveform of an AND gate using GDI
OR Gate based on GDI method
     The OR gate consists of a single GDI cell as shown in Fig. where port P is given an
input B, port G an input A while port N is supplied with Vdd.
Fig5.3 : OR GATE-GDI
                                            41
Fig. shows the input and output waveforms of an OR gate constructed using GDI method
                                               43
           Figure5.7 : XOR gate using GDI method
                              44
5.4 PERFORMANCE ANALYSIS:
     displays a graphical comparative study of the average power consumption of the
XOR, AND and OR gate based on GDI method with respect to those designed using
conventional CMOS logic, transmission gates (TG)
Since the design of a NOT gate based on GDI method is equivalent to the conventional
CMOS inverter, comparison between them would be insignificant. The average power
consumption of the NOT gate by means of GDI process in 150nm technology is 3.01x10-
10 watts on application of a power supply voltage (VDD) of 1.0 volt.
                                            45
                        CHAPTER-6
                   RESULTS AND ANALYSIS
AND Gate
NOT Gate
XOR Gate
                     47
Half Adder
                                   48
4BIT RCA ADDER: (RIPPLE CARRY ADDER)
                               49
BIT CBA ADDER (CARRY BYPASS ADDER):
                               50
BIT CLA ADDER (CARRY LOOK AHEAD ADDER):
OR Gate
                           52
XOR Gate
Half Adder
                                53
Full Adder
CIA
Fig6.18: Output waveforms for 8-Bit CIA for Adding of (1101 1011)2 and (1110 1110)2
                                         54
CBA
Fig6.19: Output waveforms for 8-Bit CBA for Adding of (0101 1101)2 and (1011 1101)2
CKA
Fig6.20: Output waveforms for 8-Bit CKA for Adding of (1111 0111)2 and (1100 1111)2
                                        55
6.21 CLA
Fig6.21: Output waveforms for 8-Bit CLA for Adding of (1100 1111)2 and (1111 1101)2
                                         56
                         CHAPTER-7
                CONCLUSION AND FUTURE SCOPE
7.1 CONCLUSION
A novel methodology for asynchronous circuits, based on two-transistor GDI cells, was
presented. In this project we proposed a GDI-ADDRES for low-power design was
presented. The proposed circuit has a simple structure, based       power consumption
principles, and some gates to described using GDI method.
       The current work proposes the design of a full subtractor using Gate Diffusion
Input (GDI) procedure which on simulation has been found to consume low power in
conjunction with lesser delay time and fewer transistors while maintaining proper output-
voltage swing. In order to establish the technology independence the present work has
been performed in 150nm technology using Tanner SPICE and the layout has been
concocted in Microwind. Comparisons with standard CMOS, transmission gate and CPL
techniques.
Gate diffusion input (GDI) technique is used for designing the flops, memory devices
also the power consumption, delay, chip area and connection and parasitic capacitors are
decreased
                                           57
                                 REFERENCE
 S Bharathi “Area Efficient Self Timed Adders for Low Power Applications in
   VLSI”, I. J. of innovative Research in Science, Eng and Technology, Vol. 4,
   Issue 2, Feb 2015.
 Jashanpreet kaur, et.al. “A Review on GDI”, I. J of Advance Research in
   Electronics, Electrical & Computer Science Applications of Engg & Technology,
   Vol. 2, Issue 4, July 2014.
 P.Chaitanya Kumari, “Design of 32 bit Parallel Prefix Adders”, IOSR Journal of
   Electronics and Communication Engg, Vol.6, Issue 1, May 2013.
 Adilakshmi siliveru, “Design of KSA and BKA using Degenerate PTL”, I.J. of
   Emerging Science and Engg, Vol. 1(4), 2013, pp.2319–6378.
 Arun Prakash Singh, “Implementation of 1-Bit Full Adder using GDI Cell”, I.J.of
   Electronics and Computer Science Engineering, Vol.1, Issue 2,pp.333-342, 2012.
 Madhu Thakur, “ Design of Braun Multiplier with KS A & its Implementation on
   FPGA”, I.J. of Scientific & Engg Research, Vol.3, Issue 10, Oct-2012.
 S. Veeramachaneni, “Efficient Design of 32-Bit Comparator using CLA Logic,”
   in Proceedings of the IEEE NEWCAS ‟07, August 2007, pp.867–870.
58