$See \ discussions, stats, and author \ profiles \ for \ this \ publication \ at: \ https://www.researchgate.net/publication/342926422$ 

# Synthesis and relative performance of Carry Select and Carry Skip Adder in 65nm FPGA

Conference Paper · August 2011

| citations<br>0 |                                                                                                              | READS<br>112 |                                                                                                                   |  |
|----------------|--------------------------------------------------------------------------------------------------------------|--------------|-------------------------------------------------------------------------------------------------------------------|--|
| 3 autho        | s, including:                                                                                                |              |                                                                                                                   |  |
| L              | Romana Yousuf<br>Islamic University of Science and Technology<br>13 PUBLICATIONS 34 CITATIONS<br>SEE PROFILE | 0            | Najeeb-Ud-Din Hakim<br>National Institute of Technology Srinagar<br>107 PUBLICATIONS 287 CITATIONS<br>SEE PROFILE |  |

Romana Yousuf, Jyoti sharma and Najeeb-ud-Din (Senior Member IEEE)

*Abstract*— Due to rapidly growing system-on-chip industry, not only the faster units but also smaller area and less power has become a major concern for designing digital circuits. Therefore, the important performance parameters that are to be taken into consideration while designing any digital circuit includes speed, low power consumption, and smaller area. However, it is very difficult to integrate all the above performance criteria into a single cost function. An improvement in one parameter can lead to degradation in other performance parameters. Therefore, careful parameter optimization in digital circuits is of utmost importance. Adders are the fundamental components in a digital based system. All other operations like subtraction, multiplication, division, complement takes place by implementing adder. Hence, the speed of the adder determines the speed of a processor or system.

The goal of this work is to synthesize a new topology for carry select and carry skip adders in 65nm technology using FPGA's. Further, their delay is optimized, and their relative performance is calculated on the basis of their delay, and Power-Delay Product (PDP).

*Index Terms*— Adders, Carry Select Adder, Carry Skip adder, Delay; FPGA, Multilevelism in adders.

### I. INTRODUCTION

ITH the technology scaling to deep sub-micron, the speed of a circuit increases rapidly. At the same time, the power density per chip also increases significantly due to the increasing density of the chip. Due to this thermal consideration becomes a major challenge in the VLSI circuit design, which in turn puts constraints on the speed improvement of circuit. Therefore, in realizing modern VLSI circuits, low power, small area and high-speed are the predominant factors, which need to be considered in design flow.

Romana Yousuf was with Department of Electronics & Communication Engineering, National Institute of Technology Srinagar J&K 190006, India, and is currently with the Electronics & Communication Engineering Department, IUST, Awantipora, J&K India, (email: rozi13@rediffmail.com ,Ph No: +91 9622914268)

Jyoti Sharma was with Department of Electronics & Communication Engineering, National Institute of Technology Srinagar J&K 190006, India, and with the School of Electronics & Communication Engineering, SMVDU Katra, J&K India, She is currently in Australia. (email: jyoti.sharma22@yahoo.co.in, Ph No: +91 9419 04936)

Najeeb-ud-din (Corresponding Author) is with Department of Electronics & Communication Engineering, National Institute of Technology Srinagar J&K 190006, India, (email: najeeb@nitsri.net, Ph No: +91 194 2429423 \*2706, Fax: +91 194 2420475, mobile: +91 9906 666033)

Digital circuits make use of digital arithmetic's, which encompasses the study of number representation, algorithms for operation on numbers, implementation of arithmetic units in hardware. Adders are considered as one of the fundamental arithmetic components in computer systems. All other operations like subtraction, multiplication, division etc takes place by implementing adder. The adder delay defines the maximum frequency of the operations of an integrated circuit. Fast addition should involve fast carry generation. The demand for the fast adders is always increasing, since microprocessors at present use high clock frequencies, and low voltage power supplies. Various kinds of adders have been implemented in order to satisfy the area, delay, and power supply requirements. They include Ripple Carry Adders; Carry Look-ahead Adder, Carry Skip Adder, Carry Select Adders etc, which provide trade-off between delays and other parameters such as area, and energy dissipation. As such no design is considered as efficient, but they provide alternatives from which to choose in a specific context within a set of specific requirements and constraints. In this paper we will focus on Carry Select Adder and Carry skip adder, by comparing them on the basis of one of the important performance parameter – Delay, and power-delay product.

#### II. CARRY SELECT ADDER

In conventional Carry Select Adder (CSA) [1], to find out the sum, two ripple carry adders are used, one for carry-in  $C_{in} = 0$  and the other for  $C_{in} = 1$ . Carry output is then obtained by selecting the correct result through the multiplexer. However, this topology has a disadvantage that, it consumes more power [1]. In order to eliminate this disadvantage, [2] proposed an adder. This scheme requires 29.2% fewer transistors with a speed penalty of 5.9% for a bit length of n=64. To further improve the speed, a new design for high speed and high density carry select adders was proposed in [3], by giving an algorithm with reduced hardware and making use of Field Programmable Gate Arrays (FPGA). This proposed algorithm was implemented for a 64-bit adder, but was efficient for other sizes of adders or subtractors. In the scheme given by Hashemian [3], carries are grouped and the carries within a group are generated concurrently. This results in reducing the overall delay of the adder. Further reduction in area with negligible speed penalty was made by Kim and Kim [4]. This type of adder makes use of the Chang's complement scheme for designing an add-one circuit and some small changes were made in this adder for the improvement of the circuit. This type of adder has 38% fewer transistors than conventional adder proposed by Bedrij [1] and 29% fewer transistors than Chang and Hsiao's [2] carry select adder. Another architecture for carry select adder was proposed [5], which was based on sharing of clock and hence called as carry select adder with sharing (CSAS). It has been found that CSAS is  $\sqrt{2}$  times slower than CSA but area of CSA are larger than CSAS. In order to increase the speed of the carry select adder, the carry select schemes can be combined with Carry Look-Ahead Adder (CLA) architecture [6]. This combination of CSA and CLA results in the hybrid adder. Dual Transition Skewed Logic (DTSL) can also be used to design a low power and high performance carry select adder [7]. Alioto, Palumbo, and Poli [8] proposed a gate-level design of carry select adder, where the requirement was to minimize the delay through a proper selection of full adder group sizes. Further improvements in CSA have been made by introducing pipeline adder architecture [9] and reconfigurable carry select adder with fixed-sized and variable-sized blocks [10]. To obtain a lower transistor count, and hence area efficient carry select adder, modified add-one circuit was proposed by [11]. Further, to reduce the area and power with minimum speed penalty, a modified carry select adder was proposed by [12]. They analyzed the two versions of carry select adder, one using ripple carry adder for low power and the other with carry look-ahead adder for higher speed.

## III. CARRY SKIP ADDER

The carry skip mechanism was originally invented by Babbage in nineteenth century. The main consideration for the carry skip adders is to find out a way for the optimum group distributions of blocks. The problem associated to group distribution was simplified by Silvio Turrini in late 1980's by generating an algorithm, which varies the provided worst case delay and generate optimal distributions in variable sized blocks [13]. Initially fixed sized block carry skip adders were proposed. However it was found that, the delay associated with variable size block carry skip adders is much less than the fixed sized one [14, 15]. Designs were optimized initially by considering the worst-case delay for an adder, and then find the optimal distributions of blocks, which were characterized by a different number of bits. Kantabutra proposed an adder in 1993 with a concept to build the largest possible adder for a given carry propagation delay [16]. Enhancements to the basic carry-skip structure include using multiple skip levels were proposed by [17, 18]. An accelerated two-level Carry skip adder (CSA) consists of two halves called ascending half and the descending half [17]. At present a 3-level fully static carry skip adder has been implemented with high performance and low power consumption [18]. A method to reduce power consumption can be obtained by using pass transistor based full adder. Every cell has a full voltage swing for all input combinations [19]. Here carry skip adder employs a network of 2

transmission gates to enable carry signals to bypass blocks of full adder cells for faster computation. A generalized block distribution algorithm (GBDA) was proposed to determine the block sizes, which achieved better topology regularity and layout simplicity [20]. In order to achieve the topological regularity and a reduction in carry propagation time as compared to the existing carry skip adder; a recursive partial full adder (PFA) hierarchy is proposed. An improvement can be obtained by replacing the blocks with high speed adders like carry look ahead adder structure to reduce the first block delay [14]; the excessive delay due to the skip multiplexer at the most significant bit (MSB) end of the adder can be reduced by introducing carry select adder at that position [21]. Most recent adder has been implemented in 130 nm CMOS technology and the main concern was minimizing dynamic switching power, and achieving the target processor time [15].

## IV. LOGIC AND COMPARISON

## A. Carry Select Adder

The basic principle behind Carry Select Adder (CSA) is that at each level we have two adder units. Each unit implements the addition operation in parallel [1]. The first unit implements the addition assuming a carry-in of '0', generating the sum and carry-out bit. The second unit performs the same operation assuming the carry-in of '1' i.e. two sums are computed in parallel, one for 1-carry and other for 0-carry and then select among them when the carry of previous stage is available with the help of a typical logic circuit e.g. a multiplexer.

Mathematically:

$$(C^{0}, S^{0}) = ADD (A, B, C_{0} = 0)$$
 (1)  
 $(C^{l}, S^{l}) = ADD (A, B, C_{0} = 1)$  (2)

Thus, when the carry-in of the group is known, one of the above two functions is selected as

$$(C, S) = (C^{0}, S^{0}) \text{ if } C_{0} = 0$$
  
= (C<sup>l</sup>, S<sup>l</sup>) if C<sub>0</sub> = 1 (3)

## B. Carry Skip Adder

The basic principle behind carry skip adder is that the adder is divided into groups of m-bits. The carry input for group j+1 is determined by one of the following two conditions:

The carry is propagated by group j and the carry out of the group j is equal to the carry-in of that group only if the sum of the inputs to that group is equal to  $2^{(m-1)}$ . If x(j) and y(j) are the integers corresponding to these inputs, the group propagate signal:

$$P(j) = 1 \quad \text{if} \quad x(j) + y(j) = 2^{(m-1)}$$
  
= 0 Otherwise (4)

The carry is not propagated by the group. i.e., either it is killed or generated inside the group.

## C. Comparison

Carry skip adder and Carry select adders provide trade off between delay and area. Now which adder is to be used among the two depends upon the application with which we will be dealing. There is no preference of one over the other. If we are designing a high speed circuit where area is not a consideration, carry select adder may be used in such cases. For applications where area is of utmost importance, but less speed is tolerable, carry skip adder is a good choice. In carry skip adder, area is a linear function of number of bits to be added f(n), and delay is function of square root of the number of bits to be added  $f(\sqrt{n})$ , while in case of carry select adder we have f(nlog(n)) area and f(log(n)) delay. Hence, these adders provide a good compromise in terms of area and delay.

Carry skip adder and carry select adder can be of fixed block size or variable block size. Here main aim of the adder design is to find the optimal bit partitioning to balance the propagation delay of the inputs to the carry chain. The delay and power dissipation are reduced by dividing the adder into variable-sized blocks, which balance the delay of inputs to the carry chain. In fixed block size all the blocks are of fixed length as compared to the variable block size adder. In variable type adders the block size is varying depending on the position of the block in the adder. The main design problem with the adder is to work out how best to group the blocks. Smaller block sizes are preferred at the LSB of the adder for the fast carry generation. The next blocks get carry as soon as possible and sum generation can take place earlier.

#### V. PROPOSE LOGIC

A new topology for carry select adder and carry skip adder has been proposed here, which reduces the delay to a larger extent and hence increases the performance of the adders. A comparison between the two has also been carried out.

## A. Carry Select Adder

Our aim here was to synthesize a carry select adder, which is optimized for delay, power and area, and hence increases the performance of the adder. A new logic is proposed for the Carry Select Adder that reduces the delay in comparison with the existing Carry Select Adder. This methodology shows better result than that of the already existing ones. It is based on the logic that sum for carry-in of '0' is compliment of the sum for carry-in of '1'. Based on the same, the logic of the circuit as shown in the Figure 1 has been developed. Sum for carry-in of '0' is given as:

$$Sum^0 = (A \oplus B) \tag{5}$$

And for carry-in of '1' we have:

$$Sum^{1} = (A \oplus B)^{\prime} \tag{6}$$



Figure 1: Proposed Basic carry select adder Cell

Both  $Sum^0$  and  $Sum^1$  are fed to the multiplexer whose strobe is the previous carry. The multiplexer will give us the desired sum. Carry-out is obtained form multiplexer whose inputs are 'A' and previous carry 'C<sub>in</sub>' with a strobe of Sum<sup>0</sup>.

If we make use of the concept of blocks in adders i.e. if the operands are divided into the blocks of same or different length, the delay of adders can be reduced by proper arrangement of blocks. It has been observed that delay can be reduced to a larger extent by making use of the blocks of the variable length.

Further, reduction in delay and hence PDP of carry select adder is to make use of the universal NAND gate, which has lesser delay in comparison to other gates. Therefore replacing all the logic elements, including the multiplexers of Figure 1 by NAND gates reduces the delay of the proposed logic by a substantial amount.

The circuit diagram of the optimized proposed logic is shown in Figure 2. Here each multiplexer, that is used to select the sum and carry, is represented in terms of NAND gates. Thus the use of NAND gates further reduces the delay of the proposed logic. Also, proper arrangement of blocks results in the reduction of glitches, which are spurious transitions.

These transitions occur as a result of finite propagation delay one logic block to the next. As a result of these glitches, a node can exhibit multiple transitions in a single clock cycle before settling to the correct logic level.



Figure 2: Optimized carry select adder Cell



Figure 3: Proposed carry skip adder

#### B. Carry Skip Adder

Logic diagram for carry skip adder is shown in figure 3. In our proposed logic two types of adders are used. One full adder generates carry signal, propagate signal, and sum output; while the other full adder along with above output generates: generate, and kill signals.

The later full adder with two additional outputs is used only as a last block for any carry skip adder. This later full adder produces generate output if both the inputs to that full adder are '1', and produces kill signal if both the inputs to that full adder are '0'. These two outputs are ORed together. Either one of the two outputs is '1' or the propagate signal is '1'. This OR gate selects one of the two signals, viz. generate, or kill signal. The output of this OR gate is further ORed with the output of the AND gate which is used to predict the block propagate signal. Out of all the three signals one signal may be '1' and that signal is further ORed with the carry that is propagated through the adder as ripple carry and hence exact carry can be predicted. Further reduction in delay can be achieved by introducing the "Multilevelism". An adder is known as a multilevel if more than one level is introduced in the adder. Two level carry skip adders are those adders; where not only bits are grouped into blocks, but blocks are further grouped together into several disjoint sets. If every block in a section contains only those bits, which have propagated signal '1', then full section will be skipped quickly in single step hence further reduction in delay. Adder upto three levels have already been implemented. The proposed logic gives the lesser delay for less bit fixed block sized adder. So the larger blocks of size 8 bits and 12 bits, if implemented with the sub blocks of proposed 2-bits fixed block size adder can reduce the delay. The 8-bit adder is divided into four two bits adder having their own skip mechanism and the skip mechanism of the 8-bit block itself acts as the second level skip. Similarly the 12-bits block is further divided into six blocks each of 2bits and the skip mechanism of the 12-bit adder acts as second level of skip mechanism. Levels are introduced within the block and this leads to reduction in delay. Hence

multilevelism of the order of two has been implemented in this adder. Block diagram showing the process of multilevelism is shown figure 4.



Figure 4: Block diagram showing multilevelism in carry skip adder

#### VI. SIMULATION RESULTS

In this section the simulation results of the Carry Select Adders and Carry Skip Adder are discussed and then compared. Adders are implemented in Altera's Quartus-II [22] simulator, using Stratix-III device family, having 65 nm technology node. Simulation results for the optimized adders for various bit lengths are given in Table 1. Simulation results were plotted between the delay and number of bits for different bit lengths i.e. 8, 16, 24, 32, 64 and 128 bits. From table 1, it is evident that the delay of carry select adder is lesser than the carry skip adder for different bit lengths.

Table 1: Delays of carry select adder and carry select adder

|       | CARRY SELECT ADDER |      |      | CARRY SKIP ADDER |     |       |
|-------|--------------------|------|------|------------------|-----|-------|
| BITS  | 32                 | 64   | 128  | 32               | 64  | 128   |
| DELAY | 5.25               | 6.29 | 7.74 | 9.37             | 8.6 | 12.19 |

Comparison of delays in proposed logic for fixed and variable size blocks is shown in figure 5. From the plot it is evident that, for lower bits, the delay is almost same for fixed and variable size blocks. This means that for lower bits the use of fixed blocks or variable blocks does not make any difference in performance. However for higher bits the delay is less for carry select adder based on variable sized blocks, although the difference being very small.



Figure 5: Comparison of delays in proposed logic for fixed and variable size blocks carry select adder

Figure 6 shows the performance of three different topologies of carry select adder in terms of their delay parameter. Three different topologies shown in the graph, includes existing logic, proposed logic and the optimized proposed logic. From the plot it is evident that, for lower bits i.e. approximately up to 24-bits the delay of the existing and the proposed logic almost coincides with each other. But for higher bits the delay of the proposed logic is lesser than that of the existing logic.



Figure 6: Comparison of delays in existing, proposed and proposedoptimized carry select adder



Figure 7: Proposed carry skip adder showing results of multilevelism



Figure 8: Comparison of delays in Existing and proposed carry skip adder

5

Figure 7 shows the delay of carry skip adder for three different levels of multilevelism i.e. third, fourth and fifth levels. From this graph it is clear that the delay in case of fifth level is small as compared to the third level. Figure 8 shows the delay difference between the existing and proposed topology in case of carry skip adder. Figure 9 shows the difference in delay between the carry select adder and carry skip adder with reference to Table 1.From this graph it is clear that delay of proposed carry select adder is less than the proposed carry skip adder.



Figure 9: Comparison of delays in carry select and carry skip adder

# VII. POWER - DELAY PRODUCT

Table 2 shows the Power-Delay Product (PDP) of 32-bit carry select adder and carry skip adder. From these results it is clear that PDP in case of proposed topology of carry select adder is less than that of the proposed carry skip adder topology. This shows that the carry select adder is best in terms of PDP than the carry skip adder.

Table 2: Comparison of power-delay product of 32-bit carry select and carry skip adder

|     | CARRY SELECT ADDER | CARRY SKIP ADDER |
|-----|--------------------|------------------|
| PDP | 1872.15            | 3206.49          |

# VIII. CONCLUSION

This work has presented and verified a new methodology for carry select and carry skip adder. Among all adders developed so far, carry select adder has faster speed and carry skip adder has better area overhead. Proposed methodology for carry select and carry skip adder were produced. Further, optimization in the proposed logic of carry select adder was made by replacing each multiplexer of the circuit with NAND gates. In order to increase the performance of proposed topology, multilevelism was used for carry skip adder and universal gates were used for carry select adder. In case of carry skip adder, multilevelism was used to improve the performance in terms of speed and universal gates were used in carry select methodology in order to increase the performance. These methodologies have been implemented in Altera's Quartus-II simulator, using Stratix-III family and 65 nm technology. Simulations have been done under various configurations for different bits. Concept of blocks of fixed and variable size has also been introduced and the results have been analyzed for different bit pattern. Timing analysis reveals that this proposed logic is better than the existing logic in both the cases. An improvement in the proposed logic has reduced the delay to a larger extent. Further carry select adder is considered better in terms of speed than carry skip adder.

## ACKNOWLEDGMENT

This work has been carried out in SMDP-II VLSI laboratory of the Electronics and Communication Engineering Department, of National Institute of Technology Srinagar, India. This SMDP – II VLSI project is funded by Ministry of Communication and Information Technology, Government of India. Authors are grateful to the Ministry for the facilities provided under this project.

#### REFERENCES

- [1] Bedrij, J. O. "Carry Select Adder," *IRE Transactions on Electronic computers*, vol.11, pp. 340-346, 1962.
- [2] Chang, T. Y. and Hsiao, M. J., "Carry Select Adder using single Ripple Carry Adder," *Electronic Letters*, vol. 34, pp. 2101-2103, 1998.
- [3] Hashemain, R., "A new design for high speed and high density carry select Adders," Proceedings of IEEE Midwest Symposium on Circuits and Systems, 2000, pp.1300-1303.
- [4] Kim, Y. and Kim, L. S., "A low power Carry Select Adder with Reduced area," *Proceedings of IEEE Symposium on Circuits and Systems*, 2001, vol.4, pp. 218-221.
- [5] Amelifard, B., Fallah, F., and Pedram, M., "Closing the gap between Carry Select Adder and ripple carry adder: A new class for low power high performance Adders," *Proceedings* of 6<sup>th</sup> International Symposium on Quality of Electronic Design, 2005, pp. 148-152.
- [6] Kwon, O., Swartzlander, E., and Nowka, K., "A Fast hybrid Carry look ahead / Carry Select Adder design," *Proceedings* of 11<sup>th</sup> Great Lakes Symposium on VLSI, 2001, pp. 149-152.
- [7] Jeon, W., Roy, K., and Koh, C. K., "High performance Low-Power Carry Select Adders using dual transistor skewed logic," *Proceedings of IEEE Symposium on Solid State Circuits*, 2001, pp.145-148,.

- [8] Alioto, M., Palumbo, G., and Poli, M., "A Gate level strategy to design Carry Select Adders," *Proceedings of IEEE Symposium on Circuits and Systems*, 2004, vol. 2, pp. 465-468.
- [9] Kim, Y., Sung K. H., and Kim, L. S., "A 1.67 GHz 32-bit pipelined Carry Select Adder using the complementary scheme," *Proceedings of IEEE Symposium on Circuits and Systems*, 2002, vol. 1, pp. 461-464.
- [10] Li, J. F., Kuo, Y.C., Huang, C.D., Tseng, T. W., and Wey, C.L., "Design of Reconfigurable Carry Select Adders," *IEEE Asia-Pacific Conference on Circuits and Systems*, 2004pp. 825-828.
- [11] He, Y., Chang, C-H, and Gu, J., "An Area Efficient 64-bit Square Root Carry Select Adder for Low Power Applications," *Proceedings of IEEE Symposium on Circuits and Systems*, 2005, vol. 4, pp. 4082-4085.
- [12] Rawat, K., Darwaish, T., and Bayoumi, M., "A Low Power and Reduced Area Carry Select Adder," *Proceedings of IEEE Symposium on Circuits and Systems*, 2002, vol. 1, pp. 467-469.
- [13] Turrini, S., "Optimal Group Distribution in Carry Adders," Proceedings of 9<sup>th</sup> IEEE Symposium on Computer Arithmetic, Santa Monica CA, 1989, pp. 96-103.
- [14] Yu, C. C., Lin C. S., and Liu, B. D., "A Generalized Block Algorithm for Fast Carry Skip Adder Design," *Proceedings of IEEE TENCON Conference*, 1999, pp. 844-890.
- [15] Chirca, K., Schulte, M., Glossner, J., Wang, H., Mamidi, S., Balzola, P., and Vassiliadis, S., "A Static Low Power High-Performance32-bit Carry Skip Adder," *Proceedings of IEEE Symposium on Digital System design*, 2004, pp. 615-619.
- [16] Kantabutra, V., "Designing Optimum One-Level Carry Skip Adders," *IEEE Transactions on Computers*, vol. 42, pp. 759-764, 1993.
- [17] Kantabutra, V., "Accelerated Two-Level Carry Skip Adders-A Type of Very Fast Adders," *IEEE Transactions on Computers*, vol. 42, pp. 1389-1393, 1993.
- [18] Turrini, S., and Menon, S., "A Fully Static Low Power, High Performance 64 bit 3-Level Carry Skip Adder," *IEEE Symposium on Low Power Electronics*, 1994, pp. 68-69.
- [19] Goel, A. K., and Bapat, P. S., "A New Time-Position Algorithm for the Modelling of Multilevel Carry Skip Adders in VHDL," *IEEE Canadian Conference on Electrical and Computer Engineering*, 1996, pp. 158-161.
- [20] Gayles, E. S., Owens, R. M., Irwin, M. J., "Low Power Circuit Techniques for Fast Carry Skip Adders," *IEEE Midwest Symposium on Circuits and Systems*, 1996, pp. 87-90.
- [21] Cha, M., and Swartzlander, Jr E. E., "Modified Skip Adder for Reducing First Block Delay," *Proceedings of 43<sup>rd</sup> IEEE Midwest Symposium on Circuits and Systems MI*, 2000.
- [22]Altera's Design Software Suite, "Quartus-II Simulator," Altera<br/>CorporationSanJoseCA95134,2008.

# ICFoCS-2011

View publication stats