Quantum cell automata (QCA) are the best possible alternative to the
conventional CMOS technology due to its low power consumption, less area and high-speed operation. This paper describes synthesizable QCA implementation of squaring. Vedic sutras used for squaring are defined over algorithm construction. Based on the concept of the Vedic sutra, this paper has carried out 2-bit square and
4-bit square, projective to affine logic gates construction. Importantly for miniaturization
of devices, the QCA based square is the operation on which the area of
circuits relies on. This means that significantly lower QCA parameters can be used in
the square than in other competitive square circuits such as Wallace, Dadda, serial parallel,
and Baugh-Wooley.
2. 24 B. K. Bhoi et al.
CMOS technology. But when dimensions of the MOS transistors are minimized to
a nanometer, the design expresses two important problems (i) tunneling effect take
place, resulting in a change in the functionality of the design (ii) due to the effects
of wire resistance and capacitance, the interconnections do not scale automatically
[1]. There are two alternative approaches for solving the above-discussed problems
of CMOS technology through (i) new transistor-based devices such as the tunnel
FET (T-FET), single electron transistor (SET), and carbon nanotube FET (CNT-
FET) (ii) other alternatives to transistor-based devices [2]. The first alternative is
suitable for implementation of the single computation unit, but the integration of
several computational blocks still remains a challenge [2]. Considering the second
alternative, quantum cellular automata (QCA) is the new and efficient technology at
the nanoscale level. QCA circuit was reviewed in [3] to construct classical cellular
automata with the help of quantum dots (q-dot) and to differentiate to the name from
the models of cellular automata performing digital computation, it is named as QCA.
QCA is the promising technology which offers high density, low power with high
performance for digital circuits. Unlike CMOS technology, QCA has no physical
transportation charge as Columbic force is the sole reason for Interaction between the
QCA cells. Thus, QCA emerges as the possible alternative to the CMOS technology.
The primary advantage of the QCA is that it can represent a data bit occupying
an only small area as QCA cell has two electrons which having polarizations (P
+1 and P–1) for representation of logic “0” and logic “1”. In CMOS transistor
technology where the base layer is treated as an active layer, whereas in QCA all
layers can be utilized as an active on which design can be constructed.
In digital design, the most important computational units are binary addition and
multiplication. Squaring is also one of the most important operations in different
cryptographic algorithms and high-performance computing. Normally the calcula-
tion of Binary Square of a number is made using multiplier; however, many dedicated
squaring techniques are also presented in the literature [4, 5]. Vedic science is mainly
associated with various Vedic sutra (or aphorisms) deals with various applications
such as a fast multiplier, and other arithmetic operation. The importance of Vedic
mathematics remains on the fact that it reduces the large calculations in conventional
mathematics to a very simple one [6, 7].
The paper is outlined as follows: Sections 2 and 3 discussed the QCA computing
paradigm and existing work. In Sect. 4, the square circuit is constructed and expla-
nation. QCA design, simulation outcomes, and parameter comparison are discussed
in Sect. 5. Finally, the conclusion is presented in Sect. 6
2 Preliminaries
Figure1ashowsQCAcellandtwodifferentpolarizationP–1andP+1.Figure1c
and d shows the three inputs majority and inverter gate. Clocking in QCA design has
a very important impact on every QCA logic design. Clocking is utilized in QCA
not only controls data transition but also provides the power supply to the circuit [2].
3. A Novel and Efficient Design for Squaring Units … 25
Input Output
Input Output
Input
Output
Binary
‘0’
Binary
‘1’
Majority voter
Inverter
QCA
cell
(a) (b)
(c) (d)
Fig. 1 QCA design a Cell polarization –1 and 1, b wires, c inverter, d three inputs (A, B, C)
majority gate
The mostly used clocking scheme is the 4 phase clock, here cell is clocked using a 4
phase clock and the phase shift is 90° from the previous clocking to next clocking.
The four phases of the clocking are used for data flow. In case of switch phase, cells
are unpolarized and having low potential barriers and the barriers are raised. The
barriers are kept high in hold phase and it is lowered in release phase. The barriers
remain lowered in relax phase, which allows the cells to be an unpolarized state. The
logic transition of data occurs during the switch phase.
3 Existing Squaring Circuits
There are many previous state-of-the-art designs have presented using Vedic tech-
niques considering various platforms, i.e., microprocessors, FPGA [8–14]. In [8]
Vedic multiplier is implemented in 8085, and 8086 microprocessors comparing with
the conventional multiplier. As per study in [9], multiplier architecture uses the
crosswise and vertical algorithm “Urdhva Tiryagbhyam” of Vedic mathematics. This
design has improved in speed as compared to fast Booth multiplier by implement-
ing on FPGA. In other work [10], multiplier architecture uses “Nikhilam Sutra” of
Vedic mathematics. This multiplier architecture finds out the complement of the large
number from its nearest base to perform the multiplication operation. Therefore, the
multiplication of two large numbers is reduced to the multiplication of their com-
plements and addition. Existing work [10] is again extended in [11], adding carry
save adder to the Vedic multiplier architecture which reduces propagation delay
significantly. Both Vedic multipliers [10, 11] are synthesized and simulated using
Xilinx ISE 10.1 software and also implemented on FPGA devices. In FPGA plat-
4. 26 B. K. Bhoi et al.
form, d squaring architectures using Vedic mathematics are also proposed in [12,
13]. Kasliwal et al. [12] have presented squaring units using concurrent operation
of the multiplier (Vedic multiplier) and the addition in VHDL based. Authors have
compared the result with the conventional Booth’s algorithms in terms of time delay
and area considering Xilinx Vertex 4vlx15sf36-12 device. The previous study in [13]
shows the logical implementation of squaring architecture using Vedic sutra of Vedic
science targeting hardware model such as FPGA.
TheusageofVedicmathematicsinQCAplatformhasnotbeenreportedpreviously
sothismotivatesustotaketheadvantageofVedicmathematicstoQCAplatform.This
paper presents the multiplier less squaring design in QCA platform using Yavadunam
Sutra of Vedic science. The design of the proposed architecture is derived from
ancient Indian Vedic mathematics [6, 7]. The meaning of the “Yavadunam” Sutra
algorithm is “whatever the deficiency subtracts that deficit from the number and
write alongside the Square of that deficit”. The Yavadunam algorithm associated the
Square of the large size of the operand into the Square of fewer magnitude operands
with addition operation [14].
4 The Proposed Efficient QCA Squaring Unit
Table 1 presents the algorithm for squaring of a binary operand reported in [13] con-
sidering FPGA platform. A certain range of deficit that avoids extra binary addition
operation and leads to a reduced bit multiplication supports case 1 of step 4 in the
algorithm. In the new architecture of squaring techniques, the benefit of Vedic sci-
ence remains unaffected where complex calculations are simplified to a very simple
one. Here, the proposed algorithm is used for designing of the 4-bit squaring unit in
QCA platform.
According to the proposed algorithm described above, the design of 4-bit square
unit requires a 2-bit square unit, a complement unit, and a left shifter. The proposed
design is again simplified using dedicated (i) 2-bit square unit (ii) 2s complement
unit. Table 2 shows squaring operation and 2s complement operation respectively
for the 2-bit binary operand. For square operation (2 bits), the Boolean output P is
expressed as: p3 a1 (and) a0, p2 a1 and (not (a0)), p1 0, p0 a0.
Similarly, for 2s complement operation (2 bits), the Boolean output B is expressed
as b1 a1 (xor) a0, b0 a0. The proposed two-bit square requires only two “AND”
and one Inverter resulting very few numbers of cell and area for overall design.
Figures 2a and 3a show the circuit diagrams for the proposed 2-bits square design
and 4-bits square design, respectively. The layout of the new designs is shown in
Figs. 2b and 3b.
Example 1 presents the square operation of the 4-bit operand (1110) evaluating
in four steps of operation. In step 1, the 2s complement of the lower half bits of input
operand is obtained using equations (V, VI). The 2-bit result of step I is squared
in step II using equations (I–IV) as RPR of the final result. In step III, the LPR is
5. A Novel and Efficient Design for Squaring Units … 27
Table 1 Algorithm for squaring of a binary operand reported in [13]
1. Initialize n-bit binary number a;
2. Complement of a as a = not(a);
3. Calculate 2’s complement of a as Compa=
4. CASE1:
If value of compa (n-1 down to n/2)= zero then
deficit =coma(n/2-1 down to 0);
RPR =deficit * deficit;
LPR=left shifted a by 1-bit;
CASE 2:
Temp: temporary signal;
If value of compa (n-1 down to n/2)/= zero then
deficit =Compa;
Temp=deficit* deficit;
RPR=temp(n-1 down to 0);
LPR=Left shifted a by 1-bit + temp (2n-1 down to n);
5. Final square result, P obtain as
P= LPR RPR;
1
a +
Table 2 Squaring and 2s complement operation of 2-bit binary operand
Operand A (a1, a0) Square result P (p3 p2 p1 p0) 2s complement result B (b1,
b0)
0 0 0 0 0 0 0 0
0 1 0 0 0 1 1 1
1 0 0 1 0 0 1 0
1 1 1 0 0 1 0 1
obtained by one bit left shift operation of input operand. The final outcomes of the
LPR and RPR give the final square result.
Example1: Squaring algorithm
Step-I
For a 4-bit number a14 (“1110”)
2s complement operation of lower 2 bits (a1 a0 “10”), the Boolean output B is
expressed as: b1 a1 (xor) a0 “1”, b0 a0 “0”
Step-II
For square operation of b1b0 “10” (2 bits), the Boolean output P is expressed as:
p3 b1 (and) b0 0, p2 b1 and (not (b0))1, p1 0, p0 b0 0
So RPR (Right part of result)p3 p1 p2 p0 “0100”
6. 28 B. K. Bhoi et al.
(a)
0
a
1
a
0
p
1
p
2
p
3
p
0
(b)
Fig. 2 Proposed two-bit square architecture a Logic diagram. b Layout design
0
a 1
a
6
p
2
p
3
p
0
p
1
p
5
p
0
4
p
0
7
p
2
a
(a)
(b)
Fig. 3 Proposed four-bit square architecture a Logic diagram. b Layout design
Step-III
One bit left shift operation to the 4bit number (a3 a2 a1 a0 “1110”)
p7 p6 p5 p4 = (“1100”) as LPR (left part of result)
Step-IV
The concentration of LPR and RPR gives square result as:
LPR RPR“1100 0100”(196)10.
7. A Novel and Efficient Design for Squaring Units … 29
Fig. 4 Simulation result of the proposed 2-bit square architecture
5 Simulation Results and Comparison
The proposed squaring layouts are simulated and characterized using QCA tool
[15–17]. Figure 4 shows the simulation outcomes of the 2-bit square design [18–20].
The design parameters such as a number of clocks, area and cell count of the proposed
designsarecompared(inTable3)withpreviouslyreportedstate-of-the-arttechniques
considering the QCA technology.
8. 30 B. K. Bhoi et al.
Table 3 Comparative results of the square circuit
Design Cell count Area (um2) Latency
Baugh-Wooley 4×4 [2] 1982 1.8 4.75
Serial–parallel (4)c [18] – 0.1664 14
Serial–parallel (4)b [19] 406 0.4935 1
4×4 Wallace [20] 3295 7.39 10
4x4 Dadda [20] 3384 7.51 12
New 2×2 array multiplier 376 0.88 10
New 2 bit square 77 0.17 4
New 4 bit square 211 0.43 8
6 Conclusions
Novel square units have been proposed in this paper based on a Vedic sutra, which is
constructed by QCA paradigm. Therefore, miniaturization of proposed circuits has
been to limit its area. The extensive algorithm has been presented to examine the
logic computation of n-bit square architecture. The benefits of the proposed design
of the 2-bit and 4-bit squaring unit are low complex architecture, low latency, and
minimum footprint area. Due to their primitive results for square circuitry in QCA
paradigm, they are used in recent nanoelectronics application. They can be utilized
for signal processing application such as a microprocessor and emerging computing
device.
References
1. Porod W (1997) Quantum-dot devices and quantum-dot cellular automata. J Frank Inst
334(5–6):1147–1175
2. Sridharan K, Pudi V (2015) Design of basic digital circuits in QCA. In: Design of arithmetic cir-
cuits in quantum dot cellular automata nanotechnology. Studies in computational intelligence,
vol 599. Springer, Cham (2015)
3. Lent CS, Tougaw PD, Porod W, Bernstein GH (1993) Quantum cellular automata. Nanotech-
nology. 4:49–57
4. Paar C, Fleischmann P, Soria-Rodriguez P (1999) Fast arithmetic for public-key algorithms in
Galois fields with composite exponents. IEEE Trans Comput 48(10):1025–1034
5. Sethi K, Panda R (2012) An improved squaring circuit for binary numbers. International J Adv
Comput Sci Appl 3(2):111–116
6. Mishra NK, Wairya S (2013) Low Power 32×32 bit multiplier architecture based on Vedic
mathematics using virtex 7 low power device. Int J Res Rev Eng Sci Technol. 2(2)
7. Thapliyal H, Kotiyal S, Srinivas MB (2005) Design and analysis of a novel parallel square and
cube architecture based on ancient Indian Vedic mathematics. In:48th Midwest Symposium on
IEEE Circuits and systems, pp 1462–1465
8. Chidgupkar PD, Karad MT (2004) The implementation of Vedic algorithms in digital signal
processing. Glob J Eng Educ 8(2):153–158
9. A Novel and Efficient Design for Squaring Units … 31
9. Pradhan M, Panda R (2010) Design and Implementation of Vedic Multiplier. AMSE J Comput
Sci Stat Fr 15:1–19
10. Pradhan M, Panda R (2012) Speed optimization of Vedic multiplier. AMSE J Gen Math
49:21–35
11. Pradhan M, Panda R (2013) High speed multiplier using Nikhilam Sutra algorithm of Vedic
mathematics. Int J Electron 101(3):300–307 (2014). https://doi.org/1080/00207217.2013.78
0298
12. Kasliwal PS, Patil BP, Gautam DK (2011) Performance evaluation of squaring operation by
Vedic mathematics. IETE J Res 57(1):39–41
13. Barik RK, Pradhan M (2015) Area-time efficient square architecture. Adv Model Anal D
20(1):21–34
14. Pushpangadan R, Sukumaran V, Innocent R, Sasikumar D, Sundar V (2009) High speed vedic
multiplier for digital signal processors. IETE J Res 55(6):282–286
15. Vankamamidi V, Ottavi M, Lombardi F (2008) Two-dimensional schemes for clocking/timing
of QCA circuits. IEEE Trans Comput Aided Des Integr Circuits Syst 27(1):34–44
16. Misra NK, Wairya S, Sen B (2017) Design of conservative, reversible sequential logic for cost
efficient emerging nano circuits with enhanced testability. Ain Shams Eng J
17. Walus K, Dysart TJ, Jullien GA, Budiman RA (2004) QCADesigner: A rapid design and
simulation tool for quantum-dot cellular automata. IEEE Trans Nanotechnol 3(1):26–31
18. Walus K, Jullien G, Dimitrov V (2003) Computer aritmetic structures for quantum cellular
automata. In: Record of Thirty-seventh Asilomar Conference on Signals, System and Comput-
ers, pp 1435–1439
19. Cho H, Swartzlander EE Jr (2009) Adder and multiplier design in quantum-dot cellular
automata. IEEE Trans Comput 58(6):721–727
20. Kim SW, Swartzlander EE (2009) Parallel multipliers for quantum-dot cellular automata. In:
Nanotechnology Materials and Devices Conference, 2009. NMDC’09. IEEE, pp 68–72