SlideShare a Scribd company logo
1
How to transform
an architectural synthesis tool for
low power VLSI designs
S. Gailhard, N. Julien, J.-Ph. Diguet, E. Martin
IUP-LESTER
2, rue le Coat Saint Haouen
56100 Lorient
France
E-mail: gailhard@univ-ubs.fr
Fax : (33) 2.97.88.05.51 Phone : (33) 2.97.88.05.45
2
How to transform
an architectural synthesis tool for
low power VLSI designs
Abstract - High Level Synthesis (HLS) for Low Power VLSI
design is a complex optimization problem due to the
Area/Time/Power interdependence. As few low power design
tools are available, a new approach providing a modular low
power synthesis method is proposed. Although based for the
moment on a generic architectural synthesis tool Gaut, the use
of different "commercial" tools is possible
Our Gaut_w HLS tool is constituted of low power modules =
High level power dissipation estimation, Assignment, Module
selection (Operators and supply voltage), Optimization criteria
and Operators library.
As illustration, power saving factors on DWT algorithms are
presented.
Keywords - Digital Signal Processing (DSP), High Level
Synthesis (HLS), Low-Power Methods, Modular approach,
Discrete Wavelet Transform (DWT).
3
1. Introduction
Most VLSI researches have been focused on optimizing circuits speed and area to
perform complex signal processing. Indeed, the rapid advance in VLSI technology and the
increasing computation required for recent real time DSP applications infer large chips density
and high clock frequency. Then, the power dissipation minimization in modern circuits, such as
mobile systems, is a crucial problem. To minimize the power consumption of a system, VLSI
researches work on low power HLS tools. Power optimization can be processed at several
levels and the expecting power saving factor at each level can be expressed as follows :
behavioral (10-100%), architectural (10-100%), logical (10-50%) or physical (<20%) [3].
Therefore, at behavioral and architectural levels, the expected saving power is more significant
: these two domains have been less explored in the literature. The method integrated in our
HLS Gaut_w proposes techniques in order to optimize the design at these two levels.
Actually, a lot of architectural synthesis tools have been proposed. These tools differ from
the applications kind they can implement and from the targeted architecture. Therefore, it is
important to develop methods for low-power that can be quickly integrated in each user
architectural synthesis tool. For such a purpose, the best way is to develop low power modules
that are used before the architectural synthesis tool in the design flow.
In this paper, a new approach to HLS tool for low power and time constrained DSP
systems is proposed. The second section presents previous works on power estimation and
optimization. Section 3 explores briefly our Gaut_w HLS tool and its different modules (High
Level Power Estimation, Module Selection, Optimization Criteria, Assignment). In the section
4, results on DWT algorithm [2] illustrating our method are given before a conclusion.
2. Previous Works
2.1 Power estimation
There are three major sources of power dissipation in digital CMOS circuits that are
summarized in the following equation (1). The first term represents the power switching
component, the second term is due to direct-path short circuit current and the last term is the
power dissipation due to leakage current.
P P P P C p V fswitching shortcircuit leakage dd
2
= ( . ) . .+ + = (1)
where C is the physical capacitance of the CMOS circuit, p is the power consuming
transition (0→1) probability and Vdd the supply voltage
However for CMOS circuits, it assumed that both short-circuit and leakage power
dissipation is negligible. Therefore, the power dissipation is given by (2) [4] :
P C V fcomp dd
2
= . . (2)
Where Ccomp is the average capacitance switched to perform a computation and f the sampling
frequency.
It is well known that the signals statistics influence the power dissipation [5]. Nevertheless, at a
high level, it can be supposed that signal values are uniformly distributed. Then, the
capacitance Ccomp is described by the following expression (3) [6] :
4
C N C N C C Ccomp i i reg reg erconnect control
i
= + + +∑ . . int (3)
Where Ni is the number of i operations, Ci the i operator capacitance that contains the
intrablock routing and gates capacitance, NregCreg the effective registers capacitance, Cinterconnect
the physical interconnected capacitance estimation and Ccontrol the controller capacitance
estimation.
At the architectural level, the complete architecture is known and signals statistics in the
circuit can be calculated. Therefore, power dissipation estimation in a typical execution can be
computed more precisely [5].
2.2 Power optimization
Low power optimization is achieved by reducing either Ci with technology considerations
or Ccomp by the operators selection and the architectural synthesis. Moreover, voltage scaling
has a great impact on the power dissipation because of the quadratic dependence of Vdd (2).
Unfortunately, reducing Vdd increases the operators latency T significantly when the supply
voltage approaches the device threshold voltage Vt (4) [4] :
( )
T
V
V V
dd
dd t
∝
−
2 (4)
Therefore it is important to find a good trade-off between power dissipation and time
performance for DSP applications.
2.3 Low power HLS tools
Few HLS optimization tools have been proposed in the literature. Hyper_LP [6] has
integrated high level transformations and an ATP space exploration that shows the supply
voltage scaling impact on the design. Concerning the module selection, most of previous works
only goals design area optimization [7]. Moreover, a module selection (different operators at
different supply voltages) has been proposed which gives the lower bounds on Area,
Performance and Power dissipation [8]. Scheduling methods have been presented to schedule
Multi-Voltage Datapaths [9].
Nevertheless, no HLS synthesis tool using a modular approach and a classical
architectural synthesis tool has been proposed yet.
3. HLS tool Gaut_w
3.1 Overview of the design flow
The design flow of our HLS tool is illustrated in Figure 1. The three entries are the
algorithmic description of the application (behavioral VHDL), the operators library and the
optimization criteria. At present, this tool is based on the Gaut architectural synthesis tool
[1,14], but such a global method could be extended through any another tool. Gaut_w contains
two low power modules : Module Selection that selects the best operators set and the best
supply voltage for a given time constraint, and Assignment that assigns operations of the Data
Flow Graph (DFG) on physical operators in order to decrease the transition rate of operators
entries. These modules minimize a cost function (depending on both area and power) whose
weighted parameters are chosen by the user (Optimization Criteria). The design is optimized
5
through a Fast Power dissipation estimation based on average power dissipation of resources
(Operators library) used in the architecture and a probabilistic estimation of accesses on each
different resource (registers, memory, operators).
Graph
Assigned
DFG
Data Flow
Graph
Compilation
Assignment
Optimisation
criteria
Users needs
Algorithmic
specification
Operators
library
Architectural
synthesis
Tool
Fast Power
dissipation
estimation
High Level
Power
Dissipation
Estimation
Module
Selection
(operators,
supply
voltage)
RTL
specification
of the circuit
Logic
synthesis
Tool
VLSI Circuit
Gaut_w
Figure 1 : Low Power DSP HLS Tool
3.2 Formal library
The synchronous architecture synthesized by Gaut tool used operators, memories,
registers and a clock tree.
3.2.1 Operators
Each operator has been simulated by the logic simulator Compass to estimate the average
power dissipation [10]. The power estimation is the mean result of 5 simulations with 10.000
values length stimuli [11] (using Compass and Powercalc), implying a 95% confidence interval
and a less than 1% error. This method provides a good trade-off between the simulation time
and the results precision.
Therefore, the library contains operators with different ATP characteristics. For example,
an addition function in the DFG can be realized with 4 different operators (Table I).
6
3.2.2 Memory library
The great number of transistors in a memory implies that the power dissipation due to the
leakage current is no more negligible according to the switching component. Therefore, the
power dissipation estimator integrated in the Compass CAD tool is used to determine the
switching and the leakage component of the power dissipation. These terms depend on the
memory size.
Therefore, the memory power dissipation is based on a static power dissipation
Stat(S).Vdd2
and on a dynamic effective capacitance Cdyn(S). We do not have an analytic
expression, but only values corresponding to each different memory size.
3.2.3 clock library
The Cclk clock tree capacitance estimation is based on the ( )N SgRe estimated registers
number and on the Creg clock tree capacitance per register: . ( )C S N CClk reg reg= . .
3.2.4 Library example
Table I represents a library extract, where T represents the latency time for a 5.5 Volts
supply voltage, S the area and P the power dissipation estimation for a 5.5 Volts supply
voltage. The operator Ci effective capacitance can be calculated with expression (2) (P, Vdd
and f are known). The latency time at different supply voltages is estimated using expression
(4) (Vth is a known device characteristic).
16 bits Operators T
(ns)
S
(10-3
mm2)
P
(µW)
Add (DataPath) 14 73 412
Fast add (DataPath) 8.6 122 711
Add (VHDL) 18.9 53 229
Fast add (VHDL) 9.9 117 626
Mult (DataPath) 32.3 1337 18827
Pipeline mult (DataPath) 22 1337 18821
Mult (VHDL) 53.3 1023 11820
Fast mult (VHDL) 33.2 1270 16231
Register 4 70 71
Clock per register - - 65
memory 28×16 bits
Static : 36 mW
20 1163 961
Table I : Library example (CMOS 1 µµm technology)
3.3 Power dissipation estimation
The aim of this module is to give a fast power dissipation estimation calculated at a high
level of abstraction.Currently, the method targets time constrained DSP applications :
operators used in the circuit have an activation rate close to 100%.
The first estimation step is the scheduling probability computation of each operation in the
DFG [12]. This permits to estimate the storage probability of each variable (transfer via a
register, the memory and a register or transfer via a register) and then the probable activity of
7
the memory (Nmem) and the registers (NReg). The probable registers number can also be
estimated allowing to estimate the power dissipation of the clock tree. The Ni operators
activity is calculated on the DFG. Therefore, this probable activity of resources and their
average power dissipation available in the library leads to the total power dissipation Power(S)
(5) :
( )
( ) ( ) ( )( )
( )
Power S V
T
C S C S C S
T
C Static S
dd
r
Op g Dyn Mem
Clk
Clk Mem
=
+ +
+ +












2
1
1
.
.
Re _
(5)
With :
( )C S N COp i ii
N
= =∑ .1
0
( )C S N Cg reg regRe =
( )C S N CClk reg reg= .
( )C S N CDyn Mem mem dyn_ .=
Where COp(S) and CReg(S) represent respectively the effective capacitance of operators and
registers, each of these capacitances switching every Tr ns (time constraint). Cclk is the clock
tree capacitance estimation, ( )N SgRe the probable registers number and Creg the capacitance
per register (CClk switches every TClk ns : internal clock frequency). Stat(S).Vdd2
is the static
power dissipation and CDyn_mem(S) the dynamic effective capacitance.
3.4 Optimization criteria
Each designer has its own optimization problem : circuits for mobile electronic systems
require low power dissipation, whereas some designs need a minimum area circuit in order to
be cheaper. Therefore, the optimization criteria must depend on both area and power (Figure
2).
Area
Performance
Power
Time Constraint
(down limit)
Optimizing ?
• Power
• Area
Figure 2 : Optimization criteria
8
A lot of functions can be integrated in our optimization criteria module. Cost functions
using linear functions of both area and power dissipation have been chosen; an α factor (0 ≤ α
≤1) is introduced to select the power and area rate. β is a normalization coefficient (6).
( ) ( ) ( ) ( )Cost S . . Area S Power S
α
αα β= − +1 . . (6)
where Area(S) is the area estimation (operators, registers, interconnections and bus [7]),
Power(S) is the power dissipation estimation (5).
3.5 Assignment
Operations assignment on operators in the circuit is an important task in the architectural
synthesis in order to decrease the power dissipation. A high switching factor at operator entries
implies a high switching rate of operator gates and then a high power dissipation [5]. But, at a
high level of abstraction, it is impossible to know exactly the circuit architecture and to
calculate the activity rate of all operators entries. Therefore, only two models of operators
power dissipation are used :
• both of the operator entries change between two operations with white noise inputs
• only one of the operator entry changes between two operations, the other one is
constant
The power dissipation is greater in the first model than in the second one in a 25 to 50%
order depending on the operator. The first model is an estimation of the power dissipation for
two variables operation when the second model is an estimation of the power dissipation on a
variable and a constant one. Therefore, at a high level of abstraction, it is important to guide
the architectural synthesis to use the second model operator as often as possible in order to
optimize the design power dissipation. Let's take an example of this following DFG. A, B, C,
D and E are variables and their values change every clock cycle, but Cst is a constant that
keeps the same value for every clock cycle.
*
+
+
*
*
*
BA
Cst
rt
C
D E
R
Cst
R
Figure 3 : A DFG example
Multiplication on the DFG can be scheduled in different ways on one or more multipliers :
the operators power dissipation depends on the scheduling task. With the power dissipation
model used, it is clear that the minimum power dissipation is obtained when one multiplier is
dedicated only for multiplication with the Cst value (Constant value) : two multiplications are
9
done with the second model (Cst*R, Cst*C) and two multiplications with the first model (A*B
and D*E) to compute the DFG. If only one multiplier is used, the operator power dissipation is
greater : for example, if Cst*R is scheduled after A*B, then both of the two operator entries
change (A→ Cst and B→R) and then the multiplication is done in the first model.
Therefore, an assignment module detecting operations with constant has been developed.
Then, the Assignment Module assigns or not operators performing operations with constant to
optimize the cost function (6) defined by the user (Optimization Criteria).
3.6 Module selection
The module selection is the second step in the HLS GAUT design flow (Figure 1). Its
utility is to determine the optimal supply voltage and the optimal operators set from a given
library according to a DSP application (an algorithm and a time constraint) for design
optimization (cost functions defined upper).
The module selection presented here explores all the different solutions dealing with
different operators (ATP, mono or multi-function, pipeline operators) and different supply
voltages. Our DFG processing is carried in the order defined in Figure 4. To estimate the cost
function (6), the operators allocation has to be obtained from the number of pipeline stages.
This requires to know the operators latency and therefore the selected operators and the supply
voltage (4). To simplify the ATP space exploration and reduce the cost estimation number, the
following lemma is proposed [7] : the use of multi-functional units is necessary only if one of
its functions is underused, e.g. if there is at least one mono-functional operator in the operators
set S with an efficiency less than 100%. Therefore, a first step provides the best selection from
mono-functional operators. If mono-functional operators are underused, an optimization with
multi-functional units is processed for reducing the circuit area.
In conclusion, our module selection algorithm solves the best operators set for each
supply voltage (mono and multi-functions) and the best supply voltage selection in a range
from Vdd_min to Vdd_max (Figure 4) [13].
for Supply voltage from Vdd_min to Vdd_max
for all mono-functional operators set
Latency time calculation of each selected operator (4)
Number of pipeline stages calculation
Allocation estimation in each pipeline stage
Cost calculation (6)
End mono-functional operators set
Multifunctional operators resolution
ððBest operators set selection for each supply voltage
End for Supply voltage
ððBest selection (Supply voltage and operators set)
Figure 4 : Module Selection Algorithm
Nevertheless, it is possible to use normalised supply voltages (5, 3.3 ... Volts).
This ATP exploration leads to the minimum of the cost function chosen by the designer.
4. Results
In this submission, only few results on DWT algorithm are presented. However, more
results will be given in the final version of this paper with impacts of each different methods
(Voltage scaling, Selection of different operators, Assignment) on the power dissipation.
10
4.1 Discrete Wavelet Transform algorithm
There are a lot of image transformations as Fast Fourier Transform, Discrete Cosinus
Transform, and Wavelet transform. The last one has got several interesting properties and is
more and more used. The wavelet transform used here is based on a "g" high pass filter and on
a "h" low pass filter [2]. Several steps are needed to calculate the results as shown in the
Figure 5 and 6. The H and G filters are first computed on rows and after on columns of the
initial image. To calculate the DWT with one more resolution level, the same computation is
realized on a quarter of the image.
C 0
H
H
H 1:2
H
H
G
G
1:2
1:2
2:1
2:1
G
1:2
1:2
1:2
1:2
1:2G
G
G
2:1
2:1
H
D-
1d
D-
1h
D-
1v
C-1
C-2
D-
2h
D-
2d
D-
2v
First resolution level Second resolution level
Rows Columns
Rows Columns
2:1 : Keep one column out of 2 1:2 : Keep one row out of 2
Figure 5 : DWT with three resolution levels
11
D-1dD-1v
D-1h
D-2d
D-2h
D-2v
C-2
Figure 6 : Result after the DWT algorithm with two resolution levels
Several h and g filters are possible and two linear phase FIR filters with different sizes
have been chosen because of their good results [2] :
9-3 : 9th
linear phase FIR filter (h), 3rd
linear phase FIR filter (g)
9-7 : 9th
linear phase FIR filter (h), 7th
linear phase FIR filter (g)
The algorithm can be used with different resolution levels. The example of 512×512 pixels
images at the frequency of 10 images per second is treated : it implies a time constraint at
1/10=100 ms per image. To satisfy this constraint, one algorithm is specified on a n×n block
(algo_bloc) and another one realizes the calculus only on one image point (algo_point).
The power dissipation of both algorithms is compared just after their description.
Therefore, the power dissipation estimation module permits to select at the first step of the
system design the more suitable algorithm according to the power dissipation.
Table II gives results of the power dissipation estimation and the required CPU time on a
Sparc Station. It has been proved on a moving detection application [11] that the absolute
power dissipation error is less than 20% compared with the value obtained by a logic simulator
Compass (after architectural and logical synthesis with Gaut [1,14])
Algorithm
Results
algo_bloc algo_point
Power dissipation
estimation
1131 mW 436 mW
CPU time on a
Sparc Station
13 minutes 7 seconds
Table II : Comparison of two DWT transforms implantation
Therefore, at a high level of abstraction, the fast power dissipation estimation allows to
compare two different implantations and to choose the better algorithm at the early stage of
the design flow. These results show that the algo_point algorithm is more convenient to a
power minimisation.
4.2 Module selection results
The algo_point algorithm is selected to illustrate results on Module Selection. The circuit
has to implement the algorithm with three resolution levels on 10 images per second. So, it
12
needs to specify a processor that performs two FIR filters (g, a 7th and h, a 9th linear phase
FIR filters) (7) every 300 ns has been specified :
( ) ( )( )
( ) ( )( )
y n h x h x x
y n g x g x x
n i n i n i
i
n i n i n i
i
= + +
= + +
− −
=
− −
=
∑
∑
0 2 2
0
3
0 2 2
0
2
. .
. .'
(7)
Table I in section 3 is the part of the operators library used in our HLS tool for this
application. The time constraint is 300 ns. Figure 7 shows the results (Power(S) (5), Area(S)
and Cost(S) (6)) of the best operators set for each supply voltage, with α equal to 0.5.
Titre:
Créé par:
Date de création:
Figure 7 : Cost as a function of Vdd on a DWT algorithm
The power dissipation decreases with the supply voltage because of the quadratic
dependence of Vdd on the power dissipation (1). However, when the supply voltage decreases,
operators latency (4) decreases, that involves more operators allocation and the design area
increases.
Table III shows the HLS GAUT tool results for some supply voltages : best operators set
(Module selection), operators power dissipation estimation, total area and operators number in
the circuit after an architectural synthesis.
Vdd
(Volts)
Number of selected Operator (latency time (ns)) Power (mW) Area (mm2
)
4.9 1 Adder Datapath (20) 3 Mult Vhdl (70) 360 5.16
3.9 2 Adder Vhdl (40) 3 Mult Vhdl (90) 222 5.76
3.8 2 Add Vhdl (40) 2 Fast Mult Vhdl (60) 277 4.82
3.2 2 Add Datapath (40) 3 Fast Mult Vhdl (80) 199 6.14
Table III : Module selection results on a DWT algorithm
13
Breaks in the Figure 7 are explained by the table III. For a 3.9 Volts supply voltage, three
multipliers and two adders are allocated. However for a 3.8 Volts supply voltage, two adders
and only two multipliers are allocated. This infers that the area cost is smaller at 4.5 Volts
whereas the power dissipation estimation is greater.
Cost functions (6) depend on the α term which represents the relative importance between
the area and the power dissipation. Figure 8 shows the impact of this term variation on the cost
function. The optimal supply voltage when 0.2<α<1 is around 3.8 Volts. It means that the
supply voltage can decrease from 5 Volts to 3.8 Volts without a significant area penalty (only
few per cents) and with a 40% power dissipation optimization. Therefore, the α term permits
to each user to optimize the design according to his application problem (area or power
dissipation).
Titre:
Créé par:
Date de création:
Figure 8 : Apha impact on the cost function
Module Selection requires less than one CPU minute on a Sparc station. This implies that
the method can be applied on a large operators library and on complex applications without
computation time problems.
5. Conclusion
A generic method for low power VLSI design, integrated in the Gaut_w HLS tool
(Estimation and Optimization), is presented here. This new approach is modular and is realized
before the architectural synthesis. Therefore, it can be applied to any other architectural
synthesis tool. A fast power dissipation estimation at a high level of abstraction has also been
developed which compares different time constrained DSP algorithms at an early stage of the
design.
Results presented on DWT algorithms show that the High level power dissipation
estimation permits to obtain quickly power savings equal to 3 by the selection of the best
algorithm. Moreover, a 40% power reduction can be reasonably envisioned by using Module
selection without a significant area penality.
14
References :
[1] E. Martin, O. Sentieys, H. Dubois, J.-L. Philippe
GAUT, an architectural synthesis tool for dedicated signal processors
Proceedings EURO-DAC 93
[2] M. Antonini, M. Barlaud, P. Mathieu, I. Daubechies,
Image coding using wavelet transform
IEEE trans. on Image Processing, Vol. 1, No 2, Avril 92, pp. 205-220
[3] P. E. Landman
Low-power Architectural design methodologies
PhD Thesis, Berkeley, August 94
[4] A. Chandrakasan, S. Sheng, R. W. Brodersen
Low Power CMOS digital Design
IEEE journ. of solid state circuits, Vol. 22, N. 4, pp. 473-484
[5] P. E. Landman, J. M. Rabaey
Architectural power analysis: the dual bit method
IEEE trans. on VLSI systems, Vol. 3, No 2, Junev 1995, pp. 173-187
[6] A. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, R. Brodersen
Optimising Power Using Transformations
IEEE Trans. on CAD and Systems, Vol. 14, No 1, pp. 12-30, Jan. 95
[7] O. Sentieys, J.-Ph. Diguet, J. L. Philippe, E. Martin
Hardware Module Selection for Real Time Pipeline Architectures using
Probabilistic Estimation
9th Annual IEEE Inter. ASIC Conf. and Exhibit, 1996, pp.147-150
[8] Z. S. Shen, C. C. Jong
Exploring module selection space for architectural synthesis of low power designs
IEEE Int. Symp. on circuits and systems, Hong Kong, June 97, pp.1532-1535
[9] M. C. Johnson, K. roy
Scheduling and optimal voltage selection for Low-Power multi-volatge DSP
Datapaths
IEEE Int. Symp. on circuits and systems, Hong Kong, June 97, pp.2152-2155
[10] R. Burch, F. Najm, P. Yang, T. Trick
McPOWER : a monte carlo approach to power estimation
IEEE CAD 1992, pp. 90-97
[11] S. Gailhard, O. Ingremeau, N. Julien, J.-Ph. Diguet, E. Martin
Une méthode probabiliste pour estimer la consommation à un niveau
algorithmique
Colloque CAO, Villars de Lans, Jan. 97, pp. 124-127
15
[12] J-Ph. Diguet, O. Sentieys, J. L. Philippe, E. Martin
Probabilistic resource Estimation for Pipeline Architecture
IEEE Workshop on VLSI S.P.,Sakai, Japan, Oct. 95
[13] S. Gailhard, O. Sentieys, N. Julien, E. Martin
Area/Time /Power Space Exploration in Module Selection for DSP High Level
Synthesis
Int. Workshop, PATMOS’97, Louvain la Neuve, Belgium, Sept. 97, pp.35-44
[14] A. Baganne, J.-L Philippe, E. Martin
Hardware Interface Design For Real Time Embedded Systems
7-th IEEE GLS-VLSI 97, Urbana Champain , Illinois, Mars 1997

More Related Content

What's hot

IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdlIaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd Iaetsd
 
D0212326
D0212326D0212326
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Tiziano De Matteis
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplier
ijsrd.com
 
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
IRJET Journal
 
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLAIMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
eeiej_journal
 
post119s1-file3
post119s1-file3post119s1-file3
post119s1-file3
Venkata Suhas Maringanti
 
Memory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital PredistorterMemory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital Predistorter
IJERA Editor
 
Design and Implementation of Low Power DSP Core with Programmable Truncated V...
Design and Implementation of Low Power DSP Core with Programmable Truncated V...Design and Implementation of Low Power DSP Core with Programmable Truncated V...
Design and Implementation of Low Power DSP Core with Programmable Truncated V...
ijsrd.com
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
Cemal Ardil
 
19 avadhanam kartikeya sarma 177-185
19 avadhanam kartikeya sarma 177-18519 avadhanam kartikeya sarma 177-185
19 avadhanam kartikeya sarma 177-185
Alexander Decker
 
Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467
IJRAT
 
Improved Max-Min Scheduling Algorithm
Improved Max-Min Scheduling AlgorithmImproved Max-Min Scheduling Algorithm
Improved Max-Min Scheduling Algorithm
iosrjce
 
Edd clustering algorithm for
Edd clustering algorithm forEdd clustering algorithm for
Edd clustering algorithm for
csandit
 
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
eSAT Publishing House
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
AtakanAral
 

What's hot (17)

IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdlIaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
 
D0212326
D0212326D0212326
D0212326
 
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplier
 
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
 
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLAIMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
 
post119s1-file3
post119s1-file3post119s1-file3
post119s1-file3
 
Memory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital PredistorterMemory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital Predistorter
 
Design and Implementation of Low Power DSP Core with Programmable Truncated V...
Design and Implementation of Low Power DSP Core with Programmable Truncated V...Design and Implementation of Low Power DSP Core with Programmable Truncated V...
Design and Implementation of Low Power DSP Core with Programmable Truncated V...
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
 
19 avadhanam kartikeya sarma 177-185
19 avadhanam kartikeya sarma 177-18519 avadhanam kartikeya sarma 177-185
19 avadhanam kartikeya sarma 177-185
 
Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467
 
Improved Max-Min Scheduling Algorithm
Improved Max-Min Scheduling AlgorithmImproved Max-Min Scheduling Algorithm
Improved Max-Min Scheduling Algorithm
 
Edd clustering algorithm for
Edd clustering algorithm forEdd clustering algorithm for
Edd clustering algorithm for
 
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
 

Viewers also liked

High-Level Synthesis with GAUT
High-Level Synthesis with GAUTHigh-Level Synthesis with GAUT
High-Level Synthesis with GAUT
AdaCore
 
Topograhical synthesis
Topograhical synthesis   Topograhical synthesis
Topograhical synthesis
Deiptii Das
 
Sodc 1 Introduction
Sodc 1 IntroductionSodc 1 Introduction
Sodc 1 Introduction
Prabhavathi P
 
Vlsi
VlsiVlsi
114 santanu
114 santanu114 santanu
Architectural_Synthesis_for_DSP_Structured_Datapaths
Architectural_Synthesis_for_DSP_Structured_DatapathsArchitectural_Synthesis_for_DSP_Structured_Datapaths
Architectural_Synthesis_for_DSP_Structured_Datapaths
Shereef Shehata
 
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
Luigi Vanfretti
 
Building Design Process
Building Design ProcessBuilding Design Process
Building Ethnography into the design process
Building Ethnography into the design processBuilding Ethnography into the design process
Building Ethnography into the design process
Stephen Cox
 
Building a Responsive Web Design Process
Building a Responsive Web Design ProcessBuilding a Responsive Web Design Process
Building a Responsive Web Design Process
Lydia Whitehead
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-complete
Murali Rai
 
Architectural Design Method (Design Method Workshop)
Architectural Design Method (Design Method Workshop)Architectural Design Method (Design Method Workshop)
Architectural Design Method (Design Method Workshop)
Lim Gim Huang
 
Physical design
Physical design Physical design
Physical design
Mantra VLSI
 
Logic synthesis with synopsys design compiler
Logic synthesis with synopsys design compilerLogic synthesis with synopsys design compiler
Logic synthesis with synopsys design compiler
naeemtayyab
 
Vlsi physical design-notes
Vlsi physical design-notesVlsi physical design-notes
Vlsi physical design-notes
Dr.YNM
 
Episode 55 : Conceptual Process Synthesis-Design
Episode 55 :  Conceptual Process Synthesis-DesignEpisode 55 :  Conceptual Process Synthesis-Design
Episode 55 : Conceptual Process Synthesis-Design
SAJJAD KHUDHUR ABBAS
 
Building Codes And The Design Process
Building Codes And The Design ProcessBuilding Codes And The Design Process
Building Codes And The Design Process
Eric Anderson
 
Sequential circuits in digital logic design
Sequential circuits in digital logic designSequential circuits in digital logic design
Sequential circuits in digital logic design
Nallapati Anindra
 
Design process and concepts
Design process and conceptsDesign process and concepts
Design process and concepts
Slideshare
 
Sequential Logic Circuit
Sequential Logic CircuitSequential Logic Circuit
Sequential Logic Circuit
Ramasubbu .P
 

Viewers also liked (20)

High-Level Synthesis with GAUT
High-Level Synthesis with GAUTHigh-Level Synthesis with GAUT
High-Level Synthesis with GAUT
 
Topograhical synthesis
Topograhical synthesis   Topograhical synthesis
Topograhical synthesis
 
Sodc 1 Introduction
Sodc 1 IntroductionSodc 1 Introduction
Sodc 1 Introduction
 
Vlsi
VlsiVlsi
Vlsi
 
114 santanu
114 santanu114 santanu
114 santanu
 
Architectural_Synthesis_for_DSP_Structured_Datapaths
Architectural_Synthesis_for_DSP_Structured_DatapathsArchitectural_Synthesis_for_DSP_Structured_Datapaths
Architectural_Synthesis_for_DSP_Structured_Datapaths
 
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
PMU-Based Real-Time Damping Control System Software and Hardware Architecture...
 
Building Design Process
Building Design ProcessBuilding Design Process
Building Design Process
 
Building Ethnography into the design process
Building Ethnography into the design processBuilding Ethnography into the design process
Building Ethnography into the design process
 
Building a Responsive Web Design Process
Building a Responsive Web Design ProcessBuilding a Responsive Web Design Process
Building a Responsive Web Design Process
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-complete
 
Architectural Design Method (Design Method Workshop)
Architectural Design Method (Design Method Workshop)Architectural Design Method (Design Method Workshop)
Architectural Design Method (Design Method Workshop)
 
Physical design
Physical design Physical design
Physical design
 
Logic synthesis with synopsys design compiler
Logic synthesis with synopsys design compilerLogic synthesis with synopsys design compiler
Logic synthesis with synopsys design compiler
 
Vlsi physical design-notes
Vlsi physical design-notesVlsi physical design-notes
Vlsi physical design-notes
 
Episode 55 : Conceptual Process Synthesis-Design
Episode 55 :  Conceptual Process Synthesis-DesignEpisode 55 :  Conceptual Process Synthesis-Design
Episode 55 : Conceptual Process Synthesis-Design
 
Building Codes And The Design Process
Building Codes And The Design ProcessBuilding Codes And The Design Process
Building Codes And The Design Process
 
Sequential circuits in digital logic design
Sequential circuits in digital logic designSequential circuits in digital logic design
Sequential circuits in digital logic design
 
Design process and concepts
Design process and conceptsDesign process and concepts
Design process and concepts
 
Sequential Logic Circuit
Sequential Logic CircuitSequential Logic Circuit
Sequential Logic Circuit
 

Similar to Low power tool paper

Green scheduling
Green schedulingGreen scheduling
Green scheduling
Vincenzo De Maio
 
A verilog based simulation methodology for estimating statistical test for th...
A verilog based simulation methodology for estimating statistical test for th...A verilog based simulation methodology for estimating statistical test for th...
A verilog based simulation methodology for estimating statistical test for th...
ijsrd.com
 
Analysis Of Optimization Techniques For Low Power VLSI Design
Analysis Of Optimization Techniques For Low Power VLSI DesignAnalysis Of Optimization Techniques For Low Power VLSI Design
Analysis Of Optimization Techniques For Low Power VLSI Design
Amy Cernava
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
VLSICS Design
 
Implementation of Area Effective Carry Select Adders
Implementation of Area Effective Carry Select AddersImplementation of Area Effective Carry Select Adders
Implementation of Area Effective Carry Select Adders
Kumar Goud
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyDesign Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
IJEEE
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
VLSICS Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
VLSICS Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
VLSICS Design
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radix
eSAT Journals
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques using
eSAT Publishing House
 
Final Project Report
Final Project ReportFinal Project Report
Final Project Report
Riddhi Shah
 
Dual technique of reconfiguration and capacitor placement for distribution sy...
Dual technique of reconfiguration and capacitor placement for distribution sy...Dual technique of reconfiguration and capacitor placement for distribution sy...
Dual technique of reconfiguration and capacitor placement for distribution sy...
IJECEIAES
 
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
VLSICS Design
 
55
5555
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
IJERA Editor
 
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
IJCNCJournal
 
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
VLSICS Design
 

Similar to Low power tool paper (20)

Green scheduling
Green schedulingGreen scheduling
Green scheduling
 
A verilog based simulation methodology for estimating statistical test for th...
A verilog based simulation methodology for estimating statistical test for th...A verilog based simulation methodology for estimating statistical test for th...
A verilog based simulation methodology for estimating statistical test for th...
 
Analysis Of Optimization Techniques For Low Power VLSI Design
Analysis Of Optimization Techniques For Low Power VLSI DesignAnalysis Of Optimization Techniques For Low Power VLSI Design
Analysis Of Optimization Techniques For Low Power VLSI Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
Implementation of Area Effective Carry Select Adders
Implementation of Area Effective Carry Select AddersImplementation of Area Effective Carry Select Adders
Implementation of Area Effective Carry Select Adders
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyDesign Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radix
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques using
 
Final Project Report
Final Project ReportFinal Project Report
Final Project Report
 
Dual technique of reconfiguration and capacitor placement for distribution sy...
Dual technique of reconfiguration and capacitor placement for distribution sy...Dual technique of reconfiguration and capacitor placement for distribution sy...
Dual technique of reconfiguration and capacitor placement for distribution sy...
 
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCU...
 
55
5555
55
 
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
 
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
ENERGY PERFORMANCE OF A COMBINED HORIZONTAL AND VERTICAL COMPRESSION APPROACH...
 
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
DESIGN OF IMPROVED RESISTOR LESS 45NM SWITCHED INVERTER SCHEME (SIS) ANALOG T...
 

Recently uploaded

HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
deepaannamalai16
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
ssuser13ffe4
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 

Recently uploaded (20)

HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 

Low power tool paper

  • 1. 1 How to transform an architectural synthesis tool for low power VLSI designs S. Gailhard, N. Julien, J.-Ph. Diguet, E. Martin IUP-LESTER 2, rue le Coat Saint Haouen 56100 Lorient France E-mail: gailhard@univ-ubs.fr Fax : (33) 2.97.88.05.51 Phone : (33) 2.97.88.05.45
  • 2. 2 How to transform an architectural synthesis tool for low power VLSI designs Abstract - High Level Synthesis (HLS) for Low Power VLSI design is a complex optimization problem due to the Area/Time/Power interdependence. As few low power design tools are available, a new approach providing a modular low power synthesis method is proposed. Although based for the moment on a generic architectural synthesis tool Gaut, the use of different "commercial" tools is possible Our Gaut_w HLS tool is constituted of low power modules = High level power dissipation estimation, Assignment, Module selection (Operators and supply voltage), Optimization criteria and Operators library. As illustration, power saving factors on DWT algorithms are presented. Keywords - Digital Signal Processing (DSP), High Level Synthesis (HLS), Low-Power Methods, Modular approach, Discrete Wavelet Transform (DWT).
  • 3. 3 1. Introduction Most VLSI researches have been focused on optimizing circuits speed and area to perform complex signal processing. Indeed, the rapid advance in VLSI technology and the increasing computation required for recent real time DSP applications infer large chips density and high clock frequency. Then, the power dissipation minimization in modern circuits, such as mobile systems, is a crucial problem. To minimize the power consumption of a system, VLSI researches work on low power HLS tools. Power optimization can be processed at several levels and the expecting power saving factor at each level can be expressed as follows : behavioral (10-100%), architectural (10-100%), logical (10-50%) or physical (<20%) [3]. Therefore, at behavioral and architectural levels, the expected saving power is more significant : these two domains have been less explored in the literature. The method integrated in our HLS Gaut_w proposes techniques in order to optimize the design at these two levels. Actually, a lot of architectural synthesis tools have been proposed. These tools differ from the applications kind they can implement and from the targeted architecture. Therefore, it is important to develop methods for low-power that can be quickly integrated in each user architectural synthesis tool. For such a purpose, the best way is to develop low power modules that are used before the architectural synthesis tool in the design flow. In this paper, a new approach to HLS tool for low power and time constrained DSP systems is proposed. The second section presents previous works on power estimation and optimization. Section 3 explores briefly our Gaut_w HLS tool and its different modules (High Level Power Estimation, Module Selection, Optimization Criteria, Assignment). In the section 4, results on DWT algorithm [2] illustrating our method are given before a conclusion. 2. Previous Works 2.1 Power estimation There are three major sources of power dissipation in digital CMOS circuits that are summarized in the following equation (1). The first term represents the power switching component, the second term is due to direct-path short circuit current and the last term is the power dissipation due to leakage current. P P P P C p V fswitching shortcircuit leakage dd 2 = ( . ) . .+ + = (1) where C is the physical capacitance of the CMOS circuit, p is the power consuming transition (0→1) probability and Vdd the supply voltage However for CMOS circuits, it assumed that both short-circuit and leakage power dissipation is negligible. Therefore, the power dissipation is given by (2) [4] : P C V fcomp dd 2 = . . (2) Where Ccomp is the average capacitance switched to perform a computation and f the sampling frequency. It is well known that the signals statistics influence the power dissipation [5]. Nevertheless, at a high level, it can be supposed that signal values are uniformly distributed. Then, the capacitance Ccomp is described by the following expression (3) [6] :
  • 4. 4 C N C N C C Ccomp i i reg reg erconnect control i = + + +∑ . . int (3) Where Ni is the number of i operations, Ci the i operator capacitance that contains the intrablock routing and gates capacitance, NregCreg the effective registers capacitance, Cinterconnect the physical interconnected capacitance estimation and Ccontrol the controller capacitance estimation. At the architectural level, the complete architecture is known and signals statistics in the circuit can be calculated. Therefore, power dissipation estimation in a typical execution can be computed more precisely [5]. 2.2 Power optimization Low power optimization is achieved by reducing either Ci with technology considerations or Ccomp by the operators selection and the architectural synthesis. Moreover, voltage scaling has a great impact on the power dissipation because of the quadratic dependence of Vdd (2). Unfortunately, reducing Vdd increases the operators latency T significantly when the supply voltage approaches the device threshold voltage Vt (4) [4] : ( ) T V V V dd dd t ∝ − 2 (4) Therefore it is important to find a good trade-off between power dissipation and time performance for DSP applications. 2.3 Low power HLS tools Few HLS optimization tools have been proposed in the literature. Hyper_LP [6] has integrated high level transformations and an ATP space exploration that shows the supply voltage scaling impact on the design. Concerning the module selection, most of previous works only goals design area optimization [7]. Moreover, a module selection (different operators at different supply voltages) has been proposed which gives the lower bounds on Area, Performance and Power dissipation [8]. Scheduling methods have been presented to schedule Multi-Voltage Datapaths [9]. Nevertheless, no HLS synthesis tool using a modular approach and a classical architectural synthesis tool has been proposed yet. 3. HLS tool Gaut_w 3.1 Overview of the design flow The design flow of our HLS tool is illustrated in Figure 1. The three entries are the algorithmic description of the application (behavioral VHDL), the operators library and the optimization criteria. At present, this tool is based on the Gaut architectural synthesis tool [1,14], but such a global method could be extended through any another tool. Gaut_w contains two low power modules : Module Selection that selects the best operators set and the best supply voltage for a given time constraint, and Assignment that assigns operations of the Data Flow Graph (DFG) on physical operators in order to decrease the transition rate of operators entries. These modules minimize a cost function (depending on both area and power) whose weighted parameters are chosen by the user (Optimization Criteria). The design is optimized
  • 5. 5 through a Fast Power dissipation estimation based on average power dissipation of resources (Operators library) used in the architecture and a probabilistic estimation of accesses on each different resource (registers, memory, operators). Graph Assigned DFG Data Flow Graph Compilation Assignment Optimisation criteria Users needs Algorithmic specification Operators library Architectural synthesis Tool Fast Power dissipation estimation High Level Power Dissipation Estimation Module Selection (operators, supply voltage) RTL specification of the circuit Logic synthesis Tool VLSI Circuit Gaut_w Figure 1 : Low Power DSP HLS Tool 3.2 Formal library The synchronous architecture synthesized by Gaut tool used operators, memories, registers and a clock tree. 3.2.1 Operators Each operator has been simulated by the logic simulator Compass to estimate the average power dissipation [10]. The power estimation is the mean result of 5 simulations with 10.000 values length stimuli [11] (using Compass and Powercalc), implying a 95% confidence interval and a less than 1% error. This method provides a good trade-off between the simulation time and the results precision. Therefore, the library contains operators with different ATP characteristics. For example, an addition function in the DFG can be realized with 4 different operators (Table I).
  • 6. 6 3.2.2 Memory library The great number of transistors in a memory implies that the power dissipation due to the leakage current is no more negligible according to the switching component. Therefore, the power dissipation estimator integrated in the Compass CAD tool is used to determine the switching and the leakage component of the power dissipation. These terms depend on the memory size. Therefore, the memory power dissipation is based on a static power dissipation Stat(S).Vdd2 and on a dynamic effective capacitance Cdyn(S). We do not have an analytic expression, but only values corresponding to each different memory size. 3.2.3 clock library The Cclk clock tree capacitance estimation is based on the ( )N SgRe estimated registers number and on the Creg clock tree capacitance per register: . ( )C S N CClk reg reg= . . 3.2.4 Library example Table I represents a library extract, where T represents the latency time for a 5.5 Volts supply voltage, S the area and P the power dissipation estimation for a 5.5 Volts supply voltage. The operator Ci effective capacitance can be calculated with expression (2) (P, Vdd and f are known). The latency time at different supply voltages is estimated using expression (4) (Vth is a known device characteristic). 16 bits Operators T (ns) S (10-3 mm2) P (µW) Add (DataPath) 14 73 412 Fast add (DataPath) 8.6 122 711 Add (VHDL) 18.9 53 229 Fast add (VHDL) 9.9 117 626 Mult (DataPath) 32.3 1337 18827 Pipeline mult (DataPath) 22 1337 18821 Mult (VHDL) 53.3 1023 11820 Fast mult (VHDL) 33.2 1270 16231 Register 4 70 71 Clock per register - - 65 memory 28×16 bits Static : 36 mW 20 1163 961 Table I : Library example (CMOS 1 µµm technology) 3.3 Power dissipation estimation The aim of this module is to give a fast power dissipation estimation calculated at a high level of abstraction.Currently, the method targets time constrained DSP applications : operators used in the circuit have an activation rate close to 100%. The first estimation step is the scheduling probability computation of each operation in the DFG [12]. This permits to estimate the storage probability of each variable (transfer via a register, the memory and a register or transfer via a register) and then the probable activity of
  • 7. 7 the memory (Nmem) and the registers (NReg). The probable registers number can also be estimated allowing to estimate the power dissipation of the clock tree. The Ni operators activity is calculated on the DFG. Therefore, this probable activity of resources and their average power dissipation available in the library leads to the total power dissipation Power(S) (5) : ( ) ( ) ( ) ( )( ) ( ) Power S V T C S C S C S T C Static S dd r Op g Dyn Mem Clk Clk Mem = + + + +             2 1 1 . . Re _ (5) With : ( )C S N COp i ii N = =∑ .1 0 ( )C S N Cg reg regRe = ( )C S N CClk reg reg= . ( )C S N CDyn Mem mem dyn_ .= Where COp(S) and CReg(S) represent respectively the effective capacitance of operators and registers, each of these capacitances switching every Tr ns (time constraint). Cclk is the clock tree capacitance estimation, ( )N SgRe the probable registers number and Creg the capacitance per register (CClk switches every TClk ns : internal clock frequency). Stat(S).Vdd2 is the static power dissipation and CDyn_mem(S) the dynamic effective capacitance. 3.4 Optimization criteria Each designer has its own optimization problem : circuits for mobile electronic systems require low power dissipation, whereas some designs need a minimum area circuit in order to be cheaper. Therefore, the optimization criteria must depend on both area and power (Figure 2). Area Performance Power Time Constraint (down limit) Optimizing ? • Power • Area Figure 2 : Optimization criteria
  • 8. 8 A lot of functions can be integrated in our optimization criteria module. Cost functions using linear functions of both area and power dissipation have been chosen; an α factor (0 ≤ α ≤1) is introduced to select the power and area rate. β is a normalization coefficient (6). ( ) ( ) ( ) ( )Cost S . . Area S Power S α αα β= − +1 . . (6) where Area(S) is the area estimation (operators, registers, interconnections and bus [7]), Power(S) is the power dissipation estimation (5). 3.5 Assignment Operations assignment on operators in the circuit is an important task in the architectural synthesis in order to decrease the power dissipation. A high switching factor at operator entries implies a high switching rate of operator gates and then a high power dissipation [5]. But, at a high level of abstraction, it is impossible to know exactly the circuit architecture and to calculate the activity rate of all operators entries. Therefore, only two models of operators power dissipation are used : • both of the operator entries change between two operations with white noise inputs • only one of the operator entry changes between two operations, the other one is constant The power dissipation is greater in the first model than in the second one in a 25 to 50% order depending on the operator. The first model is an estimation of the power dissipation for two variables operation when the second model is an estimation of the power dissipation on a variable and a constant one. Therefore, at a high level of abstraction, it is important to guide the architectural synthesis to use the second model operator as often as possible in order to optimize the design power dissipation. Let's take an example of this following DFG. A, B, C, D and E are variables and their values change every clock cycle, but Cst is a constant that keeps the same value for every clock cycle. * + + * * * BA Cst rt C D E R Cst R Figure 3 : A DFG example Multiplication on the DFG can be scheduled in different ways on one or more multipliers : the operators power dissipation depends on the scheduling task. With the power dissipation model used, it is clear that the minimum power dissipation is obtained when one multiplier is dedicated only for multiplication with the Cst value (Constant value) : two multiplications are
  • 9. 9 done with the second model (Cst*R, Cst*C) and two multiplications with the first model (A*B and D*E) to compute the DFG. If only one multiplier is used, the operator power dissipation is greater : for example, if Cst*R is scheduled after A*B, then both of the two operator entries change (A→ Cst and B→R) and then the multiplication is done in the first model. Therefore, an assignment module detecting operations with constant has been developed. Then, the Assignment Module assigns or not operators performing operations with constant to optimize the cost function (6) defined by the user (Optimization Criteria). 3.6 Module selection The module selection is the second step in the HLS GAUT design flow (Figure 1). Its utility is to determine the optimal supply voltage and the optimal operators set from a given library according to a DSP application (an algorithm and a time constraint) for design optimization (cost functions defined upper). The module selection presented here explores all the different solutions dealing with different operators (ATP, mono or multi-function, pipeline operators) and different supply voltages. Our DFG processing is carried in the order defined in Figure 4. To estimate the cost function (6), the operators allocation has to be obtained from the number of pipeline stages. This requires to know the operators latency and therefore the selected operators and the supply voltage (4). To simplify the ATP space exploration and reduce the cost estimation number, the following lemma is proposed [7] : the use of multi-functional units is necessary only if one of its functions is underused, e.g. if there is at least one mono-functional operator in the operators set S with an efficiency less than 100%. Therefore, a first step provides the best selection from mono-functional operators. If mono-functional operators are underused, an optimization with multi-functional units is processed for reducing the circuit area. In conclusion, our module selection algorithm solves the best operators set for each supply voltage (mono and multi-functions) and the best supply voltage selection in a range from Vdd_min to Vdd_max (Figure 4) [13]. for Supply voltage from Vdd_min to Vdd_max for all mono-functional operators set Latency time calculation of each selected operator (4) Number of pipeline stages calculation Allocation estimation in each pipeline stage Cost calculation (6) End mono-functional operators set Multifunctional operators resolution ððBest operators set selection for each supply voltage End for Supply voltage ððBest selection (Supply voltage and operators set) Figure 4 : Module Selection Algorithm Nevertheless, it is possible to use normalised supply voltages (5, 3.3 ... Volts). This ATP exploration leads to the minimum of the cost function chosen by the designer. 4. Results In this submission, only few results on DWT algorithm are presented. However, more results will be given in the final version of this paper with impacts of each different methods (Voltage scaling, Selection of different operators, Assignment) on the power dissipation.
  • 10. 10 4.1 Discrete Wavelet Transform algorithm There are a lot of image transformations as Fast Fourier Transform, Discrete Cosinus Transform, and Wavelet transform. The last one has got several interesting properties and is more and more used. The wavelet transform used here is based on a "g" high pass filter and on a "h" low pass filter [2]. Several steps are needed to calculate the results as shown in the Figure 5 and 6. The H and G filters are first computed on rows and after on columns of the initial image. To calculate the DWT with one more resolution level, the same computation is realized on a quarter of the image. C 0 H H H 1:2 H H G G 1:2 1:2 2:1 2:1 G 1:2 1:2 1:2 1:2 1:2G G G 2:1 2:1 H D- 1d D- 1h D- 1v C-1 C-2 D- 2h D- 2d D- 2v First resolution level Second resolution level Rows Columns Rows Columns 2:1 : Keep one column out of 2 1:2 : Keep one row out of 2 Figure 5 : DWT with three resolution levels
  • 11. 11 D-1dD-1v D-1h D-2d D-2h D-2v C-2 Figure 6 : Result after the DWT algorithm with two resolution levels Several h and g filters are possible and two linear phase FIR filters with different sizes have been chosen because of their good results [2] : 9-3 : 9th linear phase FIR filter (h), 3rd linear phase FIR filter (g) 9-7 : 9th linear phase FIR filter (h), 7th linear phase FIR filter (g) The algorithm can be used with different resolution levels. The example of 512×512 pixels images at the frequency of 10 images per second is treated : it implies a time constraint at 1/10=100 ms per image. To satisfy this constraint, one algorithm is specified on a n×n block (algo_bloc) and another one realizes the calculus only on one image point (algo_point). The power dissipation of both algorithms is compared just after their description. Therefore, the power dissipation estimation module permits to select at the first step of the system design the more suitable algorithm according to the power dissipation. Table II gives results of the power dissipation estimation and the required CPU time on a Sparc Station. It has been proved on a moving detection application [11] that the absolute power dissipation error is less than 20% compared with the value obtained by a logic simulator Compass (after architectural and logical synthesis with Gaut [1,14]) Algorithm Results algo_bloc algo_point Power dissipation estimation 1131 mW 436 mW CPU time on a Sparc Station 13 minutes 7 seconds Table II : Comparison of two DWT transforms implantation Therefore, at a high level of abstraction, the fast power dissipation estimation allows to compare two different implantations and to choose the better algorithm at the early stage of the design flow. These results show that the algo_point algorithm is more convenient to a power minimisation. 4.2 Module selection results The algo_point algorithm is selected to illustrate results on Module Selection. The circuit has to implement the algorithm with three resolution levels on 10 images per second. So, it
  • 12. 12 needs to specify a processor that performs two FIR filters (g, a 7th and h, a 9th linear phase FIR filters) (7) every 300 ns has been specified : ( ) ( )( ) ( ) ( )( ) y n h x h x x y n g x g x x n i n i n i i n i n i n i i = + + = + + − − = − − = ∑ ∑ 0 2 2 0 3 0 2 2 0 2 . . . .' (7) Table I in section 3 is the part of the operators library used in our HLS tool for this application. The time constraint is 300 ns. Figure 7 shows the results (Power(S) (5), Area(S) and Cost(S) (6)) of the best operators set for each supply voltage, with α equal to 0.5. Titre: Créé par: Date de création: Figure 7 : Cost as a function of Vdd on a DWT algorithm The power dissipation decreases with the supply voltage because of the quadratic dependence of Vdd on the power dissipation (1). However, when the supply voltage decreases, operators latency (4) decreases, that involves more operators allocation and the design area increases. Table III shows the HLS GAUT tool results for some supply voltages : best operators set (Module selection), operators power dissipation estimation, total area and operators number in the circuit after an architectural synthesis. Vdd (Volts) Number of selected Operator (latency time (ns)) Power (mW) Area (mm2 ) 4.9 1 Adder Datapath (20) 3 Mult Vhdl (70) 360 5.16 3.9 2 Adder Vhdl (40) 3 Mult Vhdl (90) 222 5.76 3.8 2 Add Vhdl (40) 2 Fast Mult Vhdl (60) 277 4.82 3.2 2 Add Datapath (40) 3 Fast Mult Vhdl (80) 199 6.14 Table III : Module selection results on a DWT algorithm
  • 13. 13 Breaks in the Figure 7 are explained by the table III. For a 3.9 Volts supply voltage, three multipliers and two adders are allocated. However for a 3.8 Volts supply voltage, two adders and only two multipliers are allocated. This infers that the area cost is smaller at 4.5 Volts whereas the power dissipation estimation is greater. Cost functions (6) depend on the α term which represents the relative importance between the area and the power dissipation. Figure 8 shows the impact of this term variation on the cost function. The optimal supply voltage when 0.2<α<1 is around 3.8 Volts. It means that the supply voltage can decrease from 5 Volts to 3.8 Volts without a significant area penalty (only few per cents) and with a 40% power dissipation optimization. Therefore, the α term permits to each user to optimize the design according to his application problem (area or power dissipation). Titre: Créé par: Date de création: Figure 8 : Apha impact on the cost function Module Selection requires less than one CPU minute on a Sparc station. This implies that the method can be applied on a large operators library and on complex applications without computation time problems. 5. Conclusion A generic method for low power VLSI design, integrated in the Gaut_w HLS tool (Estimation and Optimization), is presented here. This new approach is modular and is realized before the architectural synthesis. Therefore, it can be applied to any other architectural synthesis tool. A fast power dissipation estimation at a high level of abstraction has also been developed which compares different time constrained DSP algorithms at an early stage of the design. Results presented on DWT algorithms show that the High level power dissipation estimation permits to obtain quickly power savings equal to 3 by the selection of the best algorithm. Moreover, a 40% power reduction can be reasonably envisioned by using Module selection without a significant area penality.
  • 14. 14 References : [1] E. Martin, O. Sentieys, H. Dubois, J.-L. Philippe GAUT, an architectural synthesis tool for dedicated signal processors Proceedings EURO-DAC 93 [2] M. Antonini, M. Barlaud, P. Mathieu, I. Daubechies, Image coding using wavelet transform IEEE trans. on Image Processing, Vol. 1, No 2, Avril 92, pp. 205-220 [3] P. E. Landman Low-power Architectural design methodologies PhD Thesis, Berkeley, August 94 [4] A. Chandrakasan, S. Sheng, R. W. Brodersen Low Power CMOS digital Design IEEE journ. of solid state circuits, Vol. 22, N. 4, pp. 473-484 [5] P. E. Landman, J. M. Rabaey Architectural power analysis: the dual bit method IEEE trans. on VLSI systems, Vol. 3, No 2, Junev 1995, pp. 173-187 [6] A. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, R. Brodersen Optimising Power Using Transformations IEEE Trans. on CAD and Systems, Vol. 14, No 1, pp. 12-30, Jan. 95 [7] O. Sentieys, J.-Ph. Diguet, J. L. Philippe, E. Martin Hardware Module Selection for Real Time Pipeline Architectures using Probabilistic Estimation 9th Annual IEEE Inter. ASIC Conf. and Exhibit, 1996, pp.147-150 [8] Z. S. Shen, C. C. Jong Exploring module selection space for architectural synthesis of low power designs IEEE Int. Symp. on circuits and systems, Hong Kong, June 97, pp.1532-1535 [9] M. C. Johnson, K. roy Scheduling and optimal voltage selection for Low-Power multi-volatge DSP Datapaths IEEE Int. Symp. on circuits and systems, Hong Kong, June 97, pp.2152-2155 [10] R. Burch, F. Najm, P. Yang, T. Trick McPOWER : a monte carlo approach to power estimation IEEE CAD 1992, pp. 90-97 [11] S. Gailhard, O. Ingremeau, N. Julien, J.-Ph. Diguet, E. Martin Une méthode probabiliste pour estimer la consommation à un niveau algorithmique Colloque CAO, Villars de Lans, Jan. 97, pp. 124-127
  • 15. 15 [12] J-Ph. Diguet, O. Sentieys, J. L. Philippe, E. Martin Probabilistic resource Estimation for Pipeline Architecture IEEE Workshop on VLSI S.P.,Sakai, Japan, Oct. 95 [13] S. Gailhard, O. Sentieys, N. Julien, E. Martin Area/Time /Power Space Exploration in Module Selection for DSP High Level Synthesis Int. Workshop, PATMOS’97, Louvain la Neuve, Belgium, Sept. 97, pp.35-44 [14] A. Baganne, J.-L Philippe, E. Martin Hardware Interface Design For Real Time Embedded Systems 7-th IEEE GLS-VLSI 97, Urbana Champain , Illinois, Mars 1997