Power reduction is one of the biggest challenges in modern systems and tends to become a severe issue dealing with complex scenarios. To provide high-performance and exibility, designers often opt for coarse-grained recongurable (CGR) systems. Nevertheless, these systems require specic attention to the power problem, since large set of resources may be underutilized while computing a certain task.
This paper focuses on this issue. Targeting CGR devices, we propose a way to model in advance power and clock gating costs on the basis of the functional, technological and architectural parameters of the baseline CGR system. The
proposed flow guides designers towards optimal implementations, saving designer eort and time.
This paper presents a recursive designing approach for high energy efficient carry select adder (CSA). Nowadays, the portable equipment’s like mobile phones and laptops have higher demands in the market. So, the designers must focus greater attention while designing such devices. Which means that have the devices must have lesser power consumption, low cost and have a better performance. The customers mainly focus on the equipment’s which have lesser power consumption, low cost and better performance. As we all know that the adders are the basic building block of microprocessors. The performance of the adders greatly influences the performance of those processors. The carry select adder is most suitable among other adders which have fast addition operation at low cost. The carry select adder (CSA) consists of chain full adders (FAs) and multiplexers. Here a carry select adder is designed with four FAs and four multiplexers. The proposed structure is assessed by the power consumption of the carry select adder using a 32-nm static CMOS technology for a wide range of supply voltages. The simulation results are obtained using Tanner EDA which reveals that the carry select adder has low power consumption.
This paper presents a recursive designing approach for high energy efficient carry select adder (CSA). Nowadays, the portable equipment’s like mobile phones and laptops have higher demands in the market. So, the designers must focus greater attention while designing such devices. Which means that have the devices must have lesser power consumption, low cost and have a better performance. The customers mainly focus on the equipment’s which have lesser power consumption, low cost and better performance. As we all know that the adders are the basic building block of microprocessors. The performance of the adders greatly influences the performance of those processors. The carry select adder is most suitable among other adders which have fast addition operation at low cost. The carry select adder (CSA) consists of chain full adders (FAs) and multiplexers. Here a carry select adder is designed with four FAs and four multiplexers. The proposed structure is assessed by the power consumption of the carry select adder using a 32-nm static CMOS technology for a wide range of supply voltages. The simulation results are obtained using Tanner EDA which reveals that the carry select adder has low power consumption.
Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift RegisterIJMTST Journal
Shift Registers are building blocks used for the storage of data in many devices. Currently, Flip flops which have been used in Shift Registers consume more Power and impose a heavy load on Clock distribution networks. The Proposed work overcomes the Power consumption and reduces the delay by using the Pulsed Latches instead of the Flip flops. Conventional Latches-Static differential Sense Amplifier Shared Pulse Latch (SSASPL) has been used where the number of Transistors has been reduced. Trigger generator is used to give non overlap clock signals to the memory elements, which reduced the delay and produced the fast implementation of the data. The Power consumed reduces by 27% and delay reduces by 21% when compared to the Shift Registers using Flip flops.
Voltage stability Analysis using GridCalAnmol Dwivedi
Power system voltage stability is characterized as being capable of maintaining load voltage magnitudes within specified operating limits under steady state conditions. This presentation deals with the modeling of two standard power systems test cases i.e the Nordic-32 and the Nordic-68, comparing the power flows results obtained from GridCal against PSS/E, finding the respective P-V curves for the two test cases using the continuation power flow under contingencies, and finally proposing a graph-based test statistic which can be used for an imminent voltage instability. The simulations are carried out using an open-source power system software called GridCal and the scripts for this project are written in python.
Design of High Speed Low Power 15-4 Compressor Using Complementary Energy Pat...CSCJournals
This paper presents the implementation of a novel high speed low power 15-4 Compressor for high speed multiplication applications using single phase clocked quasi static adiabatic logic namely CEPAL (Complementary Energy Path Adiabatic Logic). The main advantage of this static adiabatic logic is the minimization of the 1/2CVth2 energy dissipation occurring every cycle in the multi-phase power-clocked adiabatic circuits. The proposed Compressor uses bit sliced architecture to exploit the parallelism in the computation of sum of 15 input bits by five full adders. The newly proposed Compressor is also centered around the design of a novel 5-3 Compressor that attempts to minimize the stage delays of a conventional 5-3 Compressor that is designed using single bit full adder and half adder architectures. Firstly, the performance characteristics of CEPAL 15-3 Compressor with 14 transistor and 10 transistor adder designs are compared against the conventional static CMOS logic counterpart to identify its adiabatic power advantage. The analyses are carried out using the industry standard Tanner EDA design environment using 250 nm technology libraries. The results prove that CEPAL 14T 15-4 Compressor is 68.11% power efficient, 75.31% faster over its static CMOS counterpart.
ECMFA 2015 - Energy Consumption Analysis and Design with Foundational UMLLuca Berardinelli
Wireless Sensor Networks (WSN) are nowadays applied to a
wide set of domains (e.g., security, health). WSN are networks of spatially distributed, radio-communicating, battery-powered, autonomous sensor nodes. WSN are characterized by scarcity of resources, hence an application running on them should carefully manage its resources. The most critical resource in WSN is the nodes’ battery.
In this paper, we propose model-based engineering facilities to analyze the energy consumption and to develop energy-aware applications for WSN that are based on Agilla Middleware. For this aim i) we extend the Agilla Instruction Set with the new battery instruction able to retrieve the battery Voltage of a WSN node at run-time; ii) we measure the energy that the execution of each Agilla instruction consumes on a target platform; and iii) we extend the Agilla Modeling Framework with a new analysis that, leveraging the conducted energy consumption measurements, predicts the energy required by the Agilla agents running on the WSN. Such analysis, implemented in fUML, is based on simulation and it guides the design of WSN applications that guarantee low energy consumption. The approach is showed on the Reader agent used in the Wild Fire Tracker Application.
Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift RegisterIJMTST Journal
Shift Registers are building blocks used for the storage of data in many devices. Currently, Flip flops which have been used in Shift Registers consume more Power and impose a heavy load on Clock distribution networks. The Proposed work overcomes the Power consumption and reduces the delay by using the Pulsed Latches instead of the Flip flops. Conventional Latches-Static differential Sense Amplifier Shared Pulse Latch (SSASPL) has been used where the number of Transistors has been reduced. Trigger generator is used to give non overlap clock signals to the memory elements, which reduced the delay and produced the fast implementation of the data. The Power consumed reduces by 27% and delay reduces by 21% when compared to the Shift Registers using Flip flops.
Voltage stability Analysis using GridCalAnmol Dwivedi
Power system voltage stability is characterized as being capable of maintaining load voltage magnitudes within specified operating limits under steady state conditions. This presentation deals with the modeling of two standard power systems test cases i.e the Nordic-32 and the Nordic-68, comparing the power flows results obtained from GridCal against PSS/E, finding the respective P-V curves for the two test cases using the continuation power flow under contingencies, and finally proposing a graph-based test statistic which can be used for an imminent voltage instability. The simulations are carried out using an open-source power system software called GridCal and the scripts for this project are written in python.
Design of High Speed Low Power 15-4 Compressor Using Complementary Energy Pat...CSCJournals
This paper presents the implementation of a novel high speed low power 15-4 Compressor for high speed multiplication applications using single phase clocked quasi static adiabatic logic namely CEPAL (Complementary Energy Path Adiabatic Logic). The main advantage of this static adiabatic logic is the minimization of the 1/2CVth2 energy dissipation occurring every cycle in the multi-phase power-clocked adiabatic circuits. The proposed Compressor uses bit sliced architecture to exploit the parallelism in the computation of sum of 15 input bits by five full adders. The newly proposed Compressor is also centered around the design of a novel 5-3 Compressor that attempts to minimize the stage delays of a conventional 5-3 Compressor that is designed using single bit full adder and half adder architectures. Firstly, the performance characteristics of CEPAL 15-3 Compressor with 14 transistor and 10 transistor adder designs are compared against the conventional static CMOS logic counterpart to identify its adiabatic power advantage. The analyses are carried out using the industry standard Tanner EDA design environment using 250 nm technology libraries. The results prove that CEPAL 14T 15-4 Compressor is 68.11% power efficient, 75.31% faster over its static CMOS counterpart.
ECMFA 2015 - Energy Consumption Analysis and Design with Foundational UMLLuca Berardinelli
Wireless Sensor Networks (WSN) are nowadays applied to a
wide set of domains (e.g., security, health). WSN are networks of spatially distributed, radio-communicating, battery-powered, autonomous sensor nodes. WSN are characterized by scarcity of resources, hence an application running on them should carefully manage its resources. The most critical resource in WSN is the nodes’ battery.
In this paper, we propose model-based engineering facilities to analyze the energy consumption and to develop energy-aware applications for WSN that are based on Agilla Middleware. For this aim i) we extend the Agilla Instruction Set with the new battery instruction able to retrieve the battery Voltage of a WSN node at run-time; ii) we measure the energy that the execution of each Agilla instruction consumes on a target platform; and iii) we extend the Agilla Modeling Framework with a new analysis that, leveraging the conducted energy consumption measurements, predicts the energy required by the Agilla agents running on the WSN. Such analysis, implemented in fUML, is based on simulation and it guides the design of WSN applications that guarantee low energy consumption. The approach is showed on the Reader agent used in the Wild Fire Tracker Application.
Power gating is the main power reduction techniques for the static power. As long as technology scaling is taking place, static power becomes paramount important factor to the VLSI designs.Therefore Power gating is the recent power reduction technique that is actively in research areas.
The low power has been the main concern for the VLSI industry with the technology scaling in CMOS process from 130 nm to 22nm. The presentation here gives a brief idea about the several low power VLSI techniques being used in VLSI circuits to reduce the power and delay. for any query feel free to visit us at: http://www.siliconmentor.com/
Low Power VLSI design architecture for EDA (Electronic Design Automation) and Modern Power Estimation, Reduction and Fixing technologies including clock gating and power gating
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...IJERA Editor
The addition of two binary numbers is the important and most frequently used arithmetic process on
microprocessors, digital signal processors (DSP), and data-processing application-specific integrated circuits
(ASIC). Therefore, binary adders are critical structure blocks in very large-scale integrated (VLSI) circuits.
Their effective application is not trivial because a costly carry spread operation involving all operand bits has to
be achieved. Many different circuit constructions for binary addition have been planned over the last decades,
covering a wide range of presentation characteristics. In today era, reversibility has become essential part of
digital world to make digital circuits more efficient. In this paper, we have proposed a new method to reduce
quantum cost for ripple carry adder and carry skip adder. The results are simulated in Xilinx by using VHDL
language.
This paper investigates about the possibility to reduce power consumption in Neural Network using approximated computing techniques. Authors compare a traditional fixed-point neuron with an approximated neuron composed of approximated multipliers and adder. Experiments show that in the proposed case of study (a wine classifier) the approximated neuron allows to save up to the 43% of the area, a power consumption saving of 35% and an improvement in the maximum clock frequency of 20%.
Parallel Processing Technique for Time Efficient Matrix MultiplicationIJERA Editor
In this paper, we have proposed one designs for parallel-parallel input and single output (PPI-SO) matrix-matrix multiplication. In this design differs by high speed area efficient, throughput rate and user defined input format to match application needs. We have compared the proposed designs with the existing similar design and found that, the proposed designs offer higher throughput rate, reduce area at relatively lower hardware cost. We have synthesized the proposed design and the existing design using Xilinx software. Synthesis results shows that proposed design on average consumes nearly 30% less energy than the existing design and involves nearly 70% less area-delay-product than other. Interestingly, the proposed parallel-parallel input and single output (PPI-SO) structure consumes 40% less energy than the existing structure.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
High Performance Layout Design of SR Flip Flop using NAND Gates IJEEE
This paper presents optimized layout of SR flip flop using NAND gates on 90nm technology. The proposed SR flip flop has been designed using different technology namely fully-automatic design and semi-custom design. In the first approach layout has been generated using fully automatic technique with the help of SR flip flop schematic. In second approach layout has been generated manually by using one finger. In the last approach semicustom layout is further optimized by using two fingers. The area and power consumption of all the designs has been compared and analyzed. It can be observed from the simulated results that SR flip flop with two fingers and SR flip flop with one finger using semicustom technique consumes (62.92% , 91.20%) and (40.04%, 80.40%) less (area, power) as compare to the SR flip flop using fully automatic technique respectively.
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
The ever increasing energy requirements of supercomputers and server farms is driving the scientific and industrial communities to take in deeper consideration the energy efficiency of computing equipments. This contribution addresses the issue proposing a cluster of ARM processors for high-performance computing. The cluster is composed of five BeagleBoard-xM, with one board managing the cluster, and the other boards executing the actual processing. The software platform is based on the Angstrom GNU/Linux distribution and is equipped with a distributed file system to ease sharing data and code among the nodes of the cluster, and with tools for managing tasks and monitoring the status of each node. The computational capabilities of the cluster have been assessed through High-Performance Linpack and a cluster-wide speaker diarization algorithm, while power consumption has been measured using a clamp meter. Experimental results obtained in the speaker diarization task showed that the energy efficiency of the BeagleBoard-xM cluster is comparable to the one of a laptop computer equipped with a Intel Core2 Duo T8300 running at 2.4 GHz. Furthermore, removing the bottleneck due to the Ethernet interface, the BeagleBoard-xM cluster is able to achieve a superior energy efficiency.
POWER GATING STRUCTURE FOR REVERSIBLE PROGRAMMABLE LOGIC ARRAYecij
Throughout the world, the numbers of researchers or hardware designer struggle for the reducing of
power dissipation in low power VLSI systems. This paper presented an idea of using the power gating
structure for reducing the sub threshold leakage in the reversible system. This concept presented in the
paper is entirely new and presented in the literature of reversible logics. By using the reversible logics for
the digital systems, the energy can be saved up to the gate level implementation. But at the physical level
designing of the reversible logics by the modern CMOS technology the heat or energy is dissipated due the
sub-threshold leakage at the time of inactivity or standby mode. The Reversible Programming logic array
(RPLA) is one of the important parts of the low power industrial applications and in this paper the physical
design of the RPLA is presented by using the sleep transistor and the results is shown with the help of
TINA- PRO software. The results for the proposed design is also compare with the CMOS design and
shown that of 40.8% of energy saving. The Transient response is also produces in the paper for the
switching activity and showing that the proposed design is much better that the modern CMOS design of
the RPLA.
POWER GATING STRUCTURE FOR REVERSIBLE PROGRAMMABLE LOGIC ARRAYecij
Throughout the world, the numbers of researchers or hardware designer struggle for the reducing of power dissipation in low power VLSI systems. This paper presented an idea of using the power gating structure for reducing the sub threshold leakage in the reversible system. This concept presented in the paper is entirely new and presented in the literature of reversible logics. By using the reversible logics for the digital systems, the energy can be saved up to the gate level implementation. But at the physical level designing of the reversible logics by the modern CMOS technology the heat or energy is dissipated due the
sub-threshold leakage at the time of inactivity or standby mode. The Reversible Programming logic array (RPLA) is one of the important parts of the low power industrial applications and in this paper the physical design of the RPLA is presented by using the sleep transistor and the results is shown with the help of TINA- PRO software. The results for the proposed design is also compare with the CMOS design and shown that of 40.8% of energy saving. The Transient response is also produces in the paper for the switching activity and showing that the proposed design is much better that the modern CMOS design of the RPLA.
AN EFFICIENT COUPLED GENETIC ALGORITHM AND LOAD FLOW ALGORITHM FOR OPTIMAL PL...ijiert bestjournal
A genetic algorithm is used in conjunction with an efficient load flow programme to determine the optimal locations of the predefined D Gs with MATLAB & MATPOWER software. The best location for the DGs is determin ed using the genetic algorithm. The branch electrical loss is considered as the objecti ve function and the system total loss represent the fitness evaluation function for drivi ng the GA. The load flow equations are considered as equality constraints and the equation s of nodal voltage and branch capacity are considered as inequality constraints. The approach is tested on a 9 &14 bus IEEE distribution feeder.
Similar to Power and Clock Gating Modelling in Coarse Grained Reconfigurable Systems (20)
The Reconfigurable Video Coding (RVC) framework is a recent ISO standard aiming at providing a unified specification of MPEG video technology in the form of a library of components. The word “reconfigurable” evokes run-time instantiation of different decoders starting from an on-the-fly analysis of the input bitstream.
In this paper we move a first step towards the definition of systematic procedures that, based on the MPEG RVC specification formalism, are able to produce multi-decoder platforms, capable of fast switching between different configurations. Looking at the similarities between the decoding algorithms to implement, the papers describes an automatic tool for their composition into a single configurable multi-decoder built of all the required modules, and able to reuse the shared components so as to reduce the overall footprint (either from a hardware or software perspective).
The proposed approach, implemented in C++ leveraging on Flex and Bison code generation tools, typically exploited in the compilers front-end, demonstrates to be successful in the composition of two different decoders MPEG-4 Part 2 (SP): serial and parallel.
The Multi-Dataflow Composer Tool: a Runtime Reconfigurable HDL Platform ComposerMDC_UNICA
Dataflow Model of Computation (D-MoC) is particularly suitable to close the gap between hardware architects and software developers. Leveraging on the combination of the D-MoC with a coarse-grained reconfigurable approach to hardware design, we propose a tool, the Multi-Dataflow Composer (MDC) tool, able to improve time-to-market of modern complex multi-purpose systems by allowing the derivation of HDL runtime reconfigurable platforms starting from the D-MoC models of the targeted set of applications. MDC tool has proven to provide a considerable on-chip area saving: the 82% of saving has been reached combining of different applications in the image processing domain, adopting a 90 nm CMOS technology. In future the MDC tool, with a very small integration effort, will also be extremely useful to create multi-standard codec platforms for MPEG RVC applications.
A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflo...MDC_UNICA
In the last few years, efficient resource management turned out to be one of the major challenges for hardware designers. Strategies of reusability through reconfiguration have demonstrated interesting potentials to address it, providing also power and area minimization. The Multi-Dataflow Composer (MDC) tool has been presented to the scientific community to automatically build-up runtime coarse-grained reconfigurable platforms. Originally conceived in the field of Reconfigurable Video Coding (RVC), the MDC allows achieving attractive results also for hardware designers operating in different application fields. In this work, we intend to demonstrate the potential orthogonality of the MDC approach with respect to the RVC domain. A runtime reconfigurable wavelet denoiser, targeted for biomedical applications, has been developed and prototyped onto an FPGA Development Board Spartan 3E 1600. Impressive results in terms of resource minimization have been achieved with respect to more traditional solutions.
DSE and Profiling of Multi-Context Coarse-Grained Reconfigurable SystemsMDC_UNICA
The implementation of multi-context systems over coarse-grained reconfigurable platforms could bring several benefits in terms of efficient resource usage and power management. Nevertheless on-the-fly reconfiguration and mapping are not so straightforward and the optimal configuration of the substrate could be extremely time consuming. In this paper we present an early stage design space exploration methodology intended for dataflow-based design flows where multiple input specifications have to be taken into account. The proposed approach, coupled to the Multi-Dataflow Composer tool, has been exploited to assemble the central reconfigurable computing core of an accelerator for video/image processing.
Automatic Generation of Dataflow-Based Reconfigurable Co-processing UnitsMDC_UNICA
Hardware accelerators are widely adopted to speed up computationally onerous applications. However their design is not trivial, especially if multiple applications/kernels need to be served. To this aim the Multi-Dataflow Composer (MDC) tool can be adopted to generate the internal computing core of flexible and reconfigurable hardware accelerators. Nevertheless, MDC is not able, as it is, to deploy ready to use accelerators. To address this lack, we conceived a fully automated design flow for coarse-grained reconfigurable and memory-mapped hardware accelerators, which required: a) the definition of a generic co-processing template, the Template Interface Layer; b) the extension of the MDC tool to characterize such a template and deploy the accelerators. This methodology represents, within MPEG Reconfigurable Video Coding studies, the first framework for the automatic generation of reconfigurable hardware accelerators and, as it will be discussed, it may be beneficial also in other contexts of execution with fairly limited adjustments. Results validated the proposed approach in a real use case scenario, comparing the automatically generated co-processor with a previous custom one.
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...MDC_UNICA
pecialized hardware infrastructures for efficient multi-application runtime reconfigurable platforms require to address several issues. The higher is the system complexity, the more error prone and time consuming is the entire design flow. Moreover, system configuration along with resource management and mapping are challenging, especially when runtime adaptivity is required. In order to address these issues, the Reconfigurable Video Coding Group within the MPEG group has developed the MPEG RMC standards ISO/IEC 23001-4 and 23002-4, based on the dataflow Model of Computation. In this paper, we propose an integrated design flow, leveraging on Xronos, TURNUS, and the Multi-Dataflow Composer tool, capable of automatic synthesis and mapping of reconfigurable systems. In particular, an RVC MPEG-4 SP decoder and the RVC Intra MPEG-4 SP decoder have been implemented on the same coarse-grained reconfigurable platform, targeting a Xilinx Virtex 5 330 FPGA board. Results confirmed the potentiality of the approach, capable of completely preserving the single decoders functionality and of providing, in addition, considerable power/area benefits with respect to the parallel implementation of the considered decoders on the same platform.
Power-Awarness in Coarse-Grained Reconfigurable Designs: a Dataflow Based Str...MDC_UNICA
Applications and hardware complexity management in modern systems tend to collide with efficient resource and power balance. Therefore, dedicated and power-aware design frameworks are necessary to implement efficient multi-functional runtime reconfigurable signal processing platforms. In this work, we adopt dataflow specifications as a starting point to challenge power minimization.
Automated Power Gating Methodology for Dataflow-Based Reconfigurable SystemsMDC_UNICA
Modern embedded systems designers are required to implement efficient multi-functional applications, over portable platforms under strong energy and resources constraints. Automatic tools may help them in challenging such a complex scenario: to develop complex reconfigurable systems while reducing time-to-market. At the same time, automated methodologies can aid them to manage power consumption. Dataflow models of computation, thanks to their modularity, turned out to be extremely useful to these purposes. In this paper, we will demonstrate as they can be used to automatically achieve power management since the earliest stage of the design flow. In particular, we are focussing on the automation of power gating. The methodology has been evaluated on an image processing use case targeting an ASIC 90 nm CMOS technology.
Reconfigurable Coprocessors Synthesis in the MPEG-RVC DomainMDC_UNICA
Flexibility and high efficiency are common design drivers in the embedded systems domain. Coarse-grained reconfigurable coprocessors can tackle these issues, but they suffer of complex design, debugging and applications mapping problems. In this paper, we propose an automated design flow that aids developers in design and managing coarse-grained reconfigurable coprocessors. It provides both the hardware IP and the software drivers, featuring two different levels of coupling with the host processor. The presented solution has been tested on a JPEG codec, targeting a commercial Xilinx Virtex-5 FPGA.
Power Modelling for Saving Strategies in Coarse Grained Reconfigurable SystemsMDC_UNICA
In the context of coarse-grained reconfigurable systems we present a power estimation model to guide the designer in deciding which part of the design may benefit from the application of a power gating technique. The model is assessed by adopting a reconfigurable core for image processing targeting an ASIC 90 nm technology.
Adaptable AES Implementation with Power-Gating SupportMDC_UNICA
In this paper, we propose a reconfigurable design of the Advanced Encryption Standard capable of adapting at run-time to the requirements of the target application. Reconfiguration is achieved by activating only a specific subset of all the instantiated processing elements. Further, we explore the effectiveness of power gating and clock gating methodologies to minimize the energy consumption of the processing elements not involved in computation.
The RPCT project aims to define a software framework providing an efficient support for the automatic generation and management of dedicated reconfigurable systems. The starting points of this framework are the high-level specifications provided as dataflow models.
The main outcome of this research is the Multi Dataflow Composer (MDC) tool, which automatically generates a reconfigurable hardware platform. The MDC tool is extremely suitable in video and image processing contexts, portable platforms and bio-medical signal processing applications (i.e. implantable or wearable devices).
The RPCT research project has been developed within the Microelectronics and Bioengineering Lab (EOLAB) of the Department of Electrical and Electronic Engineering at the University of Cagliari.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalR - Slides Onl...Peter Gallagher
In this session delivered at Leeds IoT, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
MATHEMATICS BRIDGE COURSE (TEN DAYS PLANNER) (FOR CLASS XI STUDENTS GOING TO ...PinkySharma900491
Class khatm kaam kaam karne kk kabhi uske kk innings evening karni nnod ennu Tak add djdhejs a Nissan s isme sniff kaam GCC bagg GB g ghan HD smart karmathtaa Niven ken many bhej kaam karne Nissan kaam kaam Karo kaam lal mam cell pal xoxo
Power and Clock Gating Modelling in Coarse Grained Reconfigurable Systems
1. POWER AND CLOCK GATING
MODELLING
IN
COARSE GRAINED
RECONFIGURABLE SYSTEMS
Francesca Palumbo
University of Sassari
PolComIng
Information Engineering Unit
Tiziana Fanni, Carlo Sau, Paolo Meloni
and Luigi Raffo
University of Cagliari
Dept. of Electrical and Electronics Eng.
tiziana.fanni@diee.unica.it
ACM International Conference on
Computing Frontiers 2016
E
O
L
A
B
2. OUTLINE
Problem Statement
Background
Dataflow Model of Computation
Coarse Grain Reconfiguration
Multi-Dataflow Composer Tool
Proposed approach
Parameters Analysis
Power Estimation Model
Logic Regions Analysis
Metodology assesment
Conclusions
2
6. PROBLEM STATEMENT
Consumers need:
Integrated complex and fancy
resurce-intensive applications
Long battery life
Possible solutions:
3
7. PROBLEM STATEMENT
Consumers need:
Integrated complex and fancy
resurce-intensive applications
Long battery life
Dataflow Model of Computation
Modularity and parallelism
INTEGRATION AND RE-USABILITY
Possible solutions:
3
8. PROBLEM STATEMENT
Consumers need:
Integrated complex and fancy
resurce-intensive applications
Long battery life
Dataflow Model of Computation
Modularity and parallelism
INTEGRATION AND RE-USABILITY
Coarse-graine reconfiguration
Flexibility and resource sharing
MULTI-APPLICATION PORTABLE DEVICES
Possible solutions:
3
9. BACKGROUND
Dataflow Model of Computation
DATAFLOW FORMALISM
Directed graph of actors
(functional units).
Actors exchange tokens
(data packets) through
dedicated channels
4
10. BACKGROUND
Dataflow Model of Computation
DATAFLOW FORMALISM
Directed graph of actors
(functional units).
Actors exchange tokens
(data packets) through
dedicated channels
actions
state
4
11. BACKGROUND
Dataflow Model of Computation
DATAFLOW FORMALISM
Directed graph of actors
(functional units).
Actors exchange tokens
(data packets) through
dedicated channels
CHARACTERISTICS
Explicit the intrinsic application parallelism.
Modularity favours model
re-usability/adaptivity.
actions
state
4
37. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
10
38. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
10
39. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
10
40. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
10
41. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
Power PG: +2% Evaluate CG
10
42. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
Power PG: +2% Evaluate CG
Power CG: -5% Apply CG
10
43. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
Power PG: +2% Evaluate CG
Power CG: -5% Apply CG
10
44. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
Power PG: +2% Evaluate CG
Power CG: -5% Apply CG
Area > Area_th Evaluate PG
10
45. PROPOSED APPROACH
LRs analysis
NO
YES
NO
YES
NO
YES
NO
YES
AREA
THRESHOLDING
ESTIMATE PG ESTIMATE CG
Select PDi to be
Implement with PG
Select PDi to be
Implement with CG
START
i=0
Increment i
STOP
1
2 3
∀PDi
i<N+1
Is Above?
Is there
Saving%?
Is there
Saving%?
LRs LR1 LR2 LR3
Area 2% 5.8% 23%
PG --- +2% -18%
CG +1% -5% ---
Area_th: 5%
Area < Area_th Evaluate CG
Power CG: +1% Discard
Area > Area_th Evaluate PG
Power PG: +2% Evaluate CG
Power CG: -5% Apply CG
Area > Area_th Evaluate PG
Power CG: -18% Apply PG
10
65. CONCLUSIONS
Advantages of the proposed approach
Given a CGR design with n kernel and k LRs
The proposed approach requires uniquely the
analysis of the baseline CGR design:
one synthesis
n simulations (one foreach kernel)
14
66. CONCLUSIONS
Advantages of the proposed approach
Given a CGR design with n kernel and k LRs
The proposed approach requires uniquely the
analysis of the baseline CGR design:
one synthesis
n simulations (one foreach kernel)
Otherwise:
(2*k+1) (one PG and one CG for each LR + one of
Base the design)
(2*k+1)*n simulation (n simulation for each design) 14
67. ACKNOWLEDGEMENTS
The research leading to these results has
received funding from:
the Region of Sardinia L.R.7/2007 under grant
agreement CRP-18324 [RPCT Project].
the Region of Sardinia, Young Researchers Grant, POR
Sardegna FSE 2007-2013, L.R.7/2007 “Promotion of the
scientific research and technological innovation in
Sardinia”
15
68. Francesca Palumbo
University of Sassari
PolComIng
Information Engineering Unit
Tiziana Fanni, Carlo Sau, Paolo Meloni
and Luigi Raffo
University of Cagliari
Dept. of Electrical and Electronics Eng.
E
O
L
A
B
?
ACM International Conference on
Computing Frontiers 2016
tiziana.fanni@diee.unica.it
69. READ FULL ARTICLE
ACM Digital Library
http://dl.acm.org/citation.cfm?id=2911713
RPCT Project Website:
http://sites.unica.it/rpct/references/
70. REFERENCE
@inproceedings{Fanni_CF2016,
author = {Fanni, Tiziana and Sau, Carlo and Meloni, Paolo
and Raffo, Luigi and Palumbo, Francesca},
title = {Power and Clock Gating Modelling in Coarse
Grained Reconfigurable Systems},
booktitle = {Proceedings of the ACM International
Conference on Computing Frontiers},
series = {CF '16},
year = {2016},
location = {Como, Italy},
doi = {10.1145/2903150.2911713},
publisher = {ACM},
address = {New York, NY, USA}, }
BibTex