International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Bandpass Filter in S-Band by D.C.Vaghela,LJIET,Ahmedabad,Gujarat.Dipak Vaghela
This paper is to design bandpass filter suitable with center at 2.5 GHz. This application is in the S band range at
2.5 GHz center frequency currently being used for Indian Regional Navigation Satellite System (IRNSS) receiver. The filter
covers the centre frequency 2.5 GHz and the bandwidth is 80 MHz. This project was initiated with theoretical understanding
of various types of filter and their applications. And suitable type was selected. It functions to pass through the desired
frequencies within the range and block unwanted frequencies. In addition, filters are also needed to remove out harmonics
that are present in the communication system. It was design and simulated using ADS (Advanced Design System) software
COMPARATIVE ANALYSIS OF ROUTE INFORMATION BASED ENHANCED DIVIDE AND RULE STRA...ijsc
In wireless sensor network, routing data efficiently to the base station is a big issue and for this purpose, a
number of routing algorithms are invented by researchers. Clustering plays a very important role in the
design and as well as development of wireless sensor networks for well distribution of network and also to
route data efficiently. In this paper, we had done the enhancement of divide and rule strategy that is
basically route information protocol based upon static clustering and dynamic cluster head selection.
Simulation results show that our technique outperforms DR, LEACH, and AODV on the basis of packet
loss, delay, and throughput.
In this paper, a low pass filter based on T-Shaped resonator is presented. The T-Shaped resonator consists of meandered lines and rectangular patches. Also, the LC model and transfer function of the proposed resonator is presented. For suppression of spurious harmonics, a bandstop structure consists of hexangular patches and open stubs has been utilized. Finally, the wide stopband microstrip lowpass filter with cutoff frequency 2.72 GHz has been simulated, fabricated and measured. The LPF has good characteristics such as wide stopband and insertion loss lower than 0.18 dB in the passband region. The rejection level is less than -20 dB from 2.98 up to 21.3 GHz. The filter size is 10.5 mm×12.7 mm, or 0.131 λg× 0.158 λg, where λg is the guided wavelength. The measured and simulated results of the filter is in good agreement with each other, which show the merits of low insertion loss and wide stopband.
High performance novel dual stack gating technique for reduction of ground bo...eSAT Journals
Abstract The development of digital integrated circuits is challenged by higher power consumption. The combination of higher clock speeds, greater functional integration, and smaller process geometries has contributed to significant growth in power density. Today leakage power has become an increasingly important issue in processor hardware and software design. So to reduce the leakages in the circuit many low power strategies are identified and experiments are carried out. But the leakage due to ground connection to the active part of the circuit is very higher than all other leakages. As it is mainly due to the back EMF of the ground connection we are calling it as ground bounce noise. To reduce this noise, different methodologies are designed. In this paper, a number of critical considerations in the sleep transistor design and implementation includes header or footer switch selection, sleep transistor distribution choices and sleep transistor gate length, width and body bias optimization for area, leakage and efficiency. Novel dual stack technique is proposed that reduces not only the leakage power but also dynamic power. The previous techniques are summarized and compared with this new approach and comparison of both the techniques is done with the help of Digital Schematic( DSCH ) and Microwind low power tools. Stacking power gating technique has been analyzed and the conditions for the important design parameters (Minimum ground bounce noise) have been derived. The Monte-Carlo simulation is performed in Microwind to calculate the values of all the needed parameters for comparison. Index Terms: Ground Bounce Noise ,Power gating schemes ,Static power dissipation, Dynamic power dissipation, Power gating parameters, Sleep transistors, Novel dual stack approach, Transistor leakage power
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Bandpass Filter in S-Band by D.C.Vaghela,LJIET,Ahmedabad,Gujarat.Dipak Vaghela
This paper is to design bandpass filter suitable with center at 2.5 GHz. This application is in the S band range at
2.5 GHz center frequency currently being used for Indian Regional Navigation Satellite System (IRNSS) receiver. The filter
covers the centre frequency 2.5 GHz and the bandwidth is 80 MHz. This project was initiated with theoretical understanding
of various types of filter and their applications. And suitable type was selected. It functions to pass through the desired
frequencies within the range and block unwanted frequencies. In addition, filters are also needed to remove out harmonics
that are present in the communication system. It was design and simulated using ADS (Advanced Design System) software
COMPARATIVE ANALYSIS OF ROUTE INFORMATION BASED ENHANCED DIVIDE AND RULE STRA...ijsc
In wireless sensor network, routing data efficiently to the base station is a big issue and for this purpose, a
number of routing algorithms are invented by researchers. Clustering plays a very important role in the
design and as well as development of wireless sensor networks for well distribution of network and also to
route data efficiently. In this paper, we had done the enhancement of divide and rule strategy that is
basically route information protocol based upon static clustering and dynamic cluster head selection.
Simulation results show that our technique outperforms DR, LEACH, and AODV on the basis of packet
loss, delay, and throughput.
In this paper, a low pass filter based on T-Shaped resonator is presented. The T-Shaped resonator consists of meandered lines and rectangular patches. Also, the LC model and transfer function of the proposed resonator is presented. For suppression of spurious harmonics, a bandstop structure consists of hexangular patches and open stubs has been utilized. Finally, the wide stopband microstrip lowpass filter with cutoff frequency 2.72 GHz has been simulated, fabricated and measured. The LPF has good characteristics such as wide stopband and insertion loss lower than 0.18 dB in the passband region. The rejection level is less than -20 dB from 2.98 up to 21.3 GHz. The filter size is 10.5 mm×12.7 mm, or 0.131 λg× 0.158 λg, where λg is the guided wavelength. The measured and simulated results of the filter is in good agreement with each other, which show the merits of low insertion loss and wide stopband.
High performance novel dual stack gating technique for reduction of ground bo...eSAT Journals
Abstract The development of digital integrated circuits is challenged by higher power consumption. The combination of higher clock speeds, greater functional integration, and smaller process geometries has contributed to significant growth in power density. Today leakage power has become an increasingly important issue in processor hardware and software design. So to reduce the leakages in the circuit many low power strategies are identified and experiments are carried out. But the leakage due to ground connection to the active part of the circuit is very higher than all other leakages. As it is mainly due to the back EMF of the ground connection we are calling it as ground bounce noise. To reduce this noise, different methodologies are designed. In this paper, a number of critical considerations in the sleep transistor design and implementation includes header or footer switch selection, sleep transistor distribution choices and sleep transistor gate length, width and body bias optimization for area, leakage and efficiency. Novel dual stack technique is proposed that reduces not only the leakage power but also dynamic power. The previous techniques are summarized and compared with this new approach and comparison of both the techniques is done with the help of Digital Schematic( DSCH ) and Microwind low power tools. Stacking power gating technique has been analyzed and the conditions for the important design parameters (Minimum ground bounce noise) have been derived. The Monte-Carlo simulation is performed in Microwind to calculate the values of all the needed parameters for comparison. Index Terms: Ground Bounce Noise ,Power gating schemes ,Static power dissipation, Dynamic power dissipation, Power gating parameters, Sleep transistors, Novel dual stack approach, Transistor leakage power
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Evaluation the affects of mimo based rayleigh network cascaded with unstable ...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
CONCURRENT TERNARY GALOIS-BASED COMPUTATION USING NANO-APEX MULTIPLEXING NIBS...VLSICS Design
Novel realizations of concurrent computations utilizing three-dimensional lattice networks and their
corresponding carbon-based field emission controlled switching is introduced in this article. The
formalistic ternary nano-based implementation utilizes recent findings in field emission and nano
applications which include carbon-based nanotubes and nanotips for three-valued lattice computing via
field-emission methods. The presented work implements multi-valued Galois functions by utilizing
concurrent nano-based lattice systems, which use two-to-one controlled switching via carbon-based field
emission devices by using nano-apex carbon fibers and carbon nanotubes that were presented in the first
part of the article. The introduced computational extension utilizing many-to-one carbon field-emission
devices will be further utilized in implementing congestion-free architectures within the third part of the
article. The emerging nano-based technologies form important directions in low-power compact-size
regular lattice realizations, in which carbon-based devices switch less-costly and more-reliably using
much less power than silicon-based devices. Applications include low-power design of VLSI circuits for
signal processing and control of autonomous robots.
CONCURRENT TERNARY GALOIS-BASED COMPUTATION USING NANO-APEX MULTIPLEXING NIBS...VLSICS Design
Novel realizations of concurrent computations utilizing three-dimensional lattice networks and their
corresponding carbon-based field emission controlled switching is introduced in this article. The
formalistic ternary nano-based implementation utilizes recent findings in field emission and nano
applications which include carbon-based nanotubes and nanotips for three-valued lattice computing via
field-emission methods. The presented work implements multi-valued Galois functions by utilizing
concurrent nano-based lattice systems, which use two-to-one controlled switching via carbon-based field
emission devices by using nano-apex carbon fibers and carbon nanotubes that were presented in the first
part of the article. The introduced computational extension utilizing many-to-one carbon field-emission
devices will be further utilized in implementing congestion-free architectures within the third part of the
article. The emerging nano-based technologies form important directions in low-power compact-size
regular lattice realizations, in which carbon-based devices switch less-costly and more-reliably using
much less power than silicon-based devices. Applications include low-power design of VLSI circuits for
signal processing and control of autonomous robots.
Erca energy efficient routing and reclusteringaciijournal
The pervasive application of wireless sensor networks (WNSs) is challenged by the scarce energy constraints of sensor nodes. En-route filtering schemes, especially commutative cipher based en-route filtering (CCEF) can saves energy with better filtering capacity. However, this approach suffer from fixed paths and inefficient underlying routing designed for ad-hoc networks. Moreover, with decrease in remaining sensor nodes, the probability of network partition increases. In this paper, we propose energy-efficient routing and re-clustering algorithm (ERCA) to address these limitations. In proposed scheme with reduction in the number of sensor nodes to certain thresh-hold the cluster size and transmission range dynamically maintain cluster node-density. Performance results show that our approach demonstrate filtering-power, better energy-efficiency, and an average gain over 285% in network lifetime.
A REVIEW OF LOW POWERAND AREA EFFICIENT FSM BASED LFSR FOR LOGIC BISTjedt_journal
Built in Self Test circuits enable an integrated circuit to test itself. Built in Self Test reduces test and maintenance costs for an integrated circuit by eliminating the need for expensive test equipment. Built in Self Test also allows an integrated circuit to test at its normal operating speed which is very important for detecting timing faults. Despite all of these advantages Built in Self Test has seen limited use in industry because of area and performance overhead and increased design time. This paper presents automated techniques for implementing BIST in a way that minimizes area and performance overhead. This approach allows applying at-speed test patterns and eliminates the need for an external tester. Proper design of the test pattern generator contributes to reduction in the power consumption of the CUT and the overall power consumption of the BIST circuitry. We have proposed FSM based LFSR which generate maximum correlation among the patterns and targeted on c432, c1908 and c3540 benchmark circuits to validate test power, achieved significant improved in power up to 15% compared to conventional test generator and also achieved optimal area overhead,designed using Verilog HDL and implemented using Xilinx 14.3 and Cadence tool.
Multilayered low pass microstrip filter using csrreSAT Journals
Abstract Multi-tracking system is a real time tracking platform which uses integration of technologies such as GPS and GSM. The platform supports multiple tracking devices for variety of applications such as live vehicle tracking, personal tracking and also assets tracking. The GPS device installed in the vehicle continuously moves with the vehicle and will calculate the co-ordinates with other related information at each position and then transmit this information via GSM to the tracking server, thus storing it in the database; which further can be viewed on electronic map, i.e., Google Map via Internet providing up-to-date information. This proposed system also supports for real time control like, if owner sends an SMS, it automatically turns of the ignition of vehicle or other different purposes. The overall system will be implemented in Micro-soft .NET technology in which C#.Net will be used for system components & for web based ASP.Net will be used. Keywords: GPS, GSM, SMS, Socket Listener, Tracking server.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIPVLSICS Design
This is widely accepted that Network-on-Chip represents a promising solution for forthcoming complex embedded systems. The current SoC Solutions are built from heterogeneous hardware and Software components integrated around a complex communication infrastructure. The crossbar is a vital component of in any NoC router. In this work, we have designed a crossbar interconnect for serial bit data transfer and 128-parallel bit data transfer. We have shown comparision between power and delay for the serial bit and parallel bit data transfer through crossbar switch. The design is implemented in 0.180 micron TSM technology.The bit rate achived in serial transfer is slow as compared with parallel data transfer. The simulation resuls show that the critical path delay is less for parallel bit data transfer but power dissipation is high.
Comparative Analysis of Route Information Based Enhanced Divide and Rule Stra...ijsc
In wireless sensor network, routing data efficiently to the base station is a big issue and for this purpose, a number of routing algorithms are invented by researchers. Clustering plays a very important role in the design and as well as development of wireless sensor networks for well distribution of network and also to route data efficiently. In this paper, we had done the enhancement of divide and rule strategy that is basically route information protocol based upon static clustering and dynamic cluster head selection. Simulation results show that our technique outperforms DR, LEACH, and AODV on the basis of packet loss, delay, and throughput.
Hardware Complexity of Microprocessor Design According to Moore's Lawcsandit
The increasing of the number of transistors on a chip, which pl
ays the main role in improvement
in the performance and increasing the speed of a microproc
essor, causes rapidly increasing of
microprocessor design complexity. Based on Moore’s Law the
number of transistors should be
doubled every 24 months. The doubling of transistor count affects i
ncreasing of microprocessor
design complexity, power dissipation, and cost of design effort
.
This article presents a proposal to discuss the matter of sca
ling hardware complexity of a
microprocessor design related to Moore’s Law. Based on the dis
cussion a hardware complexity
measure is presented.
MICROSTRIP COUPLED LINE FILTER DESIGN FOR ULTRA WIDEBAND APPLICATIONSjmicro
A compact microstrip parallel coupled line filter for ultra wide band applications by means of combining a network of coupled line and defected ground is proposed. The design equations for three and five interconnected networks are derived and implemented. Simulations for three different configurations for filters are optimized. Then three prototype circuits are constructed, a bandpass filter with center frequency 2.25 GHz and two different bandpass filters (in terms of perturbations) with center frequencies 2.33GHz.
For 2.25 GHz circuit wide fractional bandwidth of about 90% is obtained but undesired high return loss existed. For 2.33GHz circuit with grooves in sides fractional bandwidth of about 60% is obtained at about 3.4 GHz center frequency. However undesired return loss existed for this circuit whereas good out off or 2.33GHz circuit with grooves in whole sections the center frequency got shifted to about 3.4 GHz and about 50% fractional bandwidth is obtained with very good out off band performance observed.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Performance Analysis of Rake Receivers in IR–UWB System IOSR Journals
Suppression of interference in time domain equalizers is attempted for high data rate impulse
radio (IR) ultra wideband communication system. The narrow band systems may cause interference with UWB
devices as it is having very low transmission power and the large bandwidth. SRAKE receiver improves system
performance by equalizing signals from different paths. This enables the use of SRAKE receiver techniques in IR
UWB systems A semi analytical approach is used to investigate the BER performance of SRAKE receiver on
IEEE 802.15.3a UWB channel models. Study on non-line of sight indoor channel models (both CM3 and CM4)
illustrates that bit error rate performance of SRake receiver with NBI performs better than that of Rake receiver
without NBI
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Evaluation the affects of mimo based rayleigh network cascaded with unstable ...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
CONCURRENT TERNARY GALOIS-BASED COMPUTATION USING NANO-APEX MULTIPLEXING NIBS...VLSICS Design
Novel realizations of concurrent computations utilizing three-dimensional lattice networks and their
corresponding carbon-based field emission controlled switching is introduced in this article. The
formalistic ternary nano-based implementation utilizes recent findings in field emission and nano
applications which include carbon-based nanotubes and nanotips for three-valued lattice computing via
field-emission methods. The presented work implements multi-valued Galois functions by utilizing
concurrent nano-based lattice systems, which use two-to-one controlled switching via carbon-based field
emission devices by using nano-apex carbon fibers and carbon nanotubes that were presented in the first
part of the article. The introduced computational extension utilizing many-to-one carbon field-emission
devices will be further utilized in implementing congestion-free architectures within the third part of the
article. The emerging nano-based technologies form important directions in low-power compact-size
regular lattice realizations, in which carbon-based devices switch less-costly and more-reliably using
much less power than silicon-based devices. Applications include low-power design of VLSI circuits for
signal processing and control of autonomous robots.
CONCURRENT TERNARY GALOIS-BASED COMPUTATION USING NANO-APEX MULTIPLEXING NIBS...VLSICS Design
Novel realizations of concurrent computations utilizing three-dimensional lattice networks and their
corresponding carbon-based field emission controlled switching is introduced in this article. The
formalistic ternary nano-based implementation utilizes recent findings in field emission and nano
applications which include carbon-based nanotubes and nanotips for three-valued lattice computing via
field-emission methods. The presented work implements multi-valued Galois functions by utilizing
concurrent nano-based lattice systems, which use two-to-one controlled switching via carbon-based field
emission devices by using nano-apex carbon fibers and carbon nanotubes that were presented in the first
part of the article. The introduced computational extension utilizing many-to-one carbon field-emission
devices will be further utilized in implementing congestion-free architectures within the third part of the
article. The emerging nano-based technologies form important directions in low-power compact-size
regular lattice realizations, in which carbon-based devices switch less-costly and more-reliably using
much less power than silicon-based devices. Applications include low-power design of VLSI circuits for
signal processing and control of autonomous robots.
Erca energy efficient routing and reclusteringaciijournal
The pervasive application of wireless sensor networks (WNSs) is challenged by the scarce energy constraints of sensor nodes. En-route filtering schemes, especially commutative cipher based en-route filtering (CCEF) can saves energy with better filtering capacity. However, this approach suffer from fixed paths and inefficient underlying routing designed for ad-hoc networks. Moreover, with decrease in remaining sensor nodes, the probability of network partition increases. In this paper, we propose energy-efficient routing and re-clustering algorithm (ERCA) to address these limitations. In proposed scheme with reduction in the number of sensor nodes to certain thresh-hold the cluster size and transmission range dynamically maintain cluster node-density. Performance results show that our approach demonstrate filtering-power, better energy-efficiency, and an average gain over 285% in network lifetime.
A REVIEW OF LOW POWERAND AREA EFFICIENT FSM BASED LFSR FOR LOGIC BISTjedt_journal
Built in Self Test circuits enable an integrated circuit to test itself. Built in Self Test reduces test and maintenance costs for an integrated circuit by eliminating the need for expensive test equipment. Built in Self Test also allows an integrated circuit to test at its normal operating speed which is very important for detecting timing faults. Despite all of these advantages Built in Self Test has seen limited use in industry because of area and performance overhead and increased design time. This paper presents automated techniques for implementing BIST in a way that minimizes area and performance overhead. This approach allows applying at-speed test patterns and eliminates the need for an external tester. Proper design of the test pattern generator contributes to reduction in the power consumption of the CUT and the overall power consumption of the BIST circuitry. We have proposed FSM based LFSR which generate maximum correlation among the patterns and targeted on c432, c1908 and c3540 benchmark circuits to validate test power, achieved significant improved in power up to 15% compared to conventional test generator and also achieved optimal area overhead,designed using Verilog HDL and implemented using Xilinx 14.3 and Cadence tool.
Multilayered low pass microstrip filter using csrreSAT Journals
Abstract Multi-tracking system is a real time tracking platform which uses integration of technologies such as GPS and GSM. The platform supports multiple tracking devices for variety of applications such as live vehicle tracking, personal tracking and also assets tracking. The GPS device installed in the vehicle continuously moves with the vehicle and will calculate the co-ordinates with other related information at each position and then transmit this information via GSM to the tracking server, thus storing it in the database; which further can be viewed on electronic map, i.e., Google Map via Internet providing up-to-date information. This proposed system also supports for real time control like, if owner sends an SMS, it automatically turns of the ignition of vehicle or other different purposes. The overall system will be implemented in Micro-soft .NET technology in which C#.Net will be used for system components & for web based ASP.Net will be used. Keywords: GPS, GSM, SMS, Socket Listener, Tracking server.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIPVLSICS Design
This is widely accepted that Network-on-Chip represents a promising solution for forthcoming complex embedded systems. The current SoC Solutions are built from heterogeneous hardware and Software components integrated around a complex communication infrastructure. The crossbar is a vital component of in any NoC router. In this work, we have designed a crossbar interconnect for serial bit data transfer and 128-parallel bit data transfer. We have shown comparision between power and delay for the serial bit and parallel bit data transfer through crossbar switch. The design is implemented in 0.180 micron TSM technology.The bit rate achived in serial transfer is slow as compared with parallel data transfer. The simulation resuls show that the critical path delay is less for parallel bit data transfer but power dissipation is high.
Comparative Analysis of Route Information Based Enhanced Divide and Rule Stra...ijsc
In wireless sensor network, routing data efficiently to the base station is a big issue and for this purpose, a number of routing algorithms are invented by researchers. Clustering plays a very important role in the design and as well as development of wireless sensor networks for well distribution of network and also to route data efficiently. In this paper, we had done the enhancement of divide and rule strategy that is basically route information protocol based upon static clustering and dynamic cluster head selection. Simulation results show that our technique outperforms DR, LEACH, and AODV on the basis of packet loss, delay, and throughput.
Hardware Complexity of Microprocessor Design According to Moore's Lawcsandit
The increasing of the number of transistors on a chip, which pl
ays the main role in improvement
in the performance and increasing the speed of a microproc
essor, causes rapidly increasing of
microprocessor design complexity. Based on Moore’s Law the
number of transistors should be
doubled every 24 months. The doubling of transistor count affects i
ncreasing of microprocessor
design complexity, power dissipation, and cost of design effort
.
This article presents a proposal to discuss the matter of sca
ling hardware complexity of a
microprocessor design related to Moore’s Law. Based on the dis
cussion a hardware complexity
measure is presented.
MICROSTRIP COUPLED LINE FILTER DESIGN FOR ULTRA WIDEBAND APPLICATIONSjmicro
A compact microstrip parallel coupled line filter for ultra wide band applications by means of combining a network of coupled line and defected ground is proposed. The design equations for three and five interconnected networks are derived and implemented. Simulations for three different configurations for filters are optimized. Then three prototype circuits are constructed, a bandpass filter with center frequency 2.25 GHz and two different bandpass filters (in terms of perturbations) with center frequencies 2.33GHz.
For 2.25 GHz circuit wide fractional bandwidth of about 90% is obtained but undesired high return loss existed. For 2.33GHz circuit with grooves in sides fractional bandwidth of about 60% is obtained at about 3.4 GHz center frequency. However undesired return loss existed for this circuit whereas good out off or 2.33GHz circuit with grooves in whole sections the center frequency got shifted to about 3.4 GHz and about 50% fractional bandwidth is obtained with very good out off band performance observed.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Performance Analysis of Rake Receivers in IR–UWB System IOSR Journals
Suppression of interference in time domain equalizers is attempted for high data rate impulse
radio (IR) ultra wideband communication system. The narrow band systems may cause interference with UWB
devices as it is having very low transmission power and the large bandwidth. SRAKE receiver improves system
performance by equalizing signals from different paths. This enables the use of SRAKE receiver techniques in IR
UWB systems A semi analytical approach is used to investigate the BER performance of SRAKE receiver on
IEEE 802.15.3a UWB channel models. Study on non-line of sight indoor channel models (both CM3 and CM4)
illustrates that bit error rate performance of SRake receiver with NBI performs better than that of Rake receiver
without NBI
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Cosmetic shop management system project report.pdf
L03.pdf
1. L03-1
Sze and Emer
6.5930/1
Hardware Architectures for Deep Learning
Popular DNN Models
Joel Emer and Vivienne Sze
Massachusetts Institute of Technology
Electrical Engineering & Computer Science
February 13, 2023
2. L03-2
Sze and Emer
Goals of Today’s Lecture
• Last lecture covered the building blocks of CNNs; this lecture
describes how we put these blocks together to form a CNN.
• Overview of various well-known CNN models
– CNN ‘models’ are also referred to as ‘network architectures’; however, we prefer to
use the term ‘model’ in this class to avoid overloading the term ‘architecture’
• We group the CNN models into two categories
– High Accuracy CNN Models: Designed to maximize accuracy to compete in the
ImageNet Challenge
– Efficient CNN Models: Designed to reduce the number of weights and
operations (specifically MACs) while maintaining accuracy
February 13, 2023
8. L03-8
Sze and Emer
ImageNet
http://www.image-net.org/challenges/LSVRC/
Image Classification
~256x256 pixels (color)
1000 Classes
1.3M Training
100,000 Testing (50,000 Validation) Image Source: http://karpathy.github.io/
For ImageNet Large Scale Visual
Recognition Challenge (ILSVRC)
accuracy of classification task reported
based on top-1 and top-5 error
February 13, 2023
9. L03-9
Sze and Emer
AlexNet
CONV Layers: 5
Fully Connected Layers: 3
Weights: 61M
MACs: 724M
ReLU used for non-linearity
ILSCVR12 Winner
Uses Local Response Normalization (LRN)
input ofmap1 ofmap2 ofmap3 ofmap4 ofmap5 ofmap6
ofmap7
ofmap8
[Krizhevsky, NeurIPS 2012]
February 13, 2023
20. L03-20
Sze and Emer
GoogLeNet/Inception (v1)
CONV Layers: 21 (depth), 57 (total)
Fully Connected Layers: 1
Weights: 7.0M
MACs: 1.43G [Szegedy, CVPR 2015]
Also, v2, v3 and v4
ILSVRC14 Winner
9 Inception Blocks*
3 CONV layers 1 FC layer
(reduced from 3)
Auxiliary Classifiers
(helps with training,
not used during inference)
*referred to as inception
module in textbook
February 13, 2023
21. L03-21
Sze and Emer
GoogLeNet/Inception (v1)
parallel* filters of different size have the effect
of processing image at different scales
1x1 ‘bottleneck’ to
reduce number of
weights and
multiplications
Inception
Block
CONV Layers: 21 (depth), 57 (total)
Fully Connected Layers: 1
Weights: 7.0M
MACs: 1.43G
Also, v2, v3 and v4
ILSVRC14 Winner
[Szegedy, CVPR 2015]
*also referred to as “multi-branch” and
“split-transform-merge”
February 13, 2023
22. L03-22
Sze and Emer
1x1 Bottleneck
Modified image from source:
Stanford cs231n
[Lin, Network in Network, ICLR 2014]
Use 1x1 filter to capture cross-channel correlation, but not spatial correlation.
Can be used to reduce the number of channels in next layer (compress).
(Filter dimensions for bottleneck: R=1, S=1, C > M)
1
56
56
filter1
(1x1x64)
February 13, 2023
23. L03-23
Sze and Emer
1x1 Bottleneck
Modified image from source:
Stanford cs231n
filter2
(1x1x64)
2
56
56
Use 1x1 filter to capture cross-channel correlation, but not spatial correlation.
Can be used to reduce the number of channels in next layer (compress).
(Filter dimensions for bottleneck: R=1, S=1, C > M)
[Lin, Network in Network, ICLR 2014]
February 13, 2023
24. L03-24
Sze and Emer
1x1 Bottleneck
Modified image from source:
Stanford cs231n
32
56
56
Use 1x1 filter to capture cross-channel correlation, but not spatial correlation.
Can be used to reduce the number of channels in next layer (compress).
(Filter dimensions for bottleneck: R=1, S=1, C > M)
[Lin, Network in Network, ICLR 2014]
February 13, 2023
25. L03-25
Sze and Emer
GoogLeNet:1x1 Bottleneck
1x1 ‘bottleneck’ to
reduce number of
weights and
multiplications
Inception
Block
[Szegedy, CVPR 2015]
Apply 1x1 bottleneck before ‘large’ convolution filters.
Reduce weights such that entire CNN can be trained on one GPU.
Number of multiplications reduced from 854M à 358M
February 13, 2023
26. L03-26
Sze and Emer
Reduce Cost of FC Layers
First FC layer accounts for a
significant portion of weights
38M of 61M for AlexNet
105M 224M 150M 112M 75M 38M 17M 4M
# of MACs
L1 L2 L3 L4 L5 L6 L7
1000
scores
224x224
Input
Image
Conv
(11x11)
Non-Linearity
Norm
(LRN)
Max
Pool
Conv
(5x5)
Non-Linearity
Norm
(LRN)
Max
Pooling
Conv
(3x3)
Non-Linearity
Conv
(3x3)
Non-Linearity
Conv
(3x3)
Non-Linearity
Max
Pooling
Fully
Connect
Non-Linearity
Fully
Connect
Non-Linearity
Fully
Connect
34k 307k 885k 664k 442k 37.7M 16.8M 4.1M
# of weights 38M
[Krizhevsky, NeurIPS 2012]
February 13, 2023
27. L03-27
Sze and Emer
Global Pooling
GoogLeNet uses global pooling to reduce number of FC layers from three to one
[Lin, ICLR 2014]
Use Global Pooling to reduce size of input to the first FC layer and the FC layer itself
FC Layer
H
…
input fmap
output fmap1
…
C
1
W
1
H
…
…
C
W
…
filter1
input fmapGP
1
1
C
1
1
C
filter1
output fmap1
1
1
Step 2: FC Layer
H
input fmap
output fmapGP
…
…
C
1
W
1
Pool
C
Step 1: Global Pooling
Size of FC layer:
HxWxCxM à 1x1xCxM
February 13, 2023
28. L03-28
Sze and Emer
ResNet
Image Source: http://icml.cc/2016/tutorials/icml2016_tutorial_deep_residual_networks_kaiminghe.pdf
Go Deeper!
ILSVRC15 Winner
(better than human level accuracy!)
February 13, 2023
29. L03-29
Sze and Emer
ResNet: Training
Training and validation error increases with more layers;
this is due to vanishing gradient, no overfitting.
Introduce short cut block to address this!
Without shortcut With shortcut
Thin curves denote training error, and bold curves denote validation error.
[He, CVPR 2016]
February 13, 2023
30. L03-30
Sze and Emer
ResNet: Short Cut Block
Helps address the vanishing gradient challenge for
training very deep networks
1 CONV layer
1 FC layer
16 Short
Cut
Blocks
ResNet-34
3x3 CONV
ReLU
ReLU
3x3 CONV
+
x
F(x)
H(x) = F(x) + x
Iden%ty
x
Learns
Residual
F(x)=H(x)-x
Skip Connection
(also referred to
as highway)
[He, CVPR 2016]
February 13, 2023
31. L03-31
Sze and Emer
ResNet: Bottleneck
Apply 1x1 bottleneck to reduce computation and size
Also makes network deeper (ResNet-34 à ResNet-50)
compress
C > M
expand
C < M
[He, CVPR 2016]
February 13, 2023
32. L03-32
Sze and Emer
ResNet-50
CONV Layers: 49
Fully Connected Layers: 1
Weights: 25.5M
MACs: 3.9G
Also, 34-, 152-, and 1202-layer versions
ILSVRC15 Winner
1 CONV layer
1 FC layer
16 Short
Cut
Blocks
ResNet-50
Short Cut Block
[He, CVPR 2016]
February 13, 2023
34. L03-34
Sze and Emer
Summary of Popular CNNs
• AlexNet
– First CNN Winner of ILSVRC
– Uses LRN (deprecated after this)
• VGG-16
– Goes Deeper (16+ layers)
– Uses only 3x3 filters (stack for larger filters)
• GoogLeNet (v1)
– Reduces weights with Inception and uses Global Pooling so that only one FC layer is needed
– Inception Block: 1x1 and parallel connections
– Batch Normalization
• ResNet
– Goes Deeper (24+ layers)
– Short cut Block: Skip connections
February 13, 2023
35. L03-35
Sze and Emer
DenseNet
[Huang, CVPR 2017]
Feature maps are concatenated rather than added.
Break into blocks to limit depth and thus size of combined feature map.
More Skip Connections!
Connections not only from previous layer, but
many past layers to strengthen feature map
propagation and feature reuse.
Dense
Block
Transition layers
February 13, 2023
36. L03-36
Sze and Emer
DenseNet
Note: 1 MAC = 2 FLOPS
Higher accuracy than ResNet with fewer weights and multiplications
[Huang, CVPR 2017]
Top-1 error Top-1 error
February 13, 2023
37. L03-37
Sze and Emer
Wide ResNet
Increase width (# of filters) rather than depth of network
• 50-layer wide ResNet outperforms 152-layer original ResNet
• Increasing width instead of depth is also more parallel-friendly
Image Source: Stanford cs231n
[Zagoruyko, BMVC 2016]
February 13, 2023
38. L03-38
Sze and Emer
Squeeze and Excitation
H
C
W
Input fmap
1
C
1
1) Global
Pooling
1
C
1
2) Multiple
FC Layers
Filters
H
C
W
Output fmap
∗
3) Depthwise
Convolution
Dynamic
Weights
Depth-wise convolution with dynamic weights, where the weights change based
on the input feature map.
• Squeeze: Summarize each channel of input features map with global pooling
• Excitation: Determine weights using FC layers to increase attention on
certain channels of the input features map
[Hu, CVPR 2018]
excitation
squeeze
Attention (input à dynamic weights)
Used by SENet
ILSVRC 2017 Winner
February 13, 2023
39. L03-39
Sze and Emer
Convolution versus Attention Mechanism
• Convolution
– Only models dependencies between spatial neighbors
– Use sparsely connected layer to spatial neighbors; no support for dependencies outside of
spatial dimensions of filter (R x S)
• Attention
– “Allows modeling of [global] dependencies without regard to their distance” [Vaswani,
NeurIPS 2017]
– However, fully connected layer too expensive; develop mechanism to bias “the allocation of
available computational resources towards the most informative components of a signal”
[Hu, CVPR 2018]
• Transformer is a type of DNN that is built entirely using Attention Mechanism
[Vaswani, NeurIPS 2017] (Next Lecture)
February 13, 2023
42. L03-42
Sze and Emer
Manual Network Design
• Reduce Spatial Size (R, S)
– stacked filters
• Reduce Channels (C)
– 1x1 convolution, grouped convolution
• Reduce Filters (M)
– feature map reuse across layers
Filters
R
S
…
…
…
C
H
W
…
…
…
C
…
E
F
…
…
…
M
…
…
…
M
…
R
S
…
…
…
C
H
W
…
…
C
1
N
1
M
1
…
…
Input fmaps
Output fmaps
…
E
F
N
P
Q
P
Q
February 13, 2023
43. L03-43
Sze and Emer
Reduce Spatial Size (R, S): Stacked Small Filters
5x5 filter Two 3x3 filters
decompose
Apply sequentially
decompose
5x5 filter 5x1 filter
1x5 filter
Apply sequentially
GoogleNet/
Inception v3
VGG
separable
filters
Replace a large filter with a series of smaller filters (reduces degrees of freedom)
February 13, 2023
44. L03-44
Sze and Emer
Example: Inception V3
Go deeper (v1: 22 layers à v3: 40+ layers) by reducing the number
of weights per filter using filter decomposition
~3.5% higher accuracy than v1
[Szegedy, CVPR 2016]
5x5 filter à 3x3 filters 3x3 filter à 3x1 and 1x3 filters
Separable filters
February 13, 2023
45. L03-45
Sze and Emer
Reduce Channels (C): 1x1 Convolution
ResNet
GoogLeNet
compress
expand
compress
C > M
C < M
• Use 1x1 (bottleneck) filter to capture cross-channel correlation, but not spatial correlation
• Reduce the number of channels in next layer (compress), where C > M
February 13, 2023
46. L03-46
Sze and Emer
Example: SqueezeNet
[Iandola, ICLR 2017]]
Fire Block
Reduce number of weights by reducing number of input
channels by “squeezing” with 1x1
50x fewer weights than AlexNet (no accuracy loss)
However, 1.2x more operations than AlexNet*
*for SqueezeNetv1.0
(reduce operations by 2x in
SqueezeNetv1.1)
February 13, 2023
47. L03-47
Sze and Emer
Reduce Channels (C): Grouped Convolutions
Grouped convolutions reduce the number of weights and multiplications at the
cost of not sharing information between groups
• Divide filters into groups (G) operating on subset of channels.
• Each group has M/G filters and processes C/G channels.
P
Q
4
P
Q
4
P
Q
4
P
Q
4
1
Grouped Convolution
filters
input
fmaps output
fmaps
H
C
W
R
S
C/2
1 1
H
C
W
R
C/2
S
4
1
H
C
W
R
S
C/2
2 1
H
C
W
R
C/2
S
3
1
Group 1
Group 2
1
1
1
P
Q
4
P
Q
4
1
Grouped Convolution
filters
input
fmaps output
fmaps
H
C
W
R
S
C/2
1 1
H
C
W
R
S
C/2
2 1
Group 1
1
Group 2
Example for G=2: Each filter requires 2x fewer weights and MACs (C à C/2)
Group 1
In this example,
N=1 & M=4
February 13, 2023
48. L03-48
Sze and Emer
Reduce Channels (C): Grouped Convolutions
Two ways of mixing information from groups
Shuffle Operation
(Mix in multiple steps)
ShuffleNet
fmap 0
layer 1
fmap 1
layer 2
fmap 2
Pointwise (1x1) Convolution
(Mix in one step)
MobileNet
Also referred to as depth-wise separable:
Decouple the cross-channels correlations and
spatial correlations in the feature maps of the CNN
C
1
1
S
R
1
R
S
C
+
C
M
February 13, 2023
49. L03-49
Sze and Emer
Depth-wise Convolutions
The extreme case of Grouped Convolutions is Depth-wise Convolutions,
where the number of groups (G) equals number channels (C) (i.e., one input channel per group)
Group 1
Group C
H
R
S
input fmap
output fmap1
…
…
…
…
C
…
filter1
P
W Q
input fmap
output fmapC
…
…
P
Q
H
…
…
…
…
C
…
W
R
filterC
S
Typically, M=C
(but does not have to be)
February 13, 2023
50. L03-50
Sze and Emer
Example: MobileNets
[Howard, arXiv 2017]
Depth-wise filter decomposition
depthwise
pointwise (1x1)
C
C
C
R
S M
M
Reduction in MACs
HWCRSM RSM
HWC(RS+M) (RS+M)
=
R
S
February 13, 2023
51. L03-51
Sze and Emer
MobileNets: Comparison
Comparison with other CNN Models
[Image source: Github]
[Howard, arXiv 2017]
February 13, 2023
52. L03-52
Sze and Emer
Example: Xception
• An Inception block based on depth-wise separable convolutions
• Claims to learn richer features with similar number of weights as Inception V3 (i.e., more
efficient use of weights)
– Similar performance on ImageNet; 4.3% better on larger dataset (JFT)
– However, 1.5x more operations required than Inception V3
[Chollet, CVPR 2017]
Spatial correlation
Cross-channel correlation
February 13, 2023
53. L03-53
Sze and Emer
Example: ResNeXt
ResNet ResNeXt
[Xie, CVPR 2017]
Used by ILSVRC 2017 Winner SENet
Inspired by Inception’s “split-transform-merge”
Increase number of convolution groups (G) (referred to as cardinality in the paper)
instead of depth and width of network
February 13, 2023
54. L03-54
Sze and Emer
Example: ResNeXt
Improved accuracy vs. ‘complexity’ tradeoff compared to
other ResNet based models
[Xie, CVPR 2017]
February 13, 2023
55. L03-55
Sze and Emer
Shuffle Operation
H
input fmap
output fmap
…
…
…
…
C
…
P
W Q
input fmap
output fmap
…
…
P
Q
H
…
…
…
…
C
…
W
∗
∗
1
2
3
4
output fmap
P
Q
output fmap
P
Q
3
2
1
4
Shuffle
Group 1
Group 2
February 13, 2023
56. L03-56
Sze and Emer
Example: ShuffleNet
Shuffle order such that channels are not isolated across groups
(up to 4% increase in accuracy)
[Zhang, CVPR 2018]
No interaction between
channels from different groups
Shuffling allow interaction between
channels from different groups
February 13, 2023
57. L03-57
Sze and Emer
AlexNet: Grouped Convolutions
Split into 2
Groups
Split into 2
Groups
AlexNet uses grouped convolutions to train on two separate GPUs
(Drawback: correlation between channels of different groups is not used)
Mix
Information
(3x3 CONV)
Mix
Information
(FC)
February 13, 2023
58. L03-58
Sze and Emer
Reduce Filters (M): Feature Map Reuse
…
R
S
1
C …
…
…
M Filters
…
R
S
K
C
…
…
…
R
S
M
C
…
…
…
…
Output fmap with M channels
L2 L3
L1
Reuse (M-K) channels in feature maps from
previously processed layers
[Huang, CVPR 2017]
DenseNet reuses feature map from
multiple layers
M-K
M
F
…
…
…
…
P
K
Q
February 13, 2023
M-K
K
M
59. L03-59
Sze and Emer
Neural Architecture Search (NAS)
3x3? 5x5?
128 Filters?
Pool? CONV?
Rather than handcrafting the model, automatically search for it
February 13, 2023
60. L03-60
Sze and Emer
Neural Architecture Search (NAS)
• Three main components:
– Search Space (what is the set of all samples)
– Optimization Algorithm (where to sample)
– Performance Evaluation (how to evaluate samples)
Key Metrics: Achievable DNN accuracy and required search time
Search
Space
Performance
Evaluation Evaluation
Result
Optimization
Algorithm
Next Location
to Sample
Sampled
Network
Final Network
February 13, 2023
61. L03-61
Sze and Emer
Evaluate NAS Search Time
𝒕𝒊𝒎𝒆𝒏𝒂𝒔 = 𝒏𝒖𝒎𝒔𝒂𝒎𝒑𝒍𝒆𝒔 × 𝒕𝒊𝒎𝒆𝒔𝒂𝒎𝒑𝒍𝒆
𝒕𝒊𝒎𝒆𝒏𝒂𝒔 ∝ 𝒔𝒊𝒛𝒆𝒔𝒆𝒂𝒓𝒄𝒉_𝒔𝒑𝒂𝒄𝒆×
𝒏𝒖𝒎𝒂𝒍𝒈_𝒕𝒖𝒏𝒊𝒏𝒈
𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒄𝒚𝒂𝒍𝒈
× (𝒕𝒊𝒎𝒆𝒆𝒗𝒂𝒍 + 𝒕𝒊𝒎𝒆𝒕𝒓𝒂𝒊𝒏)
(1) Shrink the search
space
(2) Improve the
optimization algorithm
(3) Simplify the
performance evaluation
Goal: Improve the efficiency of NAS in the three main components
Search
Space
Performance
Evaluation
Optimization
Algorithm
February 13, 2023
62. L03-62
Sze and Emer
(1) Shrink the Search Space
• Trade the breadth of models for
search speed
• May limit the performance that can be
achieved
• Use domain knowledge from manual
network design to help guide the
reduction of the search space
February 13, 2023
Model
Universe
Model
Universe
Search Space
Optimal
Optimal
Samples =
63. L03-63
Sze and Emer
(1) Shrink the Search Space
• Search space = layer operations + connections between layers
February 13, 2023
• Identity
• 1x3 then 3x1 convolution
• 1x7 then 7x1 convolution
• 3x3 dilated convolution
• 1x1 convolution
• 3x3 convolution
• 3x3 separable convolution
• 5x5 separable convolution
• 3x3 average pooling
• 3x3 max pooling
• 5x5 max pooling
• 7x7 max pooling
Common layer operations
[Zoph, CVPR 2018]
64. L03-64
Sze and Emer
(1) Shrink the Search Space
• Search space = layer operations + connections between layers
February 13, 2023
Image Source: [Zoph, CVPR 2018]
Smaller Search Space
65. L03-65
Sze and Emer
(2) Improve Optimization Algorithm
February 13, 2023
Random Gradient Descent
Coordinate Descent
Reinforcement Learning Bayesian
Evolutionary
66. L03-66
Sze and Emer
(3) Simplify the Performance Evaluation
• NAS needs only the rank of the performance values
• Method 1: approximate accuracy
February 13, 2023
Proxy Task Early Termination Accuracy Prediction
E.g., Smaller resolution,
simpler tasks
Stop training earlier
Accuracy
Iteration
Stop
Extrapolate accuracy
Accuracy
Iteration
Predict
67. L03-67
Sze and Emer
(3) Simplify the Performance Evaluation
• NAS needs only the rank of the performance values
• Method 2: approximate weights
February 13, 2023
Copy Weights Estimate Weights
Reuse weights from
other similar networks
Infer the weights from the
previous feature maps
Copy
Generate
What
weights?
Previous
New
Feature
Map
Filter
Previous New
68. L03-68
Sze and Emer
(3) Simplify the Performance Evaluation
• NAS needs only the rank of the performance values
• Method 3: approximate metrics (e.g., latency, energy)
February 13, 2023
Look-Up Table
Proxy Metric
Use an easy-to-compute
metric to approximate target
Use table lookup
Latency # MACs
69. L03-69
Sze and Emer
Design Considerations for NAS
• The components may not be chosen individually
– Some optimization algorithms limit the search space
– Type of performance metric may limit the selection of the optimization
algorithms
• Commonly overlooked properties
– The complexity of implementation
– The ease of tuning hyperparameters of the optimization
– The probability of convergence to a good architecture
February 13, 2023
70. L03-70
Sze and Emer
Example: NASNet
• Search Space: Build model from popular layers
• Identity
• 1x3 then 3x1 convolution
• 1x7 then 7x1 convolution
• 3x3 dilated convolution
• 1x1 convolution
• 3x3 convolution
• 3x3 separable convolution
• 5x5 separable convolution
• 3x3 average pooling
• 3x3 max pooling
• 5x5 max pooling
• 7x7 max pooling
[Zoph, CVPR 2018]
February 13, 2023
72. L03-72
Sze and Emer
NASNet: Comparison with Existing Networks
Learned models have improved accuracy vs. ‘complexity’ tradeoff
compared to handcrafted models
[Zoph, CVPR 2018]
February 13, 2023
73. L03-73
Sze and Emer
EfficientNet
[Tan, ICML 2019]
Uniformly scaling all dimensions including depth, width, and resolution
since there is an interplay between the different dimensions.
Use NAS to search for baseline model and then scale up.
February 13, 2023
74. L03-74
Sze and Emer
Summary
• Approaches used to improve accuracy by popular CNN models in the ImageNet
Challenge
– Go deeper (i.e., more layers)
– Stack smaller filters and apply 1x1 bottlenecks to reduce number of weights such that the
deeper models can fit into a GPU (faster training)
– Use multiple connections across layers (e.g., parallel and short cut)
• Efficient models aim to reduce number of weights and number of operations
– Most use some form of filter decomposition (spatial, depth and channel)
– Note: Number of weights and operations does not directly map to storage, speed and
power/energy. Depends on hardware!
• Filter shapes vary across layers and models
– Need flexible hardware!
February 13, 2023
75. L03-75
Sze and Emer
Warning!
• These works often use number of weights and operations to
measure “complexity”
• Number of weights provides an indication of storage cost for
inference
• However later in the course, we will see that
– Number of operations doesn’t directly translate to latency/throughput
– Number of weights and operations doesn’t directly translate to power/energy
consumption
• Understanding the underlying hardware is important for evaluating
the impact of these “efficient” CNN models
February 13, 2023
76. L03-76
Sze and Emer
References
• Book: Chapter 2 & 9
– https://doi.org/10.1007/978-3-031-01766-7
• Other Works Cited in Lecture (increase accuracy)
– LeNet: LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proc. IEEE 1998.
– AlexNet: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." NeurIPS.
2012.
– VGGNet: Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." ICLR 2015.
– Network in Network: Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014
– GoogleNet: Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern
recognition. CVPR 2015.
– ResNet: He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern
recognition. CVPR 2016.
– DenseNet: Huang, Gao, et al. "Densely connected convolutional networks." CVPR 2017
– Wide ResNet: Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." BMVC 2017.
– ResNext: Xie, Saining, et al. "Aggregated residual transformations for deep neural networks.” CVPR 2017
– SENets: Hu, Jie et al., “Squeeze-and-Excitation Networks,” CVPR 2018
– NFNet: Brock, Andrew, et al., “High-Performance Large-Scale Image Recognition Without Normalization,” arXiv 2021
February 13, 2023
77. L03-77
Sze and Emer
References
• Other Works Cited in Lecture (increase efficiency)
– InceptionV3: Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." CVPR 2016.
– SqueezeNet: Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model
size." ICLR 2017.
– Xception: Chollet, François. "Xception: Deep Learning with Depthwise Separable Convolutions." CVPR 2017
– MobileNet: Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications."
arXiv preprint arXiv:1704.04861 (2017).
– MobileNetv2: Sandler, Mark et al. “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” CVPR 2018
– MobileNetv3: Howard, Andrew et al., “Searching for MobileNetV3,” ICCV 2019
– ShuffleNet: Zhang, Xiangyu, et al. "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices."
CVPR 2018
– Learning Network Architecture: Zoph, Barret, et al. "Learning Transferable Architectures for Scalable Image Recognition."
CVPR 2018
• Other Works Cited in Lecture (Increase accuracy and efficiency)
– EfficientNet: Tan, Mingxing, et al. “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” ICML 2019
February 13, 2023