Energy Consumption Reduction in Wireless Sensor Network Based on ClusteringIJCNCJournal
ABSTRACT
One of the important issues in the routing protocol design in Wireless Sensor Networks (WSNs) is minimizing energy consumption and maximizing network lift time. Nowadays networks and information systems are one of the main parts of modern life that without them, people cannot live. On the hand, the impairment of these networks leads to great and incalculable costs. In this paper, a new method based on clustering has presented that problem of energy consumption is solved. The proposed algorithm is that energy-based clustering can create clusters of the same energy level and distribute energy efficiency across the WNS nodes. This proposed clustering protocol classify network nodes based on energy and neighbourhood criteria and attempts to better balance energy in clusters and ultimately increase network lifetime and maintain network coverage. Results are shown that the proposed algorithm is on average 40% better than LEACH algorithm and 14% better than IBLEACH algorithm.
KEYWORDS
Wireless Sensor Network, Clustering, LEACH Algorithm, IBLEACH Algorithm
Abstract Link : http://aircconline.com/abstract/ijcnc/v11n2/11219cnc03.html
Full Details : http://aircconline.com/ijcnc/V11N2/11219cnc03.pdf
Optimized Design of 2D Mesh NOC Router using Custom SRAM & Common Buffer Util...VLSICS Design
With the shrinking technology, reduced scale and power-hungry chip IO leads to System on Chip. The design of SOC using traditional standard bus scheme encounters with issues like non-uniform delay and routing problems. Crossbars could scale better when compared to buses but tend to become huge with increasing number of nodes. NOC has become the design paradigm for SOC design for its highly regularized interconnect structure, good scalability and linear design effort. The main components of an NoC topology are the network adapters, routing nodes, and network interconnect links. This paper mainly deals with the implementation of full custom SRAM based arrays over D FF based register arrays in the design of input module of routing node in 2D mesh NOC topology. The custom SRAM blocks replace D FF(D flip flop) memory implementations to optimize area and power of the input block. Full custom design of SRAMs has been carried out by MILKYWAY, while physical implementation of the input module with SRAMs has been carried out by IC Compiler of SYNOPSYS.The improved design occupies approximately 30% of the area of the original design. This is in conformity to the ratio of the area of an SRAM cell to the area of a D flip flop, which is approximately 6:28.The power consumption is almost halved to 1.5 mW. Maximum operating frequency is improved from 50 MHz to 200 MHz. It is intended to study and quantify the behavior of the single packet array design in relation to the multiple packet array design. Intuitively, a
common packet buffer would result in better utilization of available buffer space. This in turn would translate into lower delays in transmission. A MATLAB model is used to show quantitatively how performance is improved in a common packet array design.
TriBA(Triplet Based Architecture) is a Network on Chip processor(NoC) architecture which merges the
core philosophy of Object Oriented Design with the hardware design of multicore processors[1].We
present TriBASim in this paper, a NoC simulator specifically designed for TriBA.In TriBA ,nodes are
connected in recursive triplets .TriBA network topology performance analysis have been carried out from
different perspectives [2] and routing algorithms have been developed [3][4] but the architecture still lacks
a simulator that the researcher can use to run simple and fast behavioural analysis on the architecture
based on common parameters in the Network On Chip arena. TriBASim is introduced in this paper ,a
simulator for TriBA ,based on systemc[6] .TriBASim will lessen the burden on researchers on TriBA ,by
giving them something to just plug in desired parameters and have nodes and topology set up ready for
analysis.
ERROR PERFORMANCE ANALYSIS USING COOPERATIVE CONTENTION-BASED ROUTING IN WIRE...IJCSEIT Journal
In Wireless Ad hoc network, cooperation of nodes can be achieved by more interactions at higher protocol
layers, particularly the MAC (Medium Access Control) and network layers play vital role. MAC facilitates
a routing protocol based on position location of nodes at network layer specially known as Beacon-less
geographic routing (BLGR) using Contention-based selection process. This paper proposes two levels of
cross-layer framework -a MAC network cross-layer design for forwarder selection (or routing) and a
MAC-PHY for relay selection. Wireless networks suffers huge number of communication at the same time
leads to increase in collision and energy consumption; hence focused on new Contention access method
that uses a dynamical change of channel access probability which can reduce the number of contention
times and collisions. Simulation result demonstrates the best Relay selection and the comparative of direct
mode with the cooperative networks. And also demonstrates the Performance evaluation of contention
probability with Collision avoidance.
Energy Consumption Reduction in Wireless Sensor Network Based on ClusteringIJCNCJournal
ABSTRACT
One of the important issues in the routing protocol design in Wireless Sensor Networks (WSNs) is minimizing energy consumption and maximizing network lift time. Nowadays networks and information systems are one of the main parts of modern life that without them, people cannot live. On the hand, the impairment of these networks leads to great and incalculable costs. In this paper, a new method based on clustering has presented that problem of energy consumption is solved. The proposed algorithm is that energy-based clustering can create clusters of the same energy level and distribute energy efficiency across the WNS nodes. This proposed clustering protocol classify network nodes based on energy and neighbourhood criteria and attempts to better balance energy in clusters and ultimately increase network lifetime and maintain network coverage. Results are shown that the proposed algorithm is on average 40% better than LEACH algorithm and 14% better than IBLEACH algorithm.
KEYWORDS
Wireless Sensor Network, Clustering, LEACH Algorithm, IBLEACH Algorithm
Abstract Link : http://aircconline.com/abstract/ijcnc/v11n2/11219cnc03.html
Full Details : http://aircconline.com/ijcnc/V11N2/11219cnc03.pdf
Optimized Design of 2D Mesh NOC Router using Custom SRAM & Common Buffer Util...VLSICS Design
With the shrinking technology, reduced scale and power-hungry chip IO leads to System on Chip. The design of SOC using traditional standard bus scheme encounters with issues like non-uniform delay and routing problems. Crossbars could scale better when compared to buses but tend to become huge with increasing number of nodes. NOC has become the design paradigm for SOC design for its highly regularized interconnect structure, good scalability and linear design effort. The main components of an NoC topology are the network adapters, routing nodes, and network interconnect links. This paper mainly deals with the implementation of full custom SRAM based arrays over D FF based register arrays in the design of input module of routing node in 2D mesh NOC topology. The custom SRAM blocks replace D FF(D flip flop) memory implementations to optimize area and power of the input block. Full custom design of SRAMs has been carried out by MILKYWAY, while physical implementation of the input module with SRAMs has been carried out by IC Compiler of SYNOPSYS.The improved design occupies approximately 30% of the area of the original design. This is in conformity to the ratio of the area of an SRAM cell to the area of a D flip flop, which is approximately 6:28.The power consumption is almost halved to 1.5 mW. Maximum operating frequency is improved from 50 MHz to 200 MHz. It is intended to study and quantify the behavior of the single packet array design in relation to the multiple packet array design. Intuitively, a
common packet buffer would result in better utilization of available buffer space. This in turn would translate into lower delays in transmission. A MATLAB model is used to show quantitatively how performance is improved in a common packet array design.
TriBA(Triplet Based Architecture) is a Network on Chip processor(NoC) architecture which merges the
core philosophy of Object Oriented Design with the hardware design of multicore processors[1].We
present TriBASim in this paper, a NoC simulator specifically designed for TriBA.In TriBA ,nodes are
connected in recursive triplets .TriBA network topology performance analysis have been carried out from
different perspectives [2] and routing algorithms have been developed [3][4] but the architecture still lacks
a simulator that the researcher can use to run simple and fast behavioural analysis on the architecture
based on common parameters in the Network On Chip arena. TriBASim is introduced in this paper ,a
simulator for TriBA ,based on systemc[6] .TriBASim will lessen the burden on researchers on TriBA ,by
giving them something to just plug in desired parameters and have nodes and topology set up ready for
analysis.
ERROR PERFORMANCE ANALYSIS USING COOPERATIVE CONTENTION-BASED ROUTING IN WIRE...IJCSEIT Journal
In Wireless Ad hoc network, cooperation of nodes can be achieved by more interactions at higher protocol
layers, particularly the MAC (Medium Access Control) and network layers play vital role. MAC facilitates
a routing protocol based on position location of nodes at network layer specially known as Beacon-less
geographic routing (BLGR) using Contention-based selection process. This paper proposes two levels of
cross-layer framework -a MAC network cross-layer design for forwarder selection (or routing) and a
MAC-PHY for relay selection. Wireless networks suffers huge number of communication at the same time
leads to increase in collision and energy consumption; hence focused on new Contention access method
that uses a dynamical change of channel access probability which can reduce the number of contention
times and collisions. Simulation result demonstrates the best Relay selection and the comparative of direct
mode with the cooperative networks. And also demonstrates the Performance evaluation of contention
probability with Collision avoidance.
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
Stochastic analysis of random ad hoc networks with maximum entropy deploymentsijwmn
In this paper, we present the first stochastic analysis of the link performance of an ad hoc network modelled
by a single homogeneous Poisson point process (HPPP). According to the maximum entropy principle, the
single HPPP model is mathematically the best model for random deployments with a given node density.
However, previous works in the literature only consider a modified model which shows a discrepancy in the
interference distribution with the more suitable single HPPP model. The main contributions of this paper
are as follows. 1) It presents a new mathematical framework leading to closed form expressions of the
probability of success of both one-way transmissions and handshakes for a deployment modelled by a
single HPPP. Our approach, based on stochastic geometry, can be extended to complex protocols. 2) From
the obtained results, all confirmed by comparison to simulated data, optimal PHY and MAC layer
parameters are determined and the relations between them is described in details. 3) The influence of the
routing protocol on handshake performance is taken into account in a realistic manner, leading to the
confirmation of the intuitive result that the effect of imperfect feedback on the probability of success of a
handshake is only negligible for transmissions to the first neighbour node.
AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIPVLSICS Design
Traditional System-on-Chip (SoC) design employed shared buses for data transfer among various subsystems. As SoCs become more complex involving a larger number of subsystems, traditional busbased architecture is giving way to a new paradigm for on-chip communication. This paradigm is called Network-on-Chip (NoC). A communication network of point-to-point links and routing switches is used to facilitate communication between subsystems. The routing switch proposed in this paper consists of four components, namely the input ports, output ports, switching fabric, and scheduler. The scheduler design is described in this paper. The function of the scheduler is to arbitrate between requests by data packets for use of the switching fabric. The scheduler uses an improved round robin based arbitration algorithm. Due to the symmetric structure of the scheduler, an area-efficient design is proposed by folding the scheduler onto itself, thereby reducing its area roughly by 50%.
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 155번째 논문 review 입니다.
이번에는 Facebook AI Research에서 최근에 나온(4/2) Exploring Randomly Wired Neural Networks for Image Recognition을 review해 보았습니다. random하게 generation된 network이 그동안 사람들이 온갖 노력을 들여서 만든 network 이상의 성능을 나타낸다는 결과로 많은 사람들에게 충격을 준 논문인데요, 자세한 내용은 자료와 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1904.01569
영상링크: https://youtu.be/NrmLteQ5BC4
DIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGNIJCNCJournal
The shortcomings of conventional bus architectures are in terms of scalability and the ever increasing
demand of more bandwidth. And also the feature size of sub-micron domain is decreasing making it
difficult for bus architectures to fulfill the requirements of modern System on Chip (SoC) systems. Network
on chip (NoC) architectures presents a solution to the earlier mentioned shortcomings by employing a
packet based network for inter IP communications. A pivotal feature of NoC systems is the topology in
which the system is arranged. Several parameters which are topology dependent like hop count, path
diversity, degree and other various parameters affect the system performance. We propose a novel
topology forNoC architecture which has been thoroughly compared with the existing topologies on the
basis of different network parameters.
OpenFlow is one of the most commonly used protocols for communication between the
controller and the forwarding element in a software defined network (SDN). A model based on
M/M/1 queues is proposed in [1] to capture the communication between the forwarding element
and the controller. Albeit the model provides useful insight, it is accurate only for the case when
the probability of expecting a new flow is small.
Secondly, it is not straight forward to extend the model in [1] to more than one forwarding
element in the data plane. In this work we propose a model which addresses both these
challenges. The model is based on Jackson assumption but with corrections tailored to the
OpenFlow based SDN network. Performance analysis using the proposed model indicates that
the model is accurate even for the case when the probability of new flow is quite large. Further
we show by a toy example that the model can be extended to more than one node in the data plane.
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
In recent years, many applications have been implemented in embedded systems and mobile Internet of Things (IoT) devices that typically have constrained resources, smaller power budget, and exhibit "smartness" or intelligence. To implement computation-intensive and resource-hungry Convolutional Neural Network (CNN) in this class of devices, many research groups have developed specialized parallel accelerators using Graphical Processing Units (GPU), Field-Programmable Gate Arrays (FPGA), or Application-Specific Integrated Circuits (ASIC). An alternative computing paradigm called Stochastic Computing (SC) can implement CNN with low hardware footprint and power consumption. To enable building more efficient SC CNN, this work incorporates the CNN basic functions in SC that exploit correlation, share Random Number Generators (RNG), and is more robust to rounding error. Experimental results show our proposed solution provides significant savings in hardware footprint and increased accuracy for the SC CNN basic functions circuits compared to previous work.
Partially connected 3D NoC - Access Noxim. Abhishek Madav
Project for building a partially connected 3D NoC using Access Noxim co-simulator as a part of the EECS 213 - Advanced Computer Architecture course at University of California, Irvine.
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
PR-183: MixNet: Mixed Depthwise Convolutional KernelsJinwon Lee
TensorFlow-KR 논문읽기모임 PR12(12PR) 183번째 논문 review입니다.
이번에 살펴볼 논문은 Google Brain에서 발표한 MixNet입니다. Efficiency를 추구하는 CNN에서 depthwise convolution이 많이 사용되는데, 이 때 depthwise convolution filter의 size를 다양하게 해서 성능도 높이고 efficiency도 높이는 방법을 제안한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크 : https://arxiv.org/abs/1907.09595
발표영상 : https://youtu.be/252YxqpHzsg
#PR12 #PR366
안녕하세요 논문 읽기 모임 PR-12의 366번째 논문리뷰입니다.
올해가 AlexNet이 나온지 10주년이 되는 해네요.
AlexNet이 2012년에 혜성처럼 등장한 이후, Solve computer vision problem = Use CNN이 공식처럼 사용되던 2010년대가 가고
2020년대 들어서 ViT의 등장을 시작으로 Transformer 기반의 network들이 CNN의 자리를 위협하고 상당부분 이미 뺏어간 상황입니다.
2020년대에 CNN의 가야할 길은 어디일까요?
Inductive bias가 적은 Transformer가 대용량의 데이터로 학습하면 항상 CNN보다 더 낫다는 건 진실일까요?
이 논문에서는 2020년대를 위한 CNN이라는 제목으로 ConvNeXt라는 새로운(?) architecture를 제안합니다.
사실 새로운 건 없고 그동안 있었던 것들과 Transformer에서 적용한 것들을 copy해와서 CNN에 적용해보았는데요,
Transformer보다 성능도 좋고 속도도 빠른 결과가 나왔다고 합니다.
결과에 대해서 약간의 논란이 twitter 상에서 나오고 있는데 이 부분 포함해서 자세한 내용은 영상을 통해서 보실 수 있습니다.
늘 재밌게 봐주시고 좋아요 댓글 구독 해주시는 분들께 감사드립니다 :)
논문링크: https://arxiv.org/abs/2201.03545
영상링크: https://youtu.be/Mw7IhO2uBGc
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
Stochastic analysis of random ad hoc networks with maximum entropy deploymentsijwmn
In this paper, we present the first stochastic analysis of the link performance of an ad hoc network modelled
by a single homogeneous Poisson point process (HPPP). According to the maximum entropy principle, the
single HPPP model is mathematically the best model for random deployments with a given node density.
However, previous works in the literature only consider a modified model which shows a discrepancy in the
interference distribution with the more suitable single HPPP model. The main contributions of this paper
are as follows. 1) It presents a new mathematical framework leading to closed form expressions of the
probability of success of both one-way transmissions and handshakes for a deployment modelled by a
single HPPP. Our approach, based on stochastic geometry, can be extended to complex protocols. 2) From
the obtained results, all confirmed by comparison to simulated data, optimal PHY and MAC layer
parameters are determined and the relations between them is described in details. 3) The influence of the
routing protocol on handshake performance is taken into account in a realistic manner, leading to the
confirmation of the intuitive result that the effect of imperfect feedback on the probability of success of a
handshake is only negligible for transmissions to the first neighbour node.
AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIPVLSICS Design
Traditional System-on-Chip (SoC) design employed shared buses for data transfer among various subsystems. As SoCs become more complex involving a larger number of subsystems, traditional busbased architecture is giving way to a new paradigm for on-chip communication. This paradigm is called Network-on-Chip (NoC). A communication network of point-to-point links and routing switches is used to facilitate communication between subsystems. The routing switch proposed in this paper consists of four components, namely the input ports, output ports, switching fabric, and scheduler. The scheduler design is described in this paper. The function of the scheduler is to arbitrate between requests by data packets for use of the switching fabric. The scheduler uses an improved round robin based arbitration algorithm. Due to the symmetric structure of the scheduler, an area-efficient design is proposed by folding the scheduler onto itself, thereby reducing its area roughly by 50%.
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 155번째 논문 review 입니다.
이번에는 Facebook AI Research에서 최근에 나온(4/2) Exploring Randomly Wired Neural Networks for Image Recognition을 review해 보았습니다. random하게 generation된 network이 그동안 사람들이 온갖 노력을 들여서 만든 network 이상의 성능을 나타낸다는 결과로 많은 사람들에게 충격을 준 논문인데요, 자세한 내용은 자료와 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1904.01569
영상링크: https://youtu.be/NrmLteQ5BC4
DIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGNIJCNCJournal
The shortcomings of conventional bus architectures are in terms of scalability and the ever increasing
demand of more bandwidth. And also the feature size of sub-micron domain is decreasing making it
difficult for bus architectures to fulfill the requirements of modern System on Chip (SoC) systems. Network
on chip (NoC) architectures presents a solution to the earlier mentioned shortcomings by employing a
packet based network for inter IP communications. A pivotal feature of NoC systems is the topology in
which the system is arranged. Several parameters which are topology dependent like hop count, path
diversity, degree and other various parameters affect the system performance. We propose a novel
topology forNoC architecture which has been thoroughly compared with the existing topologies on the
basis of different network parameters.
OpenFlow is one of the most commonly used protocols for communication between the
controller and the forwarding element in a software defined network (SDN). A model based on
M/M/1 queues is proposed in [1] to capture the communication between the forwarding element
and the controller. Albeit the model provides useful insight, it is accurate only for the case when
the probability of expecting a new flow is small.
Secondly, it is not straight forward to extend the model in [1] to more than one forwarding
element in the data plane. In this work we propose a model which addresses both these
challenges. The model is based on Jackson assumption but with corrections tailored to the
OpenFlow based SDN network. Performance analysis using the proposed model indicates that
the model is accurate even for the case when the probability of new flow is quite large. Further
we show by a toy example that the model can be extended to more than one node in the data plane.
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
In recent years, many applications have been implemented in embedded systems and mobile Internet of Things (IoT) devices that typically have constrained resources, smaller power budget, and exhibit "smartness" or intelligence. To implement computation-intensive and resource-hungry Convolutional Neural Network (CNN) in this class of devices, many research groups have developed specialized parallel accelerators using Graphical Processing Units (GPU), Field-Programmable Gate Arrays (FPGA), or Application-Specific Integrated Circuits (ASIC). An alternative computing paradigm called Stochastic Computing (SC) can implement CNN with low hardware footprint and power consumption. To enable building more efficient SC CNN, this work incorporates the CNN basic functions in SC that exploit correlation, share Random Number Generators (RNG), and is more robust to rounding error. Experimental results show our proposed solution provides significant savings in hardware footprint and increased accuracy for the SC CNN basic functions circuits compared to previous work.
Partially connected 3D NoC - Access Noxim. Abhishek Madav
Project for building a partially connected 3D NoC using Access Noxim co-simulator as a part of the EECS 213 - Advanced Computer Architecture course at University of California, Irvine.
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
PR-183: MixNet: Mixed Depthwise Convolutional KernelsJinwon Lee
TensorFlow-KR 논문읽기모임 PR12(12PR) 183번째 논문 review입니다.
이번에 살펴볼 논문은 Google Brain에서 발표한 MixNet입니다. Efficiency를 추구하는 CNN에서 depthwise convolution이 많이 사용되는데, 이 때 depthwise convolution filter의 size를 다양하게 해서 성능도 높이고 efficiency도 높이는 방법을 제안한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크 : https://arxiv.org/abs/1907.09595
발표영상 : https://youtu.be/252YxqpHzsg
#PR12 #PR366
안녕하세요 논문 읽기 모임 PR-12의 366번째 논문리뷰입니다.
올해가 AlexNet이 나온지 10주년이 되는 해네요.
AlexNet이 2012년에 혜성처럼 등장한 이후, Solve computer vision problem = Use CNN이 공식처럼 사용되던 2010년대가 가고
2020년대 들어서 ViT의 등장을 시작으로 Transformer 기반의 network들이 CNN의 자리를 위협하고 상당부분 이미 뺏어간 상황입니다.
2020년대에 CNN의 가야할 길은 어디일까요?
Inductive bias가 적은 Transformer가 대용량의 데이터로 학습하면 항상 CNN보다 더 낫다는 건 진실일까요?
이 논문에서는 2020년대를 위한 CNN이라는 제목으로 ConvNeXt라는 새로운(?) architecture를 제안합니다.
사실 새로운 건 없고 그동안 있었던 것들과 Transformer에서 적용한 것들을 copy해와서 CNN에 적용해보았는데요,
Transformer보다 성능도 좋고 속도도 빠른 결과가 나왔다고 합니다.
결과에 대해서 약간의 논란이 twitter 상에서 나오고 있는데 이 부분 포함해서 자세한 내용은 영상을 통해서 보실 수 있습니다.
늘 재밌게 봐주시고 좋아요 댓글 구독 해주시는 분들께 감사드립니다 :)
논문링크: https://arxiv.org/abs/2201.03545
영상링크: https://youtu.be/Mw7IhO2uBGc
Many intellectual property (IP) modules are present in contemporary system on chips (SoCs). This could provide an issue with interconnection among different IP modules, which would limit the system's ability to scale. Traditional bus-based SoC architectures have a connectivity bottleneck, and network on chip (NoC) has evolved as an embedded switching network to address this issue. The interconnections between various cores or IP modules on a chip have a significant impact on communication and chip performance in terms of power, area latency and throughput. Also, designing a reliable fault tolerant NoC became a significant concern. In fault tolerant NoC it becomes critical to identify faulty node and dynamically reroute the packets keeping minimum latency. This study provides an insight into a domain of NoC, with intention of understanding fault tolerant approach based on the XY routing algorithm for 4×4 mesh architecture. The fault tolerant NoC design is synthesized on field programmable gate array (FPGA).
APPLYING GENETIC ALGORITHM TO SOLVE PARTITIONING AND MAPPING PROBLEM FOR MESH...ijcsit
This paper presents a genetic based approach to the partitioning and mapping of multicore SoC cores over a NoC system that uses mesh topology. The proposed algorithm performs the partitioning and mapping by reducing communication cost and minimizing power consumption by placing those intercommunicated cores as close as possible together. A program developed in C++ in which the provided specification of the multicore MPSoC system captures all data dependencies before any start of the design process. Experimental results of several multimedia benchmarks demonstrates that the genetic-based approach able to find different satisfied implementations to the problem of partitioning and mapping of MPSoC cores over mesh-based NoC system that satisfies design goals.
This paper presents a genetic based approach to the partitioning and mapping of multicore SoC cores over a NoC system that uses mesh topology. The proposed algorithm performs the partitioning and mapping by reducing communication cost and minimizing power consumption by placing those intercommunicated cores as close as possible together. A program developed in C++ in which the provided specification of the multicore MPSoC system captures all data dependencies before any start of the design process. Experimental results of several multimedia benchmarks demonstrates that the genetic-based approach able to find different satisfied implementations to the problem of partitioning and mapping of MPSoC cores over mesh-based NoC system that satisfies design goals.
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...csijjournal
Embedded platforms are projected to integrate hundreds of cores in the near future, and expanding the interconnection network remains a key challenge. We propose SNet, a new Scalable NETwork paradigm that extends the NoCs area to include a software/hardware dynamic routing mechanism. To design routing pathways among communicating processes, it uses a distributed, adaptive, non-supervised routing method based on the ACO algorithm (Ant Colony Optimization). A small footprint hardware unit called DMC speeds up data transfer (Direct Management of Communications). SNet has the benefit of being extremely versatile, allowing for the creation of a broad range of routing topologies to meet the needs of various applications. We provide the DMC module in this work and assess SNet performance by executing a large number of test cases.
A FLEXIBLE SOFTWARE/HARDWARE ADAPTIVE NETWORK FOR EMBEDDED DISTRIBUTED ARCHIT...csijjournal
Embedded platforms are projected to integrate hundreds of cores in the near future, and expanding the
interconnection network remains a key challenge. We propose SNet, a new Scalable NETwork paradigm
that extends the NoCs area to include a software/hardware dynamic routing mechanism. To design routing
pathways among communicating processes, it uses a distributed, adaptive, non-supervised routing method
based on the ACO algorithm (Ant Colony Optimization). A small footprint hardware unit called DMC
speeds up data transfer (Direct Management of Communications). SNet has the benefit of being extremely
versatile, allowing for the creation of a broad range of routing topologies to meet the needs of various
applications. We provide the DMC module in this work and assess SNet performance by executing a large
number of test cases.
Performance Comparison and Analysis of Mobile Ad Hoc Routing ProtocolsCSEIJJournal
A mobile ad hoc network (MANET) is a wireless network that uses multi-hop peer-to-peer routing instead
of static network infrastructure to provide network connectivity. MANETs have applications in rapidly
deployed and dynamic military and civilian systems. The network topology in a MANET usually changes
with time. Therefore, there are new challenges for routing protocols in MANETs since traditional routing
protocols may not be suitable for MANETs. Researchers are designing new MANET routing protocols
and comparing and improving existing MANET routing protocols before any routing protocols are
standardized using simulations. However, the simulation results from different research groups are not
consistent with each other. This is because of a lack of consistency in MANET routing protocol models
and application environments, including networking and user traffic profiles. Therefore, the simulation
scenarios are not equitable for all protocols and conclusions cannot be generalized. Furthermore, it is
difficult for one to choose a proper routing protocol for a given MANET application. According to the
aforementioned issues, this paper focuses on MANET routing protocols. Specifically, my contribution
includes the characterization of different routing protocols and compare and analyze the performance of
different routing protocols.
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...ijcsit
In Network on Chip (NoC) rooted system, energy consumption is affected by task scheduling and allocation
schemes which affect the performance of the system. In this paper we test the pre-existing proposed
algorithms and introduced a new energy skilled algorithm for 3D NoC architecture. An efficient dynamic
and cluster approaches are proposed along with the optimization using bio-inspired algorithm. The
proposed algorithm has been implemented and evaluated on randomly generated benchmark and real life
application such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing mapping algorithm spiral and crinkle and has shown better
reduction in the communication energy consumption and shows improvement in the performance of the
system. On performing experimental analysis of proposed algorithm results shows that average reduction
in energy consumption is 49%, reduction in communication cost is 48% and average latency is 34%.
Cluster based approach is mapped onto NoC using Dynamic Diagonal Mapping (DDMap), Crinkle and
Spiral algorithms and found DDmap provides improved result. On analysis and comparison of mapping of
cluster using DDmap approach the average energy reduction is 14% and 9% with crinkle and spiral.
FLEXIBLE VIRTUAL ROUTING FUNCTION DEPLOYMENT IN NFV-BASED NETWORK WITH MINIMU...IJCNCJournal
In a conventional network, most network devices, such as routers, are dedicated devices that do not
have much variation in capacity. In recent years, a new concept of Network Functions
Virtualisation (NFV) has come into use. The intention is to implement a variety of network functions
with software on general-purpose servers and this allows the network operator to select any
capabilities and locations of network functions without any physical constraints.
This paper focuses on the deployment of NFV-based routing functions which are one of critical
virtual network functions, and present the algorithm of virtual routing function allocation that
minimize the total network cost. In addition, this paper presents the useful allocation policy of
virtual routing functions, based on an evaluation with a ladder-shaped network model. This policy
takes the ratio of the cost of a routing function to that of a circuit and traffic distribution in the
network into consideration. Furthermore, this paper shows that there are cases where the use of
NFV-based routing functions makes it possible to reduce the total network cost dramatically, in
comparison to a conventional network, in which it is not economically viable to distribute smallcapacity
routing functions
Simulator for Energy Efficient Clustering in Mobile Ad Hoc Networkscscpconf
The research on various issues in Mobile ad hoc networks is getting popular because of its
challenging nature and all time connectivity to communicate. Network simulators provide the
platform to analyse and imitate the working of the nodes in the networks along with the traffic
and other entities. The current work proposes the design of a simulator for the mobile ad hoc
networks that provides a test bed for the energy efficient clustering in the dynamic network.
Node parameters like degree of connectivity and average transmission power are considered for
calculating the energy consumption of the mobile devices. Nodes that consume minimum energy among their 1-hop neighbours are selected as the cluster heads.
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSijwmn
The paper presents a Riverbed simulator implementation with both routing and medium access control
(MAC) protocols for mobile ad-hoc network wireless networks with multi-beam smart antennas (MBSAs).
As one of the latest promising antenna techniques, MBSAs can achieve concurrent transmissions /
receptions in multiple directions/beams. Thus it can significantly improve the network throughput.
However, so far there is still no accurate network simulator that can measure the MBSA-based
routing/MAC protocol performance. In this paper, we describe the simulation models with the
implementation of MBSA antenna model in physical layer, MAC layer, and routing layer protocols, all in
Riverbed Modeler. We will compare two routing scenarios, i.e., multi-hop diamond routing scenario and
multi-path pipe routing. We will analyze the network performance for those two scenarios and illustrate the
advantages of using MBSAs in wireless networks.
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSijwmn
The paper presents a Riverbed simulator implementation with both routing and medium access control
(MAC) protocols for mobile ad-hoc network wireless networks with multi-beam smart antennas (MBSAs).
As one of the latest promising antenna techniques, MBSAs can achieve concurrent transmissions /
receptions in multiple directions/beams. Thus it can significantly improve the network throughput.
However, so far there is still no accurate network simulator that can measure the MBSA-based
routing/MAC protocol performance. In this paper, we describe the simulation models with the
implementation of MBSA antenna model in physical layer, MAC layer, and routing layer protocols, all in
Riverbed Modeler. We will compare two routing scenarios, i.e., multi-hop diamond routing scenario and
multi-path pipe routing. We will analyze the network performance for those two scenarios and illustrate the
advantages of using MBSAs in wireless networks.
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSijwmn
The paper presents a Riverbed simulator implementation with both routing and medium access control (MAC) protocols for mobile ad-hoc network wireless networks with multi-beam smart antennas (MBSAs). As one of the latest promising antenna techniques, MBSAs can achieve concurrent transmissions /
receptions in multiple directions/beams. Thus it can significantly improve the network throughput. However, so far there is still no accurate network simulator that can measure the MBSA-based
routing/MAC protocol performance. In this paper, we describe the simulation models with the implementation of MBSA antenna model in physical layer, MAC layer, and routing layer protocols, all in Riverbed Modeler. We will compare two routing scenarios, i.e., multi-hop diamond routing scenario and
multi-path pipe routing. We will analyze the network performance for those two scenarios and illustrate the advantages of using MBSAs in wireless networks.
Area-Efficient Design of Scheduler for Routing Node of Network-On-ChipVLSICS Design
Traditional System-on-Chip (SoC) design employed shared buses for data transfer among various subsystems. As SoCs become more complex involving a larger number of subsystems, traditional busbased architecture is giving way to a new paradigm for on-chip communication. This paradigm is called Network-on-Chip (NoC). A communication network of point-to-point links and routing switches is used to facilitate communication between subsystems. The routing switch proposed in this paper consists of four components, namely the input ports, output ports, switching fabric, and scheduler. The scheduler design is described in this paper. The function of the scheduler is to arbitrate between requests by data packets for use of the switching fabric. The scheduler uses an improved round robin based arbitration algorithm. Due to the symmetric structure of the scheduler, an area-efficient design is proposed by folding the scheduler onto itself, thereby reducing its area roughly by 50%.
Performance Analysis of Mesh-based NoC’s on Routing Algorithms IJECEIAES
The advent of System-on-Chip (SoCs), has brought about a need to increase the scale of multi-core chip networks. Bus Based communications have proved to be limited in terms of performance and ease of scalability, the solution to both bus – based and Point-to-Point (P2P) communication systems is to use a communication infrastructure called Network-on-Chip (NoC). Performance of NoC depends on various factors such as network topology, routing strategy and switching technique and traffic patterns. In this paper, we have taken the initiative to compile together a comparative analysis of different Network on Chip infrastructures based on the classification of routing algorithm, switching technique, and traffic patterns. The goal is to show how varied combinations of the three factors perform differently based on the size of the mesh network, using NOXIM, an open source SystemC Simulator of mesh-based NoC. The analysis has shown tenable evidence highlighting the novelty of XY routing algorithm.
Evaluating feasibility of using wireless sensor networks in a coffee crop thr...IJCNCJournal
A Wireless Sensor Networks is a network formed with sensors that have characteristics to sensor an area to
extract a specific metric, depending of the application.
We would like to analyse the feasibility to use sensors in a coffee crop.In this work we are evaluating routing protocolsusing real dimensions and characteristics of a coffee crop. We evaluate, through simulation, AODV, DSDV and AOMDV and two variants known in this work as AODVMOD and AOMDVMOD with 802.15.4 MAC Protocol
.For this comparison, we defined three performance metrics: Packet Delivery Ratio (PDR), End-to-End Delay
and Average Energy Consumption. Simulation results show that AOMDVMOD overall, outperforms others
routing protocols evaluated, showing that is possible to use WSN in a real coffee crop environment.
Greetings from IGeekS Technologies ….
We were humbled to receive your enquiry regarding your academic project. We assure you to give all kinds of guidance for you to successfully complete your project.
IGeekS Technologies is a company located in Bangalore, India. We have being recognized as a quality provider of hardware and software solutions for the student’s in order carry out their academic Projects. We offer academic projects at various academic levels ranging from graduates to masters (Diploma, BCA, BE, M. Tech, MCA, M. Sc (CS/IT)). As a part of the development training, we offer Projects in Embedded Systems & Software to the Engineering College students in all major disciplines.
Academic Projects
As a part of our vision to provide a field experience to young graduates, we offering academic projects to MCA/B.Tech/BE/M.Tech/BCA students. Normally our way of project guidance will start with in-depth training. Why because unless and until a student know the technology, he cannot implement a project. We designed such courses based on industry requirements.
Placements
Our support never ends with training. We are maintaining a dedicated consulting division with 5 HR executives to assist our students to find good opportunities. Once a student finishes his course and project, immediately we will collect their profiles and will contact with the companies. Since January 2010, more than 450 students got placed with the help of our quality training, project assistance and placement support.
Facilities
• Project confirmation and completion certificate.
• Project base paper, synopsis and PPT.
• In-depth training by industry experts
• Project guidance from experienced people
• Regular seminars and group discussions
• Lab facility
• Good placement assistance
• A CD which contains all the required softwares and materials.
• Lab modules with 100s of examples to improve students programming skills.
Please visit our websites for further information:-
www.makefinalyearproject.com
www.igeekstechnoloiges.com
We look forward to have you in our office for a detailed technical discussion for in-depth understanding of the base paper and synopsis. Our training methodology includes to first prepare the candidates to the relevant technology used in the selected project and then start the project implementation; this gives the candidate the pre-requisite knowledge to understand not only the project but also the code in which the project is implemented.The program concludes by issuing of project completion certificate from our organization.
We attached the proposed project titles for the academic year 2015. Find the attachment. Select the titles we will send the synopsis and base paper...If have any own topic (base paper) pls send us.we will check and confirm the implementation.
We will explain the base paper and synopsis, for technical discussion or admission contact Mr. Nandu-9590544567.
Greetings from IGeekS Technologies ….
We were humbled to receive your enquiry regarding your academic project. We assure you to give all kinds of guidance for you to successfully complete your project.
IGeekS Technologies is a company located in Bangalore, India. We have being recognized as a quality provider of hardware and software solutions for the student’s in order carry out their academic Projects. We offer academic projects at various academic levels ranging from graduates to masters (Diploma, BCA, BE, M. Tech, MCA, M. Sc (CS/IT)). As a part of the development training, we offer Projects in Embedded Systems & Software to the Engineering College students in all major disciplines.
Academic Projects
As a part of our vision to provide a field experience to young graduates, we offering academic projects to MCA/B.Tech/BE/M.Tech/BCA students. Normally our way of project guidance will start with in-depth training. Why because unless and until a student know the technology, he cannot implement a project. We designed such courses based on industry requirements.
Placements
Our support never ends with training. We are maintaining a dedicated consulting division with 5 HR executives to assist our students to find good opportunities. Once a student finishes his course and project, immediately we will collect their profiles and will contact with the companies. Since January 2010, more than 450 students got placed with the help of our quality training, project assistance and placement support.
Facilities
• Project confirmation and completion certificate.
• Project base paper, synopsis and PPT.
• In-depth training by industry experts
• Project guidance from experienced people
• Regular seminars and group discussions
• Lab facility
• Good placement assistance
• A CD which contains all the required softwares and materials.
• Lab modules with 100s of examples to improve students programming skills.
Please visit our websites for further information:-
www.makefinalyearproject.com
www.igeekstechnoloiges.com
We look forward to have you in our office for a detailed technical discussion for in-depth understanding of the base paper and synopsis. Our training methodology includes to first prepare the candidates to the relevant technology used in the selected project and then start the project implementation; this gives the candidate the pre-requisite knowledge to understand not only the project but also the code in which the project is implemented.The program concludes by issuing of project completion certificate from our organization.
We attached the proposed project titles for the academic year 2015. Find the attachment. Select the titles we will send the synopsis and base paper...If have any own topic (base paper) pls send us.we will check and confirm the implementation.
We will explain the base paper and synopsis, for technical discussion or admission contact Mr. Nandu-9590544567.
1. Traffic Characterization for Multicasting in NoC
V.Laxmi1 , Roopesh Chuggani2 , M.S.Gaur3 , Pankaj Khandelwal4 , Prateek Bansal5
Department of Computer Engineering
National Institute of Technology
Jaipur
{vlaxmi |gaurms }@mnit.ac.in,{roopesh.chuggani2 |pankaj1394 |prateekbansal.895 }@gmail.com
1 3
Abstract—NoC (Network on Chip) is an emerging paradigm one core to another. Traffic modelling has been proposed as an
for design of VLSI/ULSI circuits to overcome communication open area of research in recent papers [7]. Most evaluations
bottleneck of traditional bus based systems. NoC communica- and analysis of NoC design parameters are still based on basic
tion framework consists of regularly placed routers, which are
connected to processing cores. NoC performance is determined synthetic traffic patterns such as CBR (Constant Bit Rate),
by latency and throughput for communication requirements. bursty, bit-complement, transpose, etc. These traffic patterns
NoC communication traffic modelling plays an important role do not capture real-world scenario as each of these patterns
in design of NoC simulators and/or prototypes. This paper comprise of only point-to-point communications, i.e. for each
presents a framework for modelling source traffic for multipoint source there is only one destination. Traffic modelling of
communication from one source to different destinations as is
required for multicasting. Such a traffic model captures real- multicast communication for NoC is still in infancy.
world scenarios such as multicasting, execution of concurrent In multimedia applications such as NoC design for modules
multiple tasks on a single core (each task requiring commu- of MPEG encoder/decoder, point-to-multipoint communica-
nication with different destinations). The model proposes how tion patterns are also needed as experienced by authors while
concurrent traffic streams from a single core to different desti- extending capability of an NoC simulator. This requires gen-
nations can be mathematically characterized as a single stream
at source end. The model is derived from statistical behaviour eration of multiple traffic streams originating from the same
of probabilistically demultiplexing of a single traffic stream. In source but destined for different cores. A similar traffic pattern
its nascent stage, the method is proposed for a scenario of one is observed when a core is running concurrent tasks; each task
source concurrently communicating with two destinations as shall requiring communicating with different destination.
be required for mapping two concurrent tasks to same core or In this paper, we propose how multicast communication,
simultaneous broadcast to two destinations.
Index Terms—Network on Chip, Multicasting, Bursty Traffic,
i.e. multiple traffic streams originating at the source, can be
Probabilistic Demultiplexing, Exponential Distribution viewed as a single traffic stream without any adverse impact
on statistical characteristics of destination traffic streams. The
I. I NTRODUCTION model is derived from observations of statistical behaviour of
received streams at destinations in a single source multiple
VLSI designs are increasingly becoming more complex with destinations scenario. Till now, to the best of our knowledge,
increase in scale of integration resulting in more components no traffic model has been proposed to accurately characterize
being fabricated on the same chip. With resultant increase in this scenario. In this initial work, we present model for
the number of processing cores (CPU, DSP, memory, etc.), two destinations. This can be used as basis for n(n > 2)
increased inter-core communication requirement cannot be destinations.
satisfied by the traditional bus based communication archi- The model is based on the observation that probabilistic
tecture [1], [2]. Network on Chip (NoC) has been proposed division of a bursty traffic stream into two separate streams
as an alternative [3]. NoC provides a communication layer results in both streams being bursty. Burst parameter of each
of regularly placed, interconnected routers. Inter-core com- stream is related to the that of the original stream. The
munication takes place through these routers. Decoupling of proposed traffic model has been implemented and tested on
communication and computation simplifies IC design process. an open source NoC simulator NIRGAM [8].
Regularity in NoC structure results in better scalability and This paper is organized as follows: In Section II, we present
fault tolerance [2], [4]. Because of its modular structure, many the background survey in this field. In Section III, we present
components can be reused from previous designs resulting in objectives of the presented work and motivation for proposed
reduced time to market for new NoC designs. traffic model. In Section IV, we derive how statistical charac-
NoC design parameters include topology selection, router teristics of traffic streams received at destinations are related to
design and choice of routing function. A NoC simulator can those of the source traffic. These relationships are derived from
assist the designer in evaluation of different NoC designs. observations of experiments conducted. Section V describes
One important aspect of simulator design is characterization of NoC simulator NIRGAM, on which the proposed model is
inter-core traffic. Traffic modelling of the cores is an important implemented, in brief. In Section VI, implementation of the
step in NoC design [5], [6]. Traffic models are mathematical proposed model on NIRGAM is described. Experimental result
characterization of statistical properties of data flowing from are presented in Section VII followed by conclusions and
978-1-4244-8971-8/10$26.00 c 2010 IEEE
2. pointers for further extension in Section VIII. 0
1 2 3
II. R ELATED W ORK
7
Applications needs to be mapped to the underlying NoC 4 5 6
architecture by dividing their functionality of the application
into smaller tasks. Each task is mapped onto one NoC core.
8 9 10 11
Many algorithms for mapping these tasks on to IP core have
been proposed [9]–[11]. In each of previous work, a single
task is mapped onto one IP core. Most of the past work has 12 13 14 15
been done to map a single application onto the underlying
network. In [9], the tasks of a process control platform are 0
mapped on to NoC cores in one to one manner. In [11], Hu et IP Core Task Data Flow
al propose an energy constrained mapping of communication
task graph to a NoC. This work considers single task per core. Fig. 1: NoC Architecture with Multiple Task per core
NoC evaluation is based on the assumption of mapping sin-
gle task per core and point-to-point traditional traffic patterns
like bit complement, transpose [3]. This type of communi- statistical characteristics of traffic received at the destinations.
cation is limited to only few applications, because rarely a Following are the assumptions for our model.
node communicates with just a single node or with all the 1) There is one source and two destinations. This can
other nodes in the network. For modelling a multicast (point happen when at most two traffic streams are emanating
to multipoint) scenario, uniform random traffic is used by on a single core.
selecting a random destination for each packet; probability of 2) Each stream (task) is generating Bursty traffic; average
each destination being selected is same. In [12], a new traffic OFF time of this traffic is modelled using exponential
pattern is proposed to create the scenario where tasks with distribution.
higher intertask communicating tasks are mapped to cores in 3) Traffic model is independent of burst size (Number
adjacent regions. In this traffic pattern, communication is point of packets in a particular burst). Experimental results
to point but, traffic is distributed to multiple destinations. suggest that traffic statistics appears to be independent
These traffic patterns cannot model the point to multipoint of burst size. Details are discussed in Section VII.
traffic generated by multiple tasks executing on a single
core. This is because when we map multiple tasks on single We define following parameters for our traffic model :
core, traffic of the core is composed of the individual traffic 1) mc : Average (Mean) OFF time of the traffic generated
generated by each tasks. Each individual traffic stream can by the core node.
have different statistical properties and destination pattern. But 2) p1 : Probability that packet is destined for first destina-
traditional traffic generators do not provide functionality for tion.
such a communication. 3) p2 : Probability that packet is destined for second
destination
III. M OTIVATION 4) mt1 : Average (Mean) OFF time of the traffic received
In this paper, we try to model point-to-multipoint source by first destination.
traffic pattern given the statistical behaviour of traffic received 5) mt2 : Average (Mean) OFF time of the traffic received
at the destinations. This will result in multiple traffic streams by second destination.
emerging from same core. Each traffic stream may have a Our model is based on the observation that when a bursty
different destination and is likely to have different statistical traffic generated using exponential distribution with average
properties. OFF time as mc is demultiplexed probabilistically into two
Figure 1 shows one such scenario in an NoC of size 4 × 4 traffic streams, demultiplexed traffic streams still follow expo-
wherein cores are numbered 0 to 15. Core 0 is multicasting nential distribution. Average OFF time of each stream/task is
to cores 9 and 10 respectively. Core 7 is multicasting to cores mt1 and mt2 respectively. Probabilistic demultiplexing means
10 and 12 respectively. There is one unicast communication that each packet is assigned to one of the streams/tasks as per
from core 15 to core 13. probabilities (p1 , p2 ). A random number is generated and if
it is less than p1 this burst of packets belongs to first stream,
IV. P ROPOSED MODEL otherwise to second one.
The main objective of the work presented here is to deter- We investigate dependence of mt1 and mt2 on mc , p1 , p2 .
mine how a point-to-multipoint traffic pattern can be modelled
at source end. We need to derive statistical characteristic of the A. Bursty Traffic Model
traffic at source given traffic characteristics at the destination. Bursty traffic is modelled using exponential distribution [8].
For such a derivation, we first consider the inverse of the Both inter packet interval and packet size follow exponential
objective. Given source traffic characteristics, what are the distribution. We are concerned only with inter packet intervals.
3. Exponential distribution is parametrized by average value of To verify this observation, we generated and demultiplexed
the distribution denoted by m. The probability density function traffic for multiple values of mc . One such instance is shown in
(PDF) of an exponential distribution is Figure 2. Here, Figure 2(a) shows the probability distribution
x
1 −m of original trace with m = 30 while Figure 2(b) shows PDF
me , x≥0 of one of the demultiplexed trace with probability 0.6. As can
f (x; m) = (1)
0, x<0 be seen, both approximate to exponential distribution.
m is also known as expected value of the distribution. Fol- C. Deriving the relation
lowing variables are required in the traffic model
To seek relationship between mc , mt1 and mc , mt2 , we
B. Observation of Demultiplexed Trace generated and demultiplexed traces for various values of mc
We generated a traffic trace with a random average OFF and calculated the values of mt1 and mt2 . It was found
time mc . This traffic trace was divided into two different that average OFF time of traffic generated by each stream
traces using probabilities (p1 , p2 ). The PDF of the original is directly proportional to average core OFF time.
trace was exponential as expected. PDFs of each demultiplexed mt1 ∝ mc (2a)
trace was observed to follow similar exponential distribution.
This observation was significant because it meant that we can mt2 ∝ mc (2b)
generate two different exponential distributions from a single
distribution by probabilistically demultiplexing.
100
Offtime of task 1 (mt1) with probability 0.4
120 90 Offtime of task 2 (mt2) with probability 0.6
Average Off time for tasks
80
100
70
80 60
Frequency
50
60
40
40 30
20
20
10
0 5 10 15 20 25 30 35
0
Average Off time at Core
0 50 100 150
Value of inter packet time
Fig. 3: mt1 v/s mc and mt2 v/s mc
(a) Original
Figure 3 shows the plot of average OFF time of core and
70
of demultiplexed traffic streams. On X axis is the average
60
OFF time of core (mc ), while on Y axis is the OFF time
of both streams. As can be seen, the curve comes out to be
50 approximately linear, hence showing direct proportionality.
Next, we deduce the relationship between the mt1 , mt2 and
Frequency
40
p1 , p2 . To achieve this we kept the mc constant and probability
of generation was varied from 0.1 to 0.95 (p1 + p2 = 1). It
30
was found that average OFF time of traffic generated by each
20
stream is inversely proportional to respectiveprobability.
1
10 mt1 ∝ (3a)
p1
0
0 20 40 60 80 100 120 140 150 1
Value of inter packet time mt2 ∝ (3b)
p2
(b) Demultiplexed The Figure 4 shows the plot of mt1 versus the
probability(p1 ) for mc = 50. Probability is on the X-axis
Fig. 2: (a) PDF for Original Trace, (b) PDF for a demultiplexed and average OFF time is on the Y-axis. As can be seen
trace (probability= 0.6) from the plot, curve precisely shows the inverse relationship.
4. 400 450
400 Actual offtime for source offtime 15
350
Analytical offtime for source offtime 15
Actual offtime for source offtime 25
350
Average Off time (mt1)
300 Analytical offtime for source offtime 25
Actual offtime for source offtime 35
Average Off Time
300 Analytical offtime for source offtime 35
250
250
200
200
150
150
100
100
50 50
0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability (p1) Probability
Fig. 4: Variation of mt1 w.r.t p1 Fig. 5: Analytical v/s actual OFF time of Task 1 for different
values of mc
As the probability approaches unity the case reduces from
900
point-multipoint scenario to point-point scenario and mt1 Actual Off time for source off time 35
Analytical off time for source offtme 35
approaches mc . While for other destination, it attains a very 800
Actual Off time for source off time 25
high value. Using Equations (2a), (2b), (3a) and (3b) with 700
Analytical off time for source offtme 25
Actual Off time for source off time 15
curve fitting of both the curves, empirical relationship between Analytical off time for source offtme 15
Average Off time
600
average OFF time for each was derived as:
500
1
mc + p2 + c1
mt1 + c2 (4) 400
p1
300
1
mc + p1 + c3 200
mt2 + c4 (5)
p2
100
c1 , c2 , c3 , c4 are constants. In our case, when curve fitting 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
was applied following values were obtained c1 = c3 = 6 and Probability
c2 = c4 = −6.
Verification of the Equations (4), (5) is performed in two Fig. 6: Analytical v/s actual OFF time of Task 2 for different
steps. We calculate average OFF time of traffic generated by values of mc
each stream in two ways :
1) The values of mt1 and mt2 is calculated from the demul-
tiplexed traces obtained with different values of p1 , p2 V. NIRGAM
and mc . These values are referred to as ‘calculated’ or Network-on-chip Interconnect Routing and Application
actual OFF time from trace. Modelling (NIRGAM) [8] is a discrete event, cycle accu-
2) For all the corresponding values of p1 , p2 and mc , values rate simulator targeted at Network on Chip (NoC) research.
of mt1 and mt2 is calculated using Equations (4) and (5). NIRGAM is written in SystemC, which is a dynamic library
These values are referred to as ‘analytical’ OFF time. for hardware modelling built on top of C++. NIRGAM allows
Analytical and actual values are plotted on same figure to users to change various options in terms of NoC simulation
verify the derived Equations (4) and (5). Figures 5 and 6 show at every stage such as routing algorithm, topologies, virtual
the result of verification. The results have been shown for channels, buffers etc. Simulation framework allows analysing
different values of mc to verify our model for a range of results in terms of various performance metric such as latency,
core OFF time values. On X- axis is the probability of traffic throughput etc. Orion [13] has been integrated into NIRGAM
generation and transmission for each stream and on Y axis is and allows users to creating and analysing power estimation
the OFF time of the traffic generated for that stream. As can be graphs. NIRGAM provides support for fault tolerance [14] and
seen from the Figures 5 and 6, values from analytical formula QoS [15].
very accurately estimates the actual OFF time calculate from NIRGAM supports 2D mesh and 2D torus topologies. Rout-
demultiplexed trace. ing in NIRGAM is done using flits. These are the units that
5. flow between routers. NIRGAM support wormhole switching of mt1 and mt2 while last two columns represent values
mechanism. Presently it supports a number of routing algo- calculated from traces generated by our traffic model. It can
rithm such as XY, OE, DyaD, source, Q-routing, MaXY and be observed that calculated values and input values are nearly
PROM. A large number of options are available when it comes equal.
to traffic modelling in NIRGAM as it supports various type
of traffic patterns such as Hotspot NED [12] as well as traffic
injection models. TABLE I: Calculated vs Input mean OFF time
Other user configurable parameters in NIRGAM are virtual Input OFF Calculated mc Calculated
time Probability OFF time
channels i.e. number of virtual channels per physical channels, Task1 Task2 p1 p2 Task1 Task2
buffer size of an input channel, clock frequency. All these 16 25 0.60 0.40 4 15.4 22.2
parameters can be specified in the configuration file of the 20 40 0.66 0.34 8 21.3 43.0
16 16 0.50 0.50 3 17.1 18.0
NIRGAM before starting the simulation. 15 20 0.56 0.44 3 15.3 20.6
10 20 0.65 0.35 1 12.8 22.7
VI. I MPLEMENTATION OF P ROPOSED M ODEL 30 10 0.26 0.74 1 32.7 10.9
As discussed in Section IV, given the values of mc , p1 , p2
we can calculate mt1 and mt2 using Equations (4) and (5).
We ran simulation for different values of the flit interval.
Though for implementing the proposed traffic model as a
Simulation was done for three values of flit interval – 2, 4 and
traffic generator in any simulator it is desired that mt1 and
8 clock cycles. Results are shown in Table II. It is observed that
mt2 should be the input parameters. Different values of these
mean OFF time calculated from generated trace is independent
average OFF time will represent different classes of streams.
of the flit interval. Hence, proposed traffic model can be used
To derive values of mc , p1 , p2 for given values of mt1
with different flit intervals.
and mt2 , we use Equations (4) and (5) and the fact that
p1 + p2 = 1 along with the derived values of c1 , c2 , c3 and
c4 . A generalized version of the equation needed to solve for TABLE II: Calculated vs Input mean OFF time for different
p1 is shown below in Equation (6). Flit Intervals
Input Off time Calculated OFF time
Flit Interval = 2 Flit Interval = 4 Flit Interval = 8
Task1 Task2 Task1 Task2 Task1 Task2 Task1 Task2
p3 (mt1 + mt2 + 12) − p2 (mt1 + 2 ∗ mt2 + 18) +
1 1
15 20 15.8 20.0 16.2 19.0 15.6 20.2
p1 (mt2 + 8) − 1 = 0 (6) 11 30 11.0 31.4 11.2 29.7 11.4 29.1
8 11 8.7 11.4 8.7 11.5 8.6 11.6
Equation (6) has three possible roots, the one between 0 18 18 17.8 18.5 18.5 18.4 18.0 18.2
and 1 is selected as probability values are in range [0 · · · 1].
Computed root is assigned to p1 and p2 is computed as 1−p1 .
mc can be calculated using Equation (4).
When implementing the traffic model in NIRGAM values TABLE III: Calculated vs Input mean OFF time for different
of mt1 and mt2 are read from a configuration file. Using these Burst Length
values Equation (6) is solved for p1 using bisection method Input Off time Calculated OFF time
[16]. Once mc , p1 , p2 are known mc is used to generate Burst size = 4 Burst size = 8 Burst size = 12
Task1 Task2 Task1 Task2 Task1 Task2 Task1 Task2
bursty traffic. Each time a new burst starts a random number
is generated in range [0 · · · 1]. If the generated number is less 15 20 14.8 20.2 14.6 19.1 14.5 18.9
than p1 , first stream is allowed to transmit i.e. destination is 11 30 11.4 31.6 11.3 28.6 11.0 30.6
8 11 8.6 11.6 8.3 11.9 8.0 12.0
chosen according to first stream for the current burst, otherwise 18 18 18.3 17.2 17.2 18.4 18.4 18.4
destination is chosen according to second stream.
VII. E XPERIMENTAL R ESULTS
Simulation was run with different values of the burst size.
We ran NIRGAM simulator for different values of mt1 and We have used three values of burst size – 4, 8 and 12 packets.
mt2 on 4 × 4 mesh topology. Traffic model was attached to Results obtained are shown in Table III. Calculated mean
core 0 and two destinations were cores 7 & 10 respectively. OFF time from trace is independent of the burst size of the
Traffic was generated for 5000 clock cycle and simulation was traffic. This observation allows use of different burst sizes for
run for 8000 clock cycles. Number of virtual channels were modelling different streams/tasks.
eight.
To verify the traffic model, input values of mt1 and mt2 VIII. C ONCLUSION
(values read from configuration file as specified by the user) This paper presented a traffic model for multicast communi-
are compared with values calculated from demultiplexed trace. cation in NoC. This also models traffic scenario of concurrent
These values along with calculated values of mc , p1 and p2 tasks mapped to same core; each task requiring communication
are shown in Table I. Columns 1 and 2 show the input values with different destination. Mapping multiple tasks on a single
6. NoC core will reduce the size of NoC chip and the cost [15] K. K. Paliwal, J. S. George, N. Rameshan, V. Laxmi, M. S. Gaur,
and shall provide more optimal use of network resources. To V. Janyani, and R. Narasimhan, “Implementation of Q O S aware Q-
routing algorithm for network-on-chip,” in Communications in Computer
further analyse this concept of the multicasting/multitasking, and Information Science, 2009.
we provide a traffic model under the assumption that each task [16] A. Eiger, K. Sikorski, and F. Stenger, “A bisection method for systems
generates bursty traffic. For point-multipoint communication, of nonlinear equations,” ACM Trans. Math. Softw., vol. 10, no. 4, pp.
367–377, December 1984.
the core can be viewed as generating a single stream with a
fixed average OFF time. This burst is probabilistically demul-
tiplexed into two streams. The probabilities for demultiplexing
are calculated based on specified average OFF time of traffic
generated by each communication stream. Traffic model is
implemented and verified on an open source NoC simulator
NIRGAM. Multicast traffic model is independent of inter-flit
interval and burst size. In this paper, we have presented a
novel model for simultaneous broadcast to two destinations
but the model can be extended to n(n > 2) destinations. In
latter case, the solution will require numerical method. Further
analysis of the performance of the various routing algorithms,
topologies under other traffic distributions shall be part of our
future work.
R EFERENCES
[1] L. Carloni, P. Pande, and Y. Xie, “Networks-on-chip in emerging
interconnect paradigms: Advantages and challenges,” in Networks-on-
Chip, 2009. NoCS 2009, may 2009, pp. 93 –102.
[2] L. Benini and G. D. Micheli, “Networks on chips: A new soc paradigm,”
Computer, vol. 35, pp. 70–78, 2002.
[3] W. J. Dally and B. Towles, “Route packets, not wires: on-chip intecon-
nection networks,” in DAC ’01: Proceedings of the 38th annual Design
Automation Conference, 2001, pp. 684–689.
[4] J. Duato, S. Yalamanchili, and N. Lionel, Interconnection Networks: An
Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann
Publishers Inc., 2002.
[5] M. Ali, M. Welzl, and S. Hellebrand, “A dynamic routing mechanism
for network on chip,” in NORCHIP Conference, 2005. 23rd, 21-22 2005,
pp. 70 – 73.
[6] L. Tedesco, A. Mello, L. Giacomet, N. Calazans, and F. Moraes, “Ap-
plication driven traffic modeling for nocs,” in SBCCI ’06: Proceedings
of the 19th annual symposium on Integrated circuits and systems design,
2006, pp. 62–67.
[7] R. Marculescu and P. Bogdan, “The chip is the network: Toward a sci-
ence of network-on-chip design,” Foundations and Trends in Electronic
Design Automation, vol. 2, no. 4, pp. 371–461, 2009.
[8] “NIRGAM,” 2009. [Online]. Available: http://cse-trac.mnit.ac.in
[9] T. Ahonen, D. A. Sig¨ enza-Tortosa, H. Bin, and J. Nurmi, “Topology
u
optimization for application-specific networks-on-chip,” in SLIP ’04:
Proceedings of the 2004 international workshop on System level in-
terconnect prediction. New York, NY, USA: ACM, 2004, pp. 53–60.
[10] W. H. Ho and T. M. Pinkston, “A methodology for designing efficient
on-chip interconnects on well-behaved communication patterns,” in
HPCA ’03: Proceedings of the 9th International Symposium on High-
Performance Computer Architecture. Washington, DC, USA: IEEE
Computer Society, 2003, p. 377.
[11] J. Hu and R. Marculescu, “Energy-aware mapping for tile-based noc
architectures under performance constraints,” in ASP-DAC ’03: Proceed-
ings of the 2003 Asia and South Pacific Design Automation Conference.
New York, NY, USA: ACM, 2003, pp. 233–239.
[12] A.-M. Rahmani, I. Kamali, P. Lotfi-Kamran, A. Afzali-Kusha,
and S. Safari, “Negative exponential distribution traffic pattern for
power/performance analysis of network on chips,” in VLSI Design, 2009
22nd International Conference on, 5-9 2009, pp. 157 –162.
[13] A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi, “Orion 2.0: A fast
and accurate noc power and area model for early-stage design space
exploration,” in DATE’09, 2009, pp. 423–428.
[14] C. Grecu, L. Anghel, P. P. Pande, A. Ivanov, and R. Saleh, “Essential
fault-tolerance metrics for noc infrastructures,” in IOLTS ’07: Pro-
ceedings of the 13th IEEE International On-Line Testing Symposium.
Washington, DC, USA: IEEE Computer Society, 2007, pp. 37–42.