A HYBRID NoC COMBINING SDM-BASED CIRCUIT SWITCHING WITH PACKET SWITCHING FOR REAL-TIME APPLICATIONS Angelo Kuti Lusala Jean-Didier Legat Microelectronics laboratory, UCL Microelectronics laboratory, UCL 1348 Louvain-la-Neuve, Belgium 1348 Louvain-la-Neuve, Belgium email@example.com Jean-Didier.Legat@uclouvain.beAbstract—In this paper we propose a hybrid network-on-chip in the packet-switched sub-network. We use SDM techniquewhich combines Spatial Division Multiplexing “SDM”-based in order to increase path diversity in circuit-switched sub-circuit switching and packet switching in order to efficiently network, thereby improving throughput and mitigating lowand separately handle streaming and best-effort traffics resource utilization which affects circuit-switched network.generated by real-time applications. The SDM technique is In this way multiple connections can use channels in a givenused in circuit-switched sub-network in order to increase path direction; we then take advantage of the abundance of wiresdiversity, thereby improving throughput and mitigating low resulting from the increasing integration density of CMOSresource utilization, while packet-switched sub-network is kept circuits. SDM has been proposed as a valid alternative toas simple as possible. In this way QoS is simply guaranteed TDM . SDM technique consists of allocating a sub-set ofwithout having to share resources, which often leads to acomplex design. The proposed hybrid router architecture has channel wires to a given connection between a source nodebeen synthesized in FPGA and ASIC, and results show that a and destination node .practical hybrid network-on-chip can then be built using the In the proposed architecture, the SDM variant usedproposed approach. consists in allocating more than one channel between adjacent circuit-switched sub-routers. We then define SDM- Keywords-SDM; TDM; QoS; circuit-switching; packet- Channel as a set of n-bit width sub-channels as shown in Fig.switching; network-on-chip 1. Only one sub-channel in a SDM-channel is dedicated to a connection. The SDM-based circuit-switched sub-router is I. INTRODUCTION configured by the packet-switched sub-router. The proposed hybrid-router architecture was Multiprocessor System-on-Chips “MPSoCs” constitute implemented in both ASIC and FPGA technologies. Resultssuitable platforms for real-time applications, since they offer show that increasing the number of sub-channels in a SDM-high-power computing resources and parallelism that require Channel does not greatly affect the size and the maximumreal-time applications. In MPSoC platforms, the performance clock frequency of the proposed hybrid router. A practicalof applications strongly relies on the on-chip interconnection network-on-chip can then be built using the proposed routernetwork used to carry communications between cores in the architecture.platform. Since real-time applications generate both The rest of the paper is organized as follows. Relatedstreaming and best effort traffics, the on-chip interconnection work is explored in section 2. Section 3 introduces thenetwork must provide quality of service “QoS” for streaming proposed architecture. Section 4 discusses synthesis resultstraffic and data completion for best-effort traffic. Several of the proposed router architecture. Finally a conclusion isnetworks-on-chips which handle both streaming and best- drawn in section 5.effort traffics have been proposed in the literature; some ofthem are Time Division Multiplexing “TDM”-based II. RELATED WORKconnection-oriented while others are connectionless-oriented Many hybrid Networks-on-Chips have been proposed inand assign priority to traffic . However, handling both the literature. In this paper we focus on those which combinestreaming and best-effort traffics in a network-on-chip by different switching techniques. In  ÆTHEREAL NoC issharing resources is very hard and often leads to a complex presented. It consists of two disjoint sub-networks: adesign with power consumption and area overhead . Guaranteed Service “GS” sub-network and a Best-Effort Since streaming traffic is well handled in a connection “BE” sub-network. The GS sub-network is TDM-basedoriented or circuit-switched network and best-effort traffic is connection-oriented while the BE sub-network is packet-well handled in a connectionless oriented or packet-switched switched based. The BE sub-network is responsible fornetwork, we thus propose in this paper a hybrid network-on- configuring the GS sub-network. The reserved time slots arechip which separately and efficiently handles both streaming used to carry streaming traffic while non-reserved time slotsand best-effort traffics. The proposed hybrid network are used to carry best-effort traffic. Store and forward flowconsists of two disjoint sub-networks, a SDM-based circuit- control is used in GS sub-network and wormhole is used inswitched sub-network and a packet-switched sub-network. BE sub-network, this implies the use of buffers in both GSThe streaming traffic is handled in the SDM-based circuit- and BE sub-networks. This often leads to complex designswitched sub-network while the best-effort traffic is handled with area and power consumption overhead . In , a978-1-4244-8971-8/10$26.00 c 2010 IEEE
technique called Hybrid Circuit Switching “HCS” is to a simple circuit-switched router or TDM connection-presented; it consists of a circuit-switched network which oriented router.intermingles circuit-switched flits with packet-switched flits. When a tile needs to send streaming traffic to another tileA circuit-switched packet is immediately injected in the in the network, a path or connection must first be establishednetwork behind the circuit setup request. If there is no between the two tiles. To establish the connection, the tileunused resource, the circuit-switched packet is transformed source sends a setup best-effort packet to its packet-switchedto a packet-switched packet and is buffered; it will then keep sub-router. This setup packet reserves an available sub-its new state until it is delivered. Although this technique can channel in each packet-switched sub-router crossed along itsreduce the circuit setup time overhead in the circuit-switched path from the source to destination. The packet-switchednetwork, it is still difficult to provide QoS. The NoC sub-router configures the attached SDM-based circuit-presented in  is quite similar to the one presented in . In switched sub-router by indicating the number identifier ofthis NoC, a packet can use alternatively a circuit-switched the sub-channel to use for the concerned connection. Whenand packet-switched sub-network. The authors claim that in the transaction of transferring streaming traffic is completed,each router, traffic is split between the two sub-networks in each SDM-based circuit-switched sub-router along the pathsuch a way that the power and the performance metrics of notifies each attached packet-switched sub-router to releasethe NoC are improved. With this technique it is still difficult the concerned sub-channel. When a tile needs to send best-to provide QoS for streaming traffic. In  is presented one effort traffic, it directly sends a “Normal” best-effort packetof the first works using SDM in a NoC in order to provide to the packet-switched sub-router.QoS. This NoC covers only streaming traffic. B. Packet- Switched sub-router III. PROPOSED NETWORK ARCHITECTURE The Packet-switched sub-router is responsible for handling best-effort traffic and configuring the attachedA. Router architecture SDM-based circuit-switched sub-router as shown in Fig. 1. It The proposed router architecture consists of two major uses XY routing algorithm, with cut-through as control flow.components as illustrated in Fig. 1: a packet-switched sub- We impose that packets coming from a given directionrouter and a SDM-based circuit-switched sub-router. cannot return in the same direction. The packet-switched sub-router has five bidirectional ports as shown in Fig. 1. Routing is distributed so that up to five packets can be simultaneously routed when they request different channels. A best-effort packet consists of five fields. Its structure is given in Fig. 2. Two bits indicating the type of the best-effort packet, the source and destination address are 6-bits wide since we consider a 7x7 mesh topology NoC. The sub- channel number identifier is 3-bits wide, thus a SDM- Channel can contain up to 7 sub-channels and the payload is 8-bits wide. The size of the best-effort packets varies according to the number of sub-channels in a SDM-Channel. Figure 2. Best-effort packet Figure 1. Hybrid router architecture We define three types of best-effort packets for the The two sub-routers handle traffic independently. The proposed architecture:SDM-based circuit-switched sub-router is configured by the - A setup best-effort packet, which is responsible forpacket-switched sub-router, while the SDM-based circuit- establishing paths for a streaming traffic through the circuit-switched sub-router notifies the packet-switched when a switched sub-network between a source and destination. Fortransaction of transferring streaming traffic is completed. As a setup best-effort packet, the value of the fields type andseen previously, an SDM-Channel consists of a set of a given payload are respectively “10” and “00000000”.number of sub-channels. Each sub-channel is n-bits wide and - An acknowledge “ACK” best-effort packet, which isis identified by a number called “number identifier”. A generated when a setup packet reaches its destination. It isconnection can only acquire one sub-channel in a SDM- built by swapping the fields destination address and sourceChannel. For example, for a SDM-Channel of five sub- address from the setup packet. Its fields type, sub-channelchannels, up to five connections can simultaneously use this number identifier and payload are respectively “01”, “000”SDM-Channel. With the SDM approach, the router offers and “00000000”.increased path diversity, improving the throughput compared
- A Normal best-effort packet, which carries best-effort channels reserved by this setup packet. The NACK signal inpayload. Its fields type and sub-channel number identifier are the packet-switched sub-router where the setup packet failedrespectively “11” and “000”. is equal to the sub-channel number identifier contained in this setup packet. The NACK signal indicates to the previous packet-switched sub-router the sub-channel to release. In the previous packet-switched sub-router the NACK is equal to the MSB of the reg_identifier. Configuring the attached SDM-based circuit-switched sub-router consists in indicating to each crossbar in the SDM-based circuit-switched sub-router the number identifier of the sub-channels to send in the output SDM-Channel. C. SDM-Based Circuit- Switched sub-router The SDM-based circuit-switched sub-router is responsible for carrying streaming traffic. It has five bidirectional ports. Four bidirectional ports are SDM-based and are used to connect the sub-router with four adjacent circuit-switched sub-routers and the fifth bidirectional port which consists of a sub-channel is used to connect the sub- router with the local tile as shown in Fig. 1. The SDM channel consists of a given number of sub-channels. Each sub-channel is N-bits wide. Streaming data is organized in Figure 3. Packet-switched sub-router packets like cells in ATM. The streaming packet data unit structure is shown in Fig. 4. The packet-switched sub-router consists of input buffers,link controllers and allocators as shown in Fig. 3. The inputbuffers store incoming best-effort packets. The linkcontroller is responsible for reading packets in the attachedinput buffer and deciding to which allocator to send the read Figure 4. Streaming packet data unitpacket according to the destination address. Since no packetloss is allowed, the link controller keeps the read packet in a The SDM-based circuit-switched sub-router consists ofregister until it receives a signal from the allocator indicating five crossbars and five header detectors. A crossbar andthat the packet is successfully sent to the output port. This header detectors are placed in each direction. The crossbarstrategy ensures that no packet is lost in the network. consists of multiplexers. Since crossbars are configured by The allocators are responsible for writing best-effort the packet-switched allocators, the use of XY routingpackets to the input buffers of the next packet-switched sub- algorithm in the packet-switched sub-router, determines therouter and configuring the SDM-based circuit-switched sub- number of input ports of each crossbar. The crossbar inrouter. They first check the type of the best-effort packet. If EAST direction can carry streaming traffic from eitherthe packet is an ACK or a Normal packet, then the allocator SDM-Channel from west or local tile sub-channel. Thedirectly sends it to the attached output link without crossbar in WEST direction has the same structure than themodifying it. If the packet is a setup packet, then the one in EAST direction. The crossbar in NORTH, SOUTHallocator reserves an available sub-channel in the SDM- and LOCAL can carry streaming data coming from fourChannel in the concerned direction, builds a new setup possible directions. Figure 5 gives the block diagram of thepacket by replacing the field sub-channel number of the crossbar in EAST direction and figure 6 shows itsincoming setup packet by the number identifier of the implementation using multiplexers. Figure 7 gives the blockreserved sub-channel. If the SDM-Channel has four sub- diagram of the crossbar in NORTH direction.channels, then the number identifier of each sub-channel isrespectively 1, 2, 3 and 4. The incoming sub-channel number identifier and theoutgoing sub-channel number identifier are concatenated andstored in a register. The incoming sub-channel numberidentifier is the MSB of this register while the outgoing sub-channel number identifier is the LSB. This register is called“reg_identifier”. This register helps to find the sub-channelto release when a setup packet fails. When a setup packet Figure 5. Block Diagram Crossbar East Directionfails to reserve a sub-channel in a packet-switched sub-router, a negative acknowledge “NACK” signal is sent back The size of the crossbars depends on the number of inputand propagates to all previous packet-switched sub-routers ports. For a SDM- Channel consisted of three sub-channels,crossed by this setup packet in order to release all sub- the size of crossbars in directions EAST and WEST is 5 x 3,
those in NORTH and SOUTH directions are 11x3 while the The increase of router area is mainly due to crossbars whosesize of the crossbar in local direction is 13x1. Signals “Sel1”, size varies according to the number of input ports. Synthesis“Sel2” and “Sel3” are provided by the best-effort allocator. results show that the proposed hybrid router architecture canEach sub-channel of the SDM output channel is connected to be used to build a practical network-on-chip.a header detector. TABLE I ASIC SYNTHESIS RESULTS FOR THE ROUTER Number of sub-channels Frequency Area Power in a SDM-Channel GHz µm2 µW/MHz 3 2.8 36352 13.21 4 2.8 40743 14.28 5 2.8 44157 15 TABLE II FPGA P&R RESULTS FOR THE ROUTER Number of sub-channels Frequency Total Logic utilization in a SDM-Channel MHz STRATIX III 3 179 1.4 % 4 164 1.7% Figure 6. Detailed Crossbar East Direction 5 148 2% V. CONCLUSION In this paper, a hybrid network-on-chip architecture which combines a SDM-based circuit switching with packet switching for real-time applications is proposed. The proposed hybrid router architecture efficiently and separately handles streaming and best-effort traffics respectively in a SDM-based circuit-switched sub-router and packet-switched sub-router. The SDM approach is used to allow path diversity, thereby mitigating low resource utilization in circuit switching, and improving throughput. QoS is then easily provided. The packet-switched sub-router is kept as Figure 7. Block Diagram Crossbar North Direction simple as possible. The proposed router architecture was implemented in Verilog and synthesized in both ASIC and The header detector extracts the header of the packet FPGA. Synthesis results show that a practical network-on-streaming data; if the header is equal to 2 then a signal is sent chip can be built using the proposed approach.to the best-effort allocator attached to the crossbar to releasethe associated sub-channel. The SDM-based circuit-switched REFERENCESsub- router is entirely combinational. Once a circuit is set up,  M. Faruque and J. Henkel, “QoS supported On_chip Commnicationcommunication latency is determined by the serialization for Multi-processors”, International Journal of Parallell Programming.time to send the entire streaming message. Latency and Vol. 36 (1), pp. 114-139, Feb 2008.Throughput can be configured by inserting pipelines between  A. Leroy, et al., “Spatial Division Multiplexing : a Novel Approachrouters, providing then configurable “QoS”. for Guaranteed Throughput on NoCs,” in Proc. of ISLPED, 2005  K. Goossens, J. Dielissen and A. Radulescu, “ÆTHEREAL Network- IV. SYNTHESIS RESULTS on-Chip Concepts,” in IEEE Design and Test of computers, Vol. 22 (5), pp. 414-421, 2005. The proposed hybrid router architecture has been  N. E. Jerger, L-S. Peh and M. Lipasti, “Circuit-Switched Coherence,”implemented in Verilog, synthesized in ASIC and placed and in computer architecture letters, Vol. 6 (1), pp. 5-8, June 2007.routed in FPGA. 4-packet input buffers are used in the  M. Modarressi et al, “A Hybrid Packet-Crcuit Switched On-Chippacket-switched sub-router. Synthesis results for ASIC and Network Based on SDM” in Proc. Of DATE 09, pp 566-569, AprilFPGA are shown respectively in Table I and Table II. 2009.Results for ASIC (65 nm CMOS from STMicroelectronics)  N. Kavaldjiev et al, ‘‘A Virtual Channel Network-on-Chip for GTshow that, area and power consumption per MHz of the and BE Traffic,’’ in IEEE Computer Society Annual Symposium on VLSI Emerging VLSI Technologies and Architectures, pp. 211-216,router vary according to the number of sub-channels in a 2006.SDM-Channel. Results for FPGA (Stratix III from Altera)  P.T. Wolkotte et. al, “An energy efficient reconfigurable circuitshow that the total logic utilization of FPGA resources switched network-on-chip,” in Proc. of IEEE IPDPS, pp. 155a-155a,slightly increase with the number of sub-channels in a SDM- Apr2005.Channel, while the maximum clock frequency decreases.