• Designed a Network-On-Chip router circuit in cadence with facilities for error detection, error correction and a loopback module to route back data away from faulty routers at any port in the router.
• Logic Synthesis of the written RTL code was done by Synopsys Design Compiler with our own library obtained from SiliconSmart ACE using the cadence spice netlists of our own standard cells. Applied DC-Tcl scripts for logic optimization with various design constraints.
• Adopted Synopsys DC SPEF file for timing-driven placement in the layout. The router layout was auto-place & routed in Cadence Encounter APR from the LEF file format of our own cadence cell library.
• Power reduction and timing closure was achieved through Synopsys Primetime and the chip operated at 1.86 GHz with low power consumption of 17.37mW.
RELIABLE NoC ROUTER ARCHITECTURE DESIGN USING IBM 130NM TECHNOLOGY
1. THE UNIVERSITY OF TEXAS AT DALLAS
EECT/CE 6325 VLSI DESIGN
Fall 2016
PROJECT #6
ROUTER DESIGN
BY:
ILANGO JEYASUBRAMANIAN - 2021270958
MUKESH TRITH SWAIN - 2021288272
ADITYA MANISHBHAI MEHTA - 2021287096
2. GENERAL DESCRIPTION OF THE DESIGN
We have implemented a Verilog code for a simple router with four ports. Each port contains the following modules:
LOOPBACK MODULE:
The data comes in and goes out from each port only through this module block. The important
function of this block is to loopback the output into the input of the same port of the router, whenever the input
port of the corresponding destination router becomes unavailable.
FINITE STATE MACHINE MODULE:
This module is used to detect if the buffers are full at all the ports by checking the input data at
each port sequentially at every clock pulse. It works like a 4-bit sequence detector with bit ‘1’ when input is
present and bit ‘0’ when there is no input at the port. When a sequence of ’1111’ occurs, it means that the data is
present at all the four ports.Hence,it will make all the port unavailable signals id_un1,id_un2,id_un3,id_un4 to bit
‘1’ at the corresponding ports of the router to indicate port unavailability to the adjascent routers, as the buffers
are full.
This helps the other routers to route the data without losing the packets by making use of the loopback module.
ECC MODULE:
The data that comes inside the loopback module will first be sent to the hamming code error correction
module. This checks for any error in the bits and corrects if there is a single bit error. However, when it finds a
double bit error, the data will be discarded as the hamming code operation cannot correct two or more bit errors.
ROUTING-LOGIC:
We use XY routing algorithm here, where the data moves along the Y and X direction to reach the
destination. We have also use adaptive XY routing algorithm such as:
The SH-XY (Surround horizontal XY) mode is used when the router’s left or right neighbor is deac tivated.
Correspondingly, the mode SV-XY (Surround vertical XY) is used when the upper or lower neighbor of the router is
inactive.This helps in reducing the packet loss in the router.
INPUT-DATA:
We have used a 16 bit input ,with the last 4 bits representing the destination address.
3. ROUTER BLOCK - DIAGRAM:
LOOP BACK MODULE
Data_out2
Id_un2
Id_uno2
Data_in2
ECC
D_in2
BUFFER
D_chk2
D_out4
ROUTING
LOGIC
D_chk2
LOOPBACK MODULE
Data_in4 Data_out4
Id_uno4
Id_un4
ECC
D_in4
ROUTING
LOGIC
D_chk4
BUFFER
D_chk2
L
O
O
P
B
A
C
K
M
O
D
U
L
E
L
O
O
P
B
A
C
K
M
O
D
U
L
E
ECC
D_in1
ROUTING
LOGIC
D_chk2
ECC
D_in3
ROUTING
LOGIC
D_chk3
BUFFER
BUFFER
D_out2
D_out21 D_out23
D_out24
D_out13
D_out14
D_out12
D_out34
D_out31
D_out32
D_out43
D_out41
D_out42
D_out1
D_out3
Id_uno1
Data_in1
Id_un1
Data_out1
Id_uno3
Data_in3
Data_out3
Id_un3
PORT-1
PORT-3
PORT-4
PORT-2
FINITESTATE
MACHINEMODULE
Id_un2 Id_un3 Id_un4
Id_un1
D_chk1
D_chk2 D_chk3
D_chk4
4. CONNECTION BETWEEN MODULE ANDTESTBENCH:
BUFFER
DFF1
MODULE
R
O
U
T
E
R
M
O
D
U
L
E
lbm1
(LOOP
BACK
MODULE)
BUFFER
DFF2
MODULE
ECC2
ECC1
ECC4
BUFFER
DFF4
MODULE
lbm3
(LOOP
BACK
MODULE)
ECC3
BUFFER
DFF3
MODULE
lbm2
(LOOP
BACK
MODULE)
lbm4
(LOOP
BACK
MODULE)
10. LOOPBACK MODULE:
module lbm(D_out,id_uno,Data_in,Data_req_in,Data_req_out,Data_out,id_un,D_in,clk);
input clk;
input [15:0] D_out,Data_in;
input id_uno;
inout Data_req_out,Data_req_in;
output [15:0] Data_out,D_in;
output id_un;
wire sm,sd;
wire [15:0] mux_out,D1;
wire [15:0] D_loop;
assign Data_req_out =(D_out ==16'b0000000000000000)?1'b0:1'b1;
assign Data_req_in =(Data_in==16'b0000000000000000)?1'b0:1'b1;
assign sm = ~id_uno * Data_req_in;
assign mux_out=(sm==1'b0)? D_loop:Data_in;
dff b1(D_in,mux_out,clk);
assign sd = ~id_uno * Data_req_out;
demux21 a(clk,D_loop,D1,D_out,sd);
dff a1(Data_out,D1,clk);
endmodule
ROUTE-LOGIC-1 MODULE:
module routelogic1(clk,id_un2,id_un4,A,D_O1,D_chk,D_out1,D_out2,D_out3);
input clk;
input id_un2,id_un4;
input[3:0] A;
input [15:0] D_O1,D_chk;
output reg [15:0] D_out1,D_out2,D_out3;
always@(posedge clk)
begin
if(D_O1==D_chk) begin
if(id_un2!=1'b1) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if(id_un4!=1'b1) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
end
if(D_chk==16'h0000) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b1 ) begin
D_out1=16'h0000;
11. D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b0 ) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b1 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b0 )begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[13]!=A[1] && D_chk[13]==1'b1 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if( D_chk[11]!=A[0] && D_chk[11]==1'b1) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
end
endmodule
ROUTE-LOGIC-2 MODULE:
module routelogic2(clk,id_un1,id_un3,A,D_O2,D_chk,D_out1,D_out2,D_out3);
input clk;
input id_un1,id_un3;
input[3:0] A;
input [15:0] D_O2,D_chk;
output reg [15:0] D_out1,D_out2,D_out3;
always@(posedge clk)
begin
if(D_O2==D_chk) begin
if(id_un3!=1'b1) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if(id_un1!=1'b1) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
end
12. if(D_chk==16'h0000) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b1 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b1 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if(D_chk[13]!=A[1] && D_chk[13]==1'b1 ) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if(D_chk[13]!=A[1] && D_chk[13]==1'b0 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[12]!=A[0] && D_chk[12]==1'b1) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[12]!=A[0] && D_chk[12]==1'b0 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
end
endmodule
ROUTE-LOGIC-3 MODULE:
module routelogic3(clk,id_un2,id_un4,A,D_O3,D_chk,D_out1,D_out2,D_out3);
input clk;
input id_un4,id_un2;
input [3:0] A;
input [15:0] D_O3,D_chk;
output reg [15:0] D_out1,D_out2,D_out3;
always@(posedge clk)
begin
if(D_O3==D_chk) begin
if(id_un4!=1'b1) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
13. end
else if(id_un2!=1'b1) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
end
if(D_chk==16'h0000) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b1 ) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b0 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b1 ) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b0 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[13]!=A[1] && D_chk[13]==1'b0 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if( D_chk[12]!=A[0] && D_chk[12]==1'b0) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
end
endmodule
14. ROUTE-LOGIC-4-MODULE:
module routelogic4(clk,id_un1,id_un3,A,D_O4,D_chk,D_out1,D_out2,D_out3);
input clk;
input id_un1,id_un3;
input[3:0] A;
input [15:0] D_O4,D_chk;
output reg [15:0] D_out1,D_out2,D_out3;
always@(posedge clk)
begin
if(D_O4==D_chk) begin
if(id_un1!=1'b1) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if(id_un3!=1'b1) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
end
if(D_chk==16'h0000) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[15]!=A[3] && D_chk[15]==1'b0 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if( D_chk[14]!=A[2] && D_chk[14]==1'b0 ) begin
D_out1=16'h0000;
D_out2=D_chk;
D_out3=16'h0000;
end
else if( D_chk[13]!=A[1] && D_chk[13]==1'b0 ) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[13]!=A[1] && D_chk[13]==1'b1 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
D_out3=D_chk;
end
else if( D_chk[12]!=A[0] && D_chk[12]==1'b0) begin
D_out1=D_chk;
D_out2=16'h0000;
D_out3=16'h0000;
end
else if( D_chk[12]!=A[0] && D_chk[12]==1'b1 ) begin
D_out1=16'h0000;
D_out2=16'h0000;
15. D_out3=D_chk;
end
end
endmodule
FSM MODULE:
module FSM(id_un1,id_un2,id_un3,id_un4,D_chk1,D_chk2,D_chk3,D_chk4,clk);
input clk;
input [15:0] D_chk1,D_chk2,D_chk3,D_chk4;
output reg id_un1, id_un2, id_un3, id_un4;
reg I;
reg out;
reg [2:0] Q;
reg [2:0] Q1;
reg [2:0] D;
initial begin
Q=3'b000;
end
always@(posedge clk)
begin
case(Q)
3'b000:
begin
if(D_chk1!=16'h0000)
I<=1'b1;
else
I<=1'b0;
end
3'b100:
begin
if(D_chk2!=16'h0000)
I<=1'b1;
else
I<=1'b0;
end
3'b010:
begin
if(D_chk3!=16'h0000)
I<=1'b1;
else
I<=1'b0;
end
3'b110:
begin
if(D_chk4!=16'h0000)
I<=1'b1;
else
I<=1'b0;
end
16. 3'b001:
begin
if(D_chk1!=16'h0000)
I<=1'b1;
else
I<=1'b0;
end
endcase
out <= Q[1] * Q[2] * I;
D[0] <= Q[1] * Q[2] * I;
D[1] <= (((~Q[1]) * Q[2] * I) + ((~Q[0]) * Q[1] * (~Q[2]) * I)) ;
D[2] <= (~Q[2]) * I;
Q1=D;
case(Q1)
3'b000:Q=3'b000;
3'b100:Q=3'b100;
3'b010:Q=3'b010;
3'b110:Q=3'b110;
3'b001:Q=3'b001;
endcase
if(out == 1'b1)
begin
id_un1=1'b1;
id_un2=1'b1;
id_un3=1'b1;
id_un4=1'b1;
end
else
begin
id_un1=1'b0;
id_un2=1'b0;
id_un3=1'b0;
id_un4=1'b0;
end
end
endmodule
24. SIMULATION-1-GREEN PATH DESCRIPTION:
The router address is given as ‘A’ in the simulation.We have manually changed the value
of router address ‘A’ to represent 16 routers as shown in the above simulation diagram.
Let us consider a 16 bit data “0011111001100111” at the input of the 1st
port of the router
with address A=’1000’ as Data_in1.This data passes through the loopback module and comes out as D_in1 and gets
checked by Error correction module(ECC) and comes out as D_chk1.
Since the last 4 bits represent the destination router address, the last 16th bit in the data
is checked against the first bit in the current router address A='1000’ in the routing-logic of port 1.Since it finds to
differ as bit ‘0’, it will be routed to the 2nd
port to go up ,according to XY routing algorithm. This will cause the data
to enter in the 4th
port of the router with address A=’0100’.
The data is again sent to the loopback module and Error correction module(ECC) in a
similar manner at the port-4 of the router with address A= ‘0100’.The routing logic in port 4 will find that the 16th
bit
matches with the first bit of the destination address. Hence, it will compare the next 15th
bit in the data and finds it
to differs as bit ‘0’.Hence,the data is routed to port-2 of the router with address A=‘0100’, causing the data to enter
in the port-4 of the router with address A=’0000’.
The routing logic in port4 of the router with address A=’ 0000’, finds that the first two bits
(16th
and 15th
bit) of the destination address in data matches with the first 2 bits of the current router address
A=’0000’.Hence,the routing will be done based on the 14th
bit in the destination address of the data and the third bit
in the current router address. This causes the data to be sent through the third port of the router with address
A=’0000’ and enters at the 1st
port of the router with address A=’ 0001’.This continues till it reaches the destination
router with address A=’0011’.
MAPPED VERILOG SIMULATION:
26. SIMULATION-2-RED PATH DESCRIPTION:
In this simulation ,we manually made the 4th
port of the router with router address
A=’0000’ as unavailable by making the id_uno2(port unavailability indication signal) coming from port 4 of router
with address A=’0000’ to port-2 of the router with address A=’0100’.Hence, the data coming out from port-2 of the
router with router address A=’0100’ is made to loopback inside port-2 of the same router as Data_in2.
This data then comes out as D_chk2 from the ECC(error correction module) of port 2 of
the same router. The routing logic then compares the D_chk2 with the D_out2 and finds that it matches, which
indicates that a loopback has taken place.
According to XY adaptive routing algorithm, when there is a port unavailability at the
adjacent router along port-2, we have to route the data either to port-3 or port-1.Hence, we send out the data to
port-3 of router with address A=’0100’.
We also made errors at the data bit and parity bit manually at the inputs of port-1 of the
router with router address A=’0101’ and port-1 of router with address A=’0010’ which gets corrected by the ECC
hamming code correction module. However a two bit errored data at the input of port-1 of the router with router
address ‘0011’ cannot be corrected and will just get discarded.
MAPPED VERILOG SIMULATION:
29. In the last screenshot we can find that the finite state machine module, used to check if data is present in all the
ports and if the buffers are full, outputs the link unavailable signals id_un1,id_un2,id_un3,id_un4 as ‘1’ ,as data is
present at all ports.
However, In the last router 0011, due to a two bit error, the data will be discarded in the 1st
input port. This will
cause the FSM module to make all the unavailable signals ‘0’ as only three valid inputs are present now in the
router(0011) and (Acknowledgment signal)ACK1 of port1 also becomes ‘0’ as data is discarded.
33. STATIC_TIMING _ANALYSIS USING PRIMETIME SYNOPSYS TOOL:
For a clock pulse of 95ns, the setup time was met at 0.98ns and the hold time was met at 0.17ps
with output capacitance of 25fF and slew rate of 60ps.
SETUP_TIME_REPORT:
35. TRADE_OFFS IN THE DESIGN:
1. The heights of all the cells were increased to 8.65um to abide with the standard cell
structure of same height in all the cells.
2. In the D-FLIP-FLOP design, we had to make use of Metal-2 connections to provide clock
gate inputs, which helped us to reduce the height of the cell.
3. In order to avoid DRC errors in automatic place and route and to meet the design rules of
cadence encounter tool, we made use of fillers and pitch distance of 0.48um in the design.