NAT 64 FPGA Implementation


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

NAT 64 FPGA Implementation

  1. 1. EN3020 - Digital System Design Group Project NAT64 SERVER D.P.G.S.R Fernando 080104U I.U. Liyanage 080269D J.R.Kodagoda 080238H R.S.A De Silva 080073v
  2. 2. OVERVIEWOur core objective is to build an interface which connects a set of ipv4 clients to a set of ipv6servers and vice versa. IPv6 was developed by the Internet Engineering Task Force (IETF) to dealwith the long-anticipated problem of IPv4 running out of addresses. Until IPv6 completelysupplants IPv4, a number of transition mechanisms are needed to enable IPv6-only hosts toreach IPv4 services and to allow isolated IPv6 hosts and networks to reach each-other overIPv4-only infrastructure. NAT64 is a mechanism used for such needs and is valuable in present.PROCEDUREWork flow of our project is as follows.  Implementing the Tri-mode Ethernet mac-wrapper  Implement the IPv4 to IPv6 conversion algorithm  Implemented the IPv6 to IPv4 conversion algorithm  Combine two algorithms to get the NAT64 module
  3. 3. Tri-Mode Ethernet MAC-WrapperFirst of all, we had to implement the Ethernet MAC layer in a FPGA evaluation board. But soft-Ethernet MAC cores are not freely available. Therefore we had to use either Vertex 5 or Vertex6 in which Ethernet MAC cores are available. Therefore we had to choose Vertex 5 from thethree choices, Virtex-5, Virtex-2 and Altera.But in order to make the core functional, we had to use tri-mode Ethernet MAC-wrapper whichis free in the Xilinx Core generator.Tri-Mode Ethernet MAC wrapper consists of 2 components.  Block Level Wrapper  Local Link Wrapper o rx_client_fifo o tx_client_fiforx_client_fifoThe rx_client_fifo is built around 2 Dual Port block RAMs, providing a total memory capacity of4096 bytes of frame data. The receive FIFO writes in data received through the Ethernet MAC. If
  4. 4. the frame is marked as good, that frame is presented on the LocalLink interface for reading bythe user. If the frame is marked as bad, that frame is dropped by the receive FIFO. If the receiveFIFO memory overflows, the frame currently being received is dropped, regardless of whether itis a good or bad frame, and the signal rx_overflow is asserted.tx_client_fifoThe tx_client_fifo is built around 2 Dual Port block RAMs, providing a total memory capacity of4096 bytes of frame data. When a full frame has been written into the transmit FIFO, the FIFOpresents data to the MAC transmitter. On receiving the acknowledge signal from the EthernetMAC, the rest of the frame is transmitted providing there is no retransmit request output bythe Ethernet MAC. If a retransmission request is received, the frame is queued forretransmission.IPv4 to IPv6 Conversion AlgorithmFollowing diagram illustrates the timing diagram of an incoming IPv4 Ethernet packet and aconverted transmitting IPv6 packet.Ethernet header is of 14 bytes and a IPv4 header is of 20 bytes in general but may greater thanthat if optional fields exist. This is because IPv4 header is termed as ‘x’ bytes in the abovediagram. IPv6 header consists of a fixed amount 40 bytes. Our Verilog code reads a byte of datain each clock cycle. We read the required fields of the IPv4 header and the Ethernet header
  5. 5. within 34 clocks and start sending the IPv6 packet after 36 clocks giving extra clock for data tobe ready to form the IPv6 packet.IPv6 to IPv4 Conversion AlgorithmFollowing diagram illustrates the timing diagram of an incoming IPv6 Ethernet packet and aconverted transmitting IPv4 packet.Verilog code reads the required fields of Ethernet and IPv4 header within 34 clocks and startsending the IPv4 packet after 36 clocks. Here converted IPv4 header has only 20 bytes becausein our algorithm, we do not create optional fields for IPv4 header.Header MappingIPv4 & IPv6 headers have similarities and dissimilarities though they are used as the layer threeprotocols for IP communication. Therefore one to one mapping is impossible. Following tableshows similar fields used in IPv4 & IPv6 headers though have termed using different names.Ether Type is a layer two field which is used to indicate the layer 3 protocol. IPv4IPV4 IPv6Ether Type: 0x0800 Ether Type:0x86ddVersion = 4 Version = 6DSCP, ECN Traffic classHeader Length, Total Length Payload Length
  6. 6. Protocol Next HeaderTime to Live Hop LimitIpv4 address Ipv6 addressStatic NAT TableA NAT64 server can be implemented on two ways: static & dynamic. We chose static mappingmechanism and the following table illustrates how mapping is done between IPv4 & IPv6addresses. IPv4 IPv6 2000:2000:2000:2000:2000:2000:2000:2000 3000:3000:3000:3000:3000:3000:3000:3000 4000:4000:4000:4000:4000:4000:4000:4000 5000:5000:5000:5000:5000:5000:5000:5000Default Source 6000:6000:6000:6000:6000:6000:6000:6000AddressBroadcast Address ff02::1Multicast(to all hosts) ff02::1Multicast(to all ff02::2routers)Unspecified ::Loopback Address ::1
  7. 7. Hardware Debug ToolsChipScope Pro AnalyzerWe use ‘ChipScope Pro Inserter flow’ to capture signals of our project while the FPGA isbeing in operation.
  8. 8. Wireshark We used this packet sniffer software application to capture the sent and received packets in our PC which acts as the IPv4 & IPv6 to test and identify problems in implementation.Challenges  NAT64 server needs two ports to connect the IPv6 and IPv4 sides to the interface. But Virtex-5 FPGA board consists only one Ethernet port. As a solution we implemented the system only using the available port and same machine to be acted as IPv4 & IPv6 sides.  Virtex-5 device designs of Tri-mode Ethernet MAC require a Verilog LRM-IEEE 1364-2005 encryption - compliant simulator such as, o ModelSIM v6.6d o Cadence Incisive Enterprise Simulator(IES) 10.2 o Synopsys VCS and VCS MX 2010.06None of these simulators have free versions. So, we did not perform functional or timingsimulations for our project. Instead, we had to use hardware debug tools. Hardware debugging
  9. 9. is time consuming because we had to synthesize, implement and generate the program fileeach time we debug the design.