SlideShare a Scribd company logo
1 of 8
Download to read offline
SOC CHIP SCHEDULER EMBODYING I-SLIP ALGORITHM

                                      Trupti B. Salankar,Vilas A Nitnaware
                                              SRKNEC,NAGPUR
                                Truptis_135@yahoo.com, vilasan30@yahoo.com


                         Abstract                               Communication networks connect different
                                                             geographically distributed points and Switching
    We describe the methodology; the design and the          systems reduce the overall network costs by reducing
    implementation of scheduler block of interconnect.       the number of transmission links required to enable
    The scheduler block is implemented in Verilog            a given population of users to communicate. various
    using SYNOPSYS tool’s DVE and Design_vision.             switching techniques, chosen on the basis of
    The interconnect is capable of handling 72 bit           optimizing the usage of bandwidth in the network.
    packets and a total of 32 packets at a time. There       The two main switching techniques are: circuit
    are total 8 devices and we have to establish the         switching and packet switching. In data networks,
    communication between them. Each device                  there are certain gaps between the messages. The
    consists of an input block and the output block.         user devices do not need the transmission link all the
    The input block first receives the 72 bit packet and     time, but when they do, they require relatively high
    the total of 32 packets one by one. The input block      bandwidths. Assigning a continuous connection with
    internally consists of four arrays-destination head,     high bandwidth for such connections is obviously a
    destination tail, packet array and linked list array     waste of resources and results in low utilizations. If
    and also a shift register. It stores the packets in an   the circuit of high bandwidth was set up and released
    array called packet array. When scheduler sends          for each message transmission, then the set up time
    transmit request these packets are given to the          incurred for each message transmission would be
    scheduler. Scheduler internally consists of grant        high compared to the transmission time of the
    and accept arbiters. Scheduler perform its               message. Thus, switches in data networks
    operation in three steps i.e. request, grant and         incorporate the store and forward technique for
    accept. It works on the principle of i-slip              transmitting the messages.
    algorithm. Finally the scheduler decides that which      In store and forward, a message is first sent from the
    packet should be send from the input block to the        source to the switch to which it is attached. The
    output block of the device. Output block of the          switch scans the header of the message and decides
    device simply receives the packet. These packets         to which output to forward the message. The same
    are sent and received in two phases. In the first        scheme is repeated from switch to switch until the
    phase 36 bits are sent and in the second phase           message reaches its destination. The advantage of
    36bits are sent. Thus the connection is established      such a switching scheme is that the transmission
    between the devices using interconnect. We are           links are occupied only for the duration of the
    also modifying the scheduler design to reduce the        transmission of a message. After that the links are
    area required for on chip implementation. For this       released in order to transmit other messages. In other
    reason we are combining the two sets of arbiters         words, the bandwidth allocation in the store and
    into only one, so the total arbiters required for        forward scheme is determined dynamically on the
    modified scheduler now reduces to only 8                 basis of a particular message and a particular link in
    compared to 16 for original scheduler.                   the network.
                                                             Packet switching is an extension of message
    1. Introduction                                          switching. In packet switching, messages are broken
                                                             into certain blocks called packets, and packets are
    With improving fabrication technology integration        transmitted independently using the store and
    of system components onto a single die increases.        forward scheme. Some of the advantages of packet
    Communication between these components can               switching over message switching are as follows:
    become the limiting factor for performance unless        1) Messages are fragmented into packets that cannot
    careful attention is given to designing high             exceed a maximum size. This leads to fairness in the
    performance interconnects.                               network utilization, even when messages are long.
                                                             2) Successive packets in a message can be
                                                             transmitted simultaneously on different links,




978-1-4244-8971-8/10$26.00 c 2010 IEEE
reducing the end-to-end transmission delay. (This        replaced datagram networks altogether and hasn’t
effect is called pipelining.)                            been the one and only dominant technology (as it
3) Due to the smaller size of packets compared to        was promising 10 years ago), but still it has been
messages, packet are less likely to be rejected at the   deployed in many networks. Vendors are continuing
intermediate nodes due to storage capacity limitation    to study and improve ATM technology to achieve the
at the switches.                                         implementation of more and more Quality of Service
4) Both the probability of error and the error           (QoS). In ATM networks data is transferred over
recovery time will be lower for packets since they are   Virtual Circuits (VC’s) in 53-byte packets called
smaller. Once an error occurs, only the packet with      cells.
the error needs to be retransmitted rather than the      Our implementation is done in Verilog using
whole message. This leads to a more efficient use of     SYNOPSYS software using DVE & synthesized
the transmission bandwidth.                              using Design_vision. The ATM crossbar switch that
        A packet switch is a box with N inputs and N     we have implemented is a modular design (can be
outputs that routes the packets arriving on its inputs   scaled) and consists of three main components: input
to their requested outputs. One can say that the main    port modules, crossbar scheduler, and output port.
functions of packet switches are buffering and           The functionality of the switch can be described as
routing.                                                 follows. The packets first enter the input ports of the
Besides these basic operations a switch can have         switch where they are queued based on their order of
other capabilities, such as handling multicast traffic   arrival. Each input port has a port controller that
and priority functions. Small N×N packet switches        determines the destination of a packet. The port
are the key components of the interconnection            controller then sends a request to the scheduler for
networks used in multiprocessors and integrated          the destination output port. The scheduler grants a
communication networking for data, voice, and            request based on a priority algorithm that ensures
video. A popular choice in the hardware                  fair service to all the input ports. Once a grant is
implementation of packet switches is crossbar            issued, the crossbar fabric is configured to map the
architecture. Crossbar is a non-blocking architecture.   granted input ports to their destination output ports.
This means that any input-output pair can                This project implements an on-chip SOC
communicate with each other as long as they do not       interconnect (switch) embodying the i-SLIP
interfere with the other input-output pairs. In other    algorithm for efficient communication between SOC
words, any permutation of inputs and outputs is          devices.
possible as long as each input sends data to a                  The name of this algorithm is derived from the
different output, and each output receives data from     serial line internet protocol. It is merely a packet
at most one input.                                       framing protocol: SLIP defines a sequence of
This document describes the design and                   characters that frame IP packets on a serial line, and
implementation of an asynchronous transfer mode          nothing more. It provides no addressing, packet type
(ATM) crossbar switch. ATM is a means of digital         identification,    error     detection/correction    or
communication with the potential for replacing the       compression mechanisms. It is a TCP/IP protocol
conflicting communication infrastructures (telephone     used for communication between two machines that
networks, cable TV networks, and computer                are previously configured for communication with
networks) that nowadays need to be integrated into       each other. Slip is commonly used on dedicated
one.                                                     serial links and sometimes for dialup purposes, and
          These three information infrastructures have   is usually used with line speeds between 1200 bps
some overlaps among themselves and are all moving        and 19.2 kbps.It is useful for allowing mixes of hosts
from analog technology to digital technology for         and routers to communicate with one another. For
transmission, switching, and multiplexing. New           example, the internet server provider may provide
technologies are being developed that are stepping       the user with a SLIP connection so that the
along the way of merging these three communication       provider’s server can respond to requests, pass them
infrastructures. ATM technology is intended to be        on to the internet and forwards requested internet
used in networks that transport a variety of different   responses back to the user. Hence the name of this
types of information including voice traffic that was    algorithm is iterative-serial line internet protocol
traditionally carried over telephone networks, data      algorithm. This algorithm is derived from round
traffic typically carried on computer networks, and      robin scheduling algorithm.
multimedia traffic consisting of a mixture of image,
audio and video information. Each of these various
types of traffic can have a different requirement and    2. Internal interconnect of the switch
places different demands on switching and
transmission facilities. Although ATM has not
There have been discussions about what the internal
interconnect of the switch should be. The internal
interconnect of the switch can be in the form of a
single stage network (shared bus, ring, crossbar) or a
multi-stage network of smaller switches arranged in
a banyan. Even with a non-blocking interconnect
such as the crossbar, some buffering is necessary
because packets that arrive at the interconnect are
unscheduled and the switch has to multiplex them.
There are three basic conditions where buffering is
necessary: 1) The output port through which the
packet needs to be routed is blocked by the next stage
of the network. 2) Two packets destined for the same
output port arrive simultaneously at different input
ports but the output port can accept only one packet
at a time. 3) The packet needs to be held while the
routing module in the switch determines the output
port to which the packet is sent.                        Figure 2

                                                         2) Operating the switch fabric at a faster speed
                                                         than the input/output lines (speedup): This scheme
                                                         reduces the effect of HOL blocking but does not
                                                         remove it completely [6]. A speedup by a factor of S
                                                         can remove S packets from each input port within
                                                         each time slot. Therefore, for an N×N switch, if
                                                         output buffers are used, the speedup is N, and if
                                                         input buffers are used, the speedup is equal to one.
                                                         For switches that use speedup, both input and output
Figure1 : Input-queued packet switches                   buffers are required.

Many subsequent studies have tackled improving the       3) Examining the first K cells in a FIFO queue
performance of input-queued packet switches. Some        where K>1 :
of the proposed techniques are as listed below:
                                                         Consider a switch with input port buffers as shown
1) Using non-FIFO buffers: One scheme in this            in Figure 2.9 .The packet labels are destination port
category is virtual output queuing (VOQ). In this        numbers. We define array Ai = [ai1, ai2, ai3, …,
scheme each input has N queues or blocks of              aiN]T where ais = d is the destination port number, i
memory instead of one single FIFO queue. In other        is the column number, and s is the source port
words, there is a separate queue for each input-         number.
output                                                   We also define transmission array T = [t1, t2, …,
pair (Figure 2).                                         tN]T, where ts = d indicates that input port s is
                                                         assigned to transmit a packet to output port d.
Figure3                                                  is incremented to one location beyond the granted
                                                         input if and only if the grant is accepted in step3.
                                                               In other words, the priority round robin at the
Scheduling algorithms:                                   output side is incremented (provided that the grant
                                                         was accepted) after the Accept step is passed.
The scheduler module in a packet switch decides
                                                         Those inputs and outputs that are not matched at the
when data is sent from particular inputs to their
                                                         end of one iteration are eligible for matching in the
desired outputs. The scheduling algorithm has to be
                                                         next. This small change to the RRM algorithm
fast, fair, and easy to implement in hardware. The
                                                         makes I-slip capable of handling heavy loads of
problem of scheduling, that is determining which
                                                         traffic and eliminates starvation of any connections.
input and output should be connected to each other
                                                         The algorithm converges in an average of O(log N)
in each time slot, is equivalent to finding a matching
                                                         and a maximum of N iterations. I-slip can fit in a
in a bipartite graph. Several scheduling algorithms
                                                         single chip and is readily implemented in hardware.
are
                                                         The SLIP algorithm is modified as follows
     1) Maximum Size Matching scheduling
          algorithm
     2) Maximum Weight Matching algorithm
     3) Oldest Cell First (OCF) scheduling
     4) Longest Port First (LPF) algorithm
     5) Parallel iterative matching (PIM)
          algorithm
     6) Round robin matching (RRM)

The basic round-robin algorithm is designed to
overcome two problems complexity and unfairness.
This scheduling algorithm is used when all tasks are
equally important. The three steps of arbitration are:
                                                         Properties for high performance:
                                                         For practical high-performance systems, we desire
                                                         algorithms with the following properties:
                                                         • High Throughput: An algorithm that keeps the
                                                         backlog low in the VOQ’s; ideally, the algorithm
                                                         will sustain an offered load up to 100% on each
                                                         input and output.
                                                         • Starvation Free: The algorithm should not allow a
                                                         nonempty VOQ to remain unserved indefinitely.
                                                         • Fast: To achieve the highest bandwidth switch, it
          Figure 4                                       is important that the scheduling algorithm does not
                                                         become the performance bottleneck; the algorithm
7) iSLIP is an iterative algorithm achieved by           should therefore find a match as quickly as possible.
making a small change to the RRM scheme. iSLIP           • Simple to Implement: If the algorithm is to be fast
has the same three steps of RRM. Only the second         in practice, it must be implemented in special-
step (Grant step) has changed little. The SLIP           purpose hardware, preferably within a single chip.
algorithm is a variation of RRM designed to reduce
the synchronization of the output arbiters. SLIP          The i-slip algorithm is able to achieve all these
achieves this by not moving the grant pointers unless    aspects
the grant is accepted leading to a resynchronization     3.   Interconnect overview
of the arbiters under high load. SLIP is identical to
RRM except for a condition placed on updating the
grant pointers. The grant step of RRM is changed to:     on chip communication design has been done using
Step2: grant-If an output receives any requests, it      rather ad-hoc and informal approaches that fail to
chooses the one that appears next in a fixed, round-     meet the challenges posed by next generation SOC
robin schedule starting from the highest priority        designs
element. The output notifies each input whether or       The goal of this design is to provide a fast, efficient
not its request was granted. The pointer gi to the       SOC interconnect between 8 on-chip
highest priority element of the round-robin schedule     devices. The eight devices are connected to one
                                                         another through a single instance of the routing
                                                         switch to be designed.
The devices communicate using a simple packet-
                                                        based protocol. The packets are of fixed size, and
                                                        include a 6-bit header and 66-bits of packet data for
                                                        a total of 72 bits. The header is comprised of a 3-bit
                                                        source identifier and 3-bit destination identifier
                                                        (Table 2).



                                                        The Packet Data field is multipurpose and may
                                                        contain commands, addresses, data, crc,or any other
                                                        payload. The interconnect pays no attention to the
                                                        contents of the Packet Data field, and simply passes
                                                        it through as a payload. The Src field specifies the
Figure 5                                                originating device, and the Dest field specifies the
                                                        destination of the packet. Since the packets travel a
There are total 8 devices and each device consist of    relatively short distance on a well-characterizeable
an input block and output block. Each input block       chip, it is assumed that the interconnect will be
consist of 32 packets which are to be sent to the       robust enough to not require additional parity, ecc,
output block. For establishing the connection           or crc.
between the input block and output block we need to
design this interconnect with the help of this i-SLIP   4. Concept of linked lists
scheduling algorithm This design is an 8x8 crossbar     Many non-numeric applications require that an
for use as an on-chip SOC interconnect. The             ordered list of information items be represented and
interconnect serves as a communication portal           stored in memory in such a way that it is easy to add
between 8 on-chip devices                               items to the list or to delete items from the list at any
                                                        position while maintaining the desired order of
                                                        items.
                                                        There are total 8 devices. We have to establish the
                                                        connection between these 8 devices. Each device
                                                        consists of an input block and output block. Each
                                                        input block consists of 32 packets .Each packet
                                                        consist of 72 bit data as seen earlier. The input
                                                        blocks have three responsibilities:
                                                        1. Receive incoming packets
                                                        2. Store the packets while waiting for scheduling
                                                        3. Transmitting the next packet to the selected
                                                        destination once scheduling is complete




Figure 6

Packets used in this interconnect design:
Figure 7: Input block:
                                                          Figure8: High –level block diagram of scheduler.
.
The input block is comprised of four memory arrays,       HIGH LEVEL DESIGN: is a very high level block
     a FIFO and a shift register                          diagram of the scheduler. The scheduling
5. Schedular:                                             algorithm’s three phases (request, grant, and accept)
                                                          correspond
The schedular acts as a central switch arbiter.The
                                                          to the three blocks shown in the figure. Because the
goal for the scheduling algorithm is to match input
                                                          algorithm’s request phase
queues containing waiting packets with output
                                                          corresponds just to forwarding the requests to the
queues to achieve the maximum throughput while
                                                          grant arbiters, our implementation combines the
maintaining stability and eliminating starvation.The
                                                          request and grant phases. Figure 1 also shows the
slip algorithm matches inputs to outputs in a single
                                                          decision feedback information from the accept
iteration.However,after this iteration , several
                                                          arbiters, which the scheduler uses in successive
possible input and output ports may remain
                                                          iterations to mask off requests from already matched
unutilized. The i-slip algorithm uses multiple
                                                          inputs and outputs.
iterations to find paths to utilize as many input and
output ports as possible until it converges to finding
no more possible matches.
The single iteration slip is a specialization of i-slip
and may be characterized as i-slip with only a single
iteration or 1-slip.
[8]“High speed symmetric crossbar switch by
                                                                  Maryam Keyvani “– B. Sc. University of Tehran
                                                                  1998.

                                                             [9]“IEEE Paper on designing and implementing a fast
                                                                  crossbar scheduler” by Pankaj Gupta, Stanford
                                                                  University 1999.

                                                             [10]“Implementation of an On chip Interconnect using
                                                                  I – slip scheduling Algorithm” by John D. Pape
                                                                  December 11, 2006

                                                             [11]“Quality of service for Asynchronous on chip
                                                                  Networks”,       Thesis submitted by Tomaz
                                                                  Felicijan, Dept. of Computer Science, 2004.

                                                             [12]“Study of – VOQ crossbar switches for Multicast
                                                                  Traffic”, National Yunlin University of Science
                                                                  and Technology.

                                                             [13]“The I-slip scheduling algorithm for input queued
                                                                  switches”, IEEE transaction, Vol – 7, April
                                                                  1999.

                                                             [14]Tutorial on “The slip algorithm with multiple
                                                                  Iterations”
Figure9: Scheduler block diagram.
                                                             [15]Tutorial on “The slip algorithm with single
                                                                  Iteration”

6. References
                                                            SIMULATION RESULT OF SCHEDULER
   [1]     “A high – Speed and Lightweight on chip
         crossbar                  switch for on chip
         interconnection networks”, paper return by
         Kangmin Lee, See-Joong Lee and Hui-Jun Yoo,
         semiconductor system laboratory at department
         of Electrical Engineering KAIST, Daejeon,
         Korea.

   [2] “Addressing the system on a chip Interconnect
        Woes through Communication based Design”,
        University of California at Berkeley, 2001.

   [3] “Algorithm – Hardware co-design of fast Parallel
        Round Robin Arbiters”, University of
        Texas,2004.
   [4] “An Adadptive oundRobinscheduler for Head of
        line Blocking problem in Wireless LANs”,
        Department of Information Engineering, Li Bin
        Jiang and Soung Chang Liew, 1999.

   [5]“Concept of linked list” from book written by
        Andrewson       Tenanbum       on Computer
        Architecture and Organization.

   [6]“Fair queuing in data networks, Internetworking
        2002” by Rodrigo Sieera.                          Scheduler timing report on SYNOPSYS:

   [7]“Head of line blocking” from Wikipedia, the free    ***********************************
        encyclopedia.                                     *****
                                                          Report : timing
-path full
        -delay max                    ***********************************
        -max_paths 1                  *****
        -sort_by group                Report : area
Design : sc                           Design : sc
Version: Y-2006.06-SP6                Version: Y-2006.06-SP6
Date   : Wed Jul 28 16:48:35 2010     Date   : Wed Jul 28 17:25:51 2010
***********************************   ***********************************
*****                                 *****

Operating Conditions: TYPICAL         Library(s) Used:
Library: saed90nm_typ
Wire Load Model Mode: enclosed            saed90nm_typ (File:
                                      /home/student1/today/saed90nm_typ.d
  Startpoint: datactrl4_reg[4]        b)
              (rising edge-
triggered flip-flop)                  Number   of   ports:          220
  Endpoint: in_dec_valid[4]           Number   of   nets:           852
            (output port)             Number   of   cells:          380
  Path Group: (none)                  Number   of   references:      37
  Path Type: max
                                      Combinational area:
  Des/Clust/Port     Wire Load        13071.848633
Model        Library                  Noncombinational area:
  ---------------------------------   3035.730469
---------------                       Net Interconnect area:
  sc                 35000            1093.291504
saed90nm_typ
                                      Total cell area:
  Point                               16107.509766
Incr       Path                       Total area:
  ---------------------------------   17200.800781
--------------------------
  datactrl4_reg[4]/CLK (DFFX1)
0.00       0.00 r
  datactrl4_reg[4]/Q (DFFX1)
0.22       0.22 f
  U456/QN (NOR4X0)
0.17       0.38 r
  U455/QN (NAND4X0)
0.10       0.48 f
  U369/Q (AND2X1)
0.09       0.57 f
  in_dec_valid[4] (out)
0.00       0.57 f
  data arrival time
0.57
  ---------------------------------
--------------------------
  (Path is unconstrained)



Scheduler area report on SYNOPSYS:

More Related Content

What's hot

Network Layer,Computer Networks
Network Layer,Computer NetworksNetwork Layer,Computer Networks
Network Layer,Computer Networksguesta81d4b
 
Computer network transmission channel & topology
Computer network transmission channel & topologyComputer network transmission channel & topology
Computer network transmission channel & topologySweta Kumari Barnwal
 
11.a review of improvement in tcp congestion control using route failure det...
11.a  review of improvement in tcp congestion control using route failure det...11.a  review of improvement in tcp congestion control using route failure det...
11.a review of improvement in tcp congestion control using route failure det...Alexander Decker
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Chandra Meena
 
Packet transfer mechanism using routers and IP addresses
Packet transfer mechanism using routers and IP addresses Packet transfer mechanism using routers and IP addresses
Packet transfer mechanism using routers and IP addresses myrajendra
 
VPN Using MPLS Technique
VPN Using MPLS TechniqueVPN Using MPLS Technique
VPN Using MPLS TechniqueAhmad Atta
 
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORKBETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORKcscpconf
 
Lab Seminar 2009 12 01 Message Drop Reduction And Movement
Lab Seminar 2009 12 01  Message Drop Reduction And MovementLab Seminar 2009 12 01  Message Drop Reduction And Movement
Lab Seminar 2009 12 01 Message Drop Reduction And Movementtharindanv
 

What's hot (20)

Network Layer,Computer Networks
Network Layer,Computer NetworksNetwork Layer,Computer Networks
Network Layer,Computer Networks
 
DTN
DTNDTN
DTN
 
Networks 1-intro
Networks 1-introNetworks 1-intro
Networks 1-intro
 
Ns2 x graphs
Ns2 x graphsNs2 x graphs
Ns2 x graphs
 
Computer network transmission channel & topology
Computer network transmission channel & topologyComputer network transmission channel & topology
Computer network transmission channel & topology
 
Lecture set 1
Lecture set 1Lecture set 1
Lecture set 1
 
11.a review of improvement in tcp congestion control using route failure det...
11.a  review of improvement in tcp congestion control using route failure det...11.a  review of improvement in tcp congestion control using route failure det...
11.a review of improvement in tcp congestion control using route failure det...
 
B010340611
B010340611B010340611
B010340611
 
Chap 03
Chap 03Chap 03
Chap 03
 
CS6551 COMPUTER NETWORKS
CS6551 COMPUTER NETWORKSCS6551 COMPUTER NETWORKS
CS6551 COMPUTER NETWORKS
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc
 
Packet transfer mechanism using routers and IP addresses
Packet transfer mechanism using routers and IP addresses Packet transfer mechanism using routers and IP addresses
Packet transfer mechanism using routers and IP addresses
 
Packet Switching
Packet SwitchingPacket Switching
Packet Switching
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
VPN Using MPLS Technique
VPN Using MPLS TechniqueVPN Using MPLS Technique
VPN Using MPLS Technique
 
Unit 4
Unit 4Unit 4
Unit 4
 
Advanced Networking on GloMoSim
Advanced Networking on GloMoSimAdvanced Networking on GloMoSim
Advanced Networking on GloMoSim
 
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORKBETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
 
Lab Seminar 2009 12 01 Message Drop Reduction And Movement
Lab Seminar 2009 12 01  Message Drop Reduction And MovementLab Seminar 2009 12 01  Message Drop Reduction And Movement
Lab Seminar 2009 12 01 Message Drop Reduction And Movement
 

Similar to 3

A distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase theA distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase theKamal Spring
 
Automation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdfAutomation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdfGandhibabu8
 
Unit 3 CND physical layer_switching_pranoti doke
Unit 3  CND physical layer_switching_pranoti dokeUnit 3  CND physical layer_switching_pranoti doke
Unit 3 CND physical layer_switching_pranoti dokePranoti Doke
 
Unit 3 cnd physical layer_switching_pranoti doke
Unit 3  cnd physical layer_switching_pranoti dokeUnit 3  cnd physical layer_switching_pranoti doke
Unit 3 cnd physical layer_switching_pranoti dokePranoti Doke
 
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORK
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORKANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORK
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORKIJCSIT Journal
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network partha pratim deb
 
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ijac journal
 
High performance communication networkss
High performance communication networkssHigh performance communication networkss
High performance communication networkssHemaDarshana
 
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...IJECEIAES
 
Energy Minimization in Wireless Sensor Networks Using Multi Hop Transmission
Energy Minimization in Wireless Sensor Networks Using Multi  Hop TransmissionEnergy Minimization in Wireless Sensor Networks Using Multi  Hop Transmission
Energy Minimization in Wireless Sensor Networks Using Multi Hop TransmissionIOSR Journals
 
Ccna PPT
Ccna PPTCcna PPT
Ccna PPTAIRTEL
 

Similar to 3 (20)

A distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase theA distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase the
 
Automation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdfAutomation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdf
 
Unit 3 CND physical layer_switching_pranoti doke
Unit 3  CND physical layer_switching_pranoti dokeUnit 3  CND physical layer_switching_pranoti doke
Unit 3 CND physical layer_switching_pranoti doke
 
Unit 3 cnd physical layer_switching_pranoti doke
Unit 3  cnd physical layer_switching_pranoti dokeUnit 3  cnd physical layer_switching_pranoti doke
Unit 3 cnd physical layer_switching_pranoti doke
 
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORK
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORKANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORK
ANALYSIS OF ROUTING PROTOCOLS IN WIRELESS MESH NETWORK
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
 
networking1.ppt
networking1.pptnetworking1.ppt
networking1.ppt
 
Ba25315321
Ba25315321Ba25315321
Ba25315321
 
Address Interleaving in NoCs
Address Interleaving in NoCsAddress Interleaving in NoCs
Address Interleaving in NoCs
 
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
 
Ccna day1
Ccna day1Ccna day1
Ccna day1
 
Ccna day1
Ccna day1Ccna day1
Ccna day1
 
Ccna day1-130802165909-phpapp01
Ccna day1-130802165909-phpapp01Ccna day1-130802165909-phpapp01
Ccna day1-130802165909-phpapp01
 
Ccna day1
Ccna day1Ccna day1
Ccna day1
 
Ccna day 1
Ccna day 1Ccna day 1
Ccna day 1
 
High performance communication networkss
High performance communication networkssHigh performance communication networkss
High performance communication networkss
 
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...
Evaluation of Energy Consumption using Receiver–Centric MAC Protocol in Wirel...
 
Energy Minimization in Wireless Sensor Networks Using Multi Hop Transmission
Energy Minimization in Wireless Sensor Networks Using Multi  Hop TransmissionEnergy Minimization in Wireless Sensor Networks Using Multi  Hop Transmission
Energy Minimization in Wireless Sensor Networks Using Multi Hop Transmission
 
Week10 transport
Week10 transportWeek10 transport
Week10 transport
 
Ccna PPT
Ccna PPTCcna PPT
Ccna PPT
 

More from srimoorthi (20)

94
9494
94
 
87
8787
87
 
84
8484
84
 
83
8383
83
 
82
8282
82
 
75
7575
75
 
73
7373
73
 
72
7272
72
 
70
7070
70
 
69
6969
69
 
68
6868
68
 
63
6363
63
 
62
6262
62
 
61
6161
61
 
60
6060
60
 
59
5959
59
 
57
5757
57
 
56
5656
56
 
50
5050
50
 
55
5555
55
 

3

  • 1. SOC CHIP SCHEDULER EMBODYING I-SLIP ALGORITHM Trupti B. Salankar,Vilas A Nitnaware SRKNEC,NAGPUR Truptis_135@yahoo.com, vilasan30@yahoo.com Abstract Communication networks connect different geographically distributed points and Switching We describe the methodology; the design and the systems reduce the overall network costs by reducing implementation of scheduler block of interconnect. the number of transmission links required to enable The scheduler block is implemented in Verilog a given population of users to communicate. various using SYNOPSYS tool’s DVE and Design_vision. switching techniques, chosen on the basis of The interconnect is capable of handling 72 bit optimizing the usage of bandwidth in the network. packets and a total of 32 packets at a time. There The two main switching techniques are: circuit are total 8 devices and we have to establish the switching and packet switching. In data networks, communication between them. Each device there are certain gaps between the messages. The consists of an input block and the output block. user devices do not need the transmission link all the The input block first receives the 72 bit packet and time, but when they do, they require relatively high the total of 32 packets one by one. The input block bandwidths. Assigning a continuous connection with internally consists of four arrays-destination head, high bandwidth for such connections is obviously a destination tail, packet array and linked list array waste of resources and results in low utilizations. If and also a shift register. It stores the packets in an the circuit of high bandwidth was set up and released array called packet array. When scheduler sends for each message transmission, then the set up time transmit request these packets are given to the incurred for each message transmission would be scheduler. Scheduler internally consists of grant high compared to the transmission time of the and accept arbiters. Scheduler perform its message. Thus, switches in data networks operation in three steps i.e. request, grant and incorporate the store and forward technique for accept. It works on the principle of i-slip transmitting the messages. algorithm. Finally the scheduler decides that which In store and forward, a message is first sent from the packet should be send from the input block to the source to the switch to which it is attached. The output block of the device. Output block of the switch scans the header of the message and decides device simply receives the packet. These packets to which output to forward the message. The same are sent and received in two phases. In the first scheme is repeated from switch to switch until the phase 36 bits are sent and in the second phase message reaches its destination. The advantage of 36bits are sent. Thus the connection is established such a switching scheme is that the transmission between the devices using interconnect. We are links are occupied only for the duration of the also modifying the scheduler design to reduce the transmission of a message. After that the links are area required for on chip implementation. For this released in order to transmit other messages. In other reason we are combining the two sets of arbiters words, the bandwidth allocation in the store and into only one, so the total arbiters required for forward scheme is determined dynamically on the modified scheduler now reduces to only 8 basis of a particular message and a particular link in compared to 16 for original scheduler. the network. Packet switching is an extension of message 1. Introduction switching. In packet switching, messages are broken into certain blocks called packets, and packets are With improving fabrication technology integration transmitted independently using the store and of system components onto a single die increases. forward scheme. Some of the advantages of packet Communication between these components can switching over message switching are as follows: become the limiting factor for performance unless 1) Messages are fragmented into packets that cannot careful attention is given to designing high exceed a maximum size. This leads to fairness in the performance interconnects. network utilization, even when messages are long. 2) Successive packets in a message can be transmitted simultaneously on different links, 978-1-4244-8971-8/10$26.00 c 2010 IEEE
  • 2. reducing the end-to-end transmission delay. (This replaced datagram networks altogether and hasn’t effect is called pipelining.) been the one and only dominant technology (as it 3) Due to the smaller size of packets compared to was promising 10 years ago), but still it has been messages, packet are less likely to be rejected at the deployed in many networks. Vendors are continuing intermediate nodes due to storage capacity limitation to study and improve ATM technology to achieve the at the switches. implementation of more and more Quality of Service 4) Both the probability of error and the error (QoS). In ATM networks data is transferred over recovery time will be lower for packets since they are Virtual Circuits (VC’s) in 53-byte packets called smaller. Once an error occurs, only the packet with cells. the error needs to be retransmitted rather than the Our implementation is done in Verilog using whole message. This leads to a more efficient use of SYNOPSYS software using DVE & synthesized the transmission bandwidth. using Design_vision. The ATM crossbar switch that A packet switch is a box with N inputs and N we have implemented is a modular design (can be outputs that routes the packets arriving on its inputs scaled) and consists of three main components: input to their requested outputs. One can say that the main port modules, crossbar scheduler, and output port. functions of packet switches are buffering and The functionality of the switch can be described as routing. follows. The packets first enter the input ports of the Besides these basic operations a switch can have switch where they are queued based on their order of other capabilities, such as handling multicast traffic arrival. Each input port has a port controller that and priority functions. Small N×N packet switches determines the destination of a packet. The port are the key components of the interconnection controller then sends a request to the scheduler for networks used in multiprocessors and integrated the destination output port. The scheduler grants a communication networking for data, voice, and request based on a priority algorithm that ensures video. A popular choice in the hardware fair service to all the input ports. Once a grant is implementation of packet switches is crossbar issued, the crossbar fabric is configured to map the architecture. Crossbar is a non-blocking architecture. granted input ports to their destination output ports. This means that any input-output pair can This project implements an on-chip SOC communicate with each other as long as they do not interconnect (switch) embodying the i-SLIP interfere with the other input-output pairs. In other algorithm for efficient communication between SOC words, any permutation of inputs and outputs is devices. possible as long as each input sends data to a The name of this algorithm is derived from the different output, and each output receives data from serial line internet protocol. It is merely a packet at most one input. framing protocol: SLIP defines a sequence of This document describes the design and characters that frame IP packets on a serial line, and implementation of an asynchronous transfer mode nothing more. It provides no addressing, packet type (ATM) crossbar switch. ATM is a means of digital identification, error detection/correction or communication with the potential for replacing the compression mechanisms. It is a TCP/IP protocol conflicting communication infrastructures (telephone used for communication between two machines that networks, cable TV networks, and computer are previously configured for communication with networks) that nowadays need to be integrated into each other. Slip is commonly used on dedicated one. serial links and sometimes for dialup purposes, and These three information infrastructures have is usually used with line speeds between 1200 bps some overlaps among themselves and are all moving and 19.2 kbps.It is useful for allowing mixes of hosts from analog technology to digital technology for and routers to communicate with one another. For transmission, switching, and multiplexing. New example, the internet server provider may provide technologies are being developed that are stepping the user with a SLIP connection so that the along the way of merging these three communication provider’s server can respond to requests, pass them infrastructures. ATM technology is intended to be on to the internet and forwards requested internet used in networks that transport a variety of different responses back to the user. Hence the name of this types of information including voice traffic that was algorithm is iterative-serial line internet protocol traditionally carried over telephone networks, data algorithm. This algorithm is derived from round traffic typically carried on computer networks, and robin scheduling algorithm. multimedia traffic consisting of a mixture of image, audio and video information. Each of these various types of traffic can have a different requirement and 2. Internal interconnect of the switch places different demands on switching and transmission facilities. Although ATM has not
  • 3. There have been discussions about what the internal interconnect of the switch should be. The internal interconnect of the switch can be in the form of a single stage network (shared bus, ring, crossbar) or a multi-stage network of smaller switches arranged in a banyan. Even with a non-blocking interconnect such as the crossbar, some buffering is necessary because packets that arrive at the interconnect are unscheduled and the switch has to multiplex them. There are three basic conditions where buffering is necessary: 1) The output port through which the packet needs to be routed is blocked by the next stage of the network. 2) Two packets destined for the same output port arrive simultaneously at different input ports but the output port can accept only one packet at a time. 3) The packet needs to be held while the routing module in the switch determines the output port to which the packet is sent. Figure 2 2) Operating the switch fabric at a faster speed than the input/output lines (speedup): This scheme reduces the effect of HOL blocking but does not remove it completely [6]. A speedup by a factor of S can remove S packets from each input port within each time slot. Therefore, for an N×N switch, if output buffers are used, the speedup is N, and if input buffers are used, the speedup is equal to one. For switches that use speedup, both input and output Figure1 : Input-queued packet switches buffers are required. Many subsequent studies have tackled improving the 3) Examining the first K cells in a FIFO queue performance of input-queued packet switches. Some where K>1 : of the proposed techniques are as listed below: Consider a switch with input port buffers as shown 1) Using non-FIFO buffers: One scheme in this in Figure 2.9 .The packet labels are destination port category is virtual output queuing (VOQ). In this numbers. We define array Ai = [ai1, ai2, ai3, …, scheme each input has N queues or blocks of aiN]T where ais = d is the destination port number, i memory instead of one single FIFO queue. In other is the column number, and s is the source port words, there is a separate queue for each input- number. output We also define transmission array T = [t1, t2, …, pair (Figure 2). tN]T, where ts = d indicates that input port s is assigned to transmit a packet to output port d.
  • 4. Figure3 is incremented to one location beyond the granted input if and only if the grant is accepted in step3. In other words, the priority round robin at the Scheduling algorithms: output side is incremented (provided that the grant was accepted) after the Accept step is passed. The scheduler module in a packet switch decides Those inputs and outputs that are not matched at the when data is sent from particular inputs to their end of one iteration are eligible for matching in the desired outputs. The scheduling algorithm has to be next. This small change to the RRM algorithm fast, fair, and easy to implement in hardware. The makes I-slip capable of handling heavy loads of problem of scheduling, that is determining which traffic and eliminates starvation of any connections. input and output should be connected to each other The algorithm converges in an average of O(log N) in each time slot, is equivalent to finding a matching and a maximum of N iterations. I-slip can fit in a in a bipartite graph. Several scheduling algorithms single chip and is readily implemented in hardware. are The SLIP algorithm is modified as follows 1) Maximum Size Matching scheduling algorithm 2) Maximum Weight Matching algorithm 3) Oldest Cell First (OCF) scheduling 4) Longest Port First (LPF) algorithm 5) Parallel iterative matching (PIM) algorithm 6) Round robin matching (RRM) The basic round-robin algorithm is designed to overcome two problems complexity and unfairness. This scheduling algorithm is used when all tasks are equally important. The three steps of arbitration are: Properties for high performance: For practical high-performance systems, we desire algorithms with the following properties: • High Throughput: An algorithm that keeps the backlog low in the VOQ’s; ideally, the algorithm will sustain an offered load up to 100% on each input and output. • Starvation Free: The algorithm should not allow a nonempty VOQ to remain unserved indefinitely. • Fast: To achieve the highest bandwidth switch, it Figure 4 is important that the scheduling algorithm does not become the performance bottleneck; the algorithm 7) iSLIP is an iterative algorithm achieved by should therefore find a match as quickly as possible. making a small change to the RRM scheme. iSLIP • Simple to Implement: If the algorithm is to be fast has the same three steps of RRM. Only the second in practice, it must be implemented in special- step (Grant step) has changed little. The SLIP purpose hardware, preferably within a single chip. algorithm is a variation of RRM designed to reduce the synchronization of the output arbiters. SLIP The i-slip algorithm is able to achieve all these achieves this by not moving the grant pointers unless aspects the grant is accepted leading to a resynchronization 3. Interconnect overview of the arbiters under high load. SLIP is identical to RRM except for a condition placed on updating the grant pointers. The grant step of RRM is changed to: on chip communication design has been done using Step2: grant-If an output receives any requests, it rather ad-hoc and informal approaches that fail to chooses the one that appears next in a fixed, round- meet the challenges posed by next generation SOC robin schedule starting from the highest priority designs element. The output notifies each input whether or The goal of this design is to provide a fast, efficient not its request was granted. The pointer gi to the SOC interconnect between 8 on-chip highest priority element of the round-robin schedule devices. The eight devices are connected to one another through a single instance of the routing switch to be designed.
  • 5. The devices communicate using a simple packet- based protocol. The packets are of fixed size, and include a 6-bit header and 66-bits of packet data for a total of 72 bits. The header is comprised of a 3-bit source identifier and 3-bit destination identifier (Table 2). The Packet Data field is multipurpose and may contain commands, addresses, data, crc,or any other payload. The interconnect pays no attention to the contents of the Packet Data field, and simply passes it through as a payload. The Src field specifies the Figure 5 originating device, and the Dest field specifies the destination of the packet. Since the packets travel a There are total 8 devices and each device consist of relatively short distance on a well-characterizeable an input block and output block. Each input block chip, it is assumed that the interconnect will be consist of 32 packets which are to be sent to the robust enough to not require additional parity, ecc, output block. For establishing the connection or crc. between the input block and output block we need to design this interconnect with the help of this i-SLIP 4. Concept of linked lists scheduling algorithm This design is an 8x8 crossbar Many non-numeric applications require that an for use as an on-chip SOC interconnect. The ordered list of information items be represented and interconnect serves as a communication portal stored in memory in such a way that it is easy to add between 8 on-chip devices items to the list or to delete items from the list at any position while maintaining the desired order of items. There are total 8 devices. We have to establish the connection between these 8 devices. Each device consists of an input block and output block. Each input block consists of 32 packets .Each packet consist of 72 bit data as seen earlier. The input blocks have three responsibilities: 1. Receive incoming packets 2. Store the packets while waiting for scheduling 3. Transmitting the next packet to the selected destination once scheduling is complete Figure 6 Packets used in this interconnect design:
  • 6. Figure 7: Input block: Figure8: High –level block diagram of scheduler. . The input block is comprised of four memory arrays, HIGH LEVEL DESIGN: is a very high level block a FIFO and a shift register diagram of the scheduler. The scheduling 5. Schedular: algorithm’s three phases (request, grant, and accept) correspond The schedular acts as a central switch arbiter.The to the three blocks shown in the figure. Because the goal for the scheduling algorithm is to match input algorithm’s request phase queues containing waiting packets with output corresponds just to forwarding the requests to the queues to achieve the maximum throughput while grant arbiters, our implementation combines the maintaining stability and eliminating starvation.The request and grant phases. Figure 1 also shows the slip algorithm matches inputs to outputs in a single decision feedback information from the accept iteration.However,after this iteration , several arbiters, which the scheduler uses in successive possible input and output ports may remain iterations to mask off requests from already matched unutilized. The i-slip algorithm uses multiple inputs and outputs. iterations to find paths to utilize as many input and output ports as possible until it converges to finding no more possible matches. The single iteration slip is a specialization of i-slip and may be characterized as i-slip with only a single iteration or 1-slip.
  • 7. [8]“High speed symmetric crossbar switch by Maryam Keyvani “– B. Sc. University of Tehran 1998. [9]“IEEE Paper on designing and implementing a fast crossbar scheduler” by Pankaj Gupta, Stanford University 1999. [10]“Implementation of an On chip Interconnect using I – slip scheduling Algorithm” by John D. Pape December 11, 2006 [11]“Quality of service for Asynchronous on chip Networks”, Thesis submitted by Tomaz Felicijan, Dept. of Computer Science, 2004. [12]“Study of – VOQ crossbar switches for Multicast Traffic”, National Yunlin University of Science and Technology. [13]“The I-slip scheduling algorithm for input queued switches”, IEEE transaction, Vol – 7, April 1999. [14]Tutorial on “The slip algorithm with multiple Iterations” Figure9: Scheduler block diagram. [15]Tutorial on “The slip algorithm with single Iteration” 6. References SIMULATION RESULT OF SCHEDULER [1] “A high – Speed and Lightweight on chip crossbar switch for on chip interconnection networks”, paper return by Kangmin Lee, See-Joong Lee and Hui-Jun Yoo, semiconductor system laboratory at department of Electrical Engineering KAIST, Daejeon, Korea. [2] “Addressing the system on a chip Interconnect Woes through Communication based Design”, University of California at Berkeley, 2001. [3] “Algorithm – Hardware co-design of fast Parallel Round Robin Arbiters”, University of Texas,2004. [4] “An Adadptive oundRobinscheduler for Head of line Blocking problem in Wireless LANs”, Department of Information Engineering, Li Bin Jiang and Soung Chang Liew, 1999. [5]“Concept of linked list” from book written by Andrewson Tenanbum on Computer Architecture and Organization. [6]“Fair queuing in data networks, Internetworking 2002” by Rodrigo Sieera. Scheduler timing report on SYNOPSYS: [7]“Head of line blocking” from Wikipedia, the free *********************************** encyclopedia. ***** Report : timing
  • 8. -path full -delay max *********************************** -max_paths 1 ***** -sort_by group Report : area Design : sc Design : sc Version: Y-2006.06-SP6 Version: Y-2006.06-SP6 Date : Wed Jul 28 16:48:35 2010 Date : Wed Jul 28 17:25:51 2010 *********************************** *********************************** ***** ***** Operating Conditions: TYPICAL Library(s) Used: Library: saed90nm_typ Wire Load Model Mode: enclosed saed90nm_typ (File: /home/student1/today/saed90nm_typ.d Startpoint: datactrl4_reg[4] b) (rising edge- triggered flip-flop) Number of ports: 220 Endpoint: in_dec_valid[4] Number of nets: 852 (output port) Number of cells: 380 Path Group: (none) Number of references: 37 Path Type: max Combinational area: Des/Clust/Port Wire Load 13071.848633 Model Library Noncombinational area: --------------------------------- 3035.730469 --------------- Net Interconnect area: sc 35000 1093.291504 saed90nm_typ Total cell area: Point 16107.509766 Incr Path Total area: --------------------------------- 17200.800781 -------------------------- datactrl4_reg[4]/CLK (DFFX1) 0.00 0.00 r datactrl4_reg[4]/Q (DFFX1) 0.22 0.22 f U456/QN (NOR4X0) 0.17 0.38 r U455/QN (NAND4X0) 0.10 0.48 f U369/Q (AND2X1) 0.09 0.57 f in_dec_valid[4] (out) 0.00 0.57 f data arrival time 0.57 --------------------------------- -------------------------- (Path is unconstrained) Scheduler area report on SYNOPSYS: