Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CS 498 Lecture 9 Traffic Control for QoS


Published on

  • Be the first to comment

CS 498 Lecture 9 Traffic Control for QoS

  1. 1. CS 498 Lecture 9 Traffic Control for QoS Jennifer Hou Department of Computer Science University of Illinois at Urbana-Champaign Reading: Chapters 18, The Linux Networking Architecture: Design and Implementation of Network Protocols in the Linux Kernel
  2. 2. Traffic Control <ul><li>Two major functions </li></ul><ul><ul><li>Policing </li></ul></ul><ul><ul><ul><li>Usually implemented at the router. </li></ul></ul></ul><ul><ul><ul><li>Data connections are monitored and packets that are transmitted violating a specified strategy are discarded. </li></ul></ul></ul><ul><ul><li>Traffic shaping </li></ul></ul><ul><ul><ul><li>Usually implemented at end hosts. </li></ul></ul></ul><ul><ul><ul><li>Data connections are regulated to conform to certain rate. Surplus packets are either marked and then sent or delayed at the sender side until the rate constraint no longer holds true. </li></ul></ul></ul>
  3. 3. Processing of Network Data Input de-multiplexing Forwarding Output queuing Upper layers (TCP, UDP, …) Traffic control Ingress policing
  4. 4. Traffic Control in Linux Kernel net/core/dev.c driver.c net/ipv4/ip_input.c net/sched/sch_ingress.c Traffic control in Incoming direction Forwarding Local delivery Locally created data net/core/dev.c dev_queue_xmit net/sched/sch_*.c net/sched/cls_*.c Traffic control in outing direction dev-> hard_start_xmit
  5. 5. Traffic Control in Linux Kernel ... .. dev.c dev.c driver.c net_interrupt netif_rx net_rx_action Scheduler do_softirq br_input.c handle_bridge CONFIG_BRIDGE dev_alloc_skb() eth_type_trans() CPU1 CPU2 softnet_data[cpu n ].input_pkt_queue arp_rcv ip_rcv p8022_rcv dev.c dev_queue_xmit dev->qdisc-> enqueue dev.c driver.c dev->hard_start_xmit qdisc_run qdisc_restart dev->qdisc->dequeue Scheduler eth0 eth1 ip_queue_xmit arp_send ... net_tx_action ETH_P_802_2
  6. 6. Components of Traffic Control <ul><li>Queuing disciplines </li></ul><ul><ul><li>Packets sent are passed to a queueing discipline and sorted within the queue in compliance with specific rules. </li></ul></ul><ul><ul><li>Packets can be removed no earlier than when the queueing discipline has marked them as ready for transmission. </li></ul></ul><ul><li>Classes (within a queuing disciplines) </li></ul><ul><ul><li>Within a queue discipline, packets can be allocated to different classes. </li></ul></ul><ul><li>Filters: are used to allocate packets to classes with a queueing discipline. </li></ul>
  7. 7. Queuing Discipline <ul><li>Each network device has a queuing discipline </li></ul><ul><li>It controls how packets are enqueued on the device are treated </li></ul><ul><ul><li>Possible operations: keep, drop, mark </li></ul></ul><ul><li>A simple one may just consist of a single queue </li></ul>Queuing discipline
  8. 8. Complex Queuing Discipline <ul><li>Queuing discipline </li></ul><ul><ul><li>May use filters to distinguish among different classes of packets </li></ul></ul><ul><ul><li>Process each class in a specific way </li></ul></ul><ul><ul><li>Two filters can point to one class </li></ul></ul><ul><li>Classes </li></ul><ul><ul><li>do not store packets </li></ul></ul><ul><ul><li>They use another queuing discipline to do that </li></ul></ul>Queueing discipline Filter Filter Filter Class 2 Class 1 Queueing Discipline Queueing Discipline Enqueue dequeue
  9. 9. Complex Queuing Discipline
  10. 10. Policing <ul><li>When packets of a connection are enqueued, the connection can be policed: </li></ul><ul><ul><li>Letting the packets go </li></ul></ul><ul><ul><li>Dropping the packets </li></ul></ul><ul><ul><li>Letting the packets go but mark them </li></ul></ul>
  11. 11. Data Structures Include/net/pkt_sched.h Include/net/sch_generic.h
  12. 12. Traffic Control in Linux Kernel <ul><li>Traffic control kernel code resides mainly in net/sched </li></ul><ul><ul><li>Traffic control in the incoming direction is handled by net/sched/sch_ingress.c. </li></ul></ul><ul><ul><li>Various scheduling disciplines in the outgoing direction are given in </li></ul></ul><ul><ul><ul><li>net/sched/sch_*.c </li></ul></ul></ul><ul><ul><ul><li>net/sched/cls_*.c </li></ul></ul></ul>
  13. 13. Traffic Control in Linux Kernel <ul><li>Interface used inside the kernel can be found in </li></ul><ul><ul><ul><li>/usr/src/linux-(version)/include/net/pkt_cls.h </li></ul></ul></ul><ul><ul><ul><li>/usr/src/linux-(version)/include/net/pket_sched.h </li></ul></ul></ul><ul><li>Interfaces between kernel traffic control and user space programs are decared in </li></ul><ul><ul><ul><li>/usr/include/linux/pkt_cls.h </li></ul></ul></ul><ul><ul><ul><li>/usr/include/linux/pkt_sched.h. </li></ul></ul></ul>
  14. 14. Inserting Traffic Control Queueing discipline Filter Filter Filter Class 2 Class 1 Queueing Discipline Queueing Discipline Enqueue dequeue dev.c, net/sched/* softirq.c, netdevice.h dev->qdisc->enqueue driver.c dev->hard_start_xmit qdisc_run qdisc_restart dev->qdisc->dequeue Scheduler net_tx_action dev_queue_xmit timer_handler netif_schedule Timer cpu_raise_softirq do_softirq NET_TX_SOFTIRQ
  15. 15. Queueing Discipline -- Qdisc <ul><li>struct Qdisc </li></ul><ul><li>{ </li></ul><ul><li>int (*enqueue)(struct sk _buff * skb , struct Qdisc * dev ); </li></ul><ul><li>struct sk _buff * (*dequeue)(struct Qdisc * dev ); </li></ul><ul><li>unsigned flags ; 32 </li></ul><ul><li>#define TCQ_F_BUILTIN 1 </li></ul><ul><li>#define TCQ_F_THROTTLED 2 </li></ul><ul><li>#define TCQ_F_INGRESS 4 </li></ul><ul><li>int padded ; </li></ul><ul><li>struct Qdisc _ops * ops ; </li></ul><ul><li>u32 handle ; </li></ul><ul><li>u32 parent ; </li></ul><ul><li>atomic_t refcnt ; </li></ul><ul><li>struct sk _buff_head q; </li></ul><ul><li>struct net_device * dev ; </li></ul><ul><li>struct list_head list; </li></ul><ul><li>struct gnet _stats_basic bstats; </li></ul><ul><li>struct gnet _stats_queue qstats; </li></ul><ul><li>struct gnet _stats_rate_ est rate_est; </li></ul><ul><li>spinlock _t *stats_lock; </li></ul><ul><li>struct rcu _head q_rcu; </li></ul><ul><li>int (*reshape_fail)(struct sk _buff * skb , </li></ul><ul><li>struct Qdisc *q); </li></ul><ul><li>struct Qdisc *__parent; </li></ul><ul><li>}; </li></ul>The network device to which the Qdisc is allocated The Qdisc_ops data structure The socket buffer queue governed by this qdisc When an outer queue passes a packet to an inner queue the packet may have to be discarded. If the outer queueing discipline implements the callback function reshape_fail then it can be invoked by the inner queueing discipline.
  16. 16. Queuing Disciplines –Qdisc_ops <ul><li>struct Qdisc _ops { </li></ul><ul><li>struct Qdisc _ops * next ; </li></ul><ul><li>struct Qdisc _class_ops *cl_ops; </li></ul><ul><li>char id [ IFNAMSIZ ]; </li></ul><ul><li>int priv_size; </li></ul><ul><li>int (*enqueue)(struct sk _buff *, struct Qdisc *); </li></ul><ul><li>struct sk _buff * (*dequeue)(struct Qdisc *); </li></ul><ul><li>int (*requeue)(struct sk _buff *, struct Qdisc *); </li></ul><ul><li>unsigned int (*drop)(struct Qdisc *); </li></ul><ul><li>int (* init )(struct Qdisc *, struct rtattr * arg ); </li></ul><ul><li>void (* reset )(struct Qdisc *); </li></ul><ul><li>void (*destroy)(struct Qdisc *); </li></ul><ul><li>int (*change)(struct Qdisc *, struct rtattr * arg ); </li></ul><ul><li>int (* dump )(struct Qdisc *, struct sk _buff *); </li></ul><ul><li> }; </li></ul>The packet should be arranged at the position in the queueing discipline where it has been before A queueing discipline can be added via register_qdisc() in init_module()
  17. 17. Qdisc_ops <ul><li>enqueue() </li></ul><ul><ul><li>Enqueues a packet </li></ul></ul><ul><ul><li>Return values are </li></ul></ul><ul><ul><ul><li>NET_XMIT_SUCCESS, if the packet is accepted </li></ul></ul></ul><ul><ul><ul><li>NET_XMIT_DROP, if the packet is discarded </li></ul></ul></ul><ul><ul><ul><li>NET_XMIT_CN, if the packet is discarded because of buffer overflow </li></ul></ul></ul><ul><ul><ul><li>NET_XMIT_POLICED, if the packet is discarded because of violation of a policing rule. </li></ul></ul></ul><ul><ul><ul><li>NET_XMIT_BYPASS, if the packet is accepted, but will not leave the queue via the regular dequeue() function. </li></ul></ul></ul>
  18. 18. Qdisc_ops <ul><li>dequeue() </li></ul><ul><ul><li>Returns a pointer to a packet (skb) eligible for sending </li></ul></ul><ul><ul><li>A return value of null means that there are no packets ready to be sent. (The total number of packets in the queue is given in struct Qdisc* q  q.len.) </li></ul></ul><ul><li>requeue() </li></ul><ul><ul><li>Puts a packet back into the original position in the queue where it had been before. </li></ul></ul><ul><ul><li>The number of packets running through the queue should not be increased . </li></ul></ul><ul><li>drop() </li></ul><ul><ul><li>Drops one packet from the queue </li></ul></ul>
  19. 19. Qdisc_ops <ul><li>init() </li></ul><ul><ul><li>Initializes the queuing discipline </li></ul></ul><ul><li>reset() </li></ul><ul><ul><li>Resets the queuing discipline to its initial state (empty queue, reset counter, delete times) </li></ul></ul><ul><li>destroy() </li></ul><ul><ul><li>Removes a queuing discipline and frees all the resources reserved during the runtime of the queueing discipline. </li></ul></ul><ul><li>change() </li></ul><ul><ul><li>Changes the parameters of a queuing discipline </li></ul></ul><ul><li>dump() </li></ul><ul><ul><li>Returns output configuration parameters and statistics of a queueing discipline. </li></ul></ul>
  20. 20. Qdisc_class_ops <ul><li>struct Qdisc _class_ops </li></ul><ul><li>{ </li></ul><ul><li>/* Child qdisc manipulation */ </li></ul><ul><li>int (*graft)(struct Qdisc *, unsigned long cl , </li></ul><ul><li>struct Qdisc *, struct Qdisc **); </li></ul><ul><li>struct Qdisc * (*leaf)(struct Qdisc *, unsigned long cl ); </li></ul><ul><li>/* Class manipulation routines */ </li></ul><ul><li>unsigned long (*get)(struct Qdisc *, u32 classid); </li></ul><ul><li>void (*put)(struct Qdisc *, unsigned long); </li></ul><ul><li>int (* change )(struct Qdisc *, u32 , u32, </li></ul><ul><li>struct rtattr **, unsigned long *); </li></ul><ul><li>int (*delete)(struct Qdisc *, unsigned long); </li></ul><ul><li>void (*walk)(struct Qdisc *, struct qdisc _walker * arg ); </li></ul><ul><li>/* Filter manipulation */ </li></ul><ul><li>struct tcf _proto ** (*tcf_chain)(struct Qdisc *, unsigned long); </li></ul><ul><li>unsigned long (*bind_tcf)(struct Qdisc *, unsigned long, </li></ul><ul><li>u32 classid); </li></ul><ul><li>void (*unbind_tcf)(struct Qdisc *, unsigned long); </li></ul><ul><li>/* rtnetlink specific */ </li></ul><ul><li>int (* dump )(struct Qdisc *, unsigned long, </li></ul><ul><li>struct sk _buff * skb , struct tcmsg *); </li></ul><ul><li>int (*dump_stats)(struct Qdisc *, unsigned long, </li></ul><ul><li>struct gnet _dump *); </li></ul><ul><li>};   </li></ul>
  21. 21. Qdisc_class_ops <ul><li>graft(): binds a queueing discipline to a class </li></ul><ul><li>leaf(): returns a pointer to the queueing discipline currently bound to the class </li></ul><ul><li>get(): maps the classid to the internal identification and increments the reference counter by one. </li></ul><ul><ul><li>Each class is associated with two ids </li></ul></ul><ul><ul><ul><li>classid (of type u32 ) is used by the user and the configuration tools used in the user space. </li></ul></ul></ul><ul><ul><ul><li>Internal identification (of type unsigned long ) is used within the kernel </li></ul></ul></ul><ul><li>put(): decrements the usage counter. </li></ul>
  22. 22. Qdisc_class_ops <ul><li>change(): changes the class parameters </li></ul><ul><li>delete(): checks if the class is not referenced; and if not, deletes the class. </li></ul><ul><li>walk(): walks through the linked list of the all the classes of a queueing discipline and invokes the associated callback functions to obtain configuration/statistics data. </li></ul><ul><li>tcf_chain(): returns a pointer to the linked list for the filter bound to the class. </li></ul><ul><li>bind_tcf(): binds a filter to a class. </li></ul><ul><li>dump_class(): gives configuration and statistics data of a class. </li></ul>
  23. 23. tcf_proto <ul><li>struct tcf _proto </li></ul><ul><li>{ </li></ul><ul><li>/* Fast access part */ </li></ul><ul><li>struct tcf _proto * next ; </li></ul><ul><li>void * root ; </li></ul><ul><li>int (*classify)(struct sk _buff *, struct tcf _proto *, </li></ul><ul><li>struct tcf _result *); </li></ul><ul><li>u32 protocol ; </li></ul><ul><li>/* All the rest */ </li></ul><ul><li>u32 prio; </li></ul><ul><li>u32 classid; </li></ul><ul><li>struct Qdisc *q; </li></ul><ul><li>void * data ; </li></ul><ul><li>struct tcf _proto_ops * ops ; </li></ul><ul><li>}; </li></ul>
  24. 24. tcf_proto_ops <ul><li>struct tcf _proto_ops </li></ul><ul><li>{ </li></ul><ul><li>struct tcf _proto_ops * next ; </li></ul><ul><li>char kind[ IFNAMSIZ ]; </li></ul><ul><li>int (*classify)(struct sk _buff *, struct tcf _proto *, </li></ul><ul><li>struct tcf _result *); </li></ul><ul><li>int (* init )(struct tcf _proto *); </li></ul><ul><li>void (* destroy )(struct tcf _proto *); </li></ul><ul><li>unsigned long (*get)(struct tcf _proto *, u32 handle ); </li></ul><ul><li>void (*put)(struct tcf _proto *, unsigned long); </li></ul><ul><li>int (* change )(struct tcf _proto *, unsigned long, </li></ul><ul><li>u32 handle , struct rtattr **, unsigned long *); </li></ul><ul><li>int (*delete)(struct tcf _proto *, unsigned long); </li></ul><ul><li>void (*walk)(struct tcf _proto *, struct tcf _walker * arg ); </li></ul><ul><li>/* rtnetlink specific */ </li></ul><ul><li>int (* dump )(struct tcf_proto*, unsigned long, </li></ul><ul><li>struct sk_buff *skb, struct tcmsg*); </li></ul><ul><li>struct module *owner; </li></ul><ul><li>}; </li></ul>
  25. 25. tcf_proto_ops <ul><li>classify(): classifies a packet (checks if the filtering rule applies to the packet) </li></ul><ul><ul><li>Possible return values are </li></ul></ul><ul><ul><ul><li>TC_POLICE_OK: the packet is accepted by the filter. </li></ul></ul></ul><ul><ul><ul><li>TC_POLICE_RECLASSIFY: the packet violates agreed parameters and should be allocated to a different class. </li></ul></ul></ul><ul><ul><ul><li>TCP_POLICE_SHOT: the packet was dropped because of violation of agreed parameters </li></ul></ul></ul><ul><ul><ul><li>TCP_POLICE_UNSPEC: The rule does not match the packet, and the packet should be passed to the next filter. </li></ul></ul></ul><ul><ul><li>tcf_result contains the classid and the internal identification of the class. </li></ul></ul>
  26. 26. Queueing Discipline Example net/sched/sch_red.c
  27. 27. RED
  28. 28. Dropping Probability p a pb Linux implementation min th max th max p 1
  29. 29. RED implementation I <ul><li>struct red_ sched _data { /* Parameters */ </li></ul><ul><ul><li>u32 limit ; /* HARD maximal queue length */ </li></ul></ul><ul><ul><li>u32 qth_min; /* Min average length threshold: A scaled */ </li></ul></ul><ul><ul><li>u32 qth_max; /* Max average length threshold: A scaled */ </li></ul></ul><ul><ul><li>char Wlog; /* log(W) */ </li></ul></ul><ul><ul><li>char Plog; /* random number bits */ </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><ul><li>unsigned long qave; /* Average queue length: A scaled */ </li></ul></ul><ul><ul><li>int qcount; /* Packets since last random number generation */ </li></ul></ul><ul><ul><li>u32 qR; /* Cached random number */ </li></ul></ul><ul><ul><li>psched _time_t qidlestart; /* Start of idle period */ </li></ul></ul><ul><ul><li>struct tc _red_ xstats st ; }; </li></ul></ul>
  30. 30. RED implementation II: Compute average queue length <ul><li>We want: </li></ul><ul><ul><li>avg = avg * (1- w) +w * backlog </li></ul></ul><ul><li>Code in Linux: </li></ul><ul><ul><li>q ->qave += sch-> stats .backlog - ( q ->qave >> q ->Wlog); </li></ul></ul><ul><li>Why: </li></ul><ul><ul><li>avg = q->qave * w </li></ul></ul><ul><ul><li>w = 2^(-wlog) </li></ul></ul>
  31. 31. RED implementation III <ul><li>Ideally avg should be calculated every constant clock interval </li></ul><ul><li>In Linux it is updated every packet outgoing </li></ul><ul><li>Care need to be taken for idle period </li></ul>
  32. 32. RED implementation IV: Decide dropping probability <ul><li>We want: enqueue if </li></ul><ul><li>Linux code: </li></ul><ul><ul><li>if ((( q ->qave - q ->qth_min)>> q ->Wlog)* q ->qcount < q ->qR) goto enqeue; </li></ul></ul><ul><li>max_P = (qth_max – qth_min)/2^Plog </li></ul><ul><li>q->qR = rnd * 2^Plog </li></ul>