Traffic Control on the command line. A presentation of the many factors and setting that affect IP transmission performance in Linux and about the tc (Traffic Control) command to set scheduling and shaping rules using queueing disciplines, traffic classes and filters.
Comprehensive XDP Offload-handling the Edge CasesNetronome
While XDP is less complex to offload than other forms of kernel functionality due to the fact that it sits at the bottom of the stack, there are a number of items that lead to complexity when dealing with XDP offload. This talk will explore some of the ideas around how to implement these concepts as well as share some of the results we have seen while implementing offload on a 32 bit architecture.
BPF: Next Generation of Programmable DatapathThomas Graf
This session covers lessons learned while exploring BPF to provide a programmable datapath based on BPF and discusses options for OVS to leverage the technology.
netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers.
iptables is a user-space application program that allows a system administrator to configure the tables provided by the Linux kernel firewall (implemented as different netfilter modules) and the chains and rules it stores.
Many systems use iptables/netfilter, Linux's native packet filtering/mangling framework since Linux 2.4, be it home routers or sophisticated cloud network stacks.
In this session, we will talk about the netfilter framework and its facilities, explain how basic filtering and mangling use-cases are implemented using iptables, and introduce some less common but powerful extensions of iptables.
Shmulik Ladkani, Chief Architect at Nsof Networks.
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Nsof Networks, where he's been busy architecting network infrastructure as a cloud-based service, gazing at internet routes in astonishment, and playing the chkuku.
Comprehensive XDP Offload-handling the Edge CasesNetronome
While XDP is less complex to offload than other forms of kernel functionality due to the fact that it sits at the bottom of the stack, there are a number of items that lead to complexity when dealing with XDP offload. This talk will explore some of the ideas around how to implement these concepts as well as share some of the results we have seen while implementing offload on a 32 bit architecture.
BPF: Next Generation of Programmable DatapathThomas Graf
This session covers lessons learned while exploring BPF to provide a programmable datapath based on BPF and discusses options for OVS to leverage the technology.
netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers.
iptables is a user-space application program that allows a system administrator to configure the tables provided by the Linux kernel firewall (implemented as different netfilter modules) and the chains and rules it stores.
Many systems use iptables/netfilter, Linux's native packet filtering/mangling framework since Linux 2.4, be it home routers or sophisticated cloud network stacks.
In this session, we will talk about the netfilter framework and its facilities, explain how basic filtering and mangling use-cases are implemented using iptables, and introduce some less common but powerful extensions of iptables.
Shmulik Ladkani, Chief Architect at Nsof Networks.
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Nsof Networks, where he's been busy architecting network infrastructure as a cloud-based service, gazing at internet routes in astonishment, and playing the chkuku.
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...NETWAYS
Recent publications show an ever increasing demand in the area of cloud computing where high-available storage is one important corner stone.
DRBD (Distributed Replicated Block Device) is a building block for high availability clusters since years. Currently, DRBD is basically limited to two node cluster setups. In this talk we will provide an overview about recent developments in DRBD9 that allow us to make DRBD ready for upcoming challenges like many-node cluster setups and highly automated cloud deployments.
For the upcoming release we added an abstraction layer which is handled by “drbdmanage”. It is a tool that takes over management of logical volumes (LVM) and management of configuration files for DRBD. Features of drbdmanage include creating, resizing, and removing of replicated volumes. Additionally, drbdmanage handles taking snapshots and creating volumes in consistency groups. In order to support cloud deployments, a cinder (OpenStack) driver is in development.
All the mentioned components are currently under active development and this talk will provide an overview about these tools as well as about our vision for DRBD in general.
With the upcoming DRBD9 release we will be able to support a higher number of nodes (up to 30) per replication group, ease the deployment of DRBD setups with drbdmanage (up to 1000 nodes planed) and provide the foundations for cloud integration.
Cilium - Fast IPv6 Container Networking with BPF and XDPThomas Graf
We present a new open source project which provides IPv6 networking for Linux Containers by generating programs for each individual container on the fly and then runs them as JITed BPF code in the kernel. By generating and compiling the code, the program is reduced to the minimally required feature set and then heavily optimised by the compiler as parameters become plain variables. The upcoming addition of the Express Data Plane (XDP) to the kernel will make this approach even more efficient as the programs will get invoked directly from the network driver.
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
This presentation features a walk through the Linux kernel networking stack for users and developers. It will cover insights into both, existing essential networking features and recent developments and will show how to use them properly. Our starting point is the network card driver as it feeds a packet into the stack. We will follow the packet as it traverses through various subsystems such as packet filtering, routing, protocol stacks, and the socket layer. We will pause here and there to look into concepts such as networking namespaces, segmentation offloading, TCP small queues, and low latency polling and will discuss how to configure them.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThomas Graf
The Linux packet filtering technology, iptables, has its roots in times when networking was relatively simple and network bandwidth was measured in mere megabits. Emerging technologies, such as distributed NAT, overlay networks and containers require enhanced functionality and additional flexibility. In parallel, the next generation of network cards with speeds of 40Gb and 100Gb will put additional pressure on performance.
In the upcoming Red Hat Enterprise Linux 7, a new dynamic firewall service, FirewallD, is planned to provide greater flexibility over iptables by eliminating service disruptions during rule updates, abstraction, and support for different network trust zones. Additionally, a new virtual machine-based packet filtering technology, nftables, addresses the functionality and flexibility requirements of modern network workloads.
In this session you’ll:
Deep dive into the newly introduced packet filtering capabilities of Red Hat Enterprise Linux 7 beta.
Learn best practices.
See the new set of configuration utilities that allow new optimization possibilities.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
This presentation features a walk through the Linux kernel networking stack covering the essentials and recent developments a developer needs to know. Our starting point is the network card driver as it feeds a packet into the stack. We will follow the packet as it traverses through various subsystems such as packet filtering, routing, protocol stacks, and the socket layer. We will pause here and there to look into concepts such as segmentation offloading, TCP small queues, and low latency polling. We will cover APIs exposed by the kernel that go beyond use of write()/read() on sockets and will look into how they are implemented on the kernel side.
An Introduction to eBPF (and cBPF). Topics covered include history, implementation, program types & maps. Also gives a brief introduction to XDP and DPDK
The TC Flower Classifier allows control of packets based on flows determined by matching of well-known packet fields and metadata. This is inspired by similar flow classification described by OpenFlow and implemented by Open vSwitch. Offload of the TC Flower classifier and related modules provides a powerful mechanism to both increase throughput and reduce CPU utilisation for users of such flow-based systems. This presentation will give an overview of the evolution of offload of the TC Flower classifier: where it came from, the current status and possible future directions.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Fully programmable SmartNICs allow new offloads like OVS, eBPF, P4 or vRouter, and the Linux kernel is changing for supporting them. Having these same offloads when using DPDK is a possibility although the implications are not clear yet. Alejandro Lucero presented Netronome’s perspective for adding such a support to DPDK mainly for OVS and eBPF.
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...NETWAYS
Recent publications show an ever increasing demand in the area of cloud computing where high-available storage is one important corner stone.
DRBD (Distributed Replicated Block Device) is a building block for high availability clusters since years. Currently, DRBD is basically limited to two node cluster setups. In this talk we will provide an overview about recent developments in DRBD9 that allow us to make DRBD ready for upcoming challenges like many-node cluster setups and highly automated cloud deployments.
For the upcoming release we added an abstraction layer which is handled by “drbdmanage”. It is a tool that takes over management of logical volumes (LVM) and management of configuration files for DRBD. Features of drbdmanage include creating, resizing, and removing of replicated volumes. Additionally, drbdmanage handles taking snapshots and creating volumes in consistency groups. In order to support cloud deployments, a cinder (OpenStack) driver is in development.
All the mentioned components are currently under active development and this talk will provide an overview about these tools as well as about our vision for DRBD in general.
With the upcoming DRBD9 release we will be able to support a higher number of nodes (up to 30) per replication group, ease the deployment of DRBD setups with drbdmanage (up to 1000 nodes planed) and provide the foundations for cloud integration.
Cilium - Fast IPv6 Container Networking with BPF and XDPThomas Graf
We present a new open source project which provides IPv6 networking for Linux Containers by generating programs for each individual container on the fly and then runs them as JITed BPF code in the kernel. By generating and compiling the code, the program is reduced to the minimally required feature set and then heavily optimised by the compiler as parameters become plain variables. The upcoming addition of the Express Data Plane (XDP) to the kernel will make this approach even more efficient as the programs will get invoked directly from the network driver.
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
This presentation features a walk through the Linux kernel networking stack for users and developers. It will cover insights into both, existing essential networking features and recent developments and will show how to use them properly. Our starting point is the network card driver as it feeds a packet into the stack. We will follow the packet as it traverses through various subsystems such as packet filtering, routing, protocol stacks, and the socket layer. We will pause here and there to look into concepts such as networking namespaces, segmentation offloading, TCP small queues, and low latency polling and will discuss how to configure them.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThomas Graf
The Linux packet filtering technology, iptables, has its roots in times when networking was relatively simple and network bandwidth was measured in mere megabits. Emerging technologies, such as distributed NAT, overlay networks and containers require enhanced functionality and additional flexibility. In parallel, the next generation of network cards with speeds of 40Gb and 100Gb will put additional pressure on performance.
In the upcoming Red Hat Enterprise Linux 7, a new dynamic firewall service, FirewallD, is planned to provide greater flexibility over iptables by eliminating service disruptions during rule updates, abstraction, and support for different network trust zones. Additionally, a new virtual machine-based packet filtering technology, nftables, addresses the functionality and flexibility requirements of modern network workloads.
In this session you’ll:
Deep dive into the newly introduced packet filtering capabilities of Red Hat Enterprise Linux 7 beta.
Learn best practices.
See the new set of configuration utilities that allow new optimization possibilities.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
This presentation features a walk through the Linux kernel networking stack covering the essentials and recent developments a developer needs to know. Our starting point is the network card driver as it feeds a packet into the stack. We will follow the packet as it traverses through various subsystems such as packet filtering, routing, protocol stacks, and the socket layer. We will pause here and there to look into concepts such as segmentation offloading, TCP small queues, and low latency polling. We will cover APIs exposed by the kernel that go beyond use of write()/read() on sockets and will look into how they are implemented on the kernel side.
An Introduction to eBPF (and cBPF). Topics covered include history, implementation, program types & maps. Also gives a brief introduction to XDP and DPDK
The TC Flower Classifier allows control of packets based on flows determined by matching of well-known packet fields and metadata. This is inspired by similar flow classification described by OpenFlow and implemented by Open vSwitch. Offload of the TC Flower classifier and related modules provides a powerful mechanism to both increase throughput and reduce CPU utilisation for users of such flow-based systems. This presentation will give an overview of the evolution of offload of the TC Flower classifier: where it came from, the current status and possible future directions.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Fully programmable SmartNICs allow new offloads like OVS, eBPF, P4 or vRouter, and the Linux kernel is changing for supporting them. Having these same offloads when using DPDK is a possibility although the implications are not clear yet. Alejandro Lucero presented Netronome’s perspective for adding such a support to DPDK mainly for OVS and eBPF.
WiMAX Standard 802.16 mentions following 5 QoS classes:
• Unsolicited Grant Service(UGS)
• Extended Real Time Polling Service(ertPS)
• Real Time Polling Service(rtPS)
• Non Real Time Polling Service(nrtPS)
• Best Effort Service(BE)
Each of these wimax QoS classes has their own parameters such as bandwidth request type, min. throughput requirement as well as delay or jitter constraints.
Had the pleasure to deliver the key note presentation at Informa's 3G, HSPA & LTE Optimization conference in Prague. Great event with many very important presentations.
Ed Warnicke's talk at Open Networking Summit.
All Open Source Networking project depend on having access to a Universal Dataplane that is:
Able to they deployment models: Bare Metal/Embedded/Cloud/Containers/NFVi/VNFs
High performance
Feature Rich
Open with Broad Community support/participation
FD.io provides all of this and more. Come learn more about FD.io and how you can begin using it.
XDP in Practice: DDoS Mitigation @CloudflareC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2NtlaER.
Gilberto Bertin discusses the architecture of Cloudflare’s automatic DDoS mitigation pipeline, the initial packet filtering solution based on Iptables, and why Cloudflare had to introduce userspace offload. Bertin also describes how they switched from a proprietary offload technology to XDP for network stack bypass and how they are using XDP to load balance traffic. Filmed at qconlondon.com.
Gilberto Bertin works as a System Engineer at Cloudflare London. After working on variety of technologies like P2P VPNs and userspace TCP/IP stacks, he joined the Cloudflare DDoS team in London to help filter all the bad internet traffic.
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...PROIDEA
Users of modern Linux containerization technologies are frequently at loss with what kind of security guarantees are delivered by tools they use. Typical questions range from Can these be used to isolate software with known security shortcomings and rich history of security vulnerabilities? to even Can I used such technique to isolate user-generated and potentially hostile assembler payloads?
Modern Linux OS code-base as well as independent authors provide a plethora of options for those who desire to make sure that their computational loads are solidly confined. Potential users can choose from solutions ranging from Docker-like confinement projects, through Xen hypervisors, seccomp-bpf and ptrace-based sandboxes, to isolation frameworks based on hardware virtualization (e.g. KVM).
The talk will discuss available today techniques, with focus on (frequently overstated) promises regarding their strength. In the end, as they say: “Many speed bumps don’t make a wall
This presentation introduces Data Plane Development Kit overview and basics. It is a part of a Network Programming Series.
First, the presentation focuses on the network performance challenges on the modern systems by comparing modern CPUs with modern 10 Gbps ethernet links. Then it touches memory hierarchy and kernel bottlenecks.
The following part explains the main DPDK techniques, like polling, bursts, hugepages and multicore processing.
DPDK overview explains how is the DPDK application is being initialized and run, touches lockless queues (rte_ring), memory pools (rte_mempool), memory buffers (rte_mbuf), hashes (rte_hash), cuckoo hashing, longest prefix match library (rte_lpm), poll mode drivers (PMDs) and kernel NIC interface (KNI).
At the end, there are few DPDK performance tips.
Tags: access time, burst, cache, dpdk, driver, ethernet, hub, hugepage, ip, kernel, lcore, linux, memory, pmd, polling, rss, softswitch, switch, userspace, xeon
This Booklet is designed to help CCIE candidates to prepare themselves for the CCIE written and/or the lab exam. However, this is not a complete study
reference. It is just a series of the author’s personal notes, written down during his pre-lab, and further studies, in a form of mind maps, based mainly on
CISCO Documentation for IOS 12.4T. The main goal of this material is to provide quick and easy-to-skim method of refreshing cadidate’s existing knowledge.
All effort has been made to make this Booklet as precise and correct as possible, but no warranty is implied. CCIE candidates are strongly encouradged to
prepare themselves using other comprehensive study materials like Cisco Documentation (www.cisco.com/web/psa/products/index.html), Cisco Press books
(www.ciscopress.com), and other well-known vendor’s products, before going through this Booklet. The autor of this Booklet takes no responsibility, nor
liablity to any person or entity with respect to loss of any information or failed tests or exams arising from the information contained in this Booklet.
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
NDIV is a young, very simple, yet efficient network traffic diverter. Its purpose is to help build network applications that intercept packets at line rate with a very low processing overhead. A first example application is a stateless HTTP server reaching line rate on all packet sizes.
Willy Tarreau, HaproxyTech
Networking revolution in last 6-7 years. This document shows the very brief of high level concept in changing Networking technology from legacy networking to future ideas.
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
In diesem Vortrag wird es zunächst einen kurzen Überblick über den aktuellen Stand im Bereich der Streaming-Datenanalyse geben. Danach wird es mit einer kleinen Einführung in das Apache-Flink-System zur Echtzeit-Datenanalyse weitergehen, bevor wir tiefer in einige der interessanten Eigenschaften eintauchen werden, die Flink von den anderen Spielern in diesem Bereich unterscheidet. Dazu werden wir beispielhafte Anwendungsfälle betrachten, die entweder direkt von Nutzern stammen oder auf unserer Erfahrung mit Nutzern basieren. Spezielle Eigenschaften, die wir betrachten werden, sind beispielsweise die Unterstützung für die Zerlegung von Events in einzelnen Sessions basierend auf der Zeit, zu der ein Ereignis passierte (event-time), Bestimmung von Zeitpunkten zum jeweiligen Speichern des Zustands eines Streaming-Programms für spätere Neustarts, die effiziente Abwicklung bei sehr großen zustandsorientierten Streaming-Berechnungen und die Zugänglichkeit des Zustandes von außerhalb.
Similar to Lnx eng traffic_control-1.0.7-compact (20)
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
Gen Z and the marketplaces - let's translate their needsLaura Szabó
The product workshop focused on exploring the requirements of Generation Z in relation to marketplace dynamics. We delved into their specific needs, examined the specifics in their shopping preferences, and analyzed their preferred methods for accessing information and making purchases within a marketplace. Through the study of real-life cases , we tried to gain valuable insights into enhancing the marketplace experience for Generation Z.
The workshop was held on the DMA Conference in Vienna June 2024.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
Instagram has become one of the most popular social media platforms, allowing people to share photos, videos, and stories with their followers. Sometimes, though, you might want to view someone's story without them knowing.
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
5. TC: Total ControlTC: Total Control
Contents
Traffic
Copyright Notice
1 TC: Traffic Control
2 TC origins and development
3 The Naming of Schedulers ...
4 What traffic? Where?
5 Sizes that Matter
6 What Control?
7 How is Traffic Controlled?
8 Queueing Schedulers
9 Throughput vs Lag
10 /sys and /proc
1
11 pfifo_fast Qdisc
12 ToS/DS-Prio Mapping
13 Example: Display Qdisc
14 The TBF Qdisc
15 Classless TBF Qdisc At Work
16 Classful HTB Qdisc
17 Classful HTB Qdisc At Work
18 Filter classifiers
19 Wish list
20 Docs & Credits
6. TC origins and developmentTC origins and development
● TC first appeared in kernel 2.2, developped by
Alexey N. Kustnetzov
● Many additions and extensions developped ever
since
●
Latest addition1
is Berkley Packet Filter
“programmable classifier and actions for
ingress/egress queueing disciplines”, available
since kernel 3.18
● New qdisc cake is been worked on
1) That I am aware of :-)
2
7. It isn’t just one of your holiday games.
The Naming of Schedulers
Is a Difficult Matter
The Naming of Schedulers
Is a Difficult Matter
● Queueing (|Packet) (|Discipline) (Qdisc)
● (|Packet) Scheduler (|Algorithm)
● (|Packet) Queueing (|Algorithm)
Any random combination of strings will do:
packet
scheduler
queueing
algorithm
discipline
3
8. What traffic? Where?What traffic? Where?
Process Socket
write()
send bufferProcess data
data copied
NIC
egress
buffer
Whatever goes through a socket can be scheduled:
4
9. What traffic? Where?What traffic? Where?
Process Socket
write()
send bufferProcess data
data copied
NIC
Scheduling and shaping only affect outbound
packets, not incoming ones
egress
buffer
Whatever goes through a socket can be scheduled:
q
d
i
s
c
SKB, Socket Buffer
AKA
“Driver Queue”This is where the packet
scheduler does it's job
4
10. What traffic? Where?What traffic? Where?
Process Socket
write()
send bufferProcess data
data copied
NIC
Filtering, on the other hand, can affect both
inbound and outbound packets
egress
buffer
Whatever goes through a socket can be scheduled:
SKB, Socket Buffer
AKA
“Driver Queue”
q
d
i
s
c
4
This is where the packet
scheduler does it's job
11. What traffic? Where?What traffic? Where?
Process Socket
write()
send bufferProcess data
data copied
NIC
egress
buffer
HW_hard_start_xmit
function
Transmit Queue
Whatever goes through a socket can be scheduled:
HW = name of hardware
NIC driver
q
d
i
s
c
4
12. Sizes that MatterSizes that Matter
● Socket buffer sizes:
➢ Each driver sets it's own tx_ring, rx_ring
➢ Application can set SO_SNDBUF and
SO_RCVBUF with setsockopt(2)
/proc/sys/net/core/rmem_default
/proc/sys/net/core/wmem_default
● Default transmit size: 1000 packets
ether_setup(): net/ethernet/eth.c
dev>tx_queue_len = 1000; /* Ethernet wants good queues */
Factors impacting packet transm. timings:
5
13. Sizes that MatterSizes that Matter
●
Receiving end backlog1
size: 1000 packets2
:
(/proc/sys/net/core/netdev_max_backlog)
● Queueing disciplines have their own buffer(s)
➢ See pfifo_fast ahead, for instance
● Packet size (standard, jumbo or super sized)
● Capability of the kernel/CPU to keep up with
the flux (load, jiffies…)
● Number of hops (switches, routers, …)
➢ And funny hardware interactions
1) Maximum number of input packets received before the kernel can process them
2) For non-NAPI devices/drivers (< 2.4.20)
5
14. What Control?What Control?
Traffic Control is multifaceted:
tc
SHAPING
SCHEDULING
POLICING
DROPPING
Control traffic rate of
transmission
Packets transmission scheduling
(reordering/prioritizing)
Rules to apply to ingress traffic
Clipping of traffic exceeding a
set bandwidth
= ingress = egress
6
15. What Control?What Control?
Traffic Control is multifaceted:
tc
SHAPING
SCHEDULING
POLICING
DROPPING
Control traffic rate of
transmission
Packets transmission scheduling
(reordering/prioritizing)
Rules to apply to ingress traffic
Clipping of traffic exceeding a
set bandwidth
= ingress = egress
Not
covered!
6
16. How is Traffic Controlled?How is Traffic Controlled?
Traffic Control uses Queue Disciplines:
QDISC FILTERS
QDISC
Classes
FILTERSClasses
QDISC
7
17. How is Traffic Controlled?How is Traffic Controlled?
Traffic Control uses Queue Disciplines:
QDISC FILTERS
QDISC
Classes
FILTERSClasses
Classful QDISC
QDISC
Classless qdisc
root qdisc
Only leafs
can be
classless
7
18. Queueing SchedulersQueueing Schedulers
1 pfifo_fast three-band packet-FIFO (default classless qdisc)
2 prio priority queueing discipline (classful)
3 pfifo packet-limited FIFO
4 bfifo byte-limited FIFO
5 cbq Class Based Queueing
6 htb Hierarchical Token Bucket (replacement for CBQ, 2.4.20)
7 tbf Token Bucket Filter
8 red Random Early Detection
9 choke Choose and Keep for (un)responsive flow
10 codel Controlled-Delay Active Queue Management
11 drr Deficit Round Robin scheduler
/proc/sys/net/core/default_qdisc
8
20. Queueing SchedulersQueueing Schedulers
1 ematch Extended matches for use with "basic" or "flow" filters
2 bpf BPF programmable classifier and actions (3.18)
2a cBPF Classic Berkeley Packet Filter
2b eBPF Extended Berkeley Packet Filter
NEW!
cBPF actually always executes eBPF
Can be attached to qdiscs for filtering:
8
21. Queueing SchedulersQueueing Schedulers
Classfull qdiscs use one of three methods to
classify packets:
1) Type Of Service/Differentiated Services
2) filters
3)skb>priority field, i.e.
SO_PRIORITY option set by application
8
22. Queueing SchedulersQueueing Schedulers
Root qdisc and default queue length:
[alessandro@localhost ~]$ ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 1000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[alessandro@localhost ~]$
8
23. Queueing SchedulersQueueing Schedulers
Root qdisc and default queue length:
[alessandro@localhost ~]$ ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 1000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[alessandro@localhost ~]$
default qdiscdefault queue lenght (packets)
8
24. Queueing SchedulersQueueing Schedulers
Root qdisc and default queue length:
[alessandro@localhost ~]$ ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 1000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[alessandro@localhost ~]$
[root@localhost ~]# ip link set dev eth0 txqueuelen 2000
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]# tc qdisc replace dev eth0 root prio
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]#
They can be changed this way:
8
25. Queueing SchedulersQueueing Schedulers
Root qdisc and default queue length:
[alessandro@localhost ~]$ ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 1000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[alessandro@localhost ~]$
[root@localhost ~]# ip link set dev eth0 txqueuelen 2000
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]# tc qdisc replace dev eth0 root prio
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]#
They can be changed this way:
8
26. Queueing SchedulersQueueing Schedulers
Root qdisc and default queue length:
[alessandro@localhost ~]$ ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 1000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[alessandro@localhost ~]$
[root@localhost ~]# ip link set dev eth0 txqueuelen 2000
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]# tc qdisc replace dev eth0 root prio
[root@localhost ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP mode
DEFAULT group default qlen 2000
link/ether 00:1a:92:5f:1a:73 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]#
They can be changed this way:
8
27. Queueing SchedulersQueueing Schedulers
Queues are run by kernel at each jiffy
Jiffies are set to:
● 100 Hz, fixed value, up to all kernels 2.4
● 1000 Hz, kernels 2.6.0 to 2.6.12
● Selectable among values 100, 250 (default)
and 1000, from kernel 2.6.13
● Beginning kernel 2.6.20, selectable among
values 100, 250, 300 and 1000
# CONFIG_HZ_PERIODIC is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_300=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=300
8
28. Queueing SchedulersQueueing Schedulers
Queues are run by kernel at each jiffy
Jiffies are set to:
● 100 Hz, fixed value, up to all kernels 2.4
● 1000 Hz, kernels 2.6.0 to 2.6.12
● Selectable among values 100, 250 (default)
and 1000, from kernel 2.6.13
● Beginning kernel 2.6.20, selectable among
values 100, 250, 300 and 1000
# CONFIG_HZ_PERIODIC is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_300=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=300
!
8
29. Throughput vs LagThroughput vs Lag
Sending out 300 packets/sec 1500 byte each
450KB/sec traffic is generated
How do I get more?
Of course, first thing that comes to mind is:
● At each jiffie, flush all ready to go packets
queued in buffer
9
30. Throughput vs LagThroughput vs Lag
Ways to use more bandwith/lower load:
● Jumbo frames (9000 byte packets, old idea, did not
became a standard)
●
LRO, Large Receive Offload1
(and friends: TSO
or LSO, UFO, GSO, since 2.6.18)
● Qdiscs queue not single packets data, but
descriptors to SKB that hold several packets
1) Available in NAPI drivers
2) TCP Segmentation Offload, aka Large Segmentation Offload, UDP
Fragmentation Offload, Generic Segmentation Offload.
9
31. Throughput vs LagThroughput vs Lag
How large a packet can a SKB hold?
● SKB can hold larger than 1500 byte packets
● For Ipv4, the top limit is 65,536 bytes (Total
Lenght header field is 16bit)
● NIC hardware splits this into <= MTU units
➢ Qdisc queues SKB descriptors of super-
packets sized > 1500 bytes
➢ Software or hardware splits them before
putting them on the wire
9
32. Throughput vs LagThroughput vs Lag
Settings visible/settable with ethtool:
[root@localhost ~]# ethtool showfeatures eth0 | grep E
> '(segmentation|offload):'
tcpsegmentationoffload: off
txtcpsegmentation: off
txtcpecnsegmentation: off [fixed]
txtcp6segmentation: off
udpfragmentationoffload: off [fixed]
genericsegmentationoffload: on
genericreceiveoffload: on
largereceiveoffload: off [fixed]
rxvlanoffload: on
txvlanoffload: on
txfcoesegmentation: off [fixed]
txgresegmentation: off [fixed]
txipipsegmentation: off [fixed]
txsitsegmentation: off [fixed]
txudp_tnlsegmentation: off [fixed]
l2fwdoffload: off [fixed]
[root@localhost ~]#
9
33. Throughput vs LagThroughput vs Lag
Settings visible/settable with ethtool:
[root@localhost ~]# ethtool showfeatures eth0 | grep E
> '(segmentation|offload):'
tcpsegmentationoffload: off
txtcpsegmentation: off
txtcpecnsegmentation: off [fixed]
txtcp6segmentation: off
udpfragmentationoffload: off [fixed]
genericsegmentationoffload: on
genericreceiveoffload: on
largereceiveoffload: off [fixed]
rxvlanoffload: on
txvlanoffload: on
txfcoesegmentation: off [fixed]
txgresegmentation: off [fixed]
txipipsegmentation: off [fixed]
txsitsegmentation: off [fixed]
txudp_tnlsegmentation: off [fixed]
l2fwdoffload: off [fixed]
[root@localhost ~]#
Let's unset this one
9
34. Throughput vs LagThroughput vs Lag
Settings visible/settable with ethtool:
[root@localhost ~]# ethtool features eth0 gso off
[root@localhost ~]# ethtool showfeatures eth0 | grep E
> genericsegmentationoffload
genericsegmentationoffload: off
[root@localhost ~]#
9
35. Throughput vs LagThroughput vs Lag
The larger the queue, the higher the lag...
● Take a 100Mbit link = 12,500,000 bytes/sec
● Take a qdisc buffer of 128 descriptors
● Let's assume 1 descriptor per 1500B packet
● Queue holds 127 high-throughput packets
● One small low-delay UDP packet arrives
➢ How long shall it wait before it is sent out?
1,500*127/12,500,000=0.01524sec
15.24ms! Far from Real Time!
9
36. Throughput vs LagThroughput vs Lag
The larger the queue, the higher the lag...
● BQL, Byte Queue Limit, designed (kernel
3.3.0) to dinamically limit the amount of data
queued into driver's queue
● It does not resize the buffer size, it regulates
it's use
● /sys interface directory:
find /sys/devices name byte_queue_limits
● Available on a limited set of drivers
More about this on http://www.bufferbloat.net/
9
38. BQL /sys interface directory:
[alessandro@localhost ~]$ find /sys/devices/ type d name byte_queue_limits
/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/net/wlan0/queues/tx0/byte_queue_
limits
/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/net/wlan0/queues/tx1/byte_queue_
limits
/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/net/wlan0/queues/tx2/byte_queue_
limits
/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/net/wlan0/queues/tx3/byte_queue_
limits
/sys/devices/pci0000:00/0000:00:1c.2/0000:02:00.0/net/eth0/queues/tx0/byte_queue_l
imits
/sys/devices/virtual/net/lo/queues/tx0/byte_queue_limits
[alessandro@localhost ~]$ ls /sys/devices/pci0000:00/0000:00:1c.2/0000:02:00.0/net/
eth0/queues/tx0/byte_queue_limits
hold_time inflight limit limit_max limit_min
[alessandro@localhost ~]$ cat /sys/devices/pci0000:00/0000:00:1c.2/0000:02:00.0/net/
eth0/queues/tx0/byte_queue_limits/limit_max
1879048192
[alessandro@localhost ~]$ This is 1792 MiB!
/sys and /proc/sys and /proc10
39. /proc interface files to take note of:
[alessandro@localhost ~]$ ll o /proc/sys/net/ipv4/tcp_{{r,w}mem,tso_win_divisor,
min_tso_segs,low_latency,limit_output_bytes}
rwrr 1 root 0 set 10 20:19 /proc/sys/net/ipv4/tcp_limit_output_bytes
rwrr 1 root 0 set 10 20:22 /proc/sys/net/ipv4/tcp_low_latency
rwrr 1 root 0 set 10 20:22 /proc/sys/net/ipv4/tcp_min_tso_segs
rwrr 1 root 0 set 10 20:22 /proc/sys/net/ipv4/tcp_rmem
rwrr 1 root 0 set 10 20:22 /proc/sys/net/ipv4/tcp_tso_win_divisor
rwrr 1 root 0 set 10 20:22 /proc/sys/net/ipv4/tcp_wmem
[alessandro@localhost ~]$
...just in case you didn't have enough of knobs, levers,
dials, buttons, switches, throttles, valves, levees,
readings, metres, settings, options, queues, limits,
buffers, rings, warnings, lights, sirens, signs, tables,
alarms, interfaces, keys, ...
/sys and /proc/sys and /proc10
40. pfifo_fast qdiscpfifo_fast qdisc
Simple, fast and default Queueing Discipline:
● pfifo_fast
FIFO queue: first packet arrived is the first
served
Three-band FIFO queue organization:
0 = Minimum Delay/Interactive
1 = Best Effort
2 = Bulk
Kernel maps them to DSCP (prev. TOS) bits
11
42. pfifo_fast qdiscpfifo_fast qdisc
RFC 1349 (1992) defined TOS like this:
0 1 2 3 4 5 6 7
D T R MC 0
Precedence TOS MBZ1
MMC Min. Monetary Cost MT Max. Throughput
MR Max. Reliability MD Min. Delay
1) MBZ = Must Be Zero
11
43. pfifo_fast qdiscpfifo_fast qdisc
RFC 1349 (1992) defined TOS like this:
0 1 2 3 4 5 6 7
D T R MC 0
Precedence TOS MBZ
bit6 Filler bit4 Bulk
bit5 Best Effort bit3 Interactive
In Tuxese:
11
44. pfifo_fast qdiscpfifo_fast qdisc
Linux mapped TOS bits into bands:
Bits TOS Band
0000 Normal Service 1
0001 Min. Monetary Cost 2
0010 Max. Reliability 1
0011 mmc+mr 1
0100 Max. Throughput 2
0101 mmc+mt 2
0110 mr+mt 2
0111 mmc+mr+mt 2
Bits TOS Band
1000 Min. Delay 0
1001 mmc+md 0
1010 mr+md 0
1011 mmc+mr+md 0
1100 mt+md 1
1101 mmc+mt+md 1
1110 mr+mt+md 1
1111 mmc+mr+mt+md 1
11
45. pfifo_fast qdiscpfifo_fast qdisc
RFC 2474 (1998) turned TOS into DS and
RFC 3168 (2001) added ECN:
0 1 2 3 4 5 6 7
Diff. Services Code Point X X
Differentiated Services (traffic classes) ECN
● DSCP indexes up to 64 distinct Per Hop Behaviours
● Default Forwarding PHB is the only mandatory one
11
50. The TBF QdiscThe TBF Qdisc
● TBF, Token Bucket Filter: only shapes traffic,
does no scheduling
● Easy qdisk to limit egress traffic
● Simple, low overhead
Token
Bucket
Input Data
Tokens
14
51. The TBF QdiscThe TBF Qdisc
Token Bucket: it's the qdisc buffer
Tokens: virtual unit of data managed
Token Flow: set number of tokens per unit of
time that replenish the bucket
Token
Bucket
Input Data
Tokens
Input Data
Flow
Input Token
Flow
14
52. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
Token
Bucket
Tokens are removed from bucket when packet
is sent
14
53. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
Token
Bucket
Tokens are removed from bucket when packet
is sent
14
54. The TBF QdiscThe TBF Qdisc
● Input Token Flow is faster than Input Data
Flow
● Bucket contains both empty tokens and
tokens that hold data to output
Token
Bucket
Input Data
Tokens
Input Data
Flow
Input Token
Flow
Data-holding
token
Empty token
ScenarioI14
55. The TBF QdiscThe TBF Qdisc
● Data-holding tokens are output and deleted
from bucket
Input Data
Tokens
NIC
Token
Bucket
ScenarioI14
56. The TBF QdiscThe TBF Qdisc
● Data-holding tokens are output and deleted
from bucket
Input Data
Tokens
NIC
Token
Bucket
ScenarioI14
57. The TBF QdiscThe TBF Qdisc
● Empty tokens and data-holding tokens keep
filling the bucket
Input Data
Tokens
NIC
Input Token
Flow
Input Data
Flow
ScenarioI
Token
Bucket
14
58. The TBF QdiscThe TBF Qdisc
● Empty tokens and data-holding tokens keep
filling the bucket
● Data-holding tokens are output and deleted
from bucket
Input Data
Tokens
NIC
ScenarioI
Token
Bucket
14
59. The TBF QdiscThe TBF Qdisc
● Empty tokens and data-holding tokens keep
filling the bucket
● Data-holding tokens are output and deleted
from bucket
Input Data
Tokens
NIC
ScenarioI
Token
Bucket
14
60. The TBF QdiscThe TBF Qdisc
● Eventually the bucket is full of empty tokens.
Input Data
Tokens
NIC
Token
Bucket
ScenarioI14
61. The TBF QdiscThe TBF Qdisc
● Eventually the bucket is full of empty tokens.
● Token flow slows down
Input Data
Tokens
NIC
Token
Bucket
ScenarioI14
62. The TBF QdiscThe TBF Qdisc
Empty tockens in bucket can be used to burst
data out faster than token-rate
Input Data
Tokens
NIC
Token
Bucket
ScenarioI14
63. The TBF QdiscThe TBF Qdisc
Empty tockens in bucket can be used to burst
data out faster than token-rate
Input Data
Tokens
NIC
ScenarioI
Token
Bucket
14
64. The TBF QdiscThe TBF Qdisc
Empty tockens in bucket can be used to burst
data out faster than token-rate
Input Data
Tokens
NIC
ScenarioI
Token
Bucket
14
65. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
Empty tockens in bucket can be used to burst
data out faster than token-rate
ScenarioI
Token
Bucket
14
66. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
Token
Bucket
NIC
ScenarioII
● Token Flow slower than Input Data Flow
14
67. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
Token
Bucket
14
68. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
Token
Bucket
14
69. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
Token
Bucket
14
70. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
Token
Bucket
14
71. The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
Token
Bucket
14
72. Token
Bucket
The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
● Eventually the bucket will be empty
14
73. Token
Bucket
The TBF QdiscThe TBF Qdisc
Input Data
Tokens
NIC
ScenarioII
● Token Flow slower than Input Data Flow
● Eventually the bucket will be empty
● When SKB is full, packets start being dropped
14
STOP
74. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
TBF, Token Bucket Filter: only shapes traffic,
does no scheduling
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]#
Client creates as much TCP traffic as possible:
[alessandro@client ~]$ telnet server chargen > /dev/null
[root@client ~]# sar n DEV 1
Linux 4.1.6.atom0 (client) 31/08/2015 _x86_64_ (2 CPU)
Traffic is monitored on the client:
15
77. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
● limit (bucket buffer size [bytes]) or latency
(time data can sit in bucket) must be set (MBS)
● burst (aka buffer or maxburst [bytes]) MBS
● rate MBS
15
78. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
● limit (bucket buffer size [bytes]) or latency
(time data can sit in bucket) must be set (MBS)
● burst (aka buffer or maxburst [bytes]) MBS
● rate MBS
Rant: Why did they choose these names?!
15
79. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
ID of parent QDisc
● limit (bucket buffer size [bytes]) or latency
(time data can sit in bucket) must be set (MBS)
● burst (aka buffer or maxburst [bytes]) MBS
● rate MBS
rate/HZ minimum
Never < MTU!
128 1,500
byte packets
15
80. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]# tc qdisc show dev eth0
qdisc tbf 8003: root refcnt 2 rate 1760Kbit burst 3Kb lat 916.9ms
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
15
81. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]# tc qdisc show dev eth0
qdisc tbf 8003: root refcnt 2 rate 1760Kbit burst 3Kb lat 916.9ms
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
handle to reference
this qdisc
15
82. Classless TBF Qdisc At WorkClassless TBF Qdisc At Work
[root@server ~]# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@server ~]# tc qdisc add dev eth0 root tbf rate 220kbps burst 3kb limit 200kb
[root@server ~]# tc qdisc show dev eth0
qdisc tbf 8003: root refcnt 2 rate 1760Kbit burst 3Kb lat 916.9ms
[root@server ~]#
We can only shape traffic from the origin, that is
on the server
?Easy enough:
● qdisc buffer size: 200KB
● at 220KB/sec rate, buffer empties in
200/220=0.909 seconds
● The difference is due to burstiness
15
85. Classful HTB QdiscClassful HTB Qdisc
Hierarchical Token Bucket: a qdisc that splits
data flow into several TBF-like throttled classes
HTB
root qdisc
1:0
class 1:1
class 1:2
class 1:3
100mbit
50mbit
40mbit
10mbit
16
86. Classful HTB QdiscClassful HTB Qdisc
Hierarchical Token Bucket: a qdisc that splits
data flow into several TBF-like throttled classes
HTB
root qdisc
1:0
class 1:1
class 1:2
class 1:3
Must be 0
Can be understood
100mbit
50mbit
40mbit
10mbit
Arbitrary
Arbitrary Must be > 0
16
87. Classful HTB QdiscClassful HTB Qdisc
Hierarchical Token Bucket: a qdisc that splits
data flow into several TBF-like throttled classes
HTB
root qdisc
1:0
class 1:1
class 1:2
class 1:3
qdisc
qdisc
qdisc
50mbit
40mbit
10mbit
100mbit
Qdiscs are attached to classes
16
88. Classful HTB QdiscClassful HTB Qdisc
Hierarchical Token Bucket: a qdisc that splits
data flow into several TBF-like throttled classes
HTB
root qdisc
1:0
class 1:1
class 1:2
class 1:3
We need filters to associate traffic to each class
100mbit
qdisc
qdisc
qdisc
50mbit
40mbit
10mbit
Filter 1
Filter 2
Filter 3
16
89. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 root handle 1: htb default 1
[root@server ~]#
HTB: setup of root qdisc
This queue's handle is 1:0
(qdisc → :0 understood)
Default class has
minor-id :1
Handle identifier: X:Y, Y = 0 qdisc, Y > 0 class
17
90. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 root handle 1: htb default 1
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 6 direct_qlen 1000
[root@server ~]# tc class add dev eth0 parent 1: classid 1:1 htb rate 50mbit
> ceil 55mbit
[root@server ~]#
HTB: classes are attached to root qdisc
Handle identifier: X:Y, Y = 0 qdisc, Y > 0 class
17
91. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 root handle 1: htb default 1
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 6 direct_qlen 1000
[root@server ~]# tc class add dev eth0 parent 1: classid 1:1 htb rate 50mbit
> ceil 55mbit
[root@server ~]#
HTB: classes are attached to root qdisc
Handle identifier: X:Y, Y = 0 qdisc, Y > 0 class
17
92. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 root handle 1: htb default 1
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 6 direct_qlen 1000
[root@server ~]# tc class add dev eth0 parent 1: classid 1:1 htb rate 50mbit
> ceil 55mbit
[root@server ~]#
HTB: classes are attached to root qdisc
Handle identifier: X:Y, Y = 0 qdisc, Y > 0 class
This class' parent is 1:0
i.e. the root qdisc
Must Be Set!
This class id is 1:1
Guaranteed
rate
Top
rate
17
93. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 root handle 1: htb default 1
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 6 direct_qlen 1000
[root@server ~]# tc class add dev eth0 parent 1: classid 1:1 htb rate 50mbit
> ceil 55mbit
[root@server ~]# tc class add dev eth0 parent 1: classid 1:2 htb rate 40mbit
> ceil 44mbit
[root@server ~]# tc class add dev eth0 parent 1: classid 1:3 htb rate 10mbit
> ceil 11mbit
[root@server ~]# tc class show dev eth0
class htb 1:1 root prio 0 rate 50Mbit ceil 55Mbit burst 22425b cburst 24502b
class htb 1:2 root prio 0 rate 40Mbit ceil 44Mbit burst 18260b cburst 19932b
class htb 1:3 root prio 0 rate 10Mbit ceil 11Mbit burst 5763b cburst 6182b
[root@server ~]#
HTB: classes are attached to root qdisc and
are checked
Handle identifier: X:Y, Y = 0 qdisc, Y > 0 class
17
94. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc statistics class show dev eth0
class htb 1:1 root prio 0 rate 50Mbit ceil 55Mbit burst 22425b cburst 24502b
Sent 41298 bytes 359 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 359 borrowed: 0 giants: 0
tokens: 55843 ctokens: 55489
class htb 1:2 root prio 0 rate 40Mbit ceil 44Mbit burst 18260b cburst 19932b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 57078 ctokens: 56625
class htb 1:3 root prio 0 rate 10Mbit ceil 11Mbit burst 5763b cburst 6182b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 72062 ctokens: 70250
[root@server ~]#
All traffic is now handled by default class 1:1
17
95. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc statistics class show dev eth0
class htb 1:1 root prio 0 rate 50Mbit ceil 55Mbit burst 22425b cburst 24502b
Sent 41298 bytes 359 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 359 borrowed: 0 giants: 0
tokens: 55843 ctokens: 55489
class htb 1:2 root prio 0 rate 40Mbit ceil 44Mbit burst 18260b cburst 19932b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 57078 ctokens: 56625
class htb 1:3 root prio 0 rate 10Mbit ceil 11Mbit burst 5763b cburst 6182b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 72062 ctokens: 70250
[root@server ~]#
All traffic is now handled by default class 1:1
17
96. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc statistics class show dev eth0
class htb 1:1 root prio 0 rate 50Mbit ceil 55Mbit burst 22425b cburst 24502b
Sent 41298 bytes 359 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 359 borrowed: 0 giants: 0
tokens: 55843 ctokens: 55489
class htb 1:2 root prio 0 rate 40Mbit ceil 44Mbit burst 18260b cburst 19932b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 57078 ctokens: 56625
class htb 1:3 root prio 0 rate 10Mbit ceil 11Mbit burst 5763b cburst 6182b
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
lended: 0 borrowed: 0 giants: 0
tokens: 72062 ctokens: 70250
[root@server ~]#
All traffic is now handled by default class 1:1
Qdisc statistics
are available only on
non-default qdiscs
17
99. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 parent 1:1 handle 2:1 pfifo &&
tc qdisc add dev eth0 parent 1:2 handle 3:2 pfifo limit 800 &&
tc qdisc add dev eth0 parent 1:3 handle 4:2 pfifo limit 200
[root@server ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP mode
DEFAULT group default qlen 1000
link/ether 00:aa:00:ca:83:af brd ff:ff:ff:ff:ff:ff
[root@server ~]#
Classes can be given qdiscs:
17
100. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 parent 1:1 handle 2:1 pfifo &&
tc qdisc add dev eth0 parent 1:2 handle 3:2 pfifo limit 800 &&
tc qdisc add dev eth0 parent 1:3 handle 4:2 pfifo limit 200
[root@server ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP mode
DEFAULT group default qlen 1000
link/ether 00:aa:00:ca:83:af brd ff:ff:ff:ff:ff:ff
[root@server ~]#
Classes can be given qdiscs:
Qdisc queue size
When no limit is set
default qdisc queue size
is interface txqueuelen
X
Sigh!
What ip link set calls txqueuelen is what ip
link list calls qlen which is what pfifo calls limit
17
101. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 parent 1:1 handle 2:1 pfifo &&
tc qdisc add dev eth0 parent 1:2 handle 3:2 pfifo limit 800 &&
tc qdisc add dev eth0 parent 1:3 handle 4:2 pfifo limit 200
[root@server ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP mode
DEFAULT group default qlen 1000
link/ether 00:aa:00:ca:83:af brd ff:ff:ff:ff:ff:ff
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 133 direct_qlen
1000
qdisc pfifo 2: parent 1:1 limit 1000p
qdisc pfifo 3: parent 1:2 limit 800p
qdisc pfifo 4: parent 1:3 limit 200p
[root@server ~]#
Qdiscs check:
17
102. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc qdisc add dev eth0 parent 1:1 handle 2:1 pfifo &&
tc qdisc add dev eth0 parent 1:2 handle 3:2 pfifo limit 800 &&
tc qdisc add dev eth0 parent 1:3 handle 4:2 pfifo limit 200
[root@server ~]# ip link list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP mode
DEFAULT group default qlen 1000
link/ether 00:aa:00:ca:83:af brd ff:ff:ff:ff:ff:ff
[root@server ~]# tc qdisc show dev eth0
qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 133 direct_qlen
1000
qdisc pfifo 2: parent 1:1 limit 1000p
qdisc pfifo 3: parent 1:2 limit 800p
qdisc pfifo 4: parent 1:3 limit 200p
[root@server ~]#
Qdiscs check:
17
103. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1
u32 match ip sport 19 0xffff flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 70 0xffff flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 80 0xffff flowid 1:3
[root@server ~]#
Filters tell tc to what class direct what traffic:
17
104. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1
u32 match ip sport 19 0xffff flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 70 0xffff flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 80 0xffff flowid 1:3
[root@server ~]#
Filters tell tc to what class direct what traffic:
Attach to
root qdisc
Lower priority numbered
traffic is dequeued first
Class ID that gets
sport 80 IP traffic
17
105. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1
u32 match ip sport 19 0xffff flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 70 0xffff flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 80 0xffff flowid 1:3
[root@server ~]#
Filters tell tc to what class direct what traffic:
u32 filter
matches
on anything
16bit
Mask
0xffff = exact match
Where matching
packets go
class htb 1:3 root prio 0 rate 10Mbit ceil 11Mbit burst 5763b cburst 6182b
17
106. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1
u32 match ip sport 19 0xffff flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 70 0xffff flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 0
u32 match ip sport 80 0xffff flowid 1:3
[root@server ~]# tc filter show dev eth0
filter parent 1: protocol ip pref 1 u32
filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1
filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0
flowid 1:1
match 00130000/ffff0000 at 20
filter parent 1: protocol ip pref 49151 u32
filter parent 1: protocol ip pref 49151 u32 fh 802: ht divisor 1
filter parent 1: protocol ip pref 49151 u32 fh 802::800 order 2048 key ht 802
bkt 0 flowid 1:3
match 00500000/ffff0000 at 20
filter parent 1: protocol ip pref 49152 u32
filter parent 1: protocol ip pref 49152 u32 fh 801: ht divisor 1
filter parent 1: protocol ip pref 49152 u32 fh 801::800 order 2048 key ht 801
bkt 0 flowid 1:2
[root@server ~]#
Filter check:
17
107. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
Recap: How we are set now on the server:
HTB
root qdisc
1:0
class 1:1
class 1:2
class 1:3
100mbit
qdisc
qdisc
qdisc
50mbit
40mbit
10mbit
sport 19
sport 70
sport 80
17
108. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
● Traffic from server:19 (chargen) to client: no
changes.
● Same 50%, ~6100kBps measured on client as
before.
17
115. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
● Similar thing regarding traffic from server:70
(gopher) to client.
● We get 39.99%, 4881.78kBps measured on
client.
17
116. Classful HTB Qdisc At WorkClassful HTB Qdisc At Work
● Similar thing regarding traffic from server:70
(gopher) to client.
● We get 39.99%, 4881.78kBps measured on
client.
● If client opens multiple connections to
separate ports on the server, traffick adds up
➢ I.e. chargen+http = 50%+10%=60%
➢ chargen+gopher = 50%+40%=90%
➢ gopher+http = 40%+10%=50%
➢ chargen+gopher+http = 50%+40%
+10%=100%
17
117. Filter classifiersFilter classifiers
Same thing can be done with iptables
marks instead of u32 matches:
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport chargen
j MARK setmark 5
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport gopher
j MARK setmark 6
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport http
j MARK setmark 7
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 5
fw flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 6
fw flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 7
fw flowid 1:3
[root@server ~]#
18
118. Filter classifiersFilter classifiers
Same thing can be done with iptables
marks instead of u32 matches:
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport chargen
j MARK setmark 5
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport gopher
j MARK setmark 6
[root@server ~]# iptables A OUTPUT t mangle o eth0 p tcp sport http
j MARK setmark 7
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 5
fw flowid 1:1
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 6
fw flowid 1:2
[root@server ~]# tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 7
fw flowid 1:3
[root@server ~]#
Not a u32 filter
18
What iptables calls a mark
tc calls a handle
119. Filter classifiersFilter classifiers18
1)fw, firewall
2)route, route
3)tcindex, tcindex
4)u32, u32
5)basic, basic
6)cgroup, Control Group Classifier
Filters can use several classifiers, including:
They allow tc to do many things that can be
done with netfilter.
121. Filter classifiersFilter classifiers18
1) netfilter does more things
2) netfilter is faster than packet scheduling
➢
except when eBPF is used1
3) packet flow inside Linux networking stack
must be kept in mind, expecially when
natting
But:
1) According to Michael Holzheu, “eBPF on the Mainframe - Packet Filtering and
More”, LinuxCon2015, Mon. Oct. 5th
122. Wish listWish list
● Documentation!
● lartc.org stopped in 2006, kernel 2.4
● man tc-filter(8) missing
● Tutorials lacking
● Wider:
➢ awareness of tc
➢ user base
➢ mention in certifications (LF and LPI)
● Naming consistency (txqueuelen = qlen, limit
= buffer, burst ≠ buffer etc.)
19
123. Docs & CreditsDocs & Credits
● Alexey Kuznetsov, first/main Linux scheduler developer
● Authors and maintainers of the Linux Advanced Routing and Traffic Control
HOWTO http://lartc.org/ (project needs to be revived and updated)
● OpenWRT developers:
http://wiki.openwrt.org/doc/howto/packet.scheduler/packet.scheduler
● Dan Siemon: Queueing in the Linux Network Stack (2013)
http://www.coverfire.com/articles/queueing-in-the-linux-network-stack/
● Advanced traffic control – ArchWiki
https://wiki.archlinux.org/index.php/Advanced_traffic_control
● Linux Kernel documentation
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
● Linux Foundation's iproute2_examples
http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute
2_examples
● The author thanks Mr. Giovambattista Vieri for his tips and encouragement
<g.vieri@ent-it.com>
20