Kube-proxy enables access to Kubernetes services (virtual IPs backed by pods) by configuring client-side load-balancing on nodes. The first implementation relied on a userspace proxy which was not very performant. The second implementation used iptables and is still the one used in most Kubernetes clusters. Recently, the community introduced an alternative based on IPVS. This talk will start with a description of the different modes and how they work. It will then focus on the IPVS implementation, the improvements it brings, the issues we encountered and how we fixed them as well as the remaining challenges and how they could be addressed. Finally, the talk will present alternative solutions based on eBPF such as Cilium.
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
BPF is one of the fastest emerging technologies of the Linux kernel. The talk provides an introduction to Cilium which brings the powers of BPF to Kubernetes and other orchestration systems to provide highly scalable and efficient networking, security and load balancing for containers and microservices. The talk will provide an introduction to the capabilities of Cilium today but also deep dives into the emerging roadmap involving networking at the socket layer and service mesh datapath capabilities to provide highly efficient connectivity between cloud native apps and sidecar proxies.
Cilium - API-aware Networking and Security for Containers based on BPFThomas Graf
Cilium is open source software for providing and transparently securing network connectivity and loadbalancing between application workloads such as application containers or processes. Cilium operates at Layer 3/4 to provide traditional networking and security services as well as Layer 7 to protect and secure use of modern application protocols such as HTTP, gRPC and Kafka. Cilium is integrated into common orchestration frameworks such as Kubernetes and Mesos.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Cilium - Container Networking with BPF & XDPThomas Graf
This talk demonstrates that programmability and performance does not require user space networking, it can be achieved in the kernel by generating BPF programs and leveraging the existing kernel subsystems. We will demo an early prototype which provides fast IPv6 & IPv4 connectivity to containers, container labels based security policy with avg cost O(1), and debugging and monitoring based on the per-cpu perf ring buffer. We encourage a lively discussion on the approach taken and next steps.
Kube-proxy is a Kubernetes component responsible to re-conciliate the state of the Service resources. This component can be configured in four different modes: userspace, iptables, IPVS or Kernel space (Windows). In big scales, the IPVS mode offers better performance resulting in an attractive offer. In this session, I'll try to explain the IPVS internals, and how Kubernetes automates the management of services through basic examples.
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
BPF is one of the fastest emerging technologies of the Linux kernel. The talk provides an introduction to Cilium which brings the powers of BPF to Kubernetes and other orchestration systems to provide highly scalable and efficient networking, security and load balancing for containers and microservices. The talk will provide an introduction to the capabilities of Cilium today but also deep dives into the emerging roadmap involving networking at the socket layer and service mesh datapath capabilities to provide highly efficient connectivity between cloud native apps and sidecar proxies.
Cilium - API-aware Networking and Security for Containers based on BPFThomas Graf
Cilium is open source software for providing and transparently securing network connectivity and loadbalancing between application workloads such as application containers or processes. Cilium operates at Layer 3/4 to provide traditional networking and security services as well as Layer 7 to protect and secure use of modern application protocols such as HTTP, gRPC and Kafka. Cilium is integrated into common orchestration frameworks such as Kubernetes and Mesos.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Cilium - Container Networking with BPF & XDPThomas Graf
This talk demonstrates that programmability and performance does not require user space networking, it can be achieved in the kernel by generating BPF programs and leveraging the existing kernel subsystems. We will demo an early prototype which provides fast IPv6 & IPv4 connectivity to containers, container labels based security policy with avg cost O(1), and debugging and monitoring based on the per-cpu perf ring buffer. We encourage a lively discussion on the approach taken and next steps.
Kube-proxy is a Kubernetes component responsible to re-conciliate the state of the Service resources. This component can be configured in four different modes: userspace, iptables, IPVS or Kernel space (Windows). In big scales, the IPVS mode offers better performance resulting in an attractive offer. In this session, I'll try to explain the IPVS internals, and how Kubernetes automates the management of services through basic examples.
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
Cilium is open source software for providing and transparently securing network connectivity and load balancing between application workloads such as application containers or processes. Cilium operates at Layer 3/4 to provide traditional networking and security services as well as Layer 7 to protect and secure use of modern application protocols such as HTTP, gRPC and Kafka. The foundation of Cilium is the new Linux kernel technology BPF which supports the dynamic insertion of BPF bytecode into the Linux kernel at various integration points. This presentation reveals the secrets of Kubernetes networking and gives you a deep dive into Cilium and why it is awesome!
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
As your service footprint grows, adding traffic control capabilities beyond stock solutions like kube-proxy becomes critical. Envoy provides fine grained routing control, load shedding, and metrics that help you scale your environment smoothly. We'll walk through several traffic control strategies using Envoy.
In the Cloud Native community, eBPF is gaining popularity, which can often be the best solution for solving different challenges with deep observability of system. Currently, eBPF is being embraced by major players.
Mydbops co-Founder, Kabilesh P.R (MySQL and Mongo Consultant) illustrates on debugging linux issues with eBPF. A brief about BPF & eBPF, BPF internals and the tools in actions for faster resolution.
This talk discusses the core concepts behind the Kubernetes extensibility model. We are going to see how to implement new CRDs, operators and when to use them to automate the most critical aspects of your Kubernetes clusters.
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
This talk will provide an introduction to injection options of Envoy and then deep dive into ongoing Linux kernel work that enables injecting Envoy while introducing as little latency as possible.
The servicemesh and the sidecar proxy model are on a steep trajectory to redefine many networking and security use cases. This talk explains and demos a new socket redirect Linux kernel technology that allows running Envoy with similar performance as if the sidecar was linked to the application using a UNIX domain socket. The talk will also give an outlook on how Envoy can use the recently merged kernel TLS functionality to gain access to the clear text payload transparently for end to end encrypted applications without requiring to decrypt and re-encrypt any data to further reduce the overhead and latency.
In this session, we’ll review how previous efforts, including Netfilter, Berkley Packet Filter (BPF), Open vSwitch (OVS), and TC, approached the problem of extensibility. We’ll show you an open source solution available within the Red Hat Enterprise Linux kernel, where extending and merging some of the existing concepts leads to an extensible framework that satisfies the networking needs of datacenter and cloud virtualization.
Open vSwitch Offload: Conntrack and the Upstream KernelNetronome
Offloading all or part of the Open vSwitch datapath to SmartNICs has been shown to not only release CPU resources on the server, but improve traffic processing performance. Recently steps have been made to support such offloading in the upstream Linux kernel. This has focused on creating an OVS datapath using the TC flower filter and utilizing the offload hooks already present here. This presentation focuses on how Connection Tracking (Conntrack) may fit into this model. It describes current work being undertaken with the Netfilter community to allow offloading of Conntrack entries. It continues to link this work with the offloading of Conntrack rules within OVS-TC.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
Get these visually appealing Kubernetes Concepts And Architecture PowerPoint Presentation Slides to discuss the process of operating containerized applications. You can display the need for containers by the company with the help of an open-source architecture PPT slideshow. The architecture of containers can be demonstrated with the help of a visually appealing PPT slideshow. The reasons for opting for Kubernetes by an organization can be explained to your teammates with the help of containers PowerPoint infographics. Highlight the roadmap for installing Kubernetes in the organization by using content-ready PPT slides. Take the assistance of visually appealing PPT templates to depict the major advantages of Kubernetes such as improving productivity, the stability of application run, and many more. After that, display 30 60 90 days plan to implement Kubernetes in the organization. Display the key components of Kubernetes with the help of a diagram using this professionally designed cluster architecture PPT layouts. Describe the functionality of each components of Kubernetes. Hence, download Kubernetes architecture PPT slides to easily and efficiently manage the clusters. https://bit.ly/34DWa7x
If you’re working with just a few containers, managing them isn't too complicated. But what if you have hundreds or thousands? Think about having to handle multiple upgrades for each container, keeping track of container and node state, available resources, and more. That’s where Kubernetes comes in. Kubernetes is an open source container management platform that helps you run containers at scale. This talk will cover Kubernetes components and show how to run applications on it.
Deep dive in container service discoveryDocker, Inc.
Service discovery and traffic load-balancing in the container ecosystem relies on different technologies, such as IPVS and iptables, and container orchestrators use different approaches. This talk will present in details how Docker Swarm and Kubernetes achieve this. The talk will continue with a demo showing how applications that are not managed by Kubernetes can take advantage of its native load-balancing. Finally, it will compare these approaches to service-mesh solutions.
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
Cilium is open source software for providing and transparently securing network connectivity and load balancing between application workloads such as application containers or processes. Cilium operates at Layer 3/4 to provide traditional networking and security services as well as Layer 7 to protect and secure use of modern application protocols such as HTTP, gRPC and Kafka. The foundation of Cilium is the new Linux kernel technology BPF which supports the dynamic insertion of BPF bytecode into the Linux kernel at various integration points. This presentation reveals the secrets of Kubernetes networking and gives you a deep dive into Cilium and why it is awesome!
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
As your service footprint grows, adding traffic control capabilities beyond stock solutions like kube-proxy becomes critical. Envoy provides fine grained routing control, load shedding, and metrics that help you scale your environment smoothly. We'll walk through several traffic control strategies using Envoy.
In the Cloud Native community, eBPF is gaining popularity, which can often be the best solution for solving different challenges with deep observability of system. Currently, eBPF is being embraced by major players.
Mydbops co-Founder, Kabilesh P.R (MySQL and Mongo Consultant) illustrates on debugging linux issues with eBPF. A brief about BPF & eBPF, BPF internals and the tools in actions for faster resolution.
This talk discusses the core concepts behind the Kubernetes extensibility model. We are going to see how to implement new CRDs, operators and when to use them to automate the most critical aspects of your Kubernetes clusters.
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
This talk will provide an introduction to injection options of Envoy and then deep dive into ongoing Linux kernel work that enables injecting Envoy while introducing as little latency as possible.
The servicemesh and the sidecar proxy model are on a steep trajectory to redefine many networking and security use cases. This talk explains and demos a new socket redirect Linux kernel technology that allows running Envoy with similar performance as if the sidecar was linked to the application using a UNIX domain socket. The talk will also give an outlook on how Envoy can use the recently merged kernel TLS functionality to gain access to the clear text payload transparently for end to end encrypted applications without requiring to decrypt and re-encrypt any data to further reduce the overhead and latency.
In this session, we’ll review how previous efforts, including Netfilter, Berkley Packet Filter (BPF), Open vSwitch (OVS), and TC, approached the problem of extensibility. We’ll show you an open source solution available within the Red Hat Enterprise Linux kernel, where extending and merging some of the existing concepts leads to an extensible framework that satisfies the networking needs of datacenter and cloud virtualization.
Open vSwitch Offload: Conntrack and the Upstream KernelNetronome
Offloading all or part of the Open vSwitch datapath to SmartNICs has been shown to not only release CPU resources on the server, but improve traffic processing performance. Recently steps have been made to support such offloading in the upstream Linux kernel. This has focused on creating an OVS datapath using the TC flower filter and utilizing the offload hooks already present here. This presentation focuses on how Connection Tracking (Conntrack) may fit into this model. It describes current work being undertaken with the Netfilter community to allow offloading of Conntrack entries. It continues to link this work with the offloading of Conntrack rules within OVS-TC.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
Get these visually appealing Kubernetes Concepts And Architecture PowerPoint Presentation Slides to discuss the process of operating containerized applications. You can display the need for containers by the company with the help of an open-source architecture PPT slideshow. The architecture of containers can be demonstrated with the help of a visually appealing PPT slideshow. The reasons for opting for Kubernetes by an organization can be explained to your teammates with the help of containers PowerPoint infographics. Highlight the roadmap for installing Kubernetes in the organization by using content-ready PPT slides. Take the assistance of visually appealing PPT templates to depict the major advantages of Kubernetes such as improving productivity, the stability of application run, and many more. After that, display 30 60 90 days plan to implement Kubernetes in the organization. Display the key components of Kubernetes with the help of a diagram using this professionally designed cluster architecture PPT layouts. Describe the functionality of each components of Kubernetes. Hence, download Kubernetes architecture PPT slides to easily and efficiently manage the clusters. https://bit.ly/34DWa7x
If you’re working with just a few containers, managing them isn't too complicated. But what if you have hundreds or thousands? Think about having to handle multiple upgrades for each container, keeping track of container and node state, available resources, and more. That’s where Kubernetes comes in. Kubernetes is an open source container management platform that helps you run containers at scale. This talk will cover Kubernetes components and show how to run applications on it.
Deep dive in container service discoveryDocker, Inc.
Service discovery and traffic load-balancing in the container ecosystem relies on different technologies, such as IPVS and iptables, and container orchestrators use different approaches. This talk will present in details how Docker Swarm and Kubernetes achieve this. The talk will continue with a demo showing how applications that are not managed by Kubernetes can take advantage of its native load-balancing. Finally, it will compare these approaches to service-mesh solutions.
Francisco Javier Ramírez Urea - IT Architect, Hoplasoftware
Guillaume Morini - SE, Docker
The integration of Kubernetes orchestration into the Docker Enterprise Platform presents deployments with interesting new abstractions for application connectivity. Devs and Ops are often challenged with rationalizing how pod networking (with CNI plugins like Calico or Flannel), Services (via kube-proxy) and Ingress work in concert to enable application connectivity within and outside a cluster. Similarly, given the dynamic and transient nature of containerized microservice workloads, how to leverage scalable and declarative approaches like network policies to express segmentation and security primitives. This session provides an illustrative walkthrough of these core concepts by going through common deployment architectures providing design, operations, and scale considerations based on experience from numerous production deployments. We will discuss Kubernetes publishing methods and deep dive into Ingress Controllers. This session will also showcase how to complement application and operations workflows with policy-driven business, compliance and security controls typically required in enterprise production deployments including going further into limiting traffic to services, session persistence, rewriting, and activating container health checks.
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
Load balancing is an important part of any resilient web application. Kubernetes supports a few options for external load balancing, but they are limited in features. After a brief discussion of those options and the features they lack, we’ll show how to build an advanced load balancing solution for Kubernetes on top of NGINX, utilizing Kubernetes features including Ingress, Annotations, and ConfigMap. We’ll conclude with a demo of how to use NGINX and NGINX Plus to expose services to the Internet.
Sched Link: http://sched.co/6Bc9
Kubernetes currently has two load balancing mode: userspace and IPTables. They both have limitation on scalability and performance. We introduced IPVS as third kube-proxy mode which scales kubernetes load balancer to support 50,000 services. Beyond that, control plane needs to be optimized in order to deploy 50,000 services. We will introduce alternative solutions and our prototypes with detailed performance data.
Running large Kubernetes clusters is challenging. At large scales, practitioners need to adapt and tune both their architectures and component configurations in specialized ways.
Our organisation has been running large scale Kubernetes clusters (up to 2000 nodes, and growing) for more than a year, and we have learned several lessons the hard way. This talk will dive into complex runtime and networking issues that occur when running Kubernetes in production at scale. We will provide examples of how to improve the architecture of clusters to increase scalability and performance, both on the control plane and the data plane. Further, tools from the greater ecosystem will be examined, as they are rarely tested within the context of very large clusters.
Finally, the talk will also discuss the mutually beneficial relationship we built with the larger Kubernetes community by providing feedback on the tools and contributing both fixes and improvements upstream.
Networking @Scale'19 - Getting a Taste of Your Network - Sergey FedorovSergey Fedorov
Sergey Fedorov, Senior Software Engineer at Netflix, describes a client-side network measurement system called "Probnik", and how it can be used to improve performance, reliability and control of client-server network interactions.
An introductory look at Kubernetes and how it leverages AWS IaaS features to provide its own virtual clustering, and demonstration of some of the behaviour inside the cluster that makes Kubernetes a popular choice for microservice deployments.
An introduction to Kubernetes and a look at how it leverages AWS IaaS features to provide its own virtual clustering, and demonstration of some of the behaviour inside the cluster that makes Kubernetes a popular choice for microservice deployments.
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADSElad Hirsch
Knative is the new kid in town in the Serverless community.
As Kubernetes is de facto our cloud infrastructure, Knative allows us to focus more on our business logic and less on infrastructure, All while committing to the new paradigm of Serverless computing.
This session will explore a high-level overview of Knative and follow the architectural design of a modern data pipeline shifting from AWS Lambda to Knative.
KubeCon EU 2016: Using Traffic Control to Test Apps in KubernetesKubeAcademy
Testing applications is important, as shown by the rise of continuous integration and automated testing. In this talk, I will focus on one area of testing that is difficult to automate: poor network connectivity. Developers usually work within reliable networking conditions so they might not notice issues that arise in other networking conditions. I will give examples of software that would benefit from test scenarios with varying connectivity. I will explain how traffic control on Linux can help to simulate various network connectivity. Finally, I will run a demo showing how an application running in Kubernetes behaves when changing network parameters.
Sched Link: http://sched.co/6Bb3
Zero downtime deployment of micro-services with KubernetesWojciech Barczyński
Talk on deployment strategies with Kubernetes covering kubernetes configuration files and the actual implementation of your service in Golang and .net core.
You will find demos for recreate, rolling updates, blue-green, and canary deployments.
Source and demos, you will find on github: https://github.com/wojciech12/talk_zero_downtime_deployment_with_kubernetes
Testing applications with traffic control in containers / Alban Crequy (Kinvolk)Ontico
Testing applications is important, as shows the rise of continuous integration and automated testing. In this talk, I will focus on one area of testing that is difficult to automate: poor network connectivity. Developers usually work within reliable networking conditions so they might not notice issues that arise in other networking conditions. I will give examples of software that would benefit from test scenarios with varying connectivity. I will explain how traffic control on Linux can help to simulate various network connectivity. Finally, I will run a demo showing how an application running in Kubernetes behaves when changing network parameters.
Similar to Evolution of kube-proxy (Brussels, Fosdem 2020) (20)
Running Kubernetes at scale is challenging and you can often end up in situations where you have to debug complex and unexpected issues. This requires understanding in detail how the different components work and interact with each other. Over the last 3 years, Datadog migrated most of its workloads to Kubernetes and now manages dozens of clusters consisting of thousands of nodes each. During this journey, engineers have debugged complex issues with root causes that were sometimes very surprising. In this talk Laurent and Tabitha will share some of these stories, including a favorite: how a complex interaction between familiar Kubernetes components allowed an OOM-killer invocation to trigger the deletion of a namespace.
DNS is one of the Kubernetes core systems and can quickly become a source of issues when you’re running clusters at scale. For over a year at Datadog, we’ve run Kubernetes clusters with thousands of nodes that host workloads generating tens of thousands of DNS queries per second. It wasn’t easy to build an architecture able to handle this load, and we’ve had our share of problems along the way.
This talk starts with a presentation of how Kubernetes DNS works. It then dives into the challenges we’ve faced, which span a variety of topics related to load, connection tracking, upstream servers, rolling updates, resolver implementations, and performance. We then show how our DNS architecture evolved over time to address or mitigate these problems. Finally, we share our solutions for detecting these problems before they happen—and identifying misbehaving clients.
The Kubernetes audit logs are a rich source of information: all of the calls made to the API server are stored, along with additional metadata such as usernames, timings, and source IPs. They help to answer questions such as “What is overloading my control plane?” or “Which sequence of events led to this problematic situation?”. These questions are hard to answer otherwise—especially in large clusters. At Datadog, we have been running clusters with 1000+ nodes for more than a year and during that time, the audit logs have proved invaluable.
In this presentation, we will first introduce the audit logs, explain how they are configured, and review the type of data they store. Finally, we will describe in detail several scenarios where they have helped us to diagnose complex problems.
Running large Kubernetes clusters is difficult. Datadog has been running large-scale Kubernetes clusters (thousands of nodes) for more than a year and has learned several lessons the hard way.
Laurent Bernaille examines the challenges Datadog faced during this journey. He dives into problems that arise when you run large clusters—and, crucially, how to address them—by providing detailed examples based on Datadog’s experience across different cloud providers. You’ll explore complex runtime and networking issues: at scale you discover complex issues in low-level components that are very rare but happen regularly when you have a large number of nodes.
Additionally, Laurent provides examples of how to improve the architecture of clusters to increase scalability and performance, both on the control plane and the data plane (communication between pods and ingress traffic). If scale can be hard on the control plane, it’s even harder on tools from the ecosystem, which have rarely been tested on very large clusters. He explains several examples of the tools Datadog uses and how it had to improve them to handle its scale. And you’ll leave with practical advice on how to build a good relationship with the community and start contributing back.
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...Laurent Bernaille
Kubernetes is a very powerful and complicated system, and many users don’t understand the underlying systems. Come learn how your users can abuse container runtimes, overwhelm your control plane, and cause outages - it’s actually quite easy!
In the last year, we have containerized hundreds of applications and deployed them in large scale clusters (more than 1000 nodes). The journey was eventful and we learned a lot along the way. We’ll share stories of our ten favorite Kubernetes foot guns, including the dangers of cargo culting, rolling updates gone wrong, the pitfalls of initContainers, and nightmarish daemonset upgrades. The talk will present solutions we adopted to avoid or work around some these problems and will finally show several improvements we plan deploy in the future.
Similar to the Kubecon talk with the same title with a few new incidents.
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!Laurent Bernaille
Kubernetes is a very powerful and complicated system, and many users don’t understand the underlying systems. Come learn how your users can abuse container runtimes, overwhelm your control plane, and cause outages - it’s actually quite easy!
In the last year, we have containerized hundreds of applications and deployed them in large scale clusters (more than 1000 nodes). The journey was eventful and we learned a lot along the way. We’ll share stories of our ten favorite Kubernetes foot guns, including the dangers of cargo culting, rolling updates gone wrong, the pitfalls of initContainers, and nightmarish daemonset upgrades. The talk will present solutions we adopted to avoid or work around some these problems and will finally show several improvements we plan deploy in the future.
Running large Kubernetes clusters is challenging. This talk focus on how you can optimize your network setup in clusters with 1000-2000 nodes. It discusses standard ingresses solutions and their drawbacks as well as potential solutions
This presentation describes the challenges we faced building, scaling and operating a Kubernetes cluster of more than 1000 nodes to host the Datadog applications
The Docker network overlay driver relies on several technologies: network namespaces, VXLAN, Netlink and a distributed key-value store. This talk will present each of these mechanisms one by one along with their userland tools and show hands-on how they interact together when setting up an overlay to connect containers.
The talk will continue with a demo showing how to build your own simple overlay using these technologies.
The Docker network overlay driver relies on several technologies: network namespaces, VXLAN, Netlink and a distributed key-value store. This talk will present each of these mechanisms one by one along with their userland tools and show hands-on how they interact together when setting up an overlay to connect containers. The talk will continue with a demo showing how to build your own simple overlay using these technologies. Finally, it will show how we can dynamically distribute IP and MAC information to every hosts in the overlay using BGP EVPN
Serverless architectures are promising and will play an important role in the coming years but the ecosystem around serverless is still pretty young. We have been operating Lambda based applications for about a year and faced several challenges. In this presentation we share these challenges and propose some solutions to work around them.
The Docker network overlay driver relies on several technologies: network namespaces, VXLAN, Netlink and a distributed key-value store. This talk will present each of these mechanisms one by one along with their userland tools and show hands-on how they interact together when setting up an overlay to connect containers.
The talk will continue with a demo showing how to build your own simple overlay using these technologies.
Most tools to recognize the application associated with network connections use well-known signatures as basis for their classification. This approach is very effective in enterprise and campus networks to pinpoint forbidden applications (peer to peer, for instance) or security threats. However, it is easy to use encryption to evade these mechanisms. In particular, Secure Sockets Layer (SSL) libraries such as OpenSSL are widely available and can easily be used to encrypt any type of traffic. In this paper, we propose a method to detect applications in SSL encrypted connections. Our method uses only the size of the first few packets of an SSL connection to recognize the application, which enables an early classification. We test our method on packet traces collected on two campus networks and on manually-encrypted traces. Our results show that we are able to recognize the application in an SSL connection with more than 85% accuracy.
The automatic detection of applications associated with network traffic is an essential step for network security and traffic engineering. Unfortunately, simple port-based classification methods are not always efficient and systematic analysis of packet payloads is too slow. Most recent research proposals use flow statistics to classify traffic flows once they are finished, which limit their applicability for online classification. In this paper, we evaluate the feasibility of application identification at the beginning of a TCP connection. Based on an analysis of packet traces collected on eight different networks, we find that it is possible to distinguish the behavior of an application from the observation of the size and the direction of the first few packets of the TCP connection. We apply three techniques to cluster TCP connections: K-Means, Gaussian Mixture Model and spectral clustering. Resulting clusters are used together with assignment and labeling heuristics to design classifiers. We evaluate these classifiers on different packet traces. Our results show that the first four packets of a TCP connection are sufficient to classify known applications with an accuracy over 90% and to identify new applications as unknown with a probability of 60%.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
2. Datadog
Monitoring service
Over 350 integrations
Over 1,200 employees
Over 8,000 customers
Runs on millions of hosts
Trillions of data points per day
10000s hosts in our infra
10s of Kubernetes clusters
Clusters from 50 to 3000 nodes
Multi-cloud
Very fast growth
4. ● Network proxy running on each Kubernetes node
● Implements the Kubernetes service abstraction
○ Map service Cluster IPs (virtual IPs) to pods
○ Enable ingress traffic
Kube-proxy
6. Created resources
Deployment ReplicaSet Pod 1
label: app=echo
Pod 2
label: app=echo
Pod 3
label: app=echo
Service
Selector: app=echo
kubectl get all
NAME AGE
deploy/echodeploy 16s
NAME AGE
rs/echodeploy-75dddcf5f6 16s
NAME READY
po/echodeploy-75dddcf5f6-jtjts 1/1
po/echodeploy-75dddcf5f6-r7nmk 1/1
po/echodeploy-75dddcf5f6-zvqhv 1/1
NAME TYPE CLUSTER-IP
svc/echo ClusterIP 10.200.246.139
7. The endpoint object
Deployment ReplicaSet Pod 1
label: app=echo
Pod 2
label: app=echo
Pod 3
label: app=echo
kubectl describe endpoints echo
Name: echo
Namespace: datadog
Labels: app=echo
Annotations: <none>
Subsets:
Addresses: 10.150.4.10,10.150.6.16,10.150.7.10
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
http 5000 TCP
Endpoints
Addresses:
10.150.4.10
10.150.6.16
10.150.7.10
Service
Selector: app=echo
12. Endpoints
API Server
Node
kubelet pod
HC
Status updates
Controller Manager
Watch
- pods
- services
endpoint
controller
Node
kubelet pod
HC
Sync endpoints:
- list pods matching selector
- add IP to endpoints
ETCD
pods
services
endpoints
13. Accessing services
API Server
Node A
Controller Manager
Watch
- pods
- services
endpoint
controller
Sync endpoints:
- list pods matching selector
- add IP to endpoints
ETCD
pods
services
endpoints
kube-proxy proxier
Watch
- services
- endpoints
14. Accessing services
API Server
Node A
Controller Manager
endpoint
controller
ETCD
pods
services
endpoints
kube-proxy proxier
client Node Bpod 1
Node Cpod 2
19. proxy-mode=userspace
PREROUTING
any / any =>
KUBE-PORTALS-CONTAINER
All traffic is processed by kube chains
KUBE-PORTALS-CONTAINER
any / VIP:PORT => NodeIP:PortX
Portal chain
One rule per service
One port allocated for each service (and bound by kube-proxy)
20. ● Performance (userland proxying)
● Source IP cannot be kept
● Not recommended anymore
● Since Kubernetes 1.2, default is iptables
Limitations
22. proxy-mode=iptables
API Server
Node A
kube-proxy iptables
client
Node B
Node C
pod 1
pod 2
Outgoing traffic
1. Client to Service IP
2. DNAT: Client to Pod1 IP
Reverse path
1. Pod1 IP to Client
2. Reverse NAT: Service IP to client
24. proxy-mode=iptables
PREROUTING / OUTPUT
any / any => KUBE-SERVICES
KUBE-SERVICES
any / VIP:PORT => KUBE-SVC-XXX
Global Service chain
Identify service and jump to appropriate service chain
25. proxy-mode=iptables
PREROUTING / OUTPUT
any / any => KUBE-SERVICES
KUBE-SERVICES
any / VIP:PORT => KUBE-SVC-XXX
KUBE-SVC-XXX
any / any proba 33% => KUBE-SEP-AAA
any / any proba 50% => KUBE-SEP-BBB
any / any => KUBE-SEP-CCC
Service chain (one per service)
Use statistic iptables module (probability of rule being applied)
Rules are evaluated sequentially (hence the 33%, 50%, 100%)
26. proxy-mode=iptables
PREROUTING / OUTPUT
any / any => KUBE-SERVICES
KUBE-SERVICES
any / VIP:PORT => KUBE-SVC-XXX
KUBE-SVC-XXX
any / any proba 33% => KUBE-SEP-AAA
any / any proba 50% => KUBE-SEP-BBB
any / any => KUBE-SEP-CCC
KUBE-SEP-AAA
endpoint IP / any => KUBE-MARK-MASQ
any / any => DNAT endpoint IP:Port
Endpoint Chain
Mark hairpin traffic (client = target) for SNAT
DNAT to the endpoint
27. Hairpin traffic?
API Server
Node A
kube-proxy iptables
pod 1
Node B
Node C
pod 2
pod 3
Client can also be a destination
After DNAT:
Src IP= Pod1, Dst IP= Pod1
No reverse NAT possible
=> SNAT on host for this traffic
1. Pod1 IP => SVC IP
2. SNAT: HostIP => SVC IP
3. DNAT: HostIP => Pod1 IP
Reverse path
1. Pod1 IP => Host IP
2. Reverse NAT: SVC IP => Pod1IP
29. Persistency
KUBE-SEP-AAA
endpoint IP / any => KUBE-MARK-MASQ
any / any => DNAT endpoint IP:Port
recent : set rsource KUBE-SEP-AAA
Use “recent” module
Add Source IP to set named KUBE-SEP-AAA
KUBE-SVC-XXX
any / any recent: rcheck KUBE-SEP-AAA => KUBE-SEP-AAA
any / any recent: rcheck KUBE-SEP-BBB => KUBE-SEP-BBB
any / any recent: rcheck KUBE-SEP-CCC => KUBE-SEP-CCC
Load-balancing rules
Use recent module
If Source IP is in set named KUBE-SEP-AAA,
jump to KUBE-SEP-AAA
30. ● iptables was not designed for load-balancing
● hard to debug (very quickly, 10000s of rules)
● performance impact
○ control plane: syncing rules
○ data plane: going through rules
Limitations
31. “Scale Kubernetes to Support 50,000 Services” Haibin Xie & Quinton Hoole
Kubecon Berlin 2017
32. “Scale Kubernetes to Support 50,000 Services” Haibin Xie & Quinton Hoole
Kubecon Berlin 2017
34. ● L4 load-balancer build in the Linux Kernel
● Many load-balancing algorithms
● Native persistency
● Fast atomic update (add/remove endpoint)
● Still relies on iptables for some use cases (SNAT)
● GA since 1.11
proxy-mode=ipvs
35. proxy-mode=ipvs
Pod X
Pod Y
Pod Z
Service ClusterIP:Port S
Backed by pod:port X, Y, Z
Virtual Server S
Realservers
- X
- Y
- Z
kube-proxy apiserver
Watches endpoints
Updates IPVS
36. New connection
Pod X
Pod Y
Pod Z
Virtual Server S
Realservers
- X
- Y
- Z
kube-proxy apiserver
IPVS conntrack
App A => S >> A => X
App establishes connection to S
IPVS associates Realserver X
37. Pod X deleted
Pod X
Pod Y
Pod Z
Virtual Server S
Realservers
- Y
- Z
kube-proxy apiserver
IPVS conntrack
App A => S >> A => X
Apiserver removes X from S endpoints
Kube-proxy removes X from realservers
Kernel drops traffic (no realserver)
38. Clean-up
Pod Y
Pod Z
Virtual Server S
Realservers
- Y
- Z
kube-proxy apiserver
IPVS conntrack
App A => S >> A => X
net/ipv4/vs/expire_nodest_conn
Delete conntrack entry on next packet
Forcefully terminate (RST)
RST
39. Limit?
● No graceful termination
● As soon as a pod is Terminating connections are destroyed
● Addressing this took time
40. Graceful termination (1.12)
Pod X
Pod Y
Pod Z
Virtual Server S
Realservers
- X Weight:0
- Y
- Z
kube-proxy apiserver
IPVS conntrack
App A => S >> A => X
Apiserver removes X from S endpoints
Kube-proxy sets Weight to 0
No new connection
41. Garbage collection
Pod X
Pod Y
Pod Z
Virtual Server S
Realservers
- Y
- Z
kube-proxy apiserver
IPVS conntrack
App A => S >> A => X
Pod exits and sends FIN
Kube-proxy removes realserver when it
has no active connection
FIN
42. What if X doesn’t send FIN?
Pod Y
Pod Z
Virtual Server S
Realservers
- X Weight:0
- Y
- Z
kube-proxy
IPVS conntrack
App A => S >> A => X
Conntrack entries expires after 900s
If A sends traffic, it never expires
Traffic blackholed until App detects issue
~15mn
> Mitigation: lower tcp_retries2
43. ● Default timeouts
○ tcp / tcpfin : 900s / 120s
○ udp: 300s
● Under high load, ports can be reused fast and keep entry active
○ Especially with conn_reuse_mode=0
● Very bad for DNS => No graceful termination for UDP
IPVS connection tracking
44. ● Very recent feature: allow for configurable timeouts
● Ideally we’d want:
○ Terminating pod: set weight to 0
○ Deleted pod: remove backend
> This would require an API change for Endpoints
IPVS connection tracking
45. ● Works pretty well at large scale
● Not 100% feature parity (a few edge cases)
● Tuning the IPVS parameters took time
● Improvements under discussion
IPVS status
47. Control plane scalability
● On each update to a service (new pod, readiness change)
○ The full endpoint object is recomputed
○ The endpoint is sent to all kube-proxy
● Example of a single endpoint change
○ Service with 2000 endpoints, ~100B / endpoint, 5000 nodes
○ Each node gets 2000 x 100B = 200kB
○ Apiservers send 5000 x 200kB = 1GB
● Rolling update?
○ Total traffic => 2000 x 1GB = 2TB
48. => Endpoint Slices
● Service potentially mapped to multiple EndpointSlices
● Maximum 100 endpoints in each slice
● Much more efficient for services with many endpoints
● Beta in 1.17
49. Conntrack size
● kube-proxy relies on the conntrack
○ Not great for services receiving a lot of connections
○ Pretty bad for the cluster DNS
● DNS strategy
○ Use node-local cache and force TCP for upstream
○ Much more manageable
54. Conclusion
● kube-proxy works at significant scale
● A lot of ongoing efforts to improve kube-proxy
● iptables and IPVS are not perfect matches for the service abstraction
○ Not designed for client-side load-balancing
● Very promising eBPF alternative: Cilium
○ No need for iptables / IPVS
○ Works without conntrack
○ See Daniel Borkmann’s talk