SlideShare a Scribd company logo
1 of 29
Download to read offline
SAMSUNG OPEN SOURCE CONFERENCE 2019
SOSCON
Method of NUMA-Aware Resource Management for
Kubernetes 5G NFV Cluster
Samsung Electronics | Samsung Research | Byonggon Chun
10.16, 2019
SOSCON2019BIO
Byonggon Chun
Projects Details
Tizen
(2015~2016)
Tizen Web-Device API development
Iotivity
(2016~2017)
Iotivity development based on OCF 1.0 spec
(Endpoint, Smarthome, etc)
Edge Computing
(2017~2018)
Factory Edge Computing PoC
(Based on EdgeX, DDS)
FaaS based Home Edge Computing PoC
(Based on Greengrass Core, OSS FaaS, etc)
5G MEC
(2018~2019)
5G MEC PoC
(Based on LF Akraino, Openstak-helm, ETSI MEC Standard)
Container-based
NFV Infra
(2019~)
NUMA-aware Resource Manager for CNF(PoC)
(CPU, Memory, Hugepages)
Opensource Contribution~
(Kubernetes, Docker, Containerd)
SAMSUNG OPEN SOURCE CONFERENCE 2019
SOSCON2019
Background
Deep dive into Kubernetes at the node level
How Kubernetes supports NUMA
Kubernetes Contribution
01
02
03
04
Agenda
SOSCON2019Background
Major benefits of Network Function Containerization
• Faster startup speed(quick to deploy)
• Lower performance overhead(no overhead from guest kernel)
Cyclictest Benchmark with generic kernel
VM Docker Native
6532
391
90
39 38
Startuptimeinms
App
GuestOS
VM Docker Native
39422
13 8
LatencyinuSec
Startup Benchmark with generic kernel
Cyclictest Benchmark source: Minimizing Latency of Real-TimeContainer Cloud for Software Radio Access Networks, IEEE CloudCom,2015
SOSCON2019Background
Virtual Machine vs Container
Host Kernel
Hypervisor
(KVM/QEMU)
Guest Kernel
Bins/Libs
User Apps
Host Kernel
Container Runtime
(namespaces/cgroups)
VM Container
Difference between VM and Container
• Q. So…is Container a new kind of Virtual Machine without kernel emulating?
• A. Nope, you should know about “Linux namespaces” and “Linux control groups”.
Guest Kernel
Bins/Libs
User Apps
SOSCON2019Background
What is Container?
• The concept of container is lightweight mechanism to provide isolated environment.
• Processes are “isolated by linux namespaces”.
• The resource usage is “restricted by linux cgroup”.
• So the most of containers share host kernel.
• Sometimes containers running on the isolated kernel similar to virtual machine.
(kata-runtime, gvisor, etc)
Host Kernel
Host Space
The fundamental concept of container
Isolated Space
Processes Processes
Container
SOSCON2019Deep dive into Kuburnetes at the node level
The structure of Kubernetes is straightforward.
• Kubernetes consists of master components(APIs, scheduler, etc)
and node components(kubelet, container-runtime, mandatory-services).
Overall architecture of Kubernetes
API Server etcd Scheduler Controller Manager
Master
Kubelet Proxy
Node
Pod
Pod
Pod
Pod
Kubelet Proxy
Node
Pod
Pod
Pod
Pod
…
SOSCON2019Deep dive into Kuburnetes at the node level
Let’s tear down Pod.
• Pod is usually known as the basic execution unit or smallest deployable unit in Kubernetes.
• Let’s see the Pod at the point of namespaces and cgroup.
The concept of Pod
Pod
Infra Container
Container foo
Container goo
Container hoo
provide
network, ipc
namespaces
for containers
pod-cgroup
├── container-foo-cgroup
│ └── control-files
├── container-goo-cgroup
│ └── control-files
├── container-hoo-cgroup
│ └── control-files
├── infra-container-cgroup
│ └── control-files
└── control-files Reousce limits
are set on
cgroup.
SOSCON2019Deep dive into Kuburnetes at the node level
• Kubelet communicates with container runtimes over CRI.
• CRI is developed for loosely coupled structure between kubelet and container runtimes.
(But Kubelet still communicates with docker over dockershim which is part of kubelet)
• CRI offers set of gRPC APIs and protobuf messages for pod/container lifecycle management.
(CRI runtime runs CRI runtime service server, kubelet is client)
Let’s talk about kubelet and container runtime.
source: Kubernetes Blog, Introducing Container Runtime Interface (CRI) in Kubernetes
CRI RuntimeKubelet
GRPC/CRI
RunPodSandbox
CreateContainer
StartContainer
...
Docker,
containerd,
CRI-O
…
The concept of CRI
SOSCON2019Deep dive into Kuburnetes at the node level
What is OCI and OCI compliant runtime?
source: OCI Runtime Specification
• OCI(Open Container Initiative) offers “image-spec” and “runtime-spec” as open industry standards.
• Image-spec specifies image format for “OCI Runtime bundle” which is set of files.
• Runtime-spec defines the concept of runtime bundle and configuration & lifecycle of a container.
• OCI compliant runtime means runtime which can run “OCI Runtime bundle”.
(opencontainers/runc is known as the iconicOCI runtime and reference implementation.)
The concept of OCI image spec and runtime spec
OCI Runtime bundleOCI Image OCI Container
Extraction Execution
SOSCON2019Deep dive into Kuburnetes at the node level
Now we can draw clear picture with CRI and OCI runtime.
CRI RuntimeKubelet
CRI
Lifecycle related CRI APIs and OCI Runtime event
OCI Runtime
OCI
RunPodSandbox,
CreateContainer,
UpdateContainerResources,
StartContainer,
StopContainer,
RemoveContainer,
StopPodSandbox,
RemovePodSandbox,
…
State,
Create,
Start,
Kill,
Delete
source: Container Runtime Interface, OCI Runtime Specification
OCI Container
SOSCON2019Deep dive into Kuburnetes at the node level
Now we can draw clear picture with CRI and OCI runtime.
CRI RuntimeKubelet
CRI
List of CRI and OCI Runtimes
OCI Runtime
OCI
docker,
containerd,
CRI-O,
rkt,
frakti,
singularity-cri,
…
runc,
crun,
gVisor,
kata-runtimes,
nabula,
firecracker-runtime,
singularity,
…
OCI Container
SOSCON2019Deep dive into Kuburnetes at the node level
But in the real world, there is a “shim”.
Kubelet/dockershim dockerd/containerd runc
CRI-plugin/ContainerdKubelet runc
containerd-shim-v2
containerd-shim-v2
CRI-OKubelet runc
CRI
CRI
CRI
3 ways to runc
OCI
OCI
OCI
shim
shim
SOSCON2019Deep dive into Kuburnetes at the node level
But in the real world, there is a “shim”.
CRI-plugin/ContainerdKubelet containerd-shim-v2
CRI-OKubelet Kata-runtime
CRI
CRI
OCI
OCI
containerd-shim-v2
Kata-runtime
2 ways to Kata-runtime
shim
shim
SOSCON2019Deep dive into Kuburnetes at the node level
Do we have to know all of this for resource management?
CRI RuntimeKubelet
CRI
Sequence of pod and container creation
OCI Runtime
OCI
Create a Pod level
cgroup, then set
resource restriction
for a pod.
OCI Container
Create a container
level cgroup, then
set resource
restriction for a
container. Lastly, run
container on
dedicated cgroup.
• It is required to know how to manage resources at the low level.
(to use Node Allocatable Feature, and Resource Managers in Kubernetes like CPU manager.)
• It is required to know to run hardware accelerated application like DPDK with low level resource management.
• In the case of kata-container with KVM/QEMU, the way to manage resources is little bit different.
SOSCON2019How Kubernetes supports NUMA
What is NUMA?
• NUMA(Non-Uniform Memory Access) is modern style architecture for multi processors.
• Each socket(NUMA node) has own CPU Processor, Memory, PCI Devices.
(Typically, one socket equal to one NUMA node.)
• Processor is able to access remote memory and I/O devices on other sockets.
(But the remote access of resources shows performance decrement)
Typical 2 sockets configuration of Intel Xeon
CPU CPU
3 x PCIe(16x) 3 x PCIe(16x)
UPI
6 x DDR4 6 x DDR4
SOSCON2019How Kubernetes supports NUMA
When NUMA aware resource allocation is required?
• NUMA aware resource allocation should be made for following applications.
• Latency-sensitive applications such as real-time AR/VR and game streaming.
• Hardware acceleration based applications such as DPDK and CUDA.
Socket 0 Socket 1
136
194
LatencyinnSec
DPDK l2fwd Throughput with 10Gbps NIC
(Intel Xeon Scalable Gold 6148)
Memory access latency from socket 0
(Intel Xeon E7-4800)
aligned misaligned
9.9
7.9
ThroughputinGbps
Latency test source: Memory Latencies on Intel® Xeon® Processor E5-4600 and E7-4800 product families,Intel
SOSCON2019How Kubernetes supports NUMA
CPU Pinning in Kubernetes
• CPU pinning allows exclusive usage of CPUs for process or thread.
• CPU Manager in Kubernetes responsible for allocating logical threads(SMT) to containers.
(CPU Manager attempts to allocate siblingthreads to containers, when siblings are available.)
• CPU Manager allocates exclusive CPUs using CPUSET cgroup controller.
(It is possible to adjust container’s cpu affinity at thread level by “sched_setaffintiy”.)
• Alternative(Intel CMK) also available.
(Both solutions and NTM are contributed by Intel.)
Comparison between CPU Manager and Intel CMK
Solution Part of
Kubelet
Approach Allowed
CPUSET
NUMA Support Node Allocatable
Feature
Node Topology
Manager
CPU Manager Yes cgroup
(CPUSET)
Allocated CPUs
only
CPU, I/O Devices, etc
over NTM
supported supported
Intel CMK No(Plugin) sched_setaffinity
subprocess
Entire CPUs on
machine
CPU Only Not supported Not supported
SOSCON2019How Kubernetes supports NUMA
How it Works: CPU Manager
apiVersion: v1
kind: Pod
metadata:
name: dpdk-sample
spec:
containers:
- image: dpdk-sample
name: simple-l2fwd
resources:
requests:
cpu: "4"
memory: "1Gi"
hugepages-1Gi: "2Gi"
limits:
cpu: "4"
memory: "1Gi"
hugepages-1Gi: "2Gi"
cat <container-cgroup>.cpuset.cpus
1-2,41-42
Socket 0 Socket 1
-------- --------
Core 0 [0, 40] [20, 60]
Core 1 [1, 41] [21, 61]
Core 2 [2, 42] [22, 62]
Core 3 [3, 43] [23, 63]
Core 4 [4, 44] [24, 64]
Core 5 [5, 45] [25, 65]
Core 6 [6, 46] [26, 66]
Core 7 [7, 47] [27, 67]
Core 8 [8, 48] [28, 68]
…
Yaml example for CPU pinning CPU Layout(2socket, SMT enabled)
Allocated CPUs for container
“sched_setaffinity” usage in DPDK
(pin pThread3 to lCore42)
Process A
pThread0 pThread1
pThread2 pThread3
lCore2lCore1 lCore41 lCore42
Thread_creation
├─pthread_create
├─pthread_setname_np
└─pthread_setaffinity_np
└─ sched_setaffinity(tid, cpuset)
SOSCON2019How Kubernetes supports NUMA
Resource Manager and Plugins in Kubernetes
• Device Manager
(Component of Kubelet, advertises/allocates extendedresources.)
• Device Plugins
(nvidia-gpu-plugin,amd-gpu-plugin, gpu-sharing-plugin, sr-iov-plugin, rdma-device-plugin,etc)
Sequence of extended resource allocation in Kubernetes
Device Plugin
Register
Device Manager
ListWatch
Allocate
Normally, plugins are
gRPC server and
containerized.
Kubelet
PodAdmitHandler
SOSCON2019How Kubernetes supports NUMA
The concept of Topology Manager
• Topology Manager provides the way of NUMA-aware resource allocation for containers at the node level.
• Topology Manager retrieves Topology Hint from Hint Providers
• Topology Manager calculates NUMA node affinity then judges whether admit pod or not by given policy.
(pod admission will be rejected, if chosen policycannot be satisfied.)
Sequence of Pod admission with Topology Manager
Topology ManagerKubelet
Hints
Admit()
HintProviders
GetTopologyHints()
SyncPod()
admitorrejectPod
CPU Manager,
Device Manager,
…
SOSCON2019How Kubernetes supports NUMA
What is Topology Hint and Topology Policy?
• Topology Hint is data structure to represent NUMA nodes of allocable resources as bits.
• Topology Manager collects hints then merges the hints to find best hint.
(Policies share the same merging algorithm in 1.16, each policywill have own one in future release)
• Each policy has own pod admission criteria.
Policy Description
none Do nothing, Topology Manager will not working.
best-effort Calculate best hint then just use it whatever it is
restricted* Reject pod admission if best hint is not preferred hint
single-numa* Reject pod admission if best hint does not fit to single
NUMA nodeTopology Hint Structure
Topology Policies
//TopologyHint is a struct containing the NUMANodeAffinity for a Container
type TopologyHint struct {
NUMANodeAffinity bitmask.BitMask
// Preferred is set to true when the NUMANodeAffinity encodes a preferred
// allocation for the Container. It is set to false otherwise.
Preferred bool
}
SOSCON2019How Kubernetes supports NUMA
How it Works: Topology Manager (w/single-numa policy)
apiVersion: v1
kind: Pod
metadata:
name: ntm-sample
spec:
containers:
- image: simple-sample
name: simple-sample
resources:
requests:
cpu: “4"
memory: "1Gi"
nvidia.com/gpu: 1
limits:
cpu: “4"
memory: "1Gi"
nvidia.com/gpu: 1
Yaml example for NTM
Test Case Resource
availability at
scheduler level
Available Resources
on Socket 0
Available
Resources
on Socket 1
Expected Result
Positive Case 1 CPU: 20, GPU: 4 CPU: 10, GPU: 2 CPU: 10, GPU: 2 Socket0,
Socket1
Positive Case 2 CPU: 20, GPU: 2 CPU: 10, GPU: 2 CPU: 10, GPU: 0 Socket0
Positive Case 3 CPU: 7, GPU: 3 CPU: 3, GPU: 2 CPU: 4, GPU: 1 Socket1
Negative Case 1 CPU: 13, GPU: 2 CPU: 3 , GPU 2: CPU: 10, GPU: 0 Admit Rejected
Negative Case 2 CPU: 6, GPU: 4 CPU: 3 , GPU 2: CPU: 3 , GPU 2: Admit Rejected
PodAdmit TestCase
SOSCON2019How Kubernetes supports NUMA
Issues(in 1.16)
Issue Description
Kubernetes/Issues/#83476* Unreliable Topology Hint generation when multiple containers in the same pod require alignment.
Kubernetes/PR/#83697 Topology Manager wouldn’t allow pod admit with single-numa policy when any of hint providers had
no NUMA preferences.
(Merged)
Kubernetes/PR/#83492 Topology Manager supports only guaranteed QoS class.
(Merged)
Kubernetes/Issue/#83483 To support “inter-device” topology contstraints(i.e. GPU-direct, Nvlink, RDMA)
Kubernetes/Issues/#83478 Same affinity calculation algorithm for various policies.
(Refactoring has been already started.)
TBD Alignment is limited at the container level, Topology Manager doesn't support Pod level alignment.
SOSCON2019How Kubernetes supports NUMA
Helpful Links
Title Link
Cgroup https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
CPU Manager KEP https://github.com/kubernetes/community/blob/master/contributors/design-
proposals/node/cpu-manager.md
Device Manager KEP https://github.com/kubernetes/community/blob/master/contributors/design-
proposals/resource-management/device-plugin.md
Topology Manager KEP https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0035-20190130-
topology-manager.md
CPU Manager Guide https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/
Topology Manager Guide https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/
Kubelet
(Container Manager)
https://github.com/kubernetes/kubernetes/tree/master/pkg/kubelet/cm
SOSCON2019Kubernetes Contribution
Special Interest Groups(SIGs) are open to new contributors
source: https://github.com/kubernetes/community
SOSCON2019Kubernetes Contribution
Hugepages Enhancement
But…What is hugepages?
• Hugepages are literally page which has huge size, typical Linux machine supports two page sizes(2MB, 1GB).
(Default page size is 4kb)
• The concept of hugepages is reducing TLB miss to reduce memory access latency.
(Hugepages also allow high utilization of hardware cache by reducing PageTable Entries.)
• DPDK and Database are usually known as applications which consumes hugepages.
(DPDK is Data Plane Development Kit for packet processing.)
• Kubernetes supports to consume pre-allocated hugepages but it does not support NUMA
and container isolation of hugepages.
SOSCON2019Kubernetes Contribution
Hugepages Enhancement
What is the goal of hugepages enhancement?
• Support container isolation of hugepages
• Support multi size hugepages at host and container level.
• Support NUMA aware hugepages management.
SOSCON2019
SAMSUNG OPEN SOURCE CONFERENCE 2019
THANK YOU

More Related Content

What's hot

[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...
[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...
[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...Tomoya Hibi
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Kentaro Ebisawa
 
Ieee nfv-sdn-2020-srv6-tutorial
Ieee nfv-sdn-2020-srv6-tutorialIeee nfv-sdn-2020-srv6-tutorial
Ieee nfv-sdn-2020-srv6-tutorialStefano Salsano
 
High Performance Networking with DPDK & Multi/Many Core
High Performance Networking with DPDK & Multi/Many CoreHigh Performance Networking with DPDK & Multi/Many Core
High Performance Networking with DPDK & Multi/Many Coreslankdev
 
Beginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition
Beginners: Open RAN Terminology – Virtualization, Disaggregation & DecompositionBeginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition
Beginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition3G4G
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRoutingFaisal Reza
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance CachingNGINX, Inc.
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Part 10: 5G Use cases - 5G for Absolute Beginners
Part 10: 5G Use cases - 5G for Absolute BeginnersPart 10: 5G Use cases - 5G for Absolute Beginners
Part 10: 5G Use cases - 5G for Absolute Beginners3G4G
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingMichelle Holley
 
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 

What's hot (20)

Open v ran
Open v ranOpen v ran
Open v ran
 
[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...
[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...
[D20] 高速Software Switch/Router 開発から得られた高性能ソフトウェアルータ・スイッチ活用の知見 (July Tech Fest...
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
 
Ieee nfv-sdn-2020-srv6-tutorial
Ieee nfv-sdn-2020-srv6-tutorialIeee nfv-sdn-2020-srv6-tutorial
Ieee nfv-sdn-2020-srv6-tutorial
 
High Performance Networking with DPDK & Multi/Many Core
High Performance Networking with DPDK & Multi/Many CoreHigh Performance Networking with DPDK & Multi/Many Core
High Performance Networking with DPDK & Multi/Many Core
 
Beginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition
Beginners: Open RAN Terminology – Virtualization, Disaggregation & DecompositionBeginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition
Beginners: Open RAN Terminology – Virtualization, Disaggregation & Decomposition
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRouting
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance Caching
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
NIDD (Non-IP Data Delivery) のご紹介
NIDD (Non-IP Data Delivery) のご紹介NIDD (Non-IP Data Delivery) のご紹介
NIDD (Non-IP Data Delivery) のご紹介
 
Part 10: 5G Use cases - 5G for Absolute Beginners
Part 10: 5G Use cases - 5G for Absolute BeginnersPart 10: 5G Use cases - 5G for Absolute Beginners
Part 10: 5G Use cases - 5G for Absolute Beginners
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
DPDK & Cloud Native
DPDK & Cloud NativeDPDK & Cloud Native
DPDK & Cloud Native
 
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Notes on NUMA architecture
Notes on NUMA architectureNotes on NUMA architecture
Notes on NUMA architecture
 

Similar to Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster

DevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes IntegrationDevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes IntegrationHank Preston
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016Kuniyasu Suzaki
 
Comparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetesComparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetesAdam Hamsik
 
Kubernetes presentation
Kubernetes presentationKubernetes presentation
Kubernetes presentationGauranG Bajpai
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Anthony Dahanne
 
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetesJuraj Hantak
 
Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Anthony Dahanne
 
Docker and kubernetes
Docker and kubernetesDocker and kubernetes
Docker and kubernetesDongwon Kim
 
kata-containers-onboarding-deck.pptx
kata-containers-onboarding-deck.pptxkata-containers-onboarding-deck.pptx
kata-containers-onboarding-deck.pptxQforQA
 
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...NETWAYS
 
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...NETWAYS
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej SikaJuraj Hantak
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4OpenEBS
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...PT Datacomm Diangraha
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetesTed Jung
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
DevJam 2019 - Introduction to Kubernetes
DevJam 2019 - Introduction to KubernetesDevJam 2019 - Introduction to Kubernetes
DevJam 2019 - Introduction to KubernetesRonny Trommer
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...OpenEBS
 

Similar to Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster (20)

Kubernetes
KubernetesKubernetes
Kubernetes
 
DevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes IntegrationDevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes Integration
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
Comparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetesComparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetes
 
Kubernetes presentation
Kubernetes presentationKubernetes presentation
Kubernetes presentation
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes
4. CNCF kubernetes Comparison of-existing-cni-plugins-for-kubernetes
 
Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018
 
Docker and kubernetes
Docker and kubernetesDocker and kubernetes
Docker and kubernetes
 
kata-containers-onboarding-deck.pptx
kata-containers-onboarding-deck.pptxkata-containers-onboarding-deck.pptx
kata-containers-onboarding-deck.pptx
 
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
 
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
DevJam 2019 - Introduction to Kubernetes
DevJam 2019 - Introduction to KubernetesDevJam 2019 - Introduction to Kubernetes
DevJam 2019 - Introduction to Kubernetes
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster

  • 1. SAMSUNG OPEN SOURCE CONFERENCE 2019 SOSCON Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster Samsung Electronics | Samsung Research | Byonggon Chun 10.16, 2019
  • 2. SOSCON2019BIO Byonggon Chun Projects Details Tizen (2015~2016) Tizen Web-Device API development Iotivity (2016~2017) Iotivity development based on OCF 1.0 spec (Endpoint, Smarthome, etc) Edge Computing (2017~2018) Factory Edge Computing PoC (Based on EdgeX, DDS) FaaS based Home Edge Computing PoC (Based on Greengrass Core, OSS FaaS, etc) 5G MEC (2018~2019) 5G MEC PoC (Based on LF Akraino, Openstak-helm, ETSI MEC Standard) Container-based NFV Infra (2019~) NUMA-aware Resource Manager for CNF(PoC) (CPU, Memory, Hugepages) Opensource Contribution~ (Kubernetes, Docker, Containerd)
  • 3. SAMSUNG OPEN SOURCE CONFERENCE 2019 SOSCON2019 Background Deep dive into Kubernetes at the node level How Kubernetes supports NUMA Kubernetes Contribution 01 02 03 04 Agenda
  • 4. SOSCON2019Background Major benefits of Network Function Containerization • Faster startup speed(quick to deploy) • Lower performance overhead(no overhead from guest kernel) Cyclictest Benchmark with generic kernel VM Docker Native 6532 391 90 39 38 Startuptimeinms App GuestOS VM Docker Native 39422 13 8 LatencyinuSec Startup Benchmark with generic kernel Cyclictest Benchmark source: Minimizing Latency of Real-TimeContainer Cloud for Software Radio Access Networks, IEEE CloudCom,2015
  • 5. SOSCON2019Background Virtual Machine vs Container Host Kernel Hypervisor (KVM/QEMU) Guest Kernel Bins/Libs User Apps Host Kernel Container Runtime (namespaces/cgroups) VM Container Difference between VM and Container • Q. So…is Container a new kind of Virtual Machine without kernel emulating? • A. Nope, you should know about “Linux namespaces” and “Linux control groups”. Guest Kernel Bins/Libs User Apps
  • 6. SOSCON2019Background What is Container? • The concept of container is lightweight mechanism to provide isolated environment. • Processes are “isolated by linux namespaces”. • The resource usage is “restricted by linux cgroup”. • So the most of containers share host kernel. • Sometimes containers running on the isolated kernel similar to virtual machine. (kata-runtime, gvisor, etc) Host Kernel Host Space The fundamental concept of container Isolated Space Processes Processes Container
  • 7. SOSCON2019Deep dive into Kuburnetes at the node level The structure of Kubernetes is straightforward. • Kubernetes consists of master components(APIs, scheduler, etc) and node components(kubelet, container-runtime, mandatory-services). Overall architecture of Kubernetes API Server etcd Scheduler Controller Manager Master Kubelet Proxy Node Pod Pod Pod Pod Kubelet Proxy Node Pod Pod Pod Pod …
  • 8. SOSCON2019Deep dive into Kuburnetes at the node level Let’s tear down Pod. • Pod is usually known as the basic execution unit or smallest deployable unit in Kubernetes. • Let’s see the Pod at the point of namespaces and cgroup. The concept of Pod Pod Infra Container Container foo Container goo Container hoo provide network, ipc namespaces for containers pod-cgroup ├── container-foo-cgroup │ └── control-files ├── container-goo-cgroup │ └── control-files ├── container-hoo-cgroup │ └── control-files ├── infra-container-cgroup │ └── control-files └── control-files Reousce limits are set on cgroup.
  • 9. SOSCON2019Deep dive into Kuburnetes at the node level • Kubelet communicates with container runtimes over CRI. • CRI is developed for loosely coupled structure between kubelet and container runtimes. (But Kubelet still communicates with docker over dockershim which is part of kubelet) • CRI offers set of gRPC APIs and protobuf messages for pod/container lifecycle management. (CRI runtime runs CRI runtime service server, kubelet is client) Let’s talk about kubelet and container runtime. source: Kubernetes Blog, Introducing Container Runtime Interface (CRI) in Kubernetes CRI RuntimeKubelet GRPC/CRI RunPodSandbox CreateContainer StartContainer ... Docker, containerd, CRI-O … The concept of CRI
  • 10. SOSCON2019Deep dive into Kuburnetes at the node level What is OCI and OCI compliant runtime? source: OCI Runtime Specification • OCI(Open Container Initiative) offers “image-spec” and “runtime-spec” as open industry standards. • Image-spec specifies image format for “OCI Runtime bundle” which is set of files. • Runtime-spec defines the concept of runtime bundle and configuration & lifecycle of a container. • OCI compliant runtime means runtime which can run “OCI Runtime bundle”. (opencontainers/runc is known as the iconicOCI runtime and reference implementation.) The concept of OCI image spec and runtime spec OCI Runtime bundleOCI Image OCI Container Extraction Execution
  • 11. SOSCON2019Deep dive into Kuburnetes at the node level Now we can draw clear picture with CRI and OCI runtime. CRI RuntimeKubelet CRI Lifecycle related CRI APIs and OCI Runtime event OCI Runtime OCI RunPodSandbox, CreateContainer, UpdateContainerResources, StartContainer, StopContainer, RemoveContainer, StopPodSandbox, RemovePodSandbox, … State, Create, Start, Kill, Delete source: Container Runtime Interface, OCI Runtime Specification OCI Container
  • 12. SOSCON2019Deep dive into Kuburnetes at the node level Now we can draw clear picture with CRI and OCI runtime. CRI RuntimeKubelet CRI List of CRI and OCI Runtimes OCI Runtime OCI docker, containerd, CRI-O, rkt, frakti, singularity-cri, … runc, crun, gVisor, kata-runtimes, nabula, firecracker-runtime, singularity, … OCI Container
  • 13. SOSCON2019Deep dive into Kuburnetes at the node level But in the real world, there is a “shim”. Kubelet/dockershim dockerd/containerd runc CRI-plugin/ContainerdKubelet runc containerd-shim-v2 containerd-shim-v2 CRI-OKubelet runc CRI CRI CRI 3 ways to runc OCI OCI OCI shim shim
  • 14. SOSCON2019Deep dive into Kuburnetes at the node level But in the real world, there is a “shim”. CRI-plugin/ContainerdKubelet containerd-shim-v2 CRI-OKubelet Kata-runtime CRI CRI OCI OCI containerd-shim-v2 Kata-runtime 2 ways to Kata-runtime shim shim
  • 15. SOSCON2019Deep dive into Kuburnetes at the node level Do we have to know all of this for resource management? CRI RuntimeKubelet CRI Sequence of pod and container creation OCI Runtime OCI Create a Pod level cgroup, then set resource restriction for a pod. OCI Container Create a container level cgroup, then set resource restriction for a container. Lastly, run container on dedicated cgroup. • It is required to know how to manage resources at the low level. (to use Node Allocatable Feature, and Resource Managers in Kubernetes like CPU manager.) • It is required to know to run hardware accelerated application like DPDK with low level resource management. • In the case of kata-container with KVM/QEMU, the way to manage resources is little bit different.
  • 16. SOSCON2019How Kubernetes supports NUMA What is NUMA? • NUMA(Non-Uniform Memory Access) is modern style architecture for multi processors. • Each socket(NUMA node) has own CPU Processor, Memory, PCI Devices. (Typically, one socket equal to one NUMA node.) • Processor is able to access remote memory and I/O devices on other sockets. (But the remote access of resources shows performance decrement) Typical 2 sockets configuration of Intel Xeon CPU CPU 3 x PCIe(16x) 3 x PCIe(16x) UPI 6 x DDR4 6 x DDR4
  • 17. SOSCON2019How Kubernetes supports NUMA When NUMA aware resource allocation is required? • NUMA aware resource allocation should be made for following applications. • Latency-sensitive applications such as real-time AR/VR and game streaming. • Hardware acceleration based applications such as DPDK and CUDA. Socket 0 Socket 1 136 194 LatencyinnSec DPDK l2fwd Throughput with 10Gbps NIC (Intel Xeon Scalable Gold 6148) Memory access latency from socket 0 (Intel Xeon E7-4800) aligned misaligned 9.9 7.9 ThroughputinGbps Latency test source: Memory Latencies on Intel® Xeon® Processor E5-4600 and E7-4800 product families,Intel
  • 18. SOSCON2019How Kubernetes supports NUMA CPU Pinning in Kubernetes • CPU pinning allows exclusive usage of CPUs for process or thread. • CPU Manager in Kubernetes responsible for allocating logical threads(SMT) to containers. (CPU Manager attempts to allocate siblingthreads to containers, when siblings are available.) • CPU Manager allocates exclusive CPUs using CPUSET cgroup controller. (It is possible to adjust container’s cpu affinity at thread level by “sched_setaffintiy”.) • Alternative(Intel CMK) also available. (Both solutions and NTM are contributed by Intel.) Comparison between CPU Manager and Intel CMK Solution Part of Kubelet Approach Allowed CPUSET NUMA Support Node Allocatable Feature Node Topology Manager CPU Manager Yes cgroup (CPUSET) Allocated CPUs only CPU, I/O Devices, etc over NTM supported supported Intel CMK No(Plugin) sched_setaffinity subprocess Entire CPUs on machine CPU Only Not supported Not supported
  • 19. SOSCON2019How Kubernetes supports NUMA How it Works: CPU Manager apiVersion: v1 kind: Pod metadata: name: dpdk-sample spec: containers: - image: dpdk-sample name: simple-l2fwd resources: requests: cpu: "4" memory: "1Gi" hugepages-1Gi: "2Gi" limits: cpu: "4" memory: "1Gi" hugepages-1Gi: "2Gi" cat <container-cgroup>.cpuset.cpus 1-2,41-42 Socket 0 Socket 1 -------- -------- Core 0 [0, 40] [20, 60] Core 1 [1, 41] [21, 61] Core 2 [2, 42] [22, 62] Core 3 [3, 43] [23, 63] Core 4 [4, 44] [24, 64] Core 5 [5, 45] [25, 65] Core 6 [6, 46] [26, 66] Core 7 [7, 47] [27, 67] Core 8 [8, 48] [28, 68] … Yaml example for CPU pinning CPU Layout(2socket, SMT enabled) Allocated CPUs for container “sched_setaffinity” usage in DPDK (pin pThread3 to lCore42) Process A pThread0 pThread1 pThread2 pThread3 lCore2lCore1 lCore41 lCore42 Thread_creation ├─pthread_create ├─pthread_setname_np └─pthread_setaffinity_np └─ sched_setaffinity(tid, cpuset)
  • 20. SOSCON2019How Kubernetes supports NUMA Resource Manager and Plugins in Kubernetes • Device Manager (Component of Kubelet, advertises/allocates extendedresources.) • Device Plugins (nvidia-gpu-plugin,amd-gpu-plugin, gpu-sharing-plugin, sr-iov-plugin, rdma-device-plugin,etc) Sequence of extended resource allocation in Kubernetes Device Plugin Register Device Manager ListWatch Allocate Normally, plugins are gRPC server and containerized. Kubelet PodAdmitHandler
  • 21. SOSCON2019How Kubernetes supports NUMA The concept of Topology Manager • Topology Manager provides the way of NUMA-aware resource allocation for containers at the node level. • Topology Manager retrieves Topology Hint from Hint Providers • Topology Manager calculates NUMA node affinity then judges whether admit pod or not by given policy. (pod admission will be rejected, if chosen policycannot be satisfied.) Sequence of Pod admission with Topology Manager Topology ManagerKubelet Hints Admit() HintProviders GetTopologyHints() SyncPod() admitorrejectPod CPU Manager, Device Manager, …
  • 22. SOSCON2019How Kubernetes supports NUMA What is Topology Hint and Topology Policy? • Topology Hint is data structure to represent NUMA nodes of allocable resources as bits. • Topology Manager collects hints then merges the hints to find best hint. (Policies share the same merging algorithm in 1.16, each policywill have own one in future release) • Each policy has own pod admission criteria. Policy Description none Do nothing, Topology Manager will not working. best-effort Calculate best hint then just use it whatever it is restricted* Reject pod admission if best hint is not preferred hint single-numa* Reject pod admission if best hint does not fit to single NUMA nodeTopology Hint Structure Topology Policies //TopologyHint is a struct containing the NUMANodeAffinity for a Container type TopologyHint struct { NUMANodeAffinity bitmask.BitMask // Preferred is set to true when the NUMANodeAffinity encodes a preferred // allocation for the Container. It is set to false otherwise. Preferred bool }
  • 23. SOSCON2019How Kubernetes supports NUMA How it Works: Topology Manager (w/single-numa policy) apiVersion: v1 kind: Pod metadata: name: ntm-sample spec: containers: - image: simple-sample name: simple-sample resources: requests: cpu: “4" memory: "1Gi" nvidia.com/gpu: 1 limits: cpu: “4" memory: "1Gi" nvidia.com/gpu: 1 Yaml example for NTM Test Case Resource availability at scheduler level Available Resources on Socket 0 Available Resources on Socket 1 Expected Result Positive Case 1 CPU: 20, GPU: 4 CPU: 10, GPU: 2 CPU: 10, GPU: 2 Socket0, Socket1 Positive Case 2 CPU: 20, GPU: 2 CPU: 10, GPU: 2 CPU: 10, GPU: 0 Socket0 Positive Case 3 CPU: 7, GPU: 3 CPU: 3, GPU: 2 CPU: 4, GPU: 1 Socket1 Negative Case 1 CPU: 13, GPU: 2 CPU: 3 , GPU 2: CPU: 10, GPU: 0 Admit Rejected Negative Case 2 CPU: 6, GPU: 4 CPU: 3 , GPU 2: CPU: 3 , GPU 2: Admit Rejected PodAdmit TestCase
  • 24. SOSCON2019How Kubernetes supports NUMA Issues(in 1.16) Issue Description Kubernetes/Issues/#83476* Unreliable Topology Hint generation when multiple containers in the same pod require alignment. Kubernetes/PR/#83697 Topology Manager wouldn’t allow pod admit with single-numa policy when any of hint providers had no NUMA preferences. (Merged) Kubernetes/PR/#83492 Topology Manager supports only guaranteed QoS class. (Merged) Kubernetes/Issue/#83483 To support “inter-device” topology contstraints(i.e. GPU-direct, Nvlink, RDMA) Kubernetes/Issues/#83478 Same affinity calculation algorithm for various policies. (Refactoring has been already started.) TBD Alignment is limited at the container level, Topology Manager doesn't support Pod level alignment.
  • 25. SOSCON2019How Kubernetes supports NUMA Helpful Links Title Link Cgroup https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt CPU Manager KEP https://github.com/kubernetes/community/blob/master/contributors/design- proposals/node/cpu-manager.md Device Manager KEP https://github.com/kubernetes/community/blob/master/contributors/design- proposals/resource-management/device-plugin.md Topology Manager KEP https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0035-20190130- topology-manager.md CPU Manager Guide https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/ Topology Manager Guide https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/ Kubelet (Container Manager) https://github.com/kubernetes/kubernetes/tree/master/pkg/kubelet/cm
  • 26. SOSCON2019Kubernetes Contribution Special Interest Groups(SIGs) are open to new contributors source: https://github.com/kubernetes/community
  • 27. SOSCON2019Kubernetes Contribution Hugepages Enhancement But…What is hugepages? • Hugepages are literally page which has huge size, typical Linux machine supports two page sizes(2MB, 1GB). (Default page size is 4kb) • The concept of hugepages is reducing TLB miss to reduce memory access latency. (Hugepages also allow high utilization of hardware cache by reducing PageTable Entries.) • DPDK and Database are usually known as applications which consumes hugepages. (DPDK is Data Plane Development Kit for packet processing.) • Kubernetes supports to consume pre-allocated hugepages but it does not support NUMA and container isolation of hugepages.
  • 28. SOSCON2019Kubernetes Contribution Hugepages Enhancement What is the goal of hugepages enhancement? • Support container isolation of hugepages • Support multi size hugepages at host and container level. • Support NUMA aware hugepages management.
  • 29. SOSCON2019 SAMSUNG OPEN SOURCE CONFERENCE 2019 THANK YOU