IPVS 101
HungWei Chiu
HungWei Chiu
•MTS @ ONF
•SDNDS-TW/CNTUG
•Linux/Network/Container/
Kubernetes
•Kuberentes Courses @Hiskio
Agenda
Introduction
Demonstration
Implementation
Integration with Kubernetes
Compare IPVS with IPTABLES
LVS
Linux Virtual Server
Highly scalable/available servers built on a cluster of real server with the load
balancer running on the Linux operating system.
User interacts it as if it were a single server.
Development
IPVS
KTCPVS
...etc
LVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server N
Request
Load-Balancer
Request
LVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server N
Request
Load-Balancer
Request
Servers
Service
LVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server N
Request
Request
Servers
Master
LVS
Linux Host
Backup
Server1
Sync
LVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server N
Request
Request
Servers
Master
LVS
Linux Host
Backup
Server1
Sync
IPVS
IP Virtual Server
Implemented transport-layer load balancing inside the Linux kernel.
Tool: ipvsadm
Terminology
Load-Balancer -> Services + Scheduling algorithm
Real Servers -> Real Servers
IPVS
Forwarding method
Network Address Translation(NAT), Tunnel(TUN), Director Routing(DR)
Scheduling
Round-Robin (Weight or not)
Least-Connection (Weight or not)
Source/Destination Hashing
...etc
IPVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server 1
Request
Load-Balancer
Scheduler Algorithm (RR)
User
User
User
IPVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server 2
Request
Load-Balancer
Scheduler Algorithm (WRR)
User
User
User
1:2:1
IPVS
Cluster of real servers
Server1
User
Linux Host
Server2
Server3
Server 2
Request
Scheduler Algorithm (WRR)
User
User
User
1:2:1
Packet Forwarding
Tools
ipvsadm
-A (Add service) + -s (scheduling)
tcp(-t),udp(-u),sctp(-sctp), ip:port
-a (Add server) + -r (server ip:port) + [-g|-m|-i]
SNAT(Masquerade, -m), Tunnel(IPIP, -i), g(gateway, DR)
-D/-d
-L
Demonstration
Linux (Ubuntu 18.04)
Requirement packages
Kernel modules
ip_vs.ko
ip_vs_rr.ko
ip_vs_xx.ko
Userspace tool: ipvsadm
sudo apt-get install ipvsadm
Docker
/Nginx
volume/server1
IPVS Server
curl ipvsadm
Docker
/Nginx
volume/server2
Docker
/Nginx
volume/server3
Linux Host
Linux Host
User Space
Kernel Space
Docker
/Nginx
Docker
/Nginx
Docker
/Nginx
Server1 Server2 Server3
IPVS
Ipvsadm
functions
Linux Network Stack
Commands
Question
ipvsadm -A -t 172.17.8.111:80
Should service IP(172.17.8.111) exist in any network
interface ?
Demo
Create IPVS with a fake address
Create a dummy interface
Forward the fake IP address to dummy interface
Demo
Matter?
Kubernetes Service
ClusterIP (virtual IP)
That's also how IPVS works in Kubernetes
Implementation
ipvsadm -> Userspace tool
Talk to kernel
Kernel module
ip_vs.ko -> main function
ip_vs_xx.ko -> scheduler
Location
net/netfilter/ipvs
Based on netfilter
Kernel Module
ipvsadm tries to load the module (ip_vs.ko)
kernel code tries to load scheduler modules.
net/netfilter/ipvs/ip_vs_sched.c
ip_vs_scheduler_get
Per network namespace
Workflow (ipvsadm -A)
check module
Function
Handler
Load Module
Create Service
ipvsadm
check scheduler
Load Module
User Space
Kernel Space
SetSocketOpts
Workflow (ipvsadm -a)
check module
Function
Handler
Create Real Servers
ipvsadm
check Service
Load Module
User Space
Kernel Space
SetSocketOpts
ipvs
Hook functions on netfilter
6 hook functions
LOCAL_IN * 2
LOCAL_OUT * 2
FORWARD * 2
nf_hook_ops
Mangle
Filter
IPVS
ip_vs ip_vs
ip_vs ip_vs
ip_vs
ip_vs
Load-balancing
Two types of load-balancer
Client <----> LB <----> Server (Nginx)
Two connections
Client <----LB-----> Server (IPVS, IPTABLES)
One connection.
Just changes the packet header
Kubernetes
Demo environment
Vagrant + Kubeadm + kubeadm_config
Kube-proxy -> ipvs (default is iptables)
Service
iptables
PRE_ROUTING
Packets
KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX
DNAT
Jump
Jump if match protocol/clusterIP Jump if random module return true
Match protocol
Module: TCP/UDP Module: Random
Choose ENDPOINTS
OUTPUT
Random
If P < 0.25
If P < 0.33
If P < 0.5
Endpoint3 Endpoint4
Endpoint2
Endpoint1
Endpoint1
Endpoint2
Endpoint3
Endpoint4
10.244.1.3
10.244.2.32
10.244.3.63
10.244.3.23
Request
P=1/4 = 0.25
P= 3/4 * 1/3 = 0.25
P= 3/4 * 2/3 * 1/2 = 0.25 1-0.75 = 0.25
Workflow
PRE_ROUTING
Packets
KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX
DNAT
Jump
Jump if match protocol/clusterIP Jump if random module return true
Match protocol
Module: UDP Module: Random
Choose ENDPOINTS
dest=10.96.121.30
dest=10.96.121.30 dest=10.96.121.30
dest=10.244.0.3
IPVS
IPVS as load-balancer
ClusterIP -> 1 IPVS record
NodePort -> n IPVS records ( # of NICs)
Minimize the number of IPTABLES's rules by IPSET
SNAT
IPSET
IP+SET
L3 + L4 header
Looking up by Hash
Example
Dummy interface
Debug
Dynamic Debug
/sys/kernel/debug/dynamic_debug/control
echo 'module ip_vs +fp' > /sys/.../control
Debug Level (Disabled by default)
Need to recompile the kernel module
/proc/sys/net/ipv4/vs/debug_level
echo to modify the debug level
IMO, IPVS is hard to debug, even difficult than IPTABLES.
ip_vs ip_vs
ip_vs ip_vs
ip_vs
ip_vs
One more thing
COSCUP
Cloud Native Hub
Telegram: https://t.me/cntug
Github: https://github.com/cloud-native-taiwan/meetups
MyBlog: https://www.hwchiu.com
Q&A

IP Virtual Server(IPVS) 101