Container Cluster(s)
Container Cluster(s)
Container Cluster(s)
Container Host Container Host Container Host
Internet
LB
LB
LB
Container
Host
Container
Host
Container
Host
LB
LB
LB
Router
Router
Container	Cluster	(s)
L4-LB L4-LB L4-LB L4-LB
VIP: 10.0.0.1:80
Destination IP 1
Destination IP 2
Destination IP N
VIP: 10.0.0.1:80
Destination IP 1
Destination IP 2
Destination IP N
VIP: 10.0.0.1:80
Destination IP 1
Destination IP 2
Destination IP N
VIP: 10.0.0.1:80
Destination IP 1
Destination IP 2
Destination IP N
Container Cluster(s)
Container Host Container Host Container Host
restart
restart
restart
stop stop
stop
start
start
start
L4-LB L4-LB L4-LB L4-LB
VIP: 10.0.0.1:80
DIP 1 ... N
VIP: 10.0.0.1:80
DIP 1 ... N
VIP: 10.0.0.1:80
DIP 1 ... N
VIP: 10.0.0.1:80
DIP 1 ... N
Container Host Container Host Container Host
restart
restart
restart
stop stop
stop
start
start
start
Internet
LB
LB
LB
Container
Host
Container
Host
Container
Host
LB
LB
LB
Router
Router
start
stop
restart
Container	Cluster	(s)
Container Host
Container Host
Container Host
Container Host
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
Internet
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
Container Host
Container Host
Container Host
Container Host
L4-LB
Large Scale
Load Balancing
High
Availability
L4-LB L4-LB L4-LB L4-LB L4-LB L4-LB
Host: 1.1.1.1
VIP: 10.0.0.1
Host: 1.1.1.2
VIP: 10.0.0.1
Host: 1.1.1.3
VIP: 10.0.0.1
Host: 1.1.1.4
VIP: 10.0.0.1
Host: 1.1.1.5
VIP: 10.0.0.1
Host: 1.1.1.6
VIP: 10.0.0.1
Host: 1.1.1.1
IP Advertise
L4-LB L4-LB L4-LB L4-LB L4-LB L4-LB
VIP: 10.0.0.1
Host: 1.1.1.2
VIP: 10.0.0.1
Host: 1.1.1.3
VIP: 10.0.0.1
Host: 1.1.1.4
VIP: 10.0.0.1
Host: 1.1.1.5
VIP: 10.0.0.1
Host: 1.1.1.6
VIP: 10.0.0.1
PATH= Hashing(
IP.Src,
IP.Dst,
IP.Protocol,
Port.Src,
Port.Dst )
Mod N
L4-LB L4-LB
Host: 1.1.1.3
VIP: 10.0.0.1
Host: 1.1.1.5
VIP: 10.0.0.1
Internet
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
PATH= Hashing()
Mod ( N - 1 )
L4-LBL4-LBL4-LB L4-LB
Host: 1.1.1.1
VIP: 10.0.0.1
Host: 1.1.1.3
VIP: 10.0.0.1
Host: 1.1.1.5
VIP: 10.0.0.1
Host: 1.1.1.6
VIP: 10.0.0.1
192.68.0.2
Container[0]
Container[1]
Container[2]
Container[3]
L4-LB
1. Connection
Established With
Container[2]
2. A Load
Balancer Down
3. ECMP Disruption.
Forward packet
another l4
4. Container[0] has no
idea about connection
with container[2]
5. Container[0] Send RST.
Connection Closed
192.68.0.3
Container[0]
Container[1]
Container[2]
Container[3]
L4-LB
L4-LB
1. Connection
Established With
Container[3]
2. A Load
Balancer Down
3. ECMP Disruption.
Forward packet
another l4 although
the serving LB is
alive. 4. Container[1] has no
idea about connection
with container[3]
5. Container[1] Send RST.
Connection Closed
L4-LB
Internet
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
L4-LB
Container Host
Container Host
Container Host
Container Host
L4-LB
Not
Reliable
Not
scalable
High
Availability
BGP/
ECMP
High
Availability
192.68.0.1 Hashing(IP) % 2
Container[0]
Container[1]
Hashing(IP) % 2
Hashing(IP) % 2
Hashing(IP) % 2
192.68.0.2
192.68.0.3
192.68.0.4
192.68.0.1 Hashing(IP) % 4
Hashing(IP) % 4
Hashing(IP) % 4
Hashing(IP) % 4
192.68.0.2
192.68.0.3
192.68.0.4
Container[0]
Container[1]
Container[2]
Container[3]
192.68
.0.2
Container[0]
192.68
.0.3
Container[2]
192.68
.0.4
Container[0]
Container[3]
Container[2]
192.68
.0.1
Container[1]
Container[3]
Container[1]
Guarantee
to remap
K/n only
192.68
.0.2
Container[0]
192.68
.0.3
Container[2]
192.68
.0.4
Container[0]
Container[3]
Container[2]
192.68
.0.1
Container[1]
Container[3]
Container[1]
Efficient
Load
Balancing
Consistent Hashing
Backend Selection
Packet Processing
Packet Forwarding
B0 B1 B2
3 0 3
0 2 4
4 4 5
1 6 6
5 1 0
2 3 1
6 5 2
Backend = 3
Table Size = 7
Permutate
Hashing
Permutation table ( 3x 7 )
Preference list
of Backends
Lookup_table ( size =7,empty )
Population
Permutation table ( 3x 7 )
B1 B0 B1 B0 B2 B2 B0
Lookup_table ( size =7 )
B0 B1 B2
3 0 3
0 2 4
4 4 5
1 6 6
5 1 0
2 3 1
6 5 2
[0] [1] [2] [3] [4] [5] [6]
Assign backends by
preference list
B1 B0 B1 B0 B2 B2 B0
each backend will receive an
almost equal numberof
connections.
[0] [1] [2] [3] [4] [5] [6]
L4-LB
L4-LB
B0 =3 * connection
B1 =2 * connection
B2 =2 * connection
Population
Permutation table ( 2x 7 )
B0 B2
3 3
0 4
4 5
1 6
5 0
2 1
6 2
Assign backends by
preference list
Before
B1
B0
B1
B0
B2
B2
B2
B1 is
Removed
Re-Permutate
Hashing
Lookup_table ( size =7 )
After
B1	->	B0
B0
B1	->	B0
B0
B2
B2
B2
Same Value By
hashing
Lookup_table ( size =7 )
Backend SelectionPacket ProcessingPacket Forwarding
Internet
L4-LB
L4-LB
L4-LB
L4-LB
Consistent Hashing
Backend Selection:
FUNCTION
Packet Processing:
NETFILTER
Packet Forwarding:
NAT, DR, IP Tunneling
Weighted
Round Robin
Round Robin
Source Hashing
least
connection
Makefile
Maglev Hashing
Scheduling
Module
Maglev Hashing
Scheduling
Round Robin
Source Hashing
least
connection
Weighted
Round Robin
...
Lookup_table Size=251
...
Lookup_table Size=131071
Disruption % VS Memory usage
New Service
Old Service
5%
95%
Container Host
Container Host
Container Host
Container Host
W: 10
W: 40
W: 80
Container Host
Container Host
Container Host
Container Host
W: 10
W: 40 -> 0
W: 80
IPVS 1 VIP: 10.0.0.1:80 MH
DIP: 172.16.0.1:80
DIP: 172.16.0.2:80
DIP: 172.16.0.3:80
DIP: 172.16.0.10:80
IPVS 2 VIP: 10.0.0.1:80 MH
DIP: 172.16.0.3:80
DIP: 172.16.0.1:80
DIP: 172.16.0.5:80
DIP: 172.16.0.7:80
ip_vs_mh_Lookup[ ] of 10.0.0.1:80in ipvs 1
ip_vs_mh_permutate() &
ip_vs_mh_populate()
ip_vs_mh_Lookup[ ] of 10.0.0.1:80in ipvs 2
ip_vs_mh_permutate() &
ip_vs_mh_populate()
IPVS MH IPVS MH
192.68.0.2
IPVS
IPVS MH
IPVS
IPVS MH
IPVS
IPVS MH
Container[0]
Container[1]
Container[2]
Container[3]
[2]
[2]
[2]
IPVS MH IPVS MH IPVS MH IPVS MH IPVS MH
Add IP of containers
dynamically To IPVS
Add IP of containers
dynamically To IPVS
VIP: 10.0.0.1:80 MH
DIP: 172.16.0.1:80
DIP: 172.16.0.2:80
DIP: 172.16.0.3:80
DIP: 172.16.0.10:80
IPVS 2 VIP: 10.0.0.1:80 MH
DIP: 172.16.0.3:80
DIP: 172.16.0.1:80
DIP: 172.16.0.5:80
DIP: 172.16.0.7:80
K/n
IPVS MH
Container Host
Container Host
Container Host
Container Host
L4-LB
Only K/n Disruption
IPVS MH
IPVS 1 VIP: 10.0.0.1:80 MH
DIP: 172.16.0.1:80
DIP: 172.16.0.2:80
DIP: 172.16.0.3:80
DIP: 172.16.0.10:80
IPVS 2 VIP: 10.0.0.1:80 MH
DIP: 172.16.0.1:80
DIP: 172.16.0.2:80
DIP: 172.16.0.3:80
DIP: 172.16.0.10:80
IPVS MH
ip_vs_mh_Lookup[ ] of 10.0.0.1:80in ipvs 1 ip_vs_mh_Lookup[ ] of 10.0.0.1:80in ipvs 2
Container Host
Container Host
Container Host
Container Host
IPVS
No Disruption
192.68.0.2
Container[0]
Container[1]
Container[2]
Container[3]
IPVS
1. Connection
Established With
Container[2]
2. A Load
Balancer Down
3. ECMP Disruption.
Forward packet
another l4
4. IPVS MH can forward
the packet to same
destination by hashing.
IPVS MH
5. Continue the
Established connection.
192.68.0.3
Container[0]
Container[1]
Container[2]
Container[3]
IPVS
1. Connection
Established With
Container[3]
2. A Load
Balancer Down
3. ECMP Disruption.
Forward packet
another l4 although
the serving LB is
alive.
IPVS
IPVS MH
4. IPVS MH can forward
the packet to same
destination by hashing
With no connection info.
5. Continue the
Established connection.
IPVS MH IPVS MH IPVS MH IPVS MH IPVS MH
Add IP of containers
dynamically To IPVS
Add IP of containers
dynamically To IPVS
IPVS MH IPVS MH IPVS MH IPVS MH IPVS MH
Add IP of containers
dynamically To IPVS
Add IP of containers
dynamically To IPVS
BGP/
ECMP
IPVS Maglev
Hashing
Scheduler
Linux Kernel >= 4.18
Choose M in Kernel menuconfig
to use IP_VS_MH
echo 1 > /proc/sys/net/ipv4/vs/sloppy_tcp
echo 2 > /proc/sys/net/ipv4/vs/conn_reuse_mode
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing

Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing