How VXLAN works on Linux

Etsuji Nakai
Etsuji NakaiCloud Solutions Architect at Google
How VXLAN works on Linux
Basic mechanism and
Application to OpenStack and Docker
]中井悦司 / Etsuji Nakai
Senior Solution Architect
and Cloud Evangelist
Red Hat K.K
v1.1 2015/07/09
2
How VXLAN works on Linux
$ who am i
 中井悦司 / Etsuji Nakai
– Twitter @enakai00
– Senior Solution Architect and
Cloud Evangelist at Red Hat.
– The author of some OpenStack books.
3
How VXLAN works on Linux
Contents
 VXLAN basics
 OpenStack Neutron OVS Plugin
 VTEP implementation with Flannel
 References
VXLAN basics
5
How VXLAN works on Linux
The objective of VXLAN
 Creating virtual L2 network over physical L3 network.
VXLAN
Switch
VXLAN
Switch
VXLAN
Switch
Tokyo Osaka Fukuoka
10.1.0.0/16
10.1.1.0 10.1.2.0 10.1.3.0
Physical view
Logical view
from servers
6
How VXLAN works on Linux
Packet encapsulation with VXLAN header
 VXLAN encapsulates L2 packet inside L3 packet.
VXLAN
Switch
VXLAN
Switch
Tokyo Osaka
Dest Address
yy.yy.yy.yy
Original
Packet
Source Address
xx.xx.xx.xx
Original
Packet
VXLAN Header
xx.xx.xx.xx yy.yy.yy.yy
Original
Packet
7
How VXLAN works on Linux
8
How VXLAN works on Linux
The fundamental problem of L2 over L3
 How to find the correct location of packet destination?
How did you know that
the destination is in Osaka!?
VXLAN
Switch
VXLAN
Switch
Tokyo Osaka
Dest Address
yy.yy.yy.yy
Original
Packet
Source Address
xx.xx.xx.xx
Original
Packet
VXLAN Header
xx.xx.xx.xx yy.yy.yy.yy
Original
Packet
9
How VXLAN works on Linux
ARP resolution on L2 layer
 VXLAN switches need to emulate the ARP resolution
mechanism.
IP  10.1.2.0
MAC zz:zz:zz:zz:zz:zz
① ARP Request
"What's the MAC
for IP 10.1.2.0?"
② ARP Reply
"zz:zz:zz:zz:zz:zz"
Dest IP
10.1.2.0
Source IP
10.1.1.0
Payload
Dest MAC
zz:zz:zz:...
Source MAC
xx:xx:xx:...
L3 headerL2 header
IP  10.1.1.0
MAC xx:xx:xx:xx:xx:xx
④ Send L2 packet to
"zz:zz:zz:zz:zz:zz"
③ Port <-> MAC association
is recorded in MAC table
10
How VXLAN works on Linux
Additional features for L2 over L3
 Packet encapsulation is not enough for L2 over L3. VXLAN switches need to
implement the following features.
– ARP resolution: Need to reply to ARP request from local servers without
broadcasting the ARP packet.
– Destination search : Need to find the destination location corresponding to the
destination MAC.
 The VXLAN endpoint providing these features is referred as "VTEP".
ARP Reply
「zz:zz:zz:zz:zz:zz」
Dest "zz:zz:zz:zz:zz:zz" is
located in Osaka.VXLAN
Switch
Tokyo
xx.xx.xx.xx
① ARP Request
"What's the MAC
for IP 10.1.2.0?"
④ Send L2 packet to
"zz:zz:zz:zz:zz:zz"
11
How VXLAN works on Linux
12
How VXLAN works on Linux
Variations of VTEP implementation
 To implement VTEP features, there must be some mechanism to share the
tuple (MAC, IP Address, Location) of all servers.
 The followings are some variations of VTEP implementation.
– Exchange MAC/IP information using L3 multicasting among switches.
– Use SDN controller as a central MAC/IP database.
– Use local agent and virtual VXLAN switch running on Linux servers.
OpenStack Neutron OVS Plugin
14
How VXLAN works on Linux
ML2 l2population driver
 In the case of OpenStack Neutron OVS plugin, VXLAN encapsulation is done on
the local Open vSwitch on compute nodes.
– MAC/IP information is sent by L2 agent and populated by l2population ML2 driver.
– The l2population driver populates the following entries in OVS.
• FDB (forwarding database): a lookup table to find a destination node
corresponding to the dest MAC address.
• Flowtable entries for replying to ARP requests from local VMs.
VM
OVS (br-int)
VM
l2population
driver
Messaging server
(RabbitMQ)
VM
OVS (br-int)
VM
l2population
driver
L2 Agent L2 Agent
① Attaching
new VM
② Send MAC/IP
information
③ Populate flow
table in OVS
15
How VXLAN works on Linux
 Reference : ML2 – Address Population
– http://assafmuller.com/2014/02/23/ml2-address-population/
VTEP implementation
with Flannel
17
How VXLAN works on Linux
Overlay network with Flannel
 Flannel is a opensource tool to create overlay network for Docker containers.
It's often used with Kubernetes.
– It uses Linux kernel's native VXLAN devices for packet encapsulation.
– Flannel daemon dynamically populates FDB and ARP table according to the
kernel requests via the "L2/L3 MISS" notification mechanism.
• The mechanism is originally named as "DOVE extensions"
• https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
– The IP/MAC information is shared with the backend KVS (etcd).
etcd
Physical network
192.168.122.0/24
Minion
flannel.1
Minion
flannel.1
Internal network for container communication
10.1.0.0/16
Minion
flannel.1
VXLAN device
18
How VXLAN works on Linux
Kernel's DOVE extensions
 You can use the native VXLAN device with the current Linux kernel.
– You don't necessarily need OVS for using VXLAN.
– It's just like using the traditional VLAN device with Linux :)
 VTEP features are implemented with a userland agent via "L2/L3 MISS"
notification mechanism. (The notification is sent via netlink.)
– L3MISS
• The kernel asks the agent to populate the local ARP table when necessary
instead of broadcasting the ARP request packet.
– L2MISS
• The kernel asks the agent to populate FDB when necessary.
# ip -d l show flannel.1
3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT
link/ether 82:ce:d5:09:06:2c brd ff:ff:ff:ff:ff:ff promiscuity 0
vxlan id 1 local 192.168.122.101 dev eth0 srcport 0 0 dstport 8472 proxy l2miss ageing 300
# bridge fdb show dev flannel.1
56:e1:c1:d6:b7:51 dst 192.168.122.102 self
# cat /proc/sys/net/ipv4/neigh/flannel.1/app_solicit
3
19
How VXLAN works on Linux
 Reference: Kernel patch - add DOVE extensions for VXLAN
– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
References
21
How VXLAN works on Linux
References
 ML2 – Address Population
– http://assafmuller.com/2014/02/23/ml2-address-population/
 Kernel patch: add DOVE extensions for VXLAN
– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
 FlannelのVXLANバックエンドの仕組み
– http://enakai00.hatenablog.com/entry/2015/04/02/173739
EMPOWER PEOPLE,
EMPOWER ENTERPRISE,
OPEN INNOVATION.
1 of 22

More Related Content

What's hot(20)

VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRouting
Faisal Reza1.4K views
macvlan and ipvlanmacvlan and ipvlan
macvlan and ipvlan
Suraj Deshmukh3.6K views
Open vSwitch IntroductionOpen vSwitch Introduction
Open vSwitch Introduction
HungWei Chiu813 views
Meetup 23 - 02 - OVN - The future of networking in OpenStackMeetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Vietnam Open Infrastructure User Group1.8K views
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
Vietnam Open Infrastructure User Group2.1K views
VPP事始めVPP事始め
VPP事始め
npsg11.1K views
Link Aggregation Control ProtocolLink Aggregation Control Protocol
Link Aggregation Control Protocol
Kashif Latif11.9K views
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routing
Wilfredzeng11.4K views
FD.io VPP事始めFD.io VPP事始め
FD.io VPP事始め
tetsusat2.3K views
VXLAN Practice GuideVXLAN Practice Guide
VXLAN Practice Guide
Prasenjit Sarkar16.1K views
ACI DHCP Config GuideACI DHCP Config Guide
ACI DHCP Config Guide
Woo Hyung Choi4.8K views
Ccnp workbook network bullsCcnp workbook network bulls
Ccnp workbook network bulls
Swapnil Kapate6.5K views

Similar to How VXLAN works on Linux

rtnetlinkrtnetlink
rtnetlinkTaku Fukushima
2.9K views54 slides
Ovn vancouverOvn vancouver
Ovn vancouverMason Mei
520 views33 slides
NSX-MHNSX-MH
NSX-MHsethuraman ramanathan
783 views59 slides

Similar to How VXLAN works on Linux(20)

Recently uploaded(20)

CXL at OCPCXL at OCP
CXL at OCP
CXL Forum203 views
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman161 views
Green Leaf Consulting: Capabilities DeckGreen Leaf Consulting: Capabilities Deck
Green Leaf Consulting: Capabilities Deck
GreenLeafConsulting177 views
ThroughputThroughput
Throughput
Moisés Armani Ramírez31 views

How VXLAN works on Linux

  • 1. How VXLAN works on Linux Basic mechanism and Application to OpenStack and Docker ]中井悦司 / Etsuji Nakai Senior Solution Architect and Cloud Evangelist Red Hat K.K v1.1 2015/07/09
  • 2. 2 How VXLAN works on Linux $ who am i  中井悦司 / Etsuji Nakai – Twitter @enakai00 – Senior Solution Architect and Cloud Evangelist at Red Hat. – The author of some OpenStack books.
  • 3. 3 How VXLAN works on Linux Contents  VXLAN basics  OpenStack Neutron OVS Plugin  VTEP implementation with Flannel  References
  • 5. 5 How VXLAN works on Linux The objective of VXLAN  Creating virtual L2 network over physical L3 network. VXLAN Switch VXLAN Switch VXLAN Switch Tokyo Osaka Fukuoka 10.1.0.0/16 10.1.1.0 10.1.2.0 10.1.3.0 Physical view Logical view from servers
  • 6. 6 How VXLAN works on Linux Packet encapsulation with VXLAN header  VXLAN encapsulates L2 packet inside L3 packet. VXLAN Switch VXLAN Switch Tokyo Osaka Dest Address yy.yy.yy.yy Original Packet Source Address xx.xx.xx.xx Original Packet VXLAN Header xx.xx.xx.xx yy.yy.yy.yy Original Packet
  • 7. 7 How VXLAN works on Linux
  • 8. 8 How VXLAN works on Linux The fundamental problem of L2 over L3  How to find the correct location of packet destination? How did you know that the destination is in Osaka!? VXLAN Switch VXLAN Switch Tokyo Osaka Dest Address yy.yy.yy.yy Original Packet Source Address xx.xx.xx.xx Original Packet VXLAN Header xx.xx.xx.xx yy.yy.yy.yy Original Packet
  • 9. 9 How VXLAN works on Linux ARP resolution on L2 layer  VXLAN switches need to emulate the ARP resolution mechanism. IP  10.1.2.0 MAC zz:zz:zz:zz:zz:zz ① ARP Request "What's the MAC for IP 10.1.2.0?" ② ARP Reply "zz:zz:zz:zz:zz:zz" Dest IP 10.1.2.0 Source IP 10.1.1.0 Payload Dest MAC zz:zz:zz:... Source MAC xx:xx:xx:... L3 headerL2 header IP  10.1.1.0 MAC xx:xx:xx:xx:xx:xx ④ Send L2 packet to "zz:zz:zz:zz:zz:zz" ③ Port <-> MAC association is recorded in MAC table
  • 10. 10 How VXLAN works on Linux Additional features for L2 over L3  Packet encapsulation is not enough for L2 over L3. VXLAN switches need to implement the following features. – ARP resolution: Need to reply to ARP request from local servers without broadcasting the ARP packet. – Destination search : Need to find the destination location corresponding to the destination MAC.  The VXLAN endpoint providing these features is referred as "VTEP". ARP Reply 「zz:zz:zz:zz:zz:zz」 Dest "zz:zz:zz:zz:zz:zz" is located in Osaka.VXLAN Switch Tokyo xx.xx.xx.xx ① ARP Request "What's the MAC for IP 10.1.2.0?" ④ Send L2 packet to "zz:zz:zz:zz:zz:zz"
  • 11. 11 How VXLAN works on Linux
  • 12. 12 How VXLAN works on Linux Variations of VTEP implementation  To implement VTEP features, there must be some mechanism to share the tuple (MAC, IP Address, Location) of all servers.  The followings are some variations of VTEP implementation. – Exchange MAC/IP information using L3 multicasting among switches. – Use SDN controller as a central MAC/IP database. – Use local agent and virtual VXLAN switch running on Linux servers.
  • 14. 14 How VXLAN works on Linux ML2 l2population driver  In the case of OpenStack Neutron OVS plugin, VXLAN encapsulation is done on the local Open vSwitch on compute nodes. – MAC/IP information is sent by L2 agent and populated by l2population ML2 driver. – The l2population driver populates the following entries in OVS. • FDB (forwarding database): a lookup table to find a destination node corresponding to the dest MAC address. • Flowtable entries for replying to ARP requests from local VMs. VM OVS (br-int) VM l2population driver Messaging server (RabbitMQ) VM OVS (br-int) VM l2population driver L2 Agent L2 Agent ① Attaching new VM ② Send MAC/IP information ③ Populate flow table in OVS
  • 15. 15 How VXLAN works on Linux  Reference : ML2 – Address Population – http://assafmuller.com/2014/02/23/ml2-address-population/
  • 17. 17 How VXLAN works on Linux Overlay network with Flannel  Flannel is a opensource tool to create overlay network for Docker containers. It's often used with Kubernetes. – It uses Linux kernel's native VXLAN devices for packet encapsulation. – Flannel daemon dynamically populates FDB and ARP table according to the kernel requests via the "L2/L3 MISS" notification mechanism. • The mechanism is originally named as "DOVE extensions" • https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/? id=e4f67addf158f98f8197e08974966b18480dc751 – The IP/MAC information is shared with the backend KVS (etcd). etcd Physical network 192.168.122.0/24 Minion flannel.1 Minion flannel.1 Internal network for container communication 10.1.0.0/16 Minion flannel.1 VXLAN device
  • 18. 18 How VXLAN works on Linux Kernel's DOVE extensions  You can use the native VXLAN device with the current Linux kernel. – You don't necessarily need OVS for using VXLAN. – It's just like using the traditional VLAN device with Linux :)  VTEP features are implemented with a userland agent via "L2/L3 MISS" notification mechanism. (The notification is sent via netlink.) – L3MISS • The kernel asks the agent to populate the local ARP table when necessary instead of broadcasting the ARP request packet. – L2MISS • The kernel asks the agent to populate FDB when necessary. # ip -d l show flannel.1 3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT link/ether 82:ce:d5:09:06:2c brd ff:ff:ff:ff:ff:ff promiscuity 0 vxlan id 1 local 192.168.122.101 dev eth0 srcport 0 0 dstport 8472 proxy l2miss ageing 300 # bridge fdb show dev flannel.1 56:e1:c1:d6:b7:51 dst 192.168.122.102 self # cat /proc/sys/net/ipv4/neigh/flannel.1/app_solicit 3
  • 19. 19 How VXLAN works on Linux  Reference: Kernel patch - add DOVE extensions for VXLAN – https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/? id=e4f67addf158f98f8197e08974966b18480dc751
  • 21. 21 How VXLAN works on Linux References  ML2 – Address Population – http://assafmuller.com/2014/02/23/ml2-address-population/  Kernel patch: add DOVE extensions for VXLAN – https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/? id=e4f67addf158f98f8197e08974966b18480dc751  FlannelのVXLANバックエンドの仕組み – http://enakai00.hatenablog.com/entry/2015/04/02/173739