3. 2017 | www.mirantis.com
● Linuxbridge
● Open vSwitch
● VXLAN
● VLAN
● iptables
● linux routing
● linux network namespaces
You should have basic knowledge about...
5. 2017 | www.mirantis.com
Prerequisites:
We already know the OpenStack Project and Instance we would like to
investigate. In our case: Instance “VM1” in Project “brtest”
So we source a Project based keystone.rc file
and use „openstack server list“ to identify the UUID of our Instance we would
like to troubleshoot.
VM / Linuxbridge : Identify the Instance
OS-CNTRL> source brctl.rc
OS-CNTRL> openstack server list
+--------------------------------------+------+--------+---------------------+
| ID | Name | Status | Networks |
+--------------------------------------+------+--------+---------------------+
| 7152a65a-30f5-41c4-8384-93d9f311331e | VM1 | ACTIVE | private=192.168.0.6 |
+--------------------------------------+------+--------+---------------------+
6. 2017 | www.mirantis.com
Using the Instance-ID of VM1 to identify the Port-ID
VM / Linuxbridge : Identify the Port
OS-CNTRL> nova interface-list 7152a65a-30f5-41c4-8384-93d9f311331e
+------------+--------------------------------------+--------------------------------------+--------------+-------------------+
| Port State | Port ID | Net ID | IP addresses | MAC Addr |
+------------+--------------------------------------+--------------------------------------+--------------+-------------------+
| ACTIVE | 114792f5-7014-4428-b698-c26740f5fd35 | 2ded7b2d-24c8-40b7-ba2e-f82ac7403875 | 192.168.0.6 | fa:16:3e:dc:8a:79 |
+------------+--------------------------------------+--------------------------------------+--------------+-------------------+
The first 8 digits of the Port-ID „114792f5“ are required for the next step,
to find the corresponding linux bridge on the compute node.
7. 2017 | www.mirantis.com
VM / Linuxbridge : Identify hosting compute node
OS-CNTRL> nova show 7152a65a-30f5-41c4-8384-93d9f311331e | grep ":host "
| OS-EXT-SRV-ATTR:host | node-105.debs-2-adm.os.cloud.vwfs.com
So let‘s jump to „node-105“ and use the 8 digits of the Port-ID to find the
corresponding linux bridge:
root@node-105:~# brctl show | grep 114792f5
qbr114792f5-70 8000.16ff58d02c8b no qvb114792f5-70
tap114792f5-70
qbr114792f5-70 : is the name of the linux bridge
qvb114792f5-70 : is one end of the veth-pair between bridge and br-int
: more about the other end later
tap114792f5-70 : is the tap interface representing the NIC inside the VM
First we need to know on which compute node our VM is hosted
8. 2017 | www.mirantis.com
We are here
VM1
192.168.0.6
Linux Bridge
eth0
tap114792f5-70
qbr114792f5-70
qvb114792f5-70
The place where
„access & security
groups“ are living
tap device
veth pair
9. 2017 | www.mirantis.com
Using the same 8 digit prefix, we can find several iptables rules
VM / Linuxbridge : debugging access & security groups
root@node-105:~# iptables -S | grep 114792f5
-N neutron-openvswi-i114792f5-7
-N neutron-openvswi-o114792f5-7
-N neutron-openvswi-s114792f5-7
i= is for inbound traffic
o= is for outbound traffic
s= is for special security (let‘s have a look)
root@node-105:~# iptables -S neutron-openvswi-s114792f5-7
-N neutron-openvswi-s114792f5-7
-A neutron-openvswi-s114792f5-7 -s 192.168.0.6/32 -m mac --mac-source FA:16:3E:DC:8A:79 -m comment --comment "Allow traffic from defined IP/MAC pairs." -j
RETURN
-A neutron-openvswi-s114792f5-7 -m comment --comment "Drop traffic without an IP/MAC allow rule." -j DROP
This rule ensures the mapping between instance MAC and IP address.
All traffic will be dropped when a user changes MAC or IP
10. 2017 | www.mirantis.com
As mentioned before, qvb114792f5-70 is the end mapped to the linux bridge. Now we
need to identify the corresponding end connected to br-int.
Linuxbridge / br-int: identify the veth-pair
root@node-105:~# ip -d link | grep -A2 114792f5
2747: qbr114792f5-70: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 16:ff:58:d0:2c:8b brd ff:ff:ff:ff:ff:ff promiscuity 0
bridge
2748: qvo114792f5-70@qvb114792f5-70: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 8950 qdisc noqueue master ovs-system state UP mode DEFAULT group default
qlen 1000
link/ether 1e:95:2e:f8:7f:fc brd ff:ff:ff:ff:ff:ff promiscuity 2
veth
2749: qvb114792f5-70@qvo114792f5-70: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 8950 qdisc noqueue master qbr114792f5-70 state UP mode DEFAULT group default
qlen 1000
link/ether 16:ff:58:d0:2c:8b brd ff:ff:ff:ff:ff:ff promiscuity 2
veth
2750: tap114792f5-70: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc htb master qbr114792f5-70 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether fe:16:3e:dc:8a:79 brd ff:ff:ff:ff:ff:ff promiscuity 1
tun
We can see the mapping between:
2748: qvo114792f5-70@qvb114792f5-70
2749: qvb114792f5-70@qvo114792f5-70
11. 2017 | www.mirantis.com
We are here
VM1
192.168.0.6
Linux Bridge
eth0
tap114792f5-70
qbr114792f5-70
qvb114792f5-70
veth pair
qvo114792f5-70
12. 2017 | www.mirantis.com
On br-int we can use „qvo114792f5-70 -A1“ to find the port on br-int
br-int on compute node
root@node-105:~# ovs-ofctl show br-int | grep qvo114792f5-70 -A3
658(qvo114792f5-70): addr:1e:95:2e:f8:7f:fc
config: 0
state: 0
current: 10GB-FD COPPER
root@node-105:~# ovs-vsctl show | grep qvo114792f5-70 -A1
Port "qvo114792f5-70"
tag: 201
Interface "qvo114792f5-70"
Port "qvo57f8d362-a3"
We can identify port 658 related to “qvo114792f5-70 -A1“
And an assigned VLAN tag of 201
So the next step is to see what is happening with traffic on port 658
13. 2017 | www.mirantis.com
Let‘s dump all ovs flows on br-int related to port=658
br-int on compute node: as standard mac learning switch
root@node-105:~# ovs-ofctl dump-flows br-int | grep "in_port=658"
cookie=0x8980c4df9f9bced9, duration=3016.077s, table=0, n_packets=0, n_bytes=0, idle_age=3016, priority=10,icmp6,in_port=658,icmp_type=136
actions=resubmit(,24)
cookie=0x8980c4df9f9bced9, duration=3016.059s, table=0, n_packets=65, n_bytes=2730, idle_age=5, priority=10,arp,in_port=658 actions=resubmit(,24)
cookie=0x8980c4df9f9bced9, duration=3016.095s, table=0, n_packets=120, n_bytes=13367, idle_age=10, priority=9,in_port=658 actions=resubmit(,25)
cookie=0x8980c4df9f9bced9, duration=3016.086s, table=24, n_packets=0, n_bytes=0, idle_age=3016,
priority=2,icmp6,in_port=658,icmp_type=136,nd_target=fe80::f816:3eff:fedc:8a79 actions=NORMAL
cookie=0x8980c4df9f9bced9, duration=3016.068s, table=24, n_packets=65, n_bytes=2730, idle_age=5, priority=2,arp,in_port=658,arp_spa=192.168.0.6
actions=resubmit(,25)
cookie=0x8980c4df9f9bced9, duration=3016.113s, table=25, n_packets=182, n_bytes=15867, idle_age=5, priority=2,in_port=658,dl_src=fa:16:3e:dc:8a:79
actions=NORMAL
We want to follow unicast traffic so..
● 1st line: doesn‘t match because it catches „icmp6“ traffic
● 2nd line: doesn‘t match because it catches „arp“ traffic
● 3rd line: does match and redirect this traffic to be handled in table=25
● Last line: is table=25 and handles traffic with MAC of our VM1 as „NORMAL“ which means
as standard layer 2 mac learning switch.
Table 0 is always the initial table to start to follow the flow
Priority is a number between 0 and 65535. A higher value will match before a lower one.
14. 2017 | www.mirantis.com
br-int on compute node: ports sharing the same VLAN
As we now know that our unicast traffic on port 658 will be tagged with vlan-id 201 and
handled by an ovs flow action=NORMAL (standard mac learning switch),
we need to know which other ports are part of the same VLAN=201.
root@node-105:~# ovs-appctl fdb/show br-int | grep " 201 "
1 201 fa:16:3e:74:a5:28 17
1 201 fa:16:3e:8b:8d:e3 17
1 201 fa:16:3e:2d:2c:66 17
1 201 fa:16:3e:59:b6:80 17
1 201 fa:16:3e:fd:9e:61 16
658 201 fa:16:3e:dc:8a:79 8
And we can see port 658 with the MAC of our VM1 and
port 1 with two different MAC addresses (router and DHCP servers).
Ethernet frames from port 658 will be forwarded to port 1 and vice versa.
Let‘s identify port 1:
root@node-105:~# ovs-ofctl show br-int | grep " 1(" -A3
1(patch-tun): addr:ea:3b:af:d1:a1:30
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
Port 1(patch-tun)
Port 658(qvo114792f5-70)
15. 2017 | www.mirantis.com
We know that port 1 on br-int is related to br-tun
So let‘s check which port on br-tun is related to br-int:
br-tun: verify the connection to br-int
root@node-105:~# ovs-ofctl show br-tun | grep "(patch-int)" -A3
1(patch-int): addr:1a:68:07:fb:5a:00
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
Port 1 on br-int = (patch-tun)
Port 1 on br-tun = (patch-int)
16. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
17. 2017 | www.mirantis.com
Frames from br-int are arriving on port 1 on br-tun
br-tun compute: following the incoming frame flow
root@node-105:~# ovs-ofctl dump-flows br-tun table=0 | grep "in_port=1 "
cookie=0xaca790c9b6155a8d, duration=6295554.312s, table=0, n_packets=3962107607, n_bytes=1161207200267, idle_age=0, hard_age=65534, priority=1,in_port=1
actions=resubmit(,2)
root@node-105:~# ovs-ofctl dump-flows br-tun table=2
cookie=0xaca790c9b6155a8d, duration=6295613.530s, table=2, n_packets=3790968, n_bytes=159220674, idle_age=0, hard_age=65534,
priority=1,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=resubmit(,21)
cookie=0xaca790c9b6155a8d, duration=6295613.529s, table=2, n_packets=3957685311, n_bytes=1161019540610, idle_age=0, hard_age=65534,
priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
cookie=0xaca790c9b6155a8d, duration=6295613.528s, table=2, n_packets=664604, n_bytes=37156210, idle_age=7798, hard_age=65534,
priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)
All frames from in_port=1 will be resubmitted to table=2
In table=2 we separate unicast, multicast and broadcast traffic into different tables.
Unicast will be handled by table=20
So let‘s continue following unicast traffic in table=20...
18. 2017 | www.mirantis.com
Before we continue we need to point out that VLAN tags in flows are in hex format
VLAN 201 = 0x00c9
br-tun compute: following frame flow table=20
root@node-105:~# ovs-ofctl dump-flows br-tun table=20 | grep 0x00c9
cookie=0xaca790c9b6155a8d, duration=15152.537s, table=20, n_packets=0, n_bytes=0, hard_timeout=300, idle_age=15152, hard_age=187,
priority=1,vlan_tci=0x00c9/0x0fff,dl_dst=fa:16:3e:74:a5:28 actions=load:0->NXM_OF_VLAN_TCI[],load:0x114->NXM_NX_TUN_ID[],output:71
cookie=0xaca790c9b6155a8d, duration=245.474s, table=20, n_packets=0, n_bytes=0, hard_timeout=300, idle_age=245, hard_age=240,
priority=1,vlan_tci=0x00c9/0x0fff,dl_dst=fa:16:3e:2d:2c:66 actions=load:0->NXM_OF_VLAN_TCI[],load:0x114->NXM_NX_TUN_ID[],output:70
cookie=0xaca790c9b6155a8d, duration=234.619s, table=20, n_packets=96, n_bytes=9016, hard_timeout=300, idle_age=0, hard_age=0,
priority=1,vlan_tci=0x00c9/0x0fff,dl_dst=fa:16:3e:fd:9e:61 actions=load:0->NXM_OF_VLAN_TCI[],load:0x114->NXM_NX_TUN_ID[],output:67
And we can see the magic happen:
- We send out traffic on Port=67 when
- the VLAN tag is 0x00c9 (VLAN 201) and
- the destination MAC address is fa:16:3e:fd:9e:61 (default gateway)
- We remove the VLAN tag
- We encapsulate the frame in a VXLAN header with a tunnel-id of „0x114“
19. 2017 | www.mirantis.com
Let‘s see what are the details of port 67
br-tun compute: outgoing port
root@node-105:~# ovs-ofctl show br-tun | grep " 67(" -A3
67(vxlan-c612fc36): addr:ee:41:a4:f8:81:ed
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
The VXLAN Tunnel is "vxlan-c612fc36"
The VXLAN ID is the IP address of the other end of the tunnel in hex
So we know the tunnel endpoint IP is 198.18.252.54
c6 12 fc 36
198 18 252 54
root@node-105:~# ovs-vsctl show | grep vxlan-c612fc36 -A3
Port "vxlan-c612fc36"
Interface "vxlan-c612fc36"
type: vxlan
options: {df_default="true", in_key=flow, local_ip="198.18.252.36", out_key=flow, remote_ip="198.18.252.54"}
Port "vxlan-c612fc20"
Just to double check:
20. 2017 | www.mirantis.com
We need to identify the corresponding node containing the IP address of the tunnel endpoint.
While we already know that the destination MAC addresses for this tunnel are the router and DHCP server
it is obvious to find the corresponding IP on a network/neutron node.
Let‘s do some FUEL magic to find the right neutron node:
br-tun: finding the corresponding tunnel node
root@fuel# x=`fuel node | grep neutron | cut -d| -f1 | xargs ` ; for i in $x ; do echo Node-$i && ssh node-$i "ip a | grep 198.18.252. " ; done
Node-300
inet 198.18.252.54/22 brd 198.18.255.255 scope global br-mesh
Node-311
inet 198.18.252.62/22 scope global br-mesh
Node-301
inet 198.18.252.55/22 scope global br-mesh
Node-286
inet 198.18.252.64/22 scope global br-mesh
And we can identify Node-300 is the one we need
21. 2017 | www.mirantis.com
So we continue on node-300
We know the local IP=198.18.252.54
and we know the remote tunnel endpoint IP=198.18.252.36
So we are able to calculate the VXLAN Tunnel:
On the network node
root@node-300:~# ovs-vsctl show | grep c612fc24 -A2
Port "vxlan-c612fc24"
Interface "vxlan-c612fc24"
type: vxlan
options: {df_default="true", in_key=flow, local_ip="198.18.252.54", out_key=flow, remote_ip="198.18.252.36"}
Therefor the VXLAN Tunnel is "vxlan-c612fc24"
root@node-300:~# ovs-ofctl show br-tun | grep c612fc24 -A3
32(vxlan-c612fc24): addr:7e:44:41:c6:7b:d0
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
Just to double check:
198 18 252 36
c6 18 fc 24
And we can identify port=32 on br-tun
22. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36) (vxlan-c612fc24) 32
Node-300
23. 2017 | www.mirantis.com
Packets from br-tun on compute node arrive on port 32 on br-tun of the network node
br-tun neutron: following the incoming packet flow
root@node-300:~# ovs-ofctl dump-flows br-tun table=0| grep port=32
cookie=0xa23669016eb15a51, duration=1949006.986s, table=0, n_packets=15763474, n_bytes=2045964079, idle_age=0, hard_age=65534, priority=1,in_port=32
actions=resubmit(,4)
root@node-300:~# ovs-ofctl dump-flows br-tun table=4 | grep tun_id=0x114
cookie=0xa23669016eb15a51, duration=16637.692s, table=4, n_packets=1937, n_bytes=179301, idle_age=0, priority=1,tun_id=0x114
actions=mod_vlan_vid:304,resubmit(,10)
This specific packets are resubmitted to table=4
And while we already know the tunnel-id=0x114 we can grep for this specific entry
Here we add a local relevant vlan-id of 304 and jump to table 10
24. 2017 | www.mirantis.com
br-tun neutron: following the incoming packet flow
Dump of table 10:
root@node-300:~# ovs-ofctl dump-flows br-tun table=10
NXST_FLOW reply (xid=0x4):
cookie=0xa23669016eb15a51, duration=1949418.737s, table=10, n_packets=306100085, n_bytes=39605946471, idle_age=0, hard_age=65534, priority=1
actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xa23669016eb15a51,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0-
>NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
And the corresponding magic happen:
- We update table=20 with learned combination of src_mac, tunnel-id and vlan-tag
- We remove the tunnel-id and send out the traffic on port 1
- Hence we turn back the VXLAN encapsulated packet into an ethernet frame
Let‘s identify port=1 on br-tun on the network node
root@node-300:~# ovs-ofctl show br-tun | grep " 1(" -A3
1(patch-int): addr:96:a1:79:56:f2:66
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
And we have our connection from br-tun to br-int
25. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36) (vxlan-c612fc24) 32
Node-300
br-int
VLAN 304
1 (patch-int)
26. 2017 | www.mirantis.com
br-tun to br-int on network node
root@node-300:~# ovs-appctl fdb/show br-int | grep " 304 "
32 304 fa:16:3e:dc:8a:79 0
4299 304 fa:16:3e:fd:9e:61 0
We already know this traffic will be forwarded to br-int on network node.
So let‘s see which ports are related to VLAN=304 on br-int
There are
port=32 and
port=4299
Let‘s have a look to this ports on br-int
27. 2017 | www.mirantis.com
br-tun to br-int on network node
Ports 32 and 4299 on br-int
root@node-300:~# ovs-ofctl show br-int | grep " 32(" -A3
32(patch-tun): addr:72:97:a6:b7:37:7f
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
root@node-300:~# ovs-ofctl show br-int | grep " 4299(" -A3
4299(qr-2deb344d-fb): addr:00:00:00:00:80:a2
config: PORT_DOWN
state: LINK_DOWN
speed: 0 Mbps now, 0 Mbps max
32(patch-tun) :the connection to br-tun
4299(qr-2deb344d-fb) :connection to qrouter-namespace
But how do we find the related namespace mapped to a specific interface?
root@node-300:~# ovs-appctl fdb/show br-int | grep " 304 "
32 304 fa:16:3e:dc:8a:79 0
4299 304 fa:16:3e:fd:9e:61 0
We already know this traffic will be forwarded to br-int on network node.
So let‘s see which ports are related to VLAN=304 on br-int
28. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36) (vxlan-c612fc24) 32
Node-300
br-int
VLAN 304
32 (patch-tun)
1 (patch-int)
4299(qr-2deb344d-fb)
29. 2017 | www.mirantis.com
To find the qrouter-<namespace> we need get back to the OpenStack CLI>
Every virtual router we create on OpenStack creates a qrouter namespace.
We just need to identify the UUID of our vRouter-1:
How to find a specific namespace - Router
OS-CNTRL> openstack router list
+--------------------------------------+-----------+--------+-------+-------------+----+----------------------------------+
| ID | Name | Status | State | Distributed | HA | Project |
+--------------------------------------+-----------+--------+-------+-------------+----+----------------------------------+
| f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 | vRouter-1 | ACTIVE | UP | | | 7d04b4d168ea4cdf843f5615d7078360 |
+--------------------------------------+-----------+--------+-------+-------------+----+----------------------------------+
On the network node-300 we should find a namespace containing the <Router ID>
root@node-300:~# ip netns | grep f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9
qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9
30. 2017 | www.mirantis.com
How to find a specific namespace - Router
1301: qr-2deb344d-fb :connection to qrouter-namespace
And inside this namespace we should find the interface
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
1299: ha-a73b05a2-b8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/ether fa:16:3e:0d:01:e3 brd ff:ff:ff:ff:ff:ff
1300: qg-5ec3d71d-ec: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/ether fa:16:3e:59:03:0f brd ff:ff:ff:ff:ff:ff
1301: qr-2deb344d-fb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/ether fa:16:3e:fd:9e:61 brd ff:ff:ff:ff:ff:ff
32. 2017 | www.mirantis.com
Now we have followed the traffic from an instance
passing the network overlay
and have reached the network namespace acting as
router between our private virtual network and the
external network.
Let‘s have a deeper look into its implementation
Linux Routing in namespace
33. 2017 | www.mirantis.com
We have found our „qr-xxx“ interface within the namespace and have found
another interface called „qg-xxx“
Let‘s have a look to the routing table:
Router namespace
And you can see port=4298 and a VLAN tag=66
So let‘s see which other ports are within the same VLAN of 66.
Traffic within the same subnet of the virtual network (192.168.0.0/24) will be routet to the „qr-xxx“ interface.
Any other traffic will be routed to the „qg-xxx“ interface.
Let‘s try to find the „qg-xxx“ interface on br-int:
root@node-300:~# ovs-ofctl show br-int | grep -A3 qg-5ec3d71d-ec
4298(qg-5ec3d71d-ec): addr:00:00:00:00:60:45
config: PORT_DOWN
state: LINK_DOWN
speed: 0 Mbps now, 0 Mbps max
root@node-300:~# ovs-vsctl show | grep qg-5ec3d71d-ec -A1
Port "qg-5ec3d71d-ec"
tag: 66
Interface "qg-5ec3d71d-ec"
type: internal
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip route
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
34. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36)
(vxlan-c612fc24) 32
Node-300
br-int
VLAN 304
32 (patch-tun)
1 (patch-int)
4299(qr-2deb344d-fb)
qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac91301
1300
qr-xxx
qg-xxx
VLAN 66
4298(qg-5ec3d71d-ec)
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
35. 2017 | www.mirantis.com
Back on br-int again
And you can see port=4298 and a VLAN tag=66
So let‘s find out which other ports are within the same VLAN of 66.
root@node-300:~# ovs-appctl fdb/show br-int | grep " 66 " | sort
10 66 00:00:00:00:fa:00 0
10 66 00:00:00:00:fa:01 0
10 66 00:1c:7f:00:00:fa 48
10 66 00:1c:7f:61:59:ed 1
10 66 0a:91:42:92:7e:34 0
10 66 3a:1b:98:de:3f:22 5
10 66 52:54:00:45:88:aa 1
10 66 9a:cc:22:de:59:78 1
10 66 fa:16:3e:98:83:ad 9
4162 66 fa:16:3e:a6:f9:c2 67
4173 66 fa:16:3e:db:8b:a1 0
4236 66 fa:16:3e:b3:24:6c 2
4298 66 fa:16:3e:59:03:0f 25
root@node-300:~# ovs-ofctl show br-int | grep " 10(" -A3
10(int-br-floating): addr:66:d7:7b:bf:0d:c0
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
We can see our port 4298 and several times port 10. The other 4xxx ports are connections from other virtual
routers connected to the admin_floating_net.
Let‘s investigate port 10:
36. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36)
(vxlan-c612fc24) 32
Node-300
br-int
VLAN 304
32 (patch-tun)
1 (patch-int)
4299(qr-2deb344d-fb)
qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac91301
1300
qr-xxx
qg-xxx
VLAN 66
4298(qg-5ec3d71d-ec)
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
br-floating
10 (int-br-floating)
(phy-br-floating) 2
37. 2017 | www.mirantis.com
br-floating
What are the flows and ports on br-floating:
root@node-300:~# ovs-ofctl dump-flows br-floating
NXST_FLOW reply (xid=0x4):
cookie=0x8af4e62a69d8b808, duration=1967711.545s, table=0, n_packets=41153448, n_bytes=7620357055, idle_age=0, hard_age=65534, priority=4,in_port=2,dl_vlan=66
action=strip_vlan,NORMAL
cookie=0x8af4e62a69d8b808, duration=1967908.876s, table=0, n_packets=269270838, n_bytes=32007278160, idle_age=0, hard_age=65534, priority=2,in_port=2
actions=drop
cookie=0x8af4e62a69d8b808, duration=1967909.502s, table=0, n_packets=94054070, n_bytes=187612841160, idle_age=0, hard_age=65534, priority=0 actions=NORMAL
root@node-300:~# ovs-appctl fdb/show br-floating
port VLAN MAC Age
1 0 fa:16:3e:77:2a:fc 69
1 0 00:1c:7f:00:00:fa 19
2 0 fa:16:3e:59:03:0f 11
1 0 3a:1b:98:de:3f:22 5
1 0 52:54:00:45:88:aa 4
2 0 fa:16:3e:b3:24:6c 4
1 0 fa:16:3e:98:83:ad 2
2 0 fa:16:3e:a6:f9:c2 1
1 0 9a:cc:22:de:59:78 1
1 0 00:1c:7f:61:59:ed 0
2 0 fa:16:3e:db:8b:a1 0
1 0 00:00:00:00:fa:00 0
1 0 00:00:00:00:fa:01 0
1 0 0a:91:42:92:7e:34 0
We remove the vlan-tag 66 and perform traffic as mac learning switch (actions=NORMAL)
Let‘s see the learned MACs and ports within the same VLAN=0
38. 2017 | www.mirantis.com
br-floating
Port 1 and port 2 are acting as regular switch ports. Let‘s have a look to them:
root@node-300:~# ovs-ofctl show br-floating
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000ceb8ff61d545
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
1(p_ff798dba-0): addr:6e:dc:5b:62:85:fc
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
2(phy-br-floating): addr:c2:79:cc:c5:fa:cb
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
LOCAL(br-floating): addr:ce:b8:ff:61:d5:45
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
Port 2(phy-br-floating) is our connection to br-int
Port 1(p_ff798dba-0) see next slide....
39. 2017 | www.mirantis.com
br-ex and the physical NIC
Using the good old „ip link“ to find „p_ff798dba-0“
root@node-300:~# ip -d l | grep p_ff798dba-0 -A3
527: p_ff798dba-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master br-ex state UNKNOWN mode DEFAULT group default qlen 1
link/ether 6e:dc:5b:62:85:fc brd ff:ff:ff:ff:ff:ff promiscuity 2
openvswitch
And we finally found the connection to the physical network using the eth1 interface.
root@node-300:~# brctl show br-ex
bridge name bridge id STP enabled interfaces
br-ex 8000.525400ade67f no eth1
p_ff798dba-0
Here we can see the connection between the openvswitch (br-floating) to master br-ex.
br-ex is a linuxbridge:
40. 2017 | www.mirantis.com
br-ex internals
Let‘s have a detailed look to the learned
MAC forwarding table of br-ex.....
root@node-300:~# brctl showmacs br-ex
port no mac addr is local? ageing timer
1 00:00:00:00:fa:00 no 0.08
1 00:00:00:00:fa:01 no 0.13
1 00:1c:7f:00:00:fa no 58.57
1 00:1c:7f:61:59:ed no 0.94
1 0a:91:42:92:7e:34 no 0.03
1 3a:1b:98:de:3f:22 no 4.77
1 52:54:00:45:88:aa no 3.07
1 52:54:00:ad:e6:7f yes 0.00
1 52:54:00:ad:e6:7f yes 0.00
2 6e:dc:5b:62:85:fc yes 0.00
2 6e:dc:5b:62:85:fc yes 0.00
1 9a:cc:22:de:59:78 no 0.72
1 fa:16:3e:98:83:ad no 7.46
2 fa:16:3e:a6:f9:c2 no 0.70
2 fa:16:3e:b3:24:6c no 3.68
1 fa:16:3e:cc:ba:5e no 119.58
2 fa:16:3e:db:8b:a1 no 0.07
So we forward ethernet frames on br-ex to the physical NIC eth1
root@node-300:~# brctl showstp br-ex
br-ex
bridge id 8000.525400ade67f
….
eth1 (1)
port id 8001 state forwarding
designated root 8000.525400ade67f path cost 100
designated bridge 8000.525400ade67f message age timer 0.00
p_ff798dba-0 (2)
port id 8002 state forwarding
designated root 8000.525400ade67f path cost 100
designated bridge 8000.525400ade67f message age timer 0.00
..and investigate the ports 1 and 2 on that
linuxbridge
41. 2017 | www.mirantis.com
We are here
br-int
VLAN 201
br-tun
1 (patch-tun)
1 (patch-int)
tap114792f5-70
qvb114792f5-70
VM1
192.168.0.6
Linux Bridge
eth0
qbr114792f5-70
658 (qvo114792f5-70)
Node-105
br-tun
67 (vxlan-c612fc36)
(vxlan-c612fc24) 32
Node-300
br-int
VLAN 304
32 (patch-tun)
1 (patch-int)
4299(qr-2deb344d-fb)
qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac91301
1300
qr-xxx
qg-xxx
VLAN 66
4298(qg-5ec3d71d-ec)
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
br-floating
10 (int-br-floating)
br-ex
(phy-br-floating) 2
(p_ff798dba-0) 1
eth1
42. 2017 | www.mirantis.com
SNAT within the namespace
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip a
1300: qg-5ec3d71d-ec: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1
link/ether fa:16:3e:59:03:0f brd ff:ff:ff:ff:ff:ff
inet 10.37.193.93/22 scope global qg-5ec3d71d-ec
1301: qr-2deb344d-fb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UNKNOWN group default qlen 1
link/ether fa:16:3e:fd:9e:61 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.1/24 scope global qr-2deb344d-fb
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip r
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
The private network is connected to the admin_floating_net via vRouter-1.
The IP address of the qg-xxx interface is the IP of the vRouter-1 interface connected to the
admin_floating_net. The public IP for the SNAT.
43. 2017 | www.mirantis.com
SNAT within the namespace
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 iptbles -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-POSTROUTING ! -i qg-5ec3d71d-ec ! -o qg-5ec3d71d-ec -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8775
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -o qg-5ec3d71d-ec -j SNAT --to-source 10.37.193.93
-A neutron-l3-agent-snat -m mark ! --mark 0x2/0xffff -m conntrack --ctstate DNAT -j SNAT --to-source 10.37.193.93
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
In the iptables we can find the SNAT rules
44. 2017 | www.mirantis.com
1:1 NAT – floating IP
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip a
1300: qg-5ec3d71d-ec: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1
link/ether fa:16:3e:59:03:0f brd ff:ff:ff:ff:ff:ff
inet 10.37.193.93/22 scope global qg-5ec3d71d-ec
inet 10.37.193.94/32 scope global qg-5ec3d71d-ec
valid_lft forever preferred_lft forever
1301: qr-2deb344d-fb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UNKNOWN group default qlen 1
link/ether fa:16:3e:fd:9e:61 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.1/24 scope global qr-2deb344d-fb
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 ip r
default via 10.37.192.1 dev qg-5ec3d71d-ec
10.37.192.0/22 dev qg-5ec3d71d-ec proto kernel scope link src 10.37.193.93
192.168.0.0/24 dev qr-2deb344d-fb proto kernel scope link src 192.168.0.1
Now I‘ve assigned the floating IP 10.37.193.94/32 to VM1.
And we can see 10.37.193.94 as an additional IP address on the qg-xxx interface
The routing table is still the same:
45. 2017 | www.mirantis.com
1:1 NAT – floating IP
root@node-300:~# ip netns exec qrouter-f2a1fb01-46fa-43c7-bbf9-9bff27a73ac9 iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.37.193.94/32 -j DNAT --to-destination 192.168.0.6
-A neutron-l3-agent-POSTROUTING ! -i qg-5ec3d71d-ec ! -o qg-5ec3d71d-ec -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8775
-A neutron-l3-agent-PREROUTING -d 10.37.193.94/32 -j DNAT --to-destination 192.168.0.6
-A neutron-l3-agent-float-snat -s 192.168.0.6/32 -j SNAT --to-source 10.37.193.94
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -o qg-5ec3d71d-ec -j SNAT --to-source 10.37.193.93
-A neutron-l3-agent-snat -m mark ! --mark 0x2/0xffff -m conntrack --ctstate DNAT -j SNAT --to-source 10.37.193.93
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
In the iptables we can find the 1:1 NAT rules
46. 2017 | www.mirantis.com
br-tun: MAC learning switch tunnel twist
Table 0
in_port=1(br-int),table=2
in_port=x-y(br-tun),table=4
Table 2
Table 4
If unicast, table=20
If broadcast, table=22
If unknown unicast,
table=22
Table 20
MAC=x,VLAN=x,Tunnel=x,out_port=x
MAC=y,VLAN=y,Tunnel=y,out_port=y
Table 22
Remove VLAN-ID,
send to all tunnel devices
Table 10
Tunnel=x:
Set:VLAN=x
table=10
Populate
SRC_MAC,
VLAN,
TUNNEL_ID
br-int
VXLAN
br-int
Overlay
Network
VXLAN
to other
br-tun
dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 matches all unicast Ethernet traffic
dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 matches all multicast (including broadcast) Ethernet traffic
PATCH_LocalVS_TO_TUN = 2
GRE_TUN_TO_LocalVS = 3
VXLAN_TUN_TO_LocalVS=4
LEARN_FROM_TUN = 10
UCAST_TO_TUN = 20
FLOOD_TO_TUN = 22
br-tun
47. 2017 | www.mirantis.com
brctl show
brctl showmacs <bridge>
brctl showstp <bridge>
brctl setageing <bridge> <time>
iptables –S
iptables -t nat -S
ip link
ip –d link
ip route [ get <IP> ]
ip n [flush all]
ip address
ethtool –S <interface veth>
ip netns [exec <namespace> [command] ]
ip netns exec <NS> tcpdump –e –n –l –i <interface>
Command cheat sheet
ovs-vsctl show
ovs-vsctl list-br
ovs-vsctl list-ports <ovs-bridge>
ovs-vsctl list-ifaces <ovs-bridge>
ovs-vsctl list manager
ovs-vsctl list controller
ovs-ofctl show <ovs-bridge>
ovs-ofctl show <ovs-bridge> dump-flows [table=<n>]
ovs-appctl fdb/show <ovs-bridge>
tcpdump -i <tap-device> -n -e icmp
48. 2017 | www.mirantis.com
Special thanks to….
David Mahler and his fantastic tutorials:
https://www.youtube.com/channel/UCEoaojfEY_6L5TWWjIn9t9Q
Networking in too much detail:
https://www.rdoproject.org/networking/networking-in-too-much-detail/