Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

iptables and Kubernetes

125 views

Published on

In this slide, we discussed the architecture of iptables and also showed how to implement your own IPTABLES module.
Upon the understanding of iptables, we implemented the DNS layer 7 parse in iptables module.
After that, we studied how Kubernetes service works and also explained why Kubernetes can't do layer7 load-balancer in TCP connection but UDP.

Published in: Software
  • Be the first to comment

  • Be the first to like this

iptables and Kubernetes

  1. 1. IPTABLES & Kubernetes HungWei Chiu
  2. 2. HungWei Chiu •MTS @ ONF •SDNDS-TW/CNTUG •Linux/Network/Container/ Kubernetes •Kuberentes Courses @Hiskio
  3. 3. IPTABLES Series Introduction to IPTABLES Learn IPTABLES by Docker environment. Implementation of IPTABLES User Space/Kernel Space Implement our own iptables modules Kubernetes Service discussion Layer4 load-balancing, why ? Modify the kernel module to make it support Layer7, really ?
  4. 4. IPTABLES/EBTABLES Example iptables -t nat -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER ebtables -t filter -I INPUT --log --log-prefix 'ctc/ebtable/filter-input' --log-level debug Components Chain. -> Insert/Append (-I/-A) Table Match (Module) -> build-in/module Target (Module). -> build-in/module
  5. 5. Last Time
  6. 6. Demo Environment ContainerA Linux Bridge Eth0 Eth0 Veth0 https://github.com/hwchiu/ network-study/tree/master/ iptables/k8s Customize IPTABLES module Drop DNS packet if its request domain is what we expect Layer 7 processing.
  7. 7. Architecture UserSpace Parameter, usage Kernel space Implementation Communicate by getsockopt setsockopt
  8. 8. User Space Repo: git.netfilter.org/iptables Version: 1.6.1 (In my demo) Includes iptables commands, modules, Parses parameters and then send out to kernel Written by C language
  9. 9. User Space Directory extension old/new style iptables Extension Parse the argument and then store to a memory. iptables send data to kernel space later.
  10. 10. Implement Implement a basic module Support one argument domain (string) Name-> hwchiu Output Iptables -L Iptables-save
  11. 11. Kernel Space Repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Version: 4.15 (In my demo) Whole Linux kernel Implement the function (match/target) Build as build-in function or kernel module. Written by C language
  12. 12. Kernel Space net/netfilter/ Implement the function then register to kernel. Be called when someone(iptables) calls it by setSockOpt getSockOpt
  13. 13. DNS Format (RFC 1035)
  14. 14. Summary You can inspect the packet in iptables modules. Do anything you want Coding in kernel space
  15. 15. Kubernetes Service Why People called it a Layer 4 Load-Balancer ? We could do layer7 parser in iptables.
  16. 16. Kubernetes Service ClusterIP NodePort LoadBalancer Headless
  17. 17. Demo Environment Kubernetes Pod Service ClusterIP Deployment Three Pods Pod Pod ClusterIP
  18. 18. Workflow PRE_ROUTING Packets KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX DNAT Jump Jump if match protocol/clusterIP Jump if random module return true Match protocol Module: TCP/UDP Module: Random Choose ENDPOINTS OUTPUT
  19. 19. Again
  20. 20. Random If P < 0.25 If P < 0.33 If P < 0.5 Endpoint3 Endpoint4 Endpoint2 Endpoint1 Endpoint1 Endpoint2 Endpoint3 Endpoint4 10.244.1.3 10.244.2.32 10.244.3.63 10.244.3.23 Request P=1/4 = 0.25 P= 3/4 * 1/3 = 0.25 P= 3/4 * 2/3 * 1/2 = 0.25 1-0.75 = 0.25
  21. 21. Workflow PRE_ROUTING Packets KUBE-SERVICES KUBE-SVC-XXX KUBE-SEP-XXX DNAT Jump Jump if match protocol/clusterIP Jump if random module return true Match protocol Module: UDP Module: Random Choose ENDPOINTS dest=10.96.121.30 dest=10.96.121.30 dest=10.96.121.30 dest=10.244.0.3
  22. 22. Kubernetes Service Two types of load-balancer Client <----> LB <----> Server Two connections Client <----LB-----> Server One connection.
  23. 23. Kubernetes Conntrack help to cache the NAT result. Packets will skip NAT table if there is a conntrack entry for it. For TCP packets Kubernetes did DNAT in three-way handshake. Without application data For UDP packets No three-way handshake
  24. 24. Iptables Layer 7 Firewall (IPS/IDS) Nfqueue,NDPI Layer 7 load-balancer Connectionless protocol (limited)
  25. 25. (Demo)Kubernetes Modify the statistic modules to support UDP load-balancing. Just for fun, to demonstrate how can we do from iptables. Parameters (/proc) ClusterIP (focus on specific service) Content, PodIP (forward to PodIP if it's data include content) PodIP list (to know where we are) Match workflow (limited by Random and iptables structure) Rollback if it's TCP packet. Rollback if it's destination IP != ClusterIP Return true if (1) data contains content, (2) PodIP equals iptables rule's target. Else, return false
  26. 26. One Question If we send packets to ClusterIP from Kubernetes node, what happen? UDP (Statistic + UDP) How about ARP?
  27. 27. Again
  28. 28. Socket Programming (TCP) https://wrytin.com/ramathakur/tcpip-reference-model-jvb79y39
  29. 29. (fs/read_write.c) SYSCALL_DEFINE3(....) (fs/read_write.c) vsf_write(....) (fs/read_write.c) do_sync_write(....) (net/socket.c) sock_aio_write(....) (net/socket.c) __sock_sendmsg(....) (security/security.c) security_socket_sendmsg(...) (net/socket.c) __sock_sendmsg_nosec(...) (net/ipv4/af_inet.c) inet_sendmsg(....) (net/ipv4/tcp.c) tcp_sendmsg(...) (net/ipv4/tcp.c) __tcp_push_pending_frames(...) (net/ipv4/tcp.c) tcp_push_one(...) (net/ipv4/tcp.c) tcp_write_xmit(...) (net/ipv4/tcp.c) tcp_transmit_skb(...) sk_buff
  30. 30. (net/ipv4/ip_output.c) ip_queue_xmit(...) (include/net/route.h) ip_route_output_ports(...) (net/ipv4/route.c) ip_route_output_flow(...) (net/xfrm/xfrm_policy.c) xfrm_lookup(...) (net/ipv4/ip_output.c) ip_local_out(...) (include/net/dst.h) dst_output(...) (net/ipv4/ip_output.c) ip_output(....) (net/ipv4/ip_output.c) ip_finish_output(...) (net/ipv4/ip_output.c) ip_fragment(...) (net/ipv4/ip_output.c) ip_finish_output2(...) (include/net/neighbour.h) neigh_output(...) (net/core/dev.c) dev_queue_xmit(...) (net/core/dev.c) dev_hard_start_xmit(...) (net/core/neighbour.c) neigh_resolve_output(...) (net/ipv4/ip_output.c) __ip_local_out(....) nf_hook(NFPROTO_IPV4, NF_INET_LOCAL_OUT...) Call IPTABLES (net/ipv4/ip_output.c) __ipv4_neigh_lookup_noref(....) (include/net/neighbour.h) __neigh_create(...) Lookup ARP Table Create ARP Record sk_buff
  31. 31. One more thing COSCUP CFP https://j.mp/2WmOemq Cloud Native Hub Telegram: https://t.me/cntug Github: https://github.com/cloud-native-taiwan/meetups MyBlog: https://www.hwchiu.com
  32. 32. Q&A

×