從網路封包看待資料
科學的應用
HWCHIU @ RLADIES-TAIPEI
Who Am I
◦ Hung-Wei Chiu (hwchiu)
◦ MTS @ Open Networking Foundation
◦ Microsoft MVP
◦ Cloud and Datacenter Management
◦ Blog
◦ https://hwchiu.com
◦ Co-Organizer of SDNDS-TW/CNTUG
Meetup
◦ https://www.meetup.com/CloudNative-Taiwan/
COSCUP
◦ https://coscup.org/2019/
◦ 81/17
◦ SDN x CNTUG x Golang
◦ 180 人會議室
Outline
◦ Networking Model Introduction
◦ How AI and Networking work together ?
◦ Traffic Classification
◦ Network Security
◦ Performance
Do You Know
◦ What happens when you key in `google.com` in browser
Simple Answer
◦ DNS request
◦ What is the IP address of google.com
◦ DNS reply
◦ Google.com is 172.217.160.110
◦ HTTP request
◦ Send HTTP request to 172.217.160.110
◦ HTTP reply
◦ Get HTTP reply from 172.217.160.110
Better Answer
◦ Layer2, MAC
◦ Layer3, ICMP, ARP, IPv4
◦ Layer4, TCP, port 80
◦ Layer7, DNS, HTTP
◦ Routing Table
◦ ARP Table
◦ …etc
OSI – TCP/IP
https://techdifferences.com/difference-between-tcp-ip-and-osi-model.html
Layer1
Layer2
Layer3
Layer4
Layer5
Layer6
Layer7
How it works
https://www.javatpoint.com/computer-network-tcp-ip-model
Layer1
Layer2
Layer3
Layer4
Layer7
Apps
TCP/
UDP
IP
Layer2
Layer1
Apps
TCP/
UDP
IP
Layer2
Layer1
Layer2
Header
Layer2
Data
Layer2
Footer
Layer3
Header
Layer3
Data
Layer4
Header
Layer4
Data
Layer7
Data
01010101010101010101`10111010101010101010111111110100011111111110
Layer4
◦ TCP/UDP
◦ TCP
◦ Transmission Control Protocol
◦ Reliable Protocol
◦ TCP guarantees the recipient will receive the packets in order by numbering them.
◦ UDP
◦ User Datagram Protocol
◦ Throws all the error-checking stuff out
◦ Is used when speed is desirable and error correction is not necessary.
Layer3
◦ IPv4/IPv6/ARP/ICMP/IGMP
◦ IPv4
◦ 32 bit
◦ 140.112.1.1 (8bit * 4),
◦ 255.255.255.255
◦ IPv6
◦ 128 bit
◦ 2001:db8:85a3:8d3:1319:8a2e:0370:7348 (4bit * 4 * 8)
◦ HEX format (0-f)
Layer2
◦ MAC address
◦ 42:00:f5:18:39:01
◦ VLAN
◦ Switching
TCP/IP Analogy
1. 郵差至各郵筒收信
2. 回到支局,分信員依照本地外地分信(外地則依各縣市大約分類,集中成
一堆)
3. 從支局集中到當地管理局(EX:台北~台灣北區郵政管理局!同縣市的信件
集中成大量)
https://tw.answers.yahoo.com/question/index?qid=20130317000016KK04415
TCP/IP Analogy
1. 運送到當地管理局(台北à台南; 台灣北區郵政管理局à台灣南區郵政管
理局 )
2. 管理局分信員依區碼分各支局
3. 運送至各支局,由各支局分信員依街道分區分信(例如~同一條路 單雙號
一段二段可能都有不同的郵務士負責 )
4. 每區域的郵件分派給負責的郵務士,由郵務士去排送信路順排序
https://tw.answers.yahoo.com/question/index?qid=20130317000016KK04415
郵差
當地郵局(分局)
根據目的地分類
信件內容
當地管理所
郵差
當地郵局(分局)
根據目的地分類
信件內容
當地管理所
車子/飛機/郵輪..等
TCP/IP Analogy
郵差
當地郵局(分局)
根據目的地分類
信件內容
當地管理所
當地郵局(分局)
根據目的地分類
當地管理所
車子/飛機/郵輪..等
Cross Regions
當地郵局(分局)
根據目的地分類
當地管理所
郵差
當地郵局(分局)
根據目的地分類
信件內容
當地管理所
郵差
當地郵局(分局)
根據目的地分類
信件內容
當地管理所
Cable/WIFI
TCP/IP Analogy
Layer1
Layer2
Layer3
Layer4
Layer7
MAC Address
IP Address
TCP/UDP Port Number
Data
38:f9:d3:27:45:ca -> 00:11:32:aa:bb:cc
192.168.1.3 -> 140.112.172.12
hwchiu
52136 -> 80
Frame
38:f9:d3:27:45:ca -> 00:11:32:aa:bb:cc h192.168.1.2 -> 140.112.172.17 54321->80
Layer2 Layer3 Layer4 Layer7
Request
00:11:32:aa:bb:cc -> 38:f9:d3:27:45:ca h140.112.172.17 -> 192.168.1.2 80 -> 54321
Layer2 Layer3 Layer4 Layer7
Reply
Demo Example
◦ Download Wirshark
◦ Use the wirshark to capture and analyze packets.
◦ We try to use the telnet to access ptt.cc
Laptop PTT Server
Ptt.cc (Ideally)
Request
Reply
Laptop
Wifi
Router
Ptt.cc (Real World)
Request 1
Reply 1
Building
Gateway
CHT
Router
NTU
Gateway
CS Server PTT Server
Request 2 Request 3 Request n
Reply 1Reply 1Reply n
IP
MAC
IP
MAC
IP
MAC
IP
MAC
IP
MAC
IP
MAC
DATA
TCP
IP
MAC
DATA
TCP
◦ https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-
routing-tables-packets-from-source_fig5_283841929
https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-routing-tables-
packets-from-source_fig5_283841929
◦ https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-
routing-tables-packets-from-source_fig5_283841929
https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-routing-tables-
packets-from-source_fig5_283841929
◦ https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-
routing-tables-packets-from-source_fig5_283841929
https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-routing-tables-
packets-from-source_fig5_283841929
◦ https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-
routing-tables-packets-from-source_fig5_283841929
https://www.researchgate.net/figure/Simple-fat-tree-topology-Using-the-two-level-routing-tables-
packets-from-source_fig5_283841929
Trace Routing
◦ Demo
◦ Traceroute/tracert
◦ Traceroute ptt.cc
AI & Networking
◦ What kind of the services/functions we used in the networking area ?
◦ Traffic Classification
◦ Security
◦ Performance
◦ Management
Traffic Classification
◦ Could We handle packet by its application?
◦ Layer7 (Apps)
◦ Difficult to identify
◦ No rules
◦ Maybe some pattern ?
Traffic Classification
◦ Layer2/Layer3/Layer4/Layer7
◦ Put all together
◦ How about HTTPS ?
◦ Layer 7 is encrypted.
◦ DPI
◦ Deep Packet Inspection
◦ IPS/IDS
◦ Intrusion Prevention System
◦ Intrusion Detection System
Example
◦ DropBox
◦ https://github.com/ntop/nDPI/blob/dev/src/lib/protocols/dropbox.c
◦ Raduis
◦ https://github.com/ntop/nDPI/blob/dev/src/lib/protocols/radius.c
◦ Openvpn
◦ https://github.com/ntop/nDPI/blob/f47be6ef6045a97a20f7a929d15a0354260c0414/src/lib/prot
ocols/openvpn.c
Traffic Classification
◦ Payload-based traffic classification
◦ Higher Computation
◦ Storage Cost
◦ Encryption
◦ Host behavior-based traffic classification
◦ Point to the edge of the network and examining traffic between hosts
◦ How many hosts are contacted
◦ How many port different ports are involved
◦ Flow Feature-based traffic classification
◦ Consider a communication session, which consists of a pair of complete flows.
Paper Study
◦ QoS-aware Traffic Classification Architecture Using Machine Learning and Deep
Packet Inspection in SDNs
◦ MultiClassifier: A combination of DPI and ML for application-layer classification in SDN
◦ On Internet Traffic Classification: A Two-Phased Machine Learning Approach
◦ ...etc
Security (Simple Approach)
◦ Rule-Based Rules
◦ Iptables (Linux)
◦ Based on packets header
◦ Layer2
◦ MAC address
◦ Layer3
◦ IPv4, IPv6
◦ Layer4
◦ TCP/UDP
◦ Port number
Simple Approach
◦ Drop all SSH connection (port 22)
◦ 22 is default, user can change
◦ Drop HTTP connection (port 80)
◦ 80 is default, user can change as well
◦ Drop source IP (1.2.3.4)
◦ User can change source IP by VPN/Proxy
◦ Drop destination IP
◦ Drop destination Port
Amazon
Security
◦ Misuse-based intrusion detection
◦ Monitor the network and match the network activities against the expected behavior of an
attack
◦ Anomaly-based intrusion detection
◦ Flow feature-based
◦ Payload-based anomaly detection
◦ Deep and reinforcement learning for intrusion detection
◦ Hybrid intrusion detection
https://jisajournal.springeropen.com/articles/10.1186/s13174-018-0087-2#Sec49
Paper Study
◦ Artificial Neural Networks for Misuse Detection
◦ HYBRID NEURAL NETWORK AND C4.5 FOR MISUSE DETECTION
◦ Modeling intrusion detection system using hybrid intelligent systems
◦ Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree
algorithm
◦ …
Performance
◦ Traffic Routing
◦ Traffic Prediction
◦ Resource Management
◦ QOS
Traffic Routing
◦ Select a path for packet transmission
◦ Cost minimization
◦ Maximization of link utilization
◦ QoS provisioning
https://www.youtube.com/watch?v=3MSYBAK-Y_E
Traffic Routing
https://www.youtube.com/watch?v=3MSYBAK-Y_E
Traffic Routing
◦ Traffic Prediction
◦ Bandwidth?
◦ Traffic Classification
◦ What application?
◦ Latency sensitive ?
◦ Traffic Routing
◦ Modify routing rules.
Project ONAP
Project ONAP
Q&A

How Networking works with Data Science