RFC 5996(IKEv2)のまとめ資料。
・もくじ
IPsecの概要(オリジナル)
Introduction(Section 1)
Header and Payload Formats(Section 3)
Exchanges and Payloads(Appendix C)
IKE Protocol Details and Variations(Section 2)
RFC 4306(旧IKEv2のRFC)との差分
RFC 5996(IKEv2)のまとめ資料。
・もくじ
IPsecの概要(オリジナル)
Introduction(Section 1)
Header and Payload Formats(Section 3)
Exchanges and Payloads(Appendix C)
IKE Protocol Details and Variations(Section 2)
RFC 4306(旧IKEv2のRFC)との差分
XDP in Practice: DDoS Mitigation @CloudflareC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2NtlaER.
Gilberto Bertin discusses the architecture of Cloudflare’s automatic DDoS mitigation pipeline, the initial packet filtering solution based on Iptables, and why Cloudflare had to introduce userspace offload. Bertin also describes how they switched from a proprietary offload technology to XDP for network stack bypass and how they are using XDP to load balance traffic. Filmed at qconlondon.com.
Gilberto Bertin works as a System Engineer at Cloudflare London. After working on variety of technologies like P2P VPNs and userspace TCP/IP stacks, he joined the Cloudflare DDoS team in London to help filter all the bad internet traffic.
“MPLS is that it’s a technique, not a service.”
The fundamental concept behind MPLS is that of labeling packets. In a traditional routed IP network,
each router makes an independent forwarding decision for each packet based solely on the packet’s
network-layer header. Thus, every time a packet arrives at a router, the router has to “think through”
where to send the packet next.
Segment routing is a technology that is gaining popularity as a way to simplify MPLS networks. It has the benefits of interfacing with software-defined networks and allows for source-based routing. It does this without keeping state in the core of the network and needless to use LDP and RSVP-TE.
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
[Container Plumbing Days 2023] Why was nerdctl made?Akihiro Suda
nerdctl (contaiNERD CTL) was made to facilitate development of new technologies in the containerd platform.
Such technologies include:
- Lazy-pulling with Stargz/Nydus/OverlayBD
- P2P image distribution with IPFS
- Image encryption with OCIcrypt
- Image signing with Cosign
- “Real” read-only mounts with mount_setattr
- Slirp-less rootless containers with bypass4netns
- Interactive debugging of Dockerfiles, with buildg
nerdctl is also useful for debugging Kubernetes nodes that are running containerd.
Through this session, the audiences will learn these functionalities of nerdctl, relevant projects, and the roadmap for the future.
https://containerplumbing.org/sessions/2023/why_was_nerdctl_
Kernel Recipes 2019 - XDP closer integration with network stackAnne Nicolas
XDP (eXpress Data Path) is the new programmable in-kernel fast-path, which is placed as a layer before the existing Linux kernel network stack (netstack).
We claim XDP is not kernel-bypass, as it is a layer before and it can easily fall-through to netstack. Reality is that it can easily be (ab)used to create a kernel-bypass situation, where non of the kernel facilities are used (in form of BPF-helpers and in-kernel tables). The main disadvantage with kernel-bypass, is the need to re-implement everything, even basic building blocks, like routing tables and ARP protocol handling.
It is part of the concept and speed gain, that XDP allows users to avoid calling part of the kernel code. Users have the freedom to do kernel-bypass and re-implement everything, but the kernel should provide access to more in-kernel tables, via BPF-helpers, such that users can leverage other parts of the Open Source ecosystem, like router daemons etc.
This talk is about how XDP can work in-concert with netstack, and proposal on how we can take this even-further. Crazy ideas like using XDP frames to move SKB allocation out of driver code, will also be proposed.
XDP in Practice: DDoS Mitigation @CloudflareC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2NtlaER.
Gilberto Bertin discusses the architecture of Cloudflare’s automatic DDoS mitigation pipeline, the initial packet filtering solution based on Iptables, and why Cloudflare had to introduce userspace offload. Bertin also describes how they switched from a proprietary offload technology to XDP for network stack bypass and how they are using XDP to load balance traffic. Filmed at qconlondon.com.
Gilberto Bertin works as a System Engineer at Cloudflare London. After working on variety of technologies like P2P VPNs and userspace TCP/IP stacks, he joined the Cloudflare DDoS team in London to help filter all the bad internet traffic.
“MPLS is that it’s a technique, not a service.”
The fundamental concept behind MPLS is that of labeling packets. In a traditional routed IP network,
each router makes an independent forwarding decision for each packet based solely on the packet’s
network-layer header. Thus, every time a packet arrives at a router, the router has to “think through”
where to send the packet next.
Segment routing is a technology that is gaining popularity as a way to simplify MPLS networks. It has the benefits of interfacing with software-defined networks and allows for source-based routing. It does this without keeping state in the core of the network and needless to use LDP and RSVP-TE.
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
[Container Plumbing Days 2023] Why was nerdctl made?Akihiro Suda
nerdctl (contaiNERD CTL) was made to facilitate development of new technologies in the containerd platform.
Such technologies include:
- Lazy-pulling with Stargz/Nydus/OverlayBD
- P2P image distribution with IPFS
- Image encryption with OCIcrypt
- Image signing with Cosign
- “Real” read-only mounts with mount_setattr
- Slirp-less rootless containers with bypass4netns
- Interactive debugging of Dockerfiles, with buildg
nerdctl is also useful for debugging Kubernetes nodes that are running containerd.
Through this session, the audiences will learn these functionalities of nerdctl, relevant projects, and the roadmap for the future.
https://containerplumbing.org/sessions/2023/why_was_nerdctl_
Kernel Recipes 2019 - XDP closer integration with network stackAnne Nicolas
XDP (eXpress Data Path) is the new programmable in-kernel fast-path, which is placed as a layer before the existing Linux kernel network stack (netstack).
We claim XDP is not kernel-bypass, as it is a layer before and it can easily fall-through to netstack. Reality is that it can easily be (ab)used to create a kernel-bypass situation, where non of the kernel facilities are used (in form of BPF-helpers and in-kernel tables). The main disadvantage with kernel-bypass, is the need to re-implement everything, even basic building blocks, like routing tables and ARP protocol handling.
It is part of the concept and speed gain, that XDP allows users to avoid calling part of the kernel code. Users have the freedom to do kernel-bypass and re-implement everything, but the kernel should provide access to more in-kernel tables, via BPF-helpers, such that users can leverage other parts of the Open Source ecosystem, like router daemons etc.
This talk is about how XDP can work in-concert with netstack, and proposal on how we can take this even-further. Crazy ideas like using XDP frames to move SKB allocation out of driver code, will also be proposed.
4. IPv4 的極限
IP 位址數的不足
類別基礎的 IP 位址分配方法,對單位所分配的 IP
位址以 8 位元為單位,真正單位所需要的 IP 位址數
跟實際上所分配到的 IP 位址數會有所差距
路由表太大
接續上的網路數量增加的話路由表也會跟著變大
解決的方法
C ID R
N AT
IPv6
5. CIDR
廢止 IP 分類方式,讓網路位址的位元數不在
一定是 8 的倍數,而是可變動的
加入維持路徑表大小的技術,利用路由彙整方
式來維持路由表的大小
遭遇困難:
匯集的網路都必須是集中在附近的。各地域
、國家、沿著 N e two rk 的接續構造來分割區
塊進而作分配。這跟目前實際依組織分配,
散佈全球的 IP 規劃是大不相同的,也使得
實際達成的可能性微乎其微
6. NAT
N e two rk Ad d re s s Tran s latio n
只對出入閘道分配真實位址,內部的主機則採用私有位
址的方式
以閘道器來做轉換
遭遇問題:
沒有真實位址,無法達成 P2P 通信
安全性問題
效能不彰
7. IPv6 的發展
最初被稱為下一代 IP (IPn g : IP Th e N e xt
G e n e ratio n ) 的協定
C ATN IP 、 TU B A 與 S IPP 三種協定,是最被任認可的
版本
IE TF 最後決定採用 S IPP
1 995 年 S IPP 被更名為 IPv6
8. 為什麼需要 IPv6?
位址不夠用
無線裝置支援性
Pe e r- to - Pe e r 網路技術 (N AT 問題 )
3G
安全性
9. 朝向 IPv6
新一代的 IP 協定的考量
相符於 IPv4 於目前協定堆疊中的流程
基本的動作相同
更單純化的 Pro to c o l
解決至目前為止的問題點
位址空間的不足
Mu ltic as t 、 Mo b ile
更容易運用
Plu g an d Play
S e c u rity
可以因應後續長時間的發展
容易擴充新功能
容易自 IPv4 轉移
10. OSI 的各層功能
Application Layer
為了使區域網路內部與區域網路之間的通 ( 應用層 )
信有一個統一的標準,國際標準組織 (IS O ) Presentation
提出了一個共通的網路通信參考模式,其 Layer ( 展示層 )
共包含七層,如右圖所示 Session Layer
( 會議層 )
Transport Layer
( 傳輸層 )
Network Layer
( 網路層 )
Data Link Layer
( 資料連結層 )
Physical Layer
( 實體層 )
12. IPv6 位址種類
U n ic as t 單點傳播
單一介面位址,用於一對一傳送
Mu ltic as t 多點傳播
一組介面位址,用於一對多傳送
An yc as t 任一傳播
一組介面位址,用於一對多中之一 (O n e to o n e
o f m an y) 傳送,最最接近者
13. IPv4 vs IPv6
為徹底解決位址不足, IPv6 封包表頭中的
來源位址欄 (S o u rc e Ad d re s s ) 及目的位址
欄 (D e s tin atio n Ad d re s s ) 由原三十二位元
擴增為 一二八位元。此外,為改善網際
網路服務品質,其他欄位也做了增減及修訂
。
14. 標頭的簡化
相較於 IPv4 , IPv6 簡化取消以下幾個在 IPv4 之欄位, IP
表頭長度 (H e ad e r Le n g th ) 、服務型式 (S e rvic e Typ e ) 、
識別 (Id e n tific atio n ) 、旗號 (Flag s ) 、區段移補
(F rag m e n t O ffs e t) 、表頭檢查和 (H e ad e r C h e c ks u m ) 。
17. IPv4 H e ad e r 欄位
Ve rs io n (4 b its )
表示 IP 的版本。
IH L : In te rn e t H e ad e r Le n g th (4 b its )
表示 IP 標頭長度的欄位。
TO S : Typ e o f S e rvic e (8 b its )
指定 IP S e rvic e 品質需求的欄位。但因定義不明確導
致相互運用不便,實際上不太被廣泛使用。
To tal Le n g th (1 6 b its )
指整個資料塊包含表頭的長度。
18. IPv4 H e ad e r 欄位
Id e n tific atio n (1 6 b its )
為起始主機再傳送時資料塊的識別之用,每個資料塊都
會有一個獨立的識別號碼。
F lag s (3 b its )
這個旗標是在記錄經分割的封包之後是否還有其他封包
存在。
F rag m e n t O ffs e t (1 3 b its )
表示被分割的資料,在 D atag ram 中的原始位置。
Tim e o f Live (8 b its )
記錄封包可以在網路內停留的最長秒數。
19. IPv4 H e ad e r 欄位
Pro to c o l (8 b its )
顯示 IP 的上層 (TC P 或 U D P 等 ) 協定的代碼。
H e ad e r C h e c ks u m (1 6 b its )
用來檢查標頭內是否有錯誤用的。
S o u rc e Ad d re s s (32 b its )
封包來源的 IPv4 位址。
D e s tin atio n Ad d re s s (32 b its )
封包目的地的 IPv4 位址。
20. IPv4 H e ad e r 欄位
O p tio n ( 可變長度 )
Pad d in g ( 可變長度 )
此欄位的功用為,使用 O p tio n 時,當 O p tio n 欄位資
料的大小不為 32 B its 的整數倍時,以 0 填滿,是其
成為 32 B its 的整數倍。
22. IPv6 H e ad e r 欄位
Ve rs io n (4 b its )
表示 In te rn e t Pro to c o l 的版本號碼。
Traffic C las s (8 Bits )
表示封包的類別或優先度。這個欄位與 IPv4
之” S e rvic e Typ e ” 提供相同的功能。
F lo w Lab e l (20 B it)
用來識別封包屬於同一個流量。
Paylo ad Le n g th (1 6 B it)
記錄封包裏資料欄長度,以位元組為單位計算。
23. IPv6 H e ad e r 欄位
H o p Lim it (8 b its )
每一次一個節點轉送一個封包時, 8 位元欄位就減一
,直到 0 時,封包就會被捨棄。
S o u rc e Ad d re s s (1 28 b its )
封包起始位址。
D e s tin atio n Ad d re s s (1 28 b its )
封包接收端位址。
24. IPv6 vs. IPv4 Packet Data Unit
maximum
65535 octets
minimum
20 octets
IPv4 Header Data Field
IPv4 PDU
maximum
65535 octets
Fixed
40 octets 0 or more
Extension Extension
IPv6 Header Header Header
Transport-level PDU
IPv6 PDU
25. IPv6 的優點
延伸位址能力:從 32b its 延展到 1 28b its ,支援更多
層級的定址架構,可定址的節點的數目也增加更多,加
入 s c o p 欄位以改進 m u ltic as tin g 的功能,定義一個新
的網路類別 an yc as t ad d re s s ,用來傳送封包到某一
群組裡頭的某一節點。
表頭格式的簡化:一些 IPv4 標頭欄位被捨棄或改成可
選擇性的使用,以減低 IPv6 的處理負擔和頻寬的消耗
加強表頭的延伸能力和選項部份
編碼後的表頭允許更有效率的轉送,在增加新標頭選項
,可提高選項欄位的使用彈性
26. Cont.
流向標記能力:
可以對特定流量的封包加上標記,以作為對封包特別處
理的依據, fo r e xam p le :針對 re al- tim e 的服務,我
們需要 Q o S 的保證
驗證和隱私能力:
對於 IPv6 的標準,延伸出支援認證、資料完整與資料
可信度的能力
27. IPv6 的位址格式
IPv6 是由 1 28b its (1 6B yte s ) 所組成 , 分成 8 組位置 ,
每組 2 個 b yte s 如下所示
x:x:x:x:x:x:x:x
每個 x 代表 1 組 , 由 2b yte s 所組成 (1 6 進位方式 )
28. IPv6 的位址
L o c a lho s t 地址
這是一個特別為 lo o p b ac k in te rfac e ( 回送界面或環繞
) 定義的地址 , 就像 IPv4 的 "1 27.0.0.1 " 對於 IPv6
lo c alh o s t ad d re s s 是 :
0000:0000:0000:0000:0000:0000:0000:0001
或縮減成
::1
29. ICMPv6
IC MP : In te rn e t C o n tro l Me s s ag e Pro to c o l
在 In te rn e t 的傳輸上,資料是被切割成許多適當大小的
封包傳送,這些封包需要 ip 封包表頭資料來提供選擇
不同的路徑,以達到目的地,但在傳送時會發生問題時
, ( 網路擁塞,主機故障、網路無法到達‥ ) ,需要有
一個機制可以提供控制訊息給來源主機,以方便來源主
機後續的處理。
IC MP 是通知訊息和控制處理所使用的協定,並無定址
和選擇路徑的能力,所以必須內嵌在 IP 封包中進行傳
送,如下圖:
30. ICMP (cont.)
IP 封包
ICMP 封包
IPv6 封包表頭 ICMP 封包表頭 ICMP 資料
在 IPv6 封包中的下一個表頭 (N e xt H e ad e r) 值為 58
時,代表所攜帶的是 IC MPv6 訊息
31. ICMP 格式
0 7 15
31
Type Code Checksum
Message Body
Typ e : 1 , 2 , 3 , 4 , 1 28 , 1 29
C o d e :功能內細項的說明
C h e c ks u m :總和檢查是用來偵測資料是否有損壞的欄位
Me s s ag e B o d y :是用來存放訊息的欄位,用來說明可能發生的
原因和相關的處理方式。
32. 常用的 ICMPv6 封包格式
Type 功能
1 Destination Unreachable 目的地無法到達
2 Packet too Big 封包太大
3 Time Exceeded 逾時
4 Parameter Problem 參數問題
128 Echo Request 回聲要求
129 Echo Reply 回聲答覆
33. Example
Parameter Problem :參數問題
IPv6 的節點在理封包時,發現 IPv6 的表頭或延伸表頭的
某些欄位值有問題而無法正常處理此封包時, IPv6 節點
會丟棄此封包,並且回應 ICMPv6 參數的問題訊息給封包
的來源節點。
Type Code Checksum
Pointer
As much of invoking packet
As will fit without the ICMPv6 Packet
Exceeding the minimum ipv6 MTU[ipv6]
34. IPv4 和 IPv6 共存問題
網際網路工程任務小組 (In te rn e t E n g in e e rin g Tas k
F o rc e , IE TF ) 提出三種 IPv4 與 IPv6 轉換技術,分別是
:
雙重堆疊架構 (D u al- S tac k)
隧道技術 (Tu n n e lin g )
網路地址與協定轉換 (N e two rk Ad d re s s Tran s latio n -
Pro to c o l Tran s latio n )
35. 雙重堆疊架構 (D u al- S tac k)
同時有 IPv4 與 IPv6 兩個協定並存。因為 TC P/IP 是
一系列的通訊協定,所有使用 TC P/IP 的成員共同形成
多層次堆疊 (s tac ke d ) 架構,其最大的優點在於可延伸
的架構。
適合用於與使用 IPv4 協定的終端設備以及使用 IPv6
協定的終端設備皆可通訊的終端設備上,例如提供 3G
行動上網與一般電腦上網的伺服器,以達到相互通訊的
要求
39. 網路地址與協定轉換
(N e two rk Ad d re s s Tran s latio n - Pro to c o l Tran s latio n )
使用網路地址與協議轉換技術是最複雜的技術,因為
N AT- PT 必須將封包的欄位做相對應的轉換。當封包是
IPv6 協定封包時,則會將此封包的欄位相對應到 IPv4
封包欄位,因而轉換成 IPv4 封包。相同的 IPv4 封包要轉
換成 IPv6 封包,也是透過欄位相對應的方法,將 IPv4
封包完全轉換成 IPv6 封包。
N AT- PT 技術會用於純 IPv6 與純 IPv4 網域的連結,透
過通訊協定的完全轉換,使得封包在完全的 IPv4 或 IPv6
網域傳送成為可能。