SlideShare a Scribd company logo
1 of 23
Generic External Memory
for Switch Data Planes
Daehyeok Kim
Yibo Zhu, Changhoon Kim, Jeongkeun Lee, Srinivasan Seshan
Enabling Virtual Switching on ToR Switch
2
Customers’ Bare-metal servers
(Tenant, VM IP) Host IP
(1, 20.0.0.1) 10.0.0.1
(1, 20.0.0.2) 10.0.1.1
(1, 20.0.0.3) 10.0.2.1
… …
Multi-million
Entries
≫ SRAM size!
Cannot install virtual
switches on the servers
Limited SRAM space is bottleneck for memory-intensive applications!
Move virtual switch
to ToR switch
Current Trend: Moving Functionality to Switches
3
All these applications can benefit from large memory space!
Programmable Switch Chips Need More Memory
•Programmable data plane technology
• E.g., Protocol-Independent Switch Architecture (PISA) + P4
• Flexible but only with on-chip SRAM cache
4
+
Programmable
switch chip
w/ SRAM cache
DRAM
= Lots of innovative applications!
Status quo
5
• Fixed-function switch chips built with fixed-function external memory
• These aren’t very useful
• Inflexible: Usage fixed at design time
• Fixed and small scale: Memory size and bandwidth fixed at design time
• Expensive: Chip getting larger and complex
Is programmable switch chip + general-purpose memory possible?
GEM: Generic External Memory for
Programmable Data Planes
6
General-purpose DRAM pool
Programmable switch chip
Flexible memory
access BW and size
Re-use commodity
hardware
Key Components
7
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
Dst:
10.0.1.1:20
C1: Remote memory
access channel
C3: Remote data structures
and APIs
C2: Packet management
during remote memory access
C1: Remote Memory Access Channel
• Goal: Enable programmable switch chip to directly access memory
• Purely access DRAM: No impact to the server’s existing compute and
networking workloads
• Minimal latency between the chip and memory
• Challenge: How to generate RDMA requests from the data plane?
• Programmable switch chip cannot generate arbitrary new packets
8
Leverage RDMA!
*RDMA: Remote Direct Memory Access
Accessing Remote Memory
from Data Plane via RDMA
9
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
Dst:
10.0.1.1:20
Generating RDMA request
1. Clone and truncate a packet
2. Add RDMA headers
READ Resp.
READ (entry)
DRAM server Context
Server #1 QP#, SEQ#, ACK#, …
… …
Implementable in P4
ETH Header
RDMA Header
(READ)
Key Components
10
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
Dst:
10.0.1.1:20
C1: Remote memory
access channel
C3: Remote data structures
and APIs
C2: Packet management
during remote memory access
C2: Packet Management during
Remote Memory Access
11
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
READ (entry)
READ Resp.
Dst:
10.0.1.1:20
Packet buffer in on-chip SRAM
Consuming too much
SRAM space! 
Depositing Packets on Remote Buffer
12
Match Action Packet
20.0.0.1:80 10.0.0.2:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
Dst:
10.0.1.1:20
READ (entry)WRITE (pkt)
READ Resp.
Key Components
13
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
Dst:
10.0.1.1:20
C1: Remote memory
access channel
C3: Remote data structures
and APIs
C2: Packet management
during remote memory access
C3: Remote Data Structures and APIs
14
Match Action Packet
0xA 20.0.0.1:80 10.0.0.2:20
0xB 20.0.0.2:80 10.0.1.1:20
0xC 20.0.0.3:80 10.0.2.1:20
… … …
Match Action
20.0.0.1:80 10.0.0.1:20
… …
miss
Dst:
20.0.0.2:80
READ (entry)
@ 0xB
READ Resp.
WRITE (pkt)
@ 0xB
How to locate
remote entry?
Match Mem addr
20.0.0.1:80 0xA
20.0.0.2:80 0xB
20.0.0.3:80 0xC
… …
Consuming too much
SRAM space! 
General Data Structures and APIs?
• Ongoing work: designing general data structures for remote memory
• Proof-of-concept use cases for specific applications
• Lookup table extension for extending virtual switch table
• Packet buffer extension for mitigating packet drops due to incast
• State store extension for network telemetry
15
Use Case: Extending Lookup Table
16
Customers’ Bare-metal servers
Remote table
Match Action
20.0.0.1:80 10.0.0.1:20
… …
Fetch entries from
remote tables 
Match Action
20.0.0.1:80 10.0.0.1:20
20.0.0.2:80 10.0.1.1:20
20.0.0.3:80 10.0.2.1:20
… …
Hot entries on SRAM
Entire
entries
Other Use Cases
17
• Packet buffer extension for
mitigating packet drops
Remote buffer servers
Queue is full…
Dropping packets 
Reduce
packet drops 
Remote
State stores
Can’t maintain many
stateful objects 
Update the
remote stores 
• State store extension for
network telemetry
Experiment Setup
18
ETH
IP/DSCP=0x00
TCP
Payload
ETH
IP/DSCP=0x28
TCP
Payload
Action
DSCP -> 0x28
Run NPTcp
Server Server
*Baseline: Simple L2 switch
Results
• End-to-end latency
• Packet store / load throughput: close to the line rate (≈ 37.5 Gbps)
19
1 - 2 μs
additional latency
Summary
20
Vision: Generic External Memory for Programmable Data Plane
Q1: Efficient caching on SRAM
GEM will be a key enabler
for innovations in networking and computational networking!
Q3: Handling server failures
Q2: Dynamically scaling DRAM pool
Backup slides
21
State Store Extension
• Q: How much bandwidth does remote counter update consume?
22
Max IOPS of RDMA FA
with a single NIC
Prototype Implementation
23
Control plane
Data Plane
• RDMA channel controller in C
• Data plane configuration scripts
in Python• Remote memory extension
in P4
• Example programs in P4
Barefoot Tofino Switch
Server 1 Server 2

More Related Content

What's hot

Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
 
z/VM Performance Analysis
z/VM Performance Analysisz/VM Performance Analysis
z/VM Performance AnalysisRodrigo Campos
 
Analysis of Open-Source Drivers for IEEE 802.11 WLANs
Analysis of Open-Source Drivers for IEEE 802.11 WLANsAnalysis of Open-Source Drivers for IEEE 802.11 WLANs
Analysis of Open-Source Drivers for IEEE 802.11 WLANsDanh Nguyen
 
Implementation of MAC-level sleep-scheduling
Implementation of MAC-level sleep-schedulingImplementation of MAC-level sleep-scheduling
Implementation of MAC-level sleep-schedulingOlivier Cervello
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneOpen-NFP
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardsonharryvanhaaren
 
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Ontico
 
PASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM AbstractionsPASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM Abstractionsmicchie
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinatopstavirs
 
High Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing CommunityHigh Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing Community6WIND
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 
BKK16-208 EAS
BKK16-208 EASBKK16-208 EAS
BKK16-208 EASLinaro
 
DPDK Integration: A Product's Journey - Roger B. Melton
DPDK Integration: A Product's Journey - Roger B. MeltonDPDK Integration: A Product's Journey - Roger B. Melton
DPDK Integration: A Product's Journey - Roger B. Meltonharryvanhaaren
 
A Performance Characterization of Postgres on Different Storage Systems
A Performance Characterization of Postgres on Different Storage SystemsA Performance Characterization of Postgres on Different Storage Systems
A Performance Characterization of Postgres on Different Storage SystemsDong Ye
 
DPDK Support for New HW Offloads
DPDK Support for New HW OffloadsDPDK Support for New HW Offloads
DPDK Support for New HW OffloadsNetronome
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/OAndrea Righi
 

What's hot (20)

Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
z/VM Performance Analysis
z/VM Performance Analysisz/VM Performance Analysis
z/VM Performance Analysis
 
Analysis of Open-Source Drivers for IEEE 802.11 WLANs
Analysis of Open-Source Drivers for IEEE 802.11 WLANsAnalysis of Open-Source Drivers for IEEE 802.11 WLANs
Analysis of Open-Source Drivers for IEEE 802.11 WLANs
 
Implementation of MAC-level sleep-scheduling
Implementation of MAC-level sleep-schedulingImplementation of MAC-level sleep-scheduling
Implementation of MAC-level sleep-scheduling
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data Plane
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
 
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
PASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM AbstractionsPASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM Abstractions
 
Tacc Infinite Memory Engine
Tacc Infinite Memory EngineTacc Infinite Memory Engine
Tacc Infinite Memory Engine
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinato
 
High Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing CommunityHigh Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing Community
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
BKK16-208 EAS
BKK16-208 EASBKK16-208 EAS
BKK16-208 EAS
 
DPDK Integration: A Product's Journey - Roger B. Melton
DPDK Integration: A Product's Journey - Roger B. MeltonDPDK Integration: A Product's Journey - Roger B. Melton
DPDK Integration: A Product's Journey - Roger B. Melton
 
A Performance Characterization of Postgres on Different Storage Systems
A Performance Characterization of Postgres on Different Storage SystemsA Performance Characterization of Postgres on Different Storage Systems
A Performance Characterization of Postgres on Different Storage Systems
 
DPDK Support for New HW Offloads
DPDK Support for New HW OffloadsDPDK Support for New HW Offloads
DPDK Support for New HW Offloads
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/O
 

Similar to Generic External Memory for Switch Data Planes

Installing Oracle Database on LDOM
Installing Oracle Database on LDOMInstalling Oracle Database on LDOM
Installing Oracle Database on LDOMPhilippe Fierens
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)Art Schanz
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...PROIDEA
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
ODSA Use Case - SmartNIC
ODSA Use Case - SmartNICODSA Use Case - SmartNIC
ODSA Use Case - SmartNICODSA Workgroup
 
Introduction to Real Time Java
Introduction to Real Time JavaIntroduction to Real Time Java
Introduction to Real Time JavaDeniz Oguz
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptxPratik Gohel
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Gunjae_ISCA15_slides.pdf
Gunjae_ISCA15_slides.pdfGunjae_ISCA15_slides.pdf
Gunjae_ISCA15_slides.pdfssuser30e7d2
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...Linaro
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioOPNFV
 
New idc architecture
New idc architectureNew idc architecture
New idc architectureMason Mei
 
Not bridge south bridge archexture
Not bridge  south bridge archextureNot bridge  south bridge archexture
Not bridge south bridge archexturesunil kumar
 
On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...Jorge E. López de Vergara Méndez
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Bruno Castelucci
 
Microsofts Configurable Cloud
Microsofts Configurable CloudMicrosofts Configurable Cloud
Microsofts Configurable CloudChris Genazzio
 

Similar to Generic External Memory for Switch Data Planes (20)

Installing Oracle Database on LDOM
Installing Oracle Database on LDOMInstalling Oracle Database on LDOM
Installing Oracle Database on LDOM
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)
MQTC V2.0.1.3 - WMQ & TCP Buffers – Size DOES Matter! (pps)
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
ODSA Use Case - SmartNIC
ODSA Use Case - SmartNICODSA Use Case - SmartNIC
ODSA Use Case - SmartNIC
 
Introduction to Real Time Java
Introduction to Real Time JavaIntroduction to Real Time Java
Introduction to Real Time Java
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptx
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Gunjae_ISCA15_slides.pdf
Gunjae_ISCA15_slides.pdfGunjae_ISCA15_slides.pdf
Gunjae_ISCA15_slides.pdf
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
 
New idc architecture
New idc architectureNew idc architecture
New idc architecture
 
Not bridge south bridge archexture
Not bridge  south bridge archextureNot bridge  south bridge archexture
Not bridge south bridge archexture
 
On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
 
Microsofts Configurable Cloud
Microsofts Configurable CloudMicrosofts Configurable Cloud
Microsofts Configurable Cloud
 

Recently uploaded

GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisAreesha Ahmad
 
Technical english Technical english.pptx
Technical english Technical english.pptxTechnical english Technical english.pptx
Technical english Technical english.pptxyoussefboujtat3
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysBrahmesh Reddy B R
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einsteinxgamestudios8
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneySérgio Sacani
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Ansari Aashif Raza Mohd Imtiyaz
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsSérgio Sacani
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)kushbuR
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxGOWTHAMIM22
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxjayabahari688
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxMAGOTI ERNEST
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationSérgio Sacani
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloChristian Robert
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Sahil Suleman
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfPharmatech-rx
 

Recently uploaded (20)

GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of Asepsis
 
Technical english Technical english.pptx
Technical english Technical english.pptxTechnical english Technical english.pptx
Technical english Technical english.pptx
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree days
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 

Generic External Memory for Switch Data Planes

  • 1. Generic External Memory for Switch Data Planes Daehyeok Kim Yibo Zhu, Changhoon Kim, Jeongkeun Lee, Srinivasan Seshan
  • 2. Enabling Virtual Switching on ToR Switch 2 Customers’ Bare-metal servers (Tenant, VM IP) Host IP (1, 20.0.0.1) 10.0.0.1 (1, 20.0.0.2) 10.0.1.1 (1, 20.0.0.3) 10.0.2.1 … … Multi-million Entries ≫ SRAM size! Cannot install virtual switches on the servers Limited SRAM space is bottleneck for memory-intensive applications! Move virtual switch to ToR switch
  • 3. Current Trend: Moving Functionality to Switches 3 All these applications can benefit from large memory space!
  • 4. Programmable Switch Chips Need More Memory •Programmable data plane technology • E.g., Protocol-Independent Switch Architecture (PISA) + P4 • Flexible but only with on-chip SRAM cache 4 + Programmable switch chip w/ SRAM cache DRAM = Lots of innovative applications!
  • 5. Status quo 5 • Fixed-function switch chips built with fixed-function external memory • These aren’t very useful • Inflexible: Usage fixed at design time • Fixed and small scale: Memory size and bandwidth fixed at design time • Expensive: Chip getting larger and complex Is programmable switch chip + general-purpose memory possible?
  • 6. GEM: Generic External Memory for Programmable Data Planes 6 General-purpose DRAM pool Programmable switch chip Flexible memory access BW and size Re-use commodity hardware
  • 7. Key Components 7 Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 Dst: 10.0.1.1:20 C1: Remote memory access channel C3: Remote data structures and APIs C2: Packet management during remote memory access
  • 8. C1: Remote Memory Access Channel • Goal: Enable programmable switch chip to directly access memory • Purely access DRAM: No impact to the server’s existing compute and networking workloads • Minimal latency between the chip and memory • Challenge: How to generate RDMA requests from the data plane? • Programmable switch chip cannot generate arbitrary new packets 8 Leverage RDMA! *RDMA: Remote Direct Memory Access
  • 9. Accessing Remote Memory from Data Plane via RDMA 9 Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 Dst: 10.0.1.1:20 Generating RDMA request 1. Clone and truncate a packet 2. Add RDMA headers READ Resp. READ (entry) DRAM server Context Server #1 QP#, SEQ#, ACK#, … … … Implementable in P4 ETH Header RDMA Header (READ)
  • 10. Key Components 10 Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 Dst: 10.0.1.1:20 C1: Remote memory access channel C3: Remote data structures and APIs C2: Packet management during remote memory access
  • 11. C2: Packet Management during Remote Memory Access 11 Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 READ (entry) READ Resp. Dst: 10.0.1.1:20 Packet buffer in on-chip SRAM Consuming too much SRAM space! 
  • 12. Depositing Packets on Remote Buffer 12 Match Action Packet 20.0.0.1:80 10.0.0.2:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 Dst: 10.0.1.1:20 READ (entry)WRITE (pkt) READ Resp.
  • 13. Key Components 13 Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 Dst: 10.0.1.1:20 C1: Remote memory access channel C3: Remote data structures and APIs C2: Packet management during remote memory access
  • 14. C3: Remote Data Structures and APIs 14 Match Action Packet 0xA 20.0.0.1:80 10.0.0.2:20 0xB 20.0.0.2:80 10.0.1.1:20 0xC 20.0.0.3:80 10.0.2.1:20 … … … Match Action 20.0.0.1:80 10.0.0.1:20 … … miss Dst: 20.0.0.2:80 READ (entry) @ 0xB READ Resp. WRITE (pkt) @ 0xB How to locate remote entry? Match Mem addr 20.0.0.1:80 0xA 20.0.0.2:80 0xB 20.0.0.3:80 0xC … … Consuming too much SRAM space! 
  • 15. General Data Structures and APIs? • Ongoing work: designing general data structures for remote memory • Proof-of-concept use cases for specific applications • Lookup table extension for extending virtual switch table • Packet buffer extension for mitigating packet drops due to incast • State store extension for network telemetry 15
  • 16. Use Case: Extending Lookup Table 16 Customers’ Bare-metal servers Remote table Match Action 20.0.0.1:80 10.0.0.1:20 … … Fetch entries from remote tables  Match Action 20.0.0.1:80 10.0.0.1:20 20.0.0.2:80 10.0.1.1:20 20.0.0.3:80 10.0.2.1:20 … … Hot entries on SRAM Entire entries
  • 17. Other Use Cases 17 • Packet buffer extension for mitigating packet drops Remote buffer servers Queue is full… Dropping packets  Reduce packet drops  Remote State stores Can’t maintain many stateful objects  Update the remote stores  • State store extension for network telemetry
  • 19. Results • End-to-end latency • Packet store / load throughput: close to the line rate (≈ 37.5 Gbps) 19 1 - 2 μs additional latency
  • 20. Summary 20 Vision: Generic External Memory for Programmable Data Plane Q1: Efficient caching on SRAM GEM will be a key enabler for innovations in networking and computational networking! Q3: Handling server failures Q2: Dynamically scaling DRAM pool
  • 22. State Store Extension • Q: How much bandwidth does remote counter update consume? 22 Max IOPS of RDMA FA with a single NIC
  • 23. Prototype Implementation 23 Control plane Data Plane • RDMA channel controller in C • Data plane configuration scripts in Python• Remote memory extension in P4 • Example programs in P4 Barefoot Tofino Switch Server 1 Server 2