SlideShare a Scribd company logo
1 of 20
Download to read offline
LCU14-310: Cisco ODP 
Robbie King, LCU14 
LCU14 BURLINGAME
Agenda 
● Cisco’s Data Plane 
● Background 
● Today 
● Initial Merchant Silicon Deployment 
● Subsequent Deployments 
● OpenDataPlane Project 
● Cisco’s Interest 
● Cisco Crypto API 
● ODP Crypto API 
● ODP Crypto API Status 
● Definition 
● Applications 
● IPsec Example App 
● Performance Test App & Results 
● HW Implementations 
● Cisco DP on ODP 
● Introduction 
● Block Diagram 
● Status 
● Going Forward
Cisco Data Plane - Background 
● Originally developed for Cisco QFP ASIC, ASR1000 series routers 
● Over one bazillion lines of code (OK, not a bazillion, but a LOT) 
● Deployed on ASICs ranging from 160 to 1024 threads 
● New ASICs continue to be developed 
● Software leverages assists via hardware abstraction layer 
● Work distribution 
● Packet order preservation 
● Ordered / atomic code sections 
● Classification 
● Crypto operations originally performed by external coprocessor 
● Things have changed over the last decade...
Cisco Data Plane - Today 
● Deployed on a variety of merchant silicon devices 
● x86 
● MIPs 
● PPC 
● ARMv7 and ARMv8 
● Deployed in a variety of environments 
● Bare Metal (QFP ASIC) 
● Bare Metal (Merchant Silicon) 
● Native Linux Process 
● Crypto operations offloaded in a variety of ways 
● Synchronous in place 
● Asynchronous co processor
Cisco Data Plane - Initial Merchant Silicon 
● Initial merchant silicon deployment was (still is) “bare metal” 
● Device provides several hardware assists 
● Work distribution 
● Packet ordering 
● Ordered / atomic code sections 
● Hierarchical queuing and scheduling 
● Cryptographic operations 
● Drawbacks 
● Control plane runs on separate device (difficult to partition Linux / bare metal) 
● Huge investment in time and resources 
● Lack of OS required instrumenting large amount of infrastructure 
● Required reworking existing Cisco hardware abstraction API 
● HW abstraction implementation written from scratch (did not leverage vendor SDK)
Cisco Data Plane - Later Merchant Silicon 
● Subsequent deployments run as multithreaded Linux process 
● Advantages 
● Control plane able to run on same device 
● Rapid deployment velocity on new architectures 
● Consistent infrastructure (file I/O, core files, etc) 
● Drawbacks 
● Kernel interaction must be kept to a minimum for performance 
● Hardware assists (if available) are difficult to leverage 
● If assist controlled by kernel then high interaction price 
● If assist directly accessible from user space, inconsistent API across vendors
OpenDataPlane - Cisco’s Interest 
● Deploying on merchant silicon makes good business sense 
● Allows our ASIC teams to focus on high end differentiation 
● Allows us to take advantage of “economies of scale” using off the shelf silicon 
● Difficult to compare devices today 
● Often unable to consider a device’s HW assists due to SW effort required to leverage 
● Goal is to compare devices based on throughput, power and cost 
● Desire well defined, common APIs for hardware assists 
● Common APIs are good for everyone 
● Common APIs accelerate the development of both proprietary and open source apps 
● Well crafted APIs allow vendors to differentiate while maintaining portability 
● Facilitates device selection based on all merits
OpenDataPlane - Cisco Crypto API 
● Defining HW assist APIs is a daunting task - prioritize 
● Crypto performance is becoming increasingly important 
● Getting crypto working “on the next device” has been challenging 
● Cisco developed “Crypto Device Abstraction Layer” (CDAL) 
● Initial version defines symmetric key operations 
● Session creation and per packet APIs, both synchronous and asynchronous 
● CDAL has been / is being implemented by multiple HW vendors 
● ODP project presents an awesome opportunity 
● Helps Cisco accelerate crypto development and participate in open source community 
● Cisco requested Crypto API become an ODP priority at LCA14
OpenDataPlane - ODP Crypto API 
● Goals for the ODP Crypto API 
● Level of functionality (but not necessarily semantics) similar to CDAL API 
● Develop within existing ODP constructs (i.e. don’t force ODP to be CDAL) 
● Be useable “ala carte”, i.e. don’t require wholesale conversion of app to ODP 
● Deliverables 
● Crypto API Specification 
● Linux-generic reference implementation 
● Example application to evaluate API definition 
● Stretch Goals 
● Test application to evaluate performance across implementations 
● Cisco data plane using ODP crypto API
ODP Crypto API Status - Definition 
● Document version 1.0 available today (opendataplane.org) 
● Reference implementation also available (git.linaro.org/lng/odp.git) 
● Patches accepted for “linux-generic” in August 
● Implements 3DES cipher and MD5 hash for authentication using OpenSSL libraries 
● Supports sync and async versions of per session and per packet operations 
● Supports multiple models for storing results of per packet operations 
● Result into same buffer (i.e. in place) 
● Result into new buffer (supplied by application) 
● Result into new buffer (allocated by implementation) 
● Open issues / work items 
● Resolve packet / completion event relationship questions 
● Ability to query implementation capacities and capabilities
ODP Crypto API Status - Applications 
● IPsec example application 
● Patches reviewed and ready for implementation 
● Vehicle to evaluate “robustness” of ODP crypto API 
● Implements IPsec ESP and AH protocols using 3DES and MD5/96 
● Performance test application 
● Initial version functioning, more work to do before submitting patches 
● Measures throughput for various payload sizes 
● Preliminary results (next slides) 
● Cisco DP using ODP crypto API 
● Start gated by Cisco data plane work items 
● Pending DP port to ODP infrastructure 
● Pending DP support for configuring crypto in “headless” environment
ODP Crypto API Status - IPsec Example App 
● Configuration driven from command line (modeled after “setkey”) 
● IPv4 forwarding between ports based on configured routing table 
● IPsec encode/decode based on configured SA/SP database entries 
● Currently transport mode only, tunnel mode to be added (Bug 641) 
● Supports live traffic (demos on multiple platforms this week) 
● Supports standalone traffic generation / verification 
● Generates packets internally, captures and verifies results without need for packet IO 
● Utilizes key features of ODP 
● Runs on multiple cores, utilizing either odp_schedule or polled queues 
● Utilizes ORDERED and ATOMIC queues to maintain ordering
ODP Crypto API Status - Performance Test App 
● All testing performed on TI Keystone II eval system (4xA15) 
● Compare “linux-generic” (SW) versus “linux-keystone2” (HW) 
● Test loops, invoking per packet crypto API, measures elapsed time 
● Single encode/encrypt session used for testing 
● Session specifies both cipher (3DES) and authentication (MD5-96) 
● Async test saturates pipeline with parallel encrypt operations, polls for responses 
● Caveats 
● The “linux-generic” as tested focuses on functionality not performance 
● The “linux-keystone2” as tested has yet to be performance optimized (but will be soon)
ODP Crypto API Status - Perf Test App Results 
payload 
(bytes) 
linux-generic linux-keystone2 
elapsed (us) throughput (kb) elapsed (us) throughput (kb) 
16 14.447 1,081 2.782 5,615 
64 22.132 2,823 2.804 22,290 
256 52.910 4,725 2.867 87,198 
1,024 176.745 5,657 7.349 136,076 
8,192 1,331.475 6,008 56.250 142,221 
16,384 2,652.426 6,032 112.500 142,221 
● In summary, H/W assist is ~22 times faster for sizeable payloads
ODP Crypto API Status - HW Implementations 
● Several vendors demoing this week using IPsec example app 
● Texas Instruments - Keystone II 
● Asynchronous / new buffer mode 
● Cavium - Octeon CN66XX 
● Synchronous / in place mode 
● Freescale - P4080DS 
● Asynchronous / in place mode 
● Avago - AXM5500 
● Asynchronous / in place mode
Cisco DP on ODP - Introduction 
● With Crypto API defined, where do we focus next? 
● As core counts grow, HW assists critical to core over core scaling 
● Leveraging merchant silicon HW assists proves challenging 
● Large resource investment for each device / SDK targeted 
● Different operating environments 
● Different levels of abstraction 
● ODP potentially allows Cisco to quickly leverage critical assists 
● Work distribution - odp_schedule() 
● Packet ordering - ODP_SCHED_SYNC_ORDERED queues 
● Ordered / atomic code sections - ODP_SCHED_SYNC_ATOMIC queues 
● Buffer management 
● Crypto operations
Cisco DP on ODP - Block Diagram 
RX IF 
Core 0 
Core 1 
Crypto 
Assist 
SCHEDULER 
RX IF 
TX IF 
TX IF 
Loop doing the following 
● Call odp_schedule for new work 
● Process as much as possible 
● Call odp_queue_enq to send to 
○ Output interface or 
○ Crypto assist engine or 
○ Ordered queue or 
○ Atomic queue 
N 
BUFFER MANAGER 
ORDERED 
ATOMIC 
Core N
Cisco DP on ODP - Status 
● Currently forwarding IPv4 on X86 and ARM using “linux-generic” 
● Development started on ARM using “linux-keystone2” 
● Demoing on ARM this week 
● Next steps 
● Target additional platforms as ODP implementations become available 
● Performance analysis and optimizations 
● End to end QOS analysis (priority, over subscription, etc) 
● Integration of CDAL / crypto API
Going Forward 
● For ODP 1.0 
● Quickly finalize the basic APIs 
● Strive for functionality not perfection 
● Define tear down APIs for normal application exit 
● Define abnormal cleanup APIs / mechanisms for abnormal exit 
● Complete API compliance test suite 
● Post ODP 1.0 
● Focus on performance and HW implementations 
● Verify 1.0 APIs can be implemented efficiently across member hardware 
● Verify 1.0 APIs can be used to build a non-trivial application considering 
● Portability 
● Performance 
● Quality of Service (for example, behavior of overall system when over-subscribed)
More about Linaro Connect: connect.linaro.org 
Linaro members: www.linaro.org/members 
More about Linaro: www.linaro.org/about/

More Related Content

What's hot

Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
Linaro
 
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Linaro
 
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linaro
 

What's hot (20)

Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
 
LAS16-200: SCMI - System Management and Control Interface
LAS16-200:  SCMI - System Management and Control InterfaceLAS16-200:  SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
 
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96BoardsLAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
 
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideBKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
 
Hotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya MukawaHotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya Mukawa
 
OpenDataPlane - Bill Fischofer
OpenDataPlane - Bill FischoferOpenDataPlane - Bill Fischofer
OpenDataPlane - Bill Fischofer
 
P4 to OpenDataPlane Compiler - BUD17-304
P4 to OpenDataPlane Compiler - BUD17-304P4 to OpenDataPlane Compiler - BUD17-304
P4 to OpenDataPlane Compiler - BUD17-304
 
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP IntegrationBKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
 
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
 
CentOS NFV SIG Introduction and Update
CentOS NFV SIG Introduction and UpdateCentOS NFV SIG Introduction and Update
CentOS NFV SIG Introduction and Update
 
OpenDataPlane Project
OpenDataPlane ProjectOpenDataPlane Project
OpenDataPlane Project
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
 
BKK16-210 Migrating to the new dispatcher
BKK16-210 Migrating to the new dispatcherBKK16-210 Migrating to the new dispatcher
BKK16-210 Migrating to the new dispatcher
 
BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64
 
Clang: More than just a C/C++ Compiler
Clang: More than just a C/C++ CompilerClang: More than just a C/C++ Compiler
Clang: More than just a C/C++ Compiler
 
LAS16-207: Bus scaling QoS
LAS16-207: Bus scaling QoSLAS16-207: Bus scaling QoS
LAS16-207: Bus scaling QoS
 
Implementing MPLS Services using Openflow
Implementing MPLS Services using OpenflowImplementing MPLS Services using Openflow
Implementing MPLS Services using Openflow
 
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
 
LAS16-507: LXC support in LAVA
LAS16-507: LXC support in LAVALAS16-507: LXC support in LAVA
LAS16-507: LXC support in LAVA
 

Similar to LCU14 310- Cisco ODP v2

20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_Gallo
Andrea Gallo
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdfWhat_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
chalermpany
 

Similar to LCU14 310- Cisco ODP v2 (20)

20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_Gallo
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
 
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - LinaroSummit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
 
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
OSDC 2016 | rkt and Kubernetes: What’s new with Container Runtimes and Orches...
 
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
OSDC 2016 - rkt and Kubernentes what's new with Container Runtimes and Orches...
 
Workday's Next Generation Private Cloud
Workday's Next Generation Private CloudWorkday's Next Generation Private Cloud
Workday's Next Generation Private Cloud
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdfWhat_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
 
From Fixed-Function to Programmable Switching Chip for Network Packet Broker ...
From Fixed-Function to Programmable Switching Chip for Network Packet Broker ...From Fixed-Function to Programmable Switching Chip for Network Packet Broker ...
From Fixed-Function to Programmable Switching Chip for Network Packet Broker ...
 
Hands on OpenCL
Hands on OpenCLHands on OpenCL
Hands on OpenCL
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development Pipeline
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 

LCU14 310- Cisco ODP v2

  • 1. LCU14-310: Cisco ODP Robbie King, LCU14 LCU14 BURLINGAME
  • 2. Agenda ● Cisco’s Data Plane ● Background ● Today ● Initial Merchant Silicon Deployment ● Subsequent Deployments ● OpenDataPlane Project ● Cisco’s Interest ● Cisco Crypto API ● ODP Crypto API ● ODP Crypto API Status ● Definition ● Applications ● IPsec Example App ● Performance Test App & Results ● HW Implementations ● Cisco DP on ODP ● Introduction ● Block Diagram ● Status ● Going Forward
  • 3. Cisco Data Plane - Background ● Originally developed for Cisco QFP ASIC, ASR1000 series routers ● Over one bazillion lines of code (OK, not a bazillion, but a LOT) ● Deployed on ASICs ranging from 160 to 1024 threads ● New ASICs continue to be developed ● Software leverages assists via hardware abstraction layer ● Work distribution ● Packet order preservation ● Ordered / atomic code sections ● Classification ● Crypto operations originally performed by external coprocessor ● Things have changed over the last decade...
  • 4. Cisco Data Plane - Today ● Deployed on a variety of merchant silicon devices ● x86 ● MIPs ● PPC ● ARMv7 and ARMv8 ● Deployed in a variety of environments ● Bare Metal (QFP ASIC) ● Bare Metal (Merchant Silicon) ● Native Linux Process ● Crypto operations offloaded in a variety of ways ● Synchronous in place ● Asynchronous co processor
  • 5. Cisco Data Plane - Initial Merchant Silicon ● Initial merchant silicon deployment was (still is) “bare metal” ● Device provides several hardware assists ● Work distribution ● Packet ordering ● Ordered / atomic code sections ● Hierarchical queuing and scheduling ● Cryptographic operations ● Drawbacks ● Control plane runs on separate device (difficult to partition Linux / bare metal) ● Huge investment in time and resources ● Lack of OS required instrumenting large amount of infrastructure ● Required reworking existing Cisco hardware abstraction API ● HW abstraction implementation written from scratch (did not leverage vendor SDK)
  • 6. Cisco Data Plane - Later Merchant Silicon ● Subsequent deployments run as multithreaded Linux process ● Advantages ● Control plane able to run on same device ● Rapid deployment velocity on new architectures ● Consistent infrastructure (file I/O, core files, etc) ● Drawbacks ● Kernel interaction must be kept to a minimum for performance ● Hardware assists (if available) are difficult to leverage ● If assist controlled by kernel then high interaction price ● If assist directly accessible from user space, inconsistent API across vendors
  • 7. OpenDataPlane - Cisco’s Interest ● Deploying on merchant silicon makes good business sense ● Allows our ASIC teams to focus on high end differentiation ● Allows us to take advantage of “economies of scale” using off the shelf silicon ● Difficult to compare devices today ● Often unable to consider a device’s HW assists due to SW effort required to leverage ● Goal is to compare devices based on throughput, power and cost ● Desire well defined, common APIs for hardware assists ● Common APIs are good for everyone ● Common APIs accelerate the development of both proprietary and open source apps ● Well crafted APIs allow vendors to differentiate while maintaining portability ● Facilitates device selection based on all merits
  • 8. OpenDataPlane - Cisco Crypto API ● Defining HW assist APIs is a daunting task - prioritize ● Crypto performance is becoming increasingly important ● Getting crypto working “on the next device” has been challenging ● Cisco developed “Crypto Device Abstraction Layer” (CDAL) ● Initial version defines symmetric key operations ● Session creation and per packet APIs, both synchronous and asynchronous ● CDAL has been / is being implemented by multiple HW vendors ● ODP project presents an awesome opportunity ● Helps Cisco accelerate crypto development and participate in open source community ● Cisco requested Crypto API become an ODP priority at LCA14
  • 9. OpenDataPlane - ODP Crypto API ● Goals for the ODP Crypto API ● Level of functionality (but not necessarily semantics) similar to CDAL API ● Develop within existing ODP constructs (i.e. don’t force ODP to be CDAL) ● Be useable “ala carte”, i.e. don’t require wholesale conversion of app to ODP ● Deliverables ● Crypto API Specification ● Linux-generic reference implementation ● Example application to evaluate API definition ● Stretch Goals ● Test application to evaluate performance across implementations ● Cisco data plane using ODP crypto API
  • 10. ODP Crypto API Status - Definition ● Document version 1.0 available today (opendataplane.org) ● Reference implementation also available (git.linaro.org/lng/odp.git) ● Patches accepted for “linux-generic” in August ● Implements 3DES cipher and MD5 hash for authentication using OpenSSL libraries ● Supports sync and async versions of per session and per packet operations ● Supports multiple models for storing results of per packet operations ● Result into same buffer (i.e. in place) ● Result into new buffer (supplied by application) ● Result into new buffer (allocated by implementation) ● Open issues / work items ● Resolve packet / completion event relationship questions ● Ability to query implementation capacities and capabilities
  • 11. ODP Crypto API Status - Applications ● IPsec example application ● Patches reviewed and ready for implementation ● Vehicle to evaluate “robustness” of ODP crypto API ● Implements IPsec ESP and AH protocols using 3DES and MD5/96 ● Performance test application ● Initial version functioning, more work to do before submitting patches ● Measures throughput for various payload sizes ● Preliminary results (next slides) ● Cisco DP using ODP crypto API ● Start gated by Cisco data plane work items ● Pending DP port to ODP infrastructure ● Pending DP support for configuring crypto in “headless” environment
  • 12. ODP Crypto API Status - IPsec Example App ● Configuration driven from command line (modeled after “setkey”) ● IPv4 forwarding between ports based on configured routing table ● IPsec encode/decode based on configured SA/SP database entries ● Currently transport mode only, tunnel mode to be added (Bug 641) ● Supports live traffic (demos on multiple platforms this week) ● Supports standalone traffic generation / verification ● Generates packets internally, captures and verifies results without need for packet IO ● Utilizes key features of ODP ● Runs on multiple cores, utilizing either odp_schedule or polled queues ● Utilizes ORDERED and ATOMIC queues to maintain ordering
  • 13. ODP Crypto API Status - Performance Test App ● All testing performed on TI Keystone II eval system (4xA15) ● Compare “linux-generic” (SW) versus “linux-keystone2” (HW) ● Test loops, invoking per packet crypto API, measures elapsed time ● Single encode/encrypt session used for testing ● Session specifies both cipher (3DES) and authentication (MD5-96) ● Async test saturates pipeline with parallel encrypt operations, polls for responses ● Caveats ● The “linux-generic” as tested focuses on functionality not performance ● The “linux-keystone2” as tested has yet to be performance optimized (but will be soon)
  • 14. ODP Crypto API Status - Perf Test App Results payload (bytes) linux-generic linux-keystone2 elapsed (us) throughput (kb) elapsed (us) throughput (kb) 16 14.447 1,081 2.782 5,615 64 22.132 2,823 2.804 22,290 256 52.910 4,725 2.867 87,198 1,024 176.745 5,657 7.349 136,076 8,192 1,331.475 6,008 56.250 142,221 16,384 2,652.426 6,032 112.500 142,221 ● In summary, H/W assist is ~22 times faster for sizeable payloads
  • 15. ODP Crypto API Status - HW Implementations ● Several vendors demoing this week using IPsec example app ● Texas Instruments - Keystone II ● Asynchronous / new buffer mode ● Cavium - Octeon CN66XX ● Synchronous / in place mode ● Freescale - P4080DS ● Asynchronous / in place mode ● Avago - AXM5500 ● Asynchronous / in place mode
  • 16. Cisco DP on ODP - Introduction ● With Crypto API defined, where do we focus next? ● As core counts grow, HW assists critical to core over core scaling ● Leveraging merchant silicon HW assists proves challenging ● Large resource investment for each device / SDK targeted ● Different operating environments ● Different levels of abstraction ● ODP potentially allows Cisco to quickly leverage critical assists ● Work distribution - odp_schedule() ● Packet ordering - ODP_SCHED_SYNC_ORDERED queues ● Ordered / atomic code sections - ODP_SCHED_SYNC_ATOMIC queues ● Buffer management ● Crypto operations
  • 17. Cisco DP on ODP - Block Diagram RX IF Core 0 Core 1 Crypto Assist SCHEDULER RX IF TX IF TX IF Loop doing the following ● Call odp_schedule for new work ● Process as much as possible ● Call odp_queue_enq to send to ○ Output interface or ○ Crypto assist engine or ○ Ordered queue or ○ Atomic queue N BUFFER MANAGER ORDERED ATOMIC Core N
  • 18. Cisco DP on ODP - Status ● Currently forwarding IPv4 on X86 and ARM using “linux-generic” ● Development started on ARM using “linux-keystone2” ● Demoing on ARM this week ● Next steps ● Target additional platforms as ODP implementations become available ● Performance analysis and optimizations ● End to end QOS analysis (priority, over subscription, etc) ● Integration of CDAL / crypto API
  • 19. Going Forward ● For ODP 1.0 ● Quickly finalize the basic APIs ● Strive for functionality not perfection ● Define tear down APIs for normal application exit ● Define abnormal cleanup APIs / mechanisms for abnormal exit ● Complete API compliance test suite ● Post ODP 1.0 ● Focus on performance and HW implementations ● Verify 1.0 APIs can be implemented efficiently across member hardware ● Verify 1.0 APIs can be used to build a non-trivial application considering ● Portability ● Performance ● Quality of Service (for example, behavior of overall system when over-subscribed)
  • 20. More about Linaro Connect: connect.linaro.org Linaro members: www.linaro.org/members More about Linaro: www.linaro.org/about/