SlideShare a Scribd company logo
1
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Case study:
Performance-efficient Implementation of
Robust Header Compression (ROHC)
using an Application-Specific Processor
Gert Goossens, Patrick Verbist,
Erik Brockmeyer, Luc De Coster
Synopsys
2
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Agenda
1. Robust Header Compression (ROHC) in network
processing
2. Application-Specific Processor (ASIP) methodology
3. Accelerating control processing in ROHC
4. Accelerating data processing in ROHC
5. Conclusions
3
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
ROHC in Network Processing
ROHC compressor
• 1.2 Mpackets/s
• 600MHz clock  500 cycles/packet
− Header Parser: ~100 cycles/packet
− Encoder+Context+CRC: ~400 cycles/packet
• Optimize for worst-case control path
High Performance Streaming Data (IP/UDP/RTP Protocol)
IP Header
20-40 bytes
UDP Hdr
8 bytes
RTP Header
12 bytes
Payload
Video/Audio…
ROHC Header Payload
Video/Audio…
ROHC
Compressor
ROHC
DecompressorRadio or
Cable Link
Header
Parser
Header Field
Encoder
Packet
Modification
Buffer
Feedback
Buffer
Context
Processor
CRC
Con-
Text
Mem
4
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Header
Parser
Header Field
Encoder
Packet
Modification
Buffer
Feedback
Buffer
Context
Processor
CRC
Con-
Text
Mem
ROHC Implementation
█ Blocks requiring efficient control-flow
 Tiny microprocessor with efficient branching and logic operations
█ Blocks requiring efficient control-flow and data processing
 Tiny microprocessor with hardware-accelerated instructions
ASIP technology enables the design of such processors
5
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Agenda
1. Robust Header Compression (ROHC) in network
processing
2. Application-Specific Processor (ASIP) methodology
3. Accelerating control processing in ROHC
4. Accelerating data processing in ROHC
5. Conclusions
6
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
ASIPs in SoC Design
ASIP architectural optimization space
Parallelism Specialization
Instruction-
level
parallelism
Data-
level
parallelism
Task-
level
parallelism
Orthogonal
instruction
set (VLIW)
Encoded
instruction
set
Vector
processing
(SIMD)
Multi-
core
Applic.-
specific
data types
Applic.-
specific
instructions
Connectivity &
storage matching
application’s
data-flow
App.-spec.
data
processing
App.-spec.
memory
addressing
App.-spec.
control
processing
Distributed
regs,
sub-ranges
Multiple
mem’s,
sub-ranges
Jumps, subroutines,
interrupts, HW
do-loops, residual
control, predication
Direct, indirect,
post-modification,
indexed,
stack indirect…
Any exotic
operator
Integer,
fractional,
floating-point,
bits, complex,
vector…
Single or
multi-cycle
Relative or absolute,
address range,
delay slots
Pipeline
Multi-
threading
Pipeline
depth
Hazards:
HW/SW stall,
bypass
Micro-
processor
Extensible
Processor
Application-Specific
uP / DSP
Programmable
Datapath
Hardwired
Datapath
7
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
“ASIP Designer” Tool-Suite
8
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Agenda
1. Robust Header Compression (ROHC) in network
processing
2. Application-Specific Processor (ASIP) methodology
3. Accelerating control processing in ROHC
4. Accelerating data processing in ROHC
5. Conclusions
9
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
• Architectural exploration with ASIP Designer
• Starting point: “Tmicro” CPU
– 16-bit gen.-purpose CPU (already leaner than 32-bit)
– Variable-length instructions: arithmetic (16), move
(16, 32), load/store (16, 32), control (16, 32, 48)
Customization of a 16-bit CPU: “Strip Down & Beef Up”
• End point: “Tnano” ASIP
– 16-bit stripped CPU
– Fixed-length instructions: arithmetic,
move, load/store, control (16)
– No multi-word decoding overhead
– Improved clock frequency
– Add compact control instructions to
accelerate ROHC code
– Predicated execution (Selection)
– Field extraction (Masking)
– Shortcut logic instructions
10
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
Control Path Balancing
Longest control path
Shortest control path
• Example: Control-Flow
Graph of Header Parser
• Improve control path
balancing by
– C source code
re-factorization
– User-control on code
hoisting
– Predicated execution
in tail of long control
paths
11
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
If-Else, No Predication Tmicro (gen.-purp. CPU)
nML
Conditional jump
instruction,
2-cycle branch
penalty
C
Condition at
tail of long
control path Machine code
Conditional jump
with branch penalty:
One of two delay
slots filled, one
‘nop’ left
12
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
Predication Tnano (optimized ASIP)
nML
Select
instruction
C
Condition at
tail of long
control path
Machine code
• Conditional code
executes always
• Result is used
selectively
 No branch penalty
nML
Predication
Threshold
13
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
If-Else with Multiple Tests Tmicro (gen.-purp. CPU)
nML
Stand-alone compare
instruction
C
“If-else” with
multiple tests
Machine code
Multiple compare and c-jump
instructions
Slow in worst-case
14
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
If-Else with Multiple Tests Tnano (optimized ASIP)
nML
“Compare +
shortcut-logic”
instruction
CND &= Rj==Ri
CND |= Rj!=Ri
C
“If-else” with
multiple tests
Machine code
• Multiple “compare +
shortcut-logic”
• Single c-jump
Worst case is always
faster!
15
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Control Processing
Tmicro CPU Tnano ASIP
Rohc_parse program code size 347 x 16-bit 227 x 16-bit (-35%)
Rohc_parse cycle count per packet 191 87 (-55%)
Clock frequency (28nm HPM) 800 MHz 1 GHz (+25%)
Gate count (core only, 28nm HPM) 14K gates 5.4K gates (-61%)
Results – Header Parser
16
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Agenda
1. Robust Header Compression (ROHC) in network
processing
2. Application-Specific Processor (ASIP) methodology
3. Accelerating control processing in ROHC
4. Accelerating data processing in ROHC
5. Conclusions
17
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Data Processing
• Implementation styles
– Software on processor: too slow?
– Hardware co-processors: (manual) design effort, synchronization
challenge?
– Hardware-accelerated instructions in ASIP instruction set:
well supported by tools, potential for resource sharing!
Header
Parser
Header Field
Encoder
Packet
Modification
Buffer
Feedback
Buffer
Context
Processor
CRC
Con-
Text
Mem
CRC
WLSB encoder
Scaled / Timer-Based RTP
Timestamp Compression
….
18
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Data Processing
WLSB Encoder: SW Implementation Tmicro (gen.-purp. CPU)
nML
General-purpose ALU:
add, sub, shift, mask…
C
Software implementation
of WLSB encoder: for-
loop with called function
Machine code
• 30 instructions
for called
function
• 6-packet test
program:
2110 cycles
19
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Data Processing
WLSB Encoder: HW-Accelerated Instruction Tnano (optimized ASIP)
nML (ISA view)
WLSB encoder
instruction, calling
hardware primitive
C
Intrinsic function call
to WLSB encoder
instruction
Machine code
• Called function
replaced by single
instruction
• 6-packet test
program: 267 cycles
(7.9x speedup)
nML (behavioral view)
• WLSB hardware primitive
in bit-accurate C code
• Auto-translated to RTL
20
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Accelerated Data Processing
Results: Adding HW-Accelerated Instructions
Tmicro
CPU
Tnano ASIP Tnano ASIP
w/ WLSB instr
WLSB 6-packet test program
code size
134 x 16-bit 126 x 16-bit 84 x 16-bit (-33%)
WLSB 6-packet test program
cycle count
2122 2110 267 (-87%)
Clock frequency
(28nm HPM)
800 MHz 1 GHz 1 GHz (0%)
Gate count
(core only, 28nm HPM)
14K gates 5.4K gates 6.3K gates (+16%)
21
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Agenda
1. Robust Header Compression (ROHC) in network
processing
2. Application-Specific Processor (ASIP) methodology
3. Accelerating control processing in ROHC
4. Accelerating data processing in ROHC
5. Conclusions
22
© 2016 Synopsys, Inc. All rights reserved.
May 9, 2016
Conclusions
• Application-Specific Processors (ASIP)
– Enable acceleration of control and data processing, similar to
fixed-function hardware
– Flexibility of a software-programmable processor
• ASIP Designer allows to design ASIPs quickly
– Architectural exploration: Compiler-in-the-Loop
– SDK generation
– RTL generation
• Benefits illustrated with Robust Header Compression
(ROHC) case study

More Related Content

What's hot

Iptables
IptablesIptables
Iptables
rohit verma
 
BKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPABKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPA
Linaro
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
Netronome
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
Linaro
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablement
Ganesan Narayanasamy
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
Diego Alberto Tamayo
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Deepak Shankar
 
An Update on Arm HPC
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPC
inside-BigData.com
 
Host Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsHost Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment Models
Netronome
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
Ganesan Narayanasamy
 
Measuring directly from cpu hardware performance counters
Measuring directly from cpu  hardware performance countersMeasuring directly from cpu  hardware performance counters
Measuring directly from cpu hardware performance counters
Jean-Philippe BEMPEL
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems
 
Rtos ameba
Rtos amebaRtos ameba
Rtos ameba
Jou Neo
 
It's Time to ROCm!
It's Time to ROCm!It's Time to ROCm!
It's Time to ROCm!
inside-BigData.com
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
Odinot Stanislas
 
P4 Introduction
P4 Introduction P4 Introduction
P4 Introduction
Netronome
 
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
Facultad de Informática UCM
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
Intel® Software
 

What's hot (20)

Iptables
IptablesIptables
Iptables
 
BKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPABKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPA
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablement
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
Opal rt e phaso rsim_2013
Opal rt e phaso rsim_2013Opal rt e phaso rsim_2013
Opal rt e phaso rsim_2013
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 
An Update on Arm HPC
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPC
 
Host Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsHost Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment Models
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
Measuring directly from cpu hardware performance counters
Measuring directly from cpu  hardware performance countersMeasuring directly from cpu  hardware performance counters
Measuring directly from cpu hardware performance counters
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Rtos ameba
Rtos amebaRtos ameba
Rtos ameba
 
It's Time to ROCm!
It's Time to ROCm!It's Time to ROCm!
It's Time to ROCm!
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
 
P4 Introduction
P4 Introduction P4 Introduction
P4 Introduction
 
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 

Similar to Gert Goossens,Sen. Director, ASIP Tools, Synopsys

JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
Tatsuhiro Chiba
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
inside-BigData.com
 
Kunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-SiliconKunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-Silicon
chiportal
 
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and toolsWebinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Severalnines
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
James Saint-Rossy
 
Ti DSP optimization on Jacinto
Ti DSP optimization on JacintoTi DSP optimization on Jacinto
Ti DSP optimization on Jacinto
Hank (Tai-Chi) Wang
 
Application Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance CenterApplication Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance Center
inside-BigData.com
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Canada
 
DataCore Technology Overview
DataCore Technology OverviewDataCore Technology Overview
DataCore Technology Overview
Jeff Slapp
 
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at DropboxOptimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
ScyllaDB
 
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
Linaro
 
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic..."Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
Edge AI and Vision Alliance
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
Yutaka Kawai
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
Intel® Software
 
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storageMongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
Edge AI and Vision Alliance
 
IBM Runtimes Performance Observations with Apache Spark
IBM Runtimes Performance Observations with Apache SparkIBM Runtimes Performance Observations with Apache Spark
IBM Runtimes Performance Observations with Apache Spark
AdamRobertsIBM
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based Networking
Netronome
 
Performance is not an Option - gRPC and Cassandra
Performance is not an Option - gRPC and CassandraPerformance is not an Option - gRPC and Cassandra
Performance is not an Option - gRPC and Cassandra
Dave Bechberger
 

Similar to Gert Goossens,Sen. Director, ASIP Tools, Synopsys (20)

JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Kunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-SiliconKunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-Silicon
 
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and toolsWebinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
Ti DSP optimization on Jacinto
Ti DSP optimization on JacintoTi DSP optimization on Jacinto
Ti DSP optimization on Jacinto
 
Application Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance CenterApplication Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance Center
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven Telemetry
 
DataCore Technology Overview
DataCore Technology OverviewDataCore Technology Overview
DataCore Technology Overview
 
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at DropboxOptimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
 
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
 
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic..."Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
"Combining Flexibility and Low-Power in Embedded Vision Subsystems: An Applic...
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
 
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storageMongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
IBM Runtimes Performance Observations with Apache Spark
IBM Runtimes Performance Observations with Apache SparkIBM Runtimes Performance Observations with Apache Spark
IBM Runtimes Performance Observations with Apache Spark
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based Networking
 
Performance is not an Option - gRPC and Cassandra
Performance is not an Option - gRPC and CassandraPerformance is not an Option - gRPC and Cassandra
Performance is not an Option - gRPC and Cassandra
 

More from chiportal

Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China
chiportal
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
chiportal
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
chiportal
 
Prof. Uri Weiser,Technion
Prof. Uri Weiser,TechnionProf. Uri Weiser,Technion
Prof. Uri Weiser,Technion
chiportal
 
Ken Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, FaradayKen Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, Faraday
chiportal
 
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 Prof. Danny Raz, Director, Bell Labs Israel, Nokia  Prof. Danny Raz, Director, Bell Labs Israel, Nokia
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
chiportal
 
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, SynopsysMarco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
chiportal
 
Dr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazzDr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazz
chiportal
 
Eddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, IntelEddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, Intel
chiportal
 
Dr. John Bainbridge, Principal Application Architect, NetSpeed
 Dr. John Bainbridge, Principal Application Architect, NetSpeed  Dr. John Bainbridge, Principal Application Architect, NetSpeed
Dr. John Bainbridge, Principal Application Architect, NetSpeed
chiportal
 
Xavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, ArterisXavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, Arteris
chiportal
 
Asi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, VtoolAsi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, Vtool
chiportal
 
Zvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQZvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQ
chiportal
 
Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC
chiportal
 
Tuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano RetinaTuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano Retina
chiportal
 
Sagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-SiliconSagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-Silicon
chiportal
 
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP SemiconductorRonen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
chiportal
 
Prof. Emanuel Cohen, Technion
Prof. Emanuel Cohen, TechnionProf. Emanuel Cohen, Technion
Prof. Emanuel Cohen, Technion
chiportal
 
Prof. Ramez Daniel, Technion
Prof. Ramez Daniel, TechnionProf. Ramez Daniel, Technion
Prof. Ramez Daniel, Technion
chiportal
 
Rotem Ben-Hur,Graduate Student,Technio
Rotem Ben-Hur,Graduate Student,TechnioRotem Ben-Hur,Graduate Student,Technio
Rotem Ben-Hur,Graduate Student,Technio
chiportal
 

More from chiportal (20)

Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
 
Prof. Uri Weiser,Technion
Prof. Uri Weiser,TechnionProf. Uri Weiser,Technion
Prof. Uri Weiser,Technion
 
Ken Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, FaradayKen Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, Faraday
 
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 Prof. Danny Raz, Director, Bell Labs Israel, Nokia  Prof. Danny Raz, Director, Bell Labs Israel, Nokia
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, SynopsysMarco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
 
Dr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazzDr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazz
 
Eddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, IntelEddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, Intel
 
Dr. John Bainbridge, Principal Application Architect, NetSpeed
 Dr. John Bainbridge, Principal Application Architect, NetSpeed  Dr. John Bainbridge, Principal Application Architect, NetSpeed
Dr. John Bainbridge, Principal Application Architect, NetSpeed
 
Xavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, ArterisXavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, Arteris
 
Asi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, VtoolAsi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, Vtool
 
Zvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQZvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQ
 
Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC
 
Tuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano RetinaTuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano Retina
 
Sagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-SiliconSagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-Silicon
 
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP SemiconductorRonen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
 
Prof. Emanuel Cohen, Technion
Prof. Emanuel Cohen, TechnionProf. Emanuel Cohen, Technion
Prof. Emanuel Cohen, Technion
 
Prof. Ramez Daniel, Technion
Prof. Ramez Daniel, TechnionProf. Ramez Daniel, Technion
Prof. Ramez Daniel, Technion
 
Rotem Ben-Hur,Graduate Student,Technio
Rotem Ben-Hur,Graduate Student,TechnioRotem Ben-Hur,Graduate Student,Technio
Rotem Ben-Hur,Graduate Student,Technio
 

Recently uploaded

Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
Aggregage
 
buy old yahoo accounts buy yahoo accounts
buy old yahoo accounts buy yahoo accountsbuy old yahoo accounts buy yahoo accounts
buy old yahoo accounts buy yahoo accounts
Susan Laney
 
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdfModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
fisherameliaisabella
 
An introduction to the cryptocurrency investment platform Binance Savings.
An introduction to the cryptocurrency investment platform Binance Savings.An introduction to the cryptocurrency investment platform Binance Savings.
An introduction to the cryptocurrency investment platform Binance Savings.
Any kyc Account
 
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
SOFTTECHHUB
 
3 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 20243 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 2024
SEOSMMEARTH
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
taqyed
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
agatadrynko
 
Best practices for project execution and delivery
Best practices for project execution and deliveryBest practices for project execution and delivery
Best practices for project execution and delivery
CLIVE MINCHIN
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
balatucanapplelovely
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
agatadrynko
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
JeremyPeirce1
 
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s DholeraTata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
Avirahi City Dholera
 
Mastering B2B Payments Webinar from BlueSnap
Mastering B2B Payments Webinar from BlueSnapMastering B2B Payments Webinar from BlueSnap
Mastering B2B Payments Webinar from BlueSnap
Norma Mushkat Gaffin
 
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
AnnySerafinaLove
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
Aurelien Domont, MBA
 
Creative Web Design Company in Singapore
Creative Web Design Company in SingaporeCreative Web Design Company in Singapore
Creative Web Design Company in Singapore
techboxsqauremedia
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Boris Ziegler
 
Authentically Social Presented by Corey Perlman
Authentically Social Presented by Corey PerlmanAuthentically Social Presented by Corey Perlman
Authentically Social Presented by Corey Perlman
Corey Perlman, Social Media Speaker and Consultant
 

Recently uploaded (20)

Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
 
buy old yahoo accounts buy yahoo accounts
buy old yahoo accounts buy yahoo accountsbuy old yahoo accounts buy yahoo accounts
buy old yahoo accounts buy yahoo accounts
 
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdfModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
 
An introduction to the cryptocurrency investment platform Binance Savings.
An introduction to the cryptocurrency investment platform Binance Savings.An introduction to the cryptocurrency investment platform Binance Savings.
An introduction to the cryptocurrency investment platform Binance Savings.
 
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
Hamster Kombat' Telegram Game Surpasses 100 Million Players—Token Release Sch...
 
3 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 20243 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 2024
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
 
Best practices for project execution and delivery
Best practices for project execution and deliveryBest practices for project execution and delivery
Best practices for project execution and delivery
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
 
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s DholeraTata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
Tata Group Dials Taiwan for Its Chipmaking Ambition in Gujarat’s Dholera
 
Mastering B2B Payments Webinar from BlueSnap
Mastering B2B Payments Webinar from BlueSnapMastering B2B Payments Webinar from BlueSnap
Mastering B2B Payments Webinar from BlueSnap
 
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
 
Creative Web Design Company in Singapore
Creative Web Design Company in SingaporeCreative Web Design Company in Singapore
Creative Web Design Company in Singapore
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
 
Authentically Social Presented by Corey Perlman
Authentically Social Presented by Corey PerlmanAuthentically Social Presented by Corey Perlman
Authentically Social Presented by Corey Perlman
 

Gert Goossens,Sen. Director, ASIP Tools, Synopsys

  • 1. 1 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Case study: Performance-efficient Implementation of Robust Header Compression (ROHC) using an Application-Specific Processor Gert Goossens, Patrick Verbist, Erik Brockmeyer, Luc De Coster Synopsys
  • 2. 2 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Agenda 1. Robust Header Compression (ROHC) in network processing 2. Application-Specific Processor (ASIP) methodology 3. Accelerating control processing in ROHC 4. Accelerating data processing in ROHC 5. Conclusions
  • 3. 3 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 ROHC in Network Processing ROHC compressor • 1.2 Mpackets/s • 600MHz clock  500 cycles/packet − Header Parser: ~100 cycles/packet − Encoder+Context+CRC: ~400 cycles/packet • Optimize for worst-case control path High Performance Streaming Data (IP/UDP/RTP Protocol) IP Header 20-40 bytes UDP Hdr 8 bytes RTP Header 12 bytes Payload Video/Audio… ROHC Header Payload Video/Audio… ROHC Compressor ROHC DecompressorRadio or Cable Link Header Parser Header Field Encoder Packet Modification Buffer Feedback Buffer Context Processor CRC Con- Text Mem
  • 4. 4 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Header Parser Header Field Encoder Packet Modification Buffer Feedback Buffer Context Processor CRC Con- Text Mem ROHC Implementation █ Blocks requiring efficient control-flow  Tiny microprocessor with efficient branching and logic operations █ Blocks requiring efficient control-flow and data processing  Tiny microprocessor with hardware-accelerated instructions ASIP technology enables the design of such processors
  • 5. 5 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Agenda 1. Robust Header Compression (ROHC) in network processing 2. Application-Specific Processor (ASIP) methodology 3. Accelerating control processing in ROHC 4. Accelerating data processing in ROHC 5. Conclusions
  • 6. 6 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 ASIPs in SoC Design ASIP architectural optimization space Parallelism Specialization Instruction- level parallelism Data- level parallelism Task- level parallelism Orthogonal instruction set (VLIW) Encoded instruction set Vector processing (SIMD) Multi- core Applic.- specific data types Applic.- specific instructions Connectivity & storage matching application’s data-flow App.-spec. data processing App.-spec. memory addressing App.-spec. control processing Distributed regs, sub-ranges Multiple mem’s, sub-ranges Jumps, subroutines, interrupts, HW do-loops, residual control, predication Direct, indirect, post-modification, indexed, stack indirect… Any exotic operator Integer, fractional, floating-point, bits, complex, vector… Single or multi-cycle Relative or absolute, address range, delay slots Pipeline Multi- threading Pipeline depth Hazards: HW/SW stall, bypass Micro- processor Extensible Processor Application-Specific uP / DSP Programmable Datapath Hardwired Datapath
  • 7. 7 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 “ASIP Designer” Tool-Suite
  • 8. 8 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Agenda 1. Robust Header Compression (ROHC) in network processing 2. Application-Specific Processor (ASIP) methodology 3. Accelerating control processing in ROHC 4. Accelerating data processing in ROHC 5. Conclusions
  • 9. 9 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing • Architectural exploration with ASIP Designer • Starting point: “Tmicro” CPU – 16-bit gen.-purpose CPU (already leaner than 32-bit) – Variable-length instructions: arithmetic (16), move (16, 32), load/store (16, 32), control (16, 32, 48) Customization of a 16-bit CPU: “Strip Down & Beef Up” • End point: “Tnano” ASIP – 16-bit stripped CPU – Fixed-length instructions: arithmetic, move, load/store, control (16) – No multi-word decoding overhead – Improved clock frequency – Add compact control instructions to accelerate ROHC code – Predicated execution (Selection) – Field extraction (Masking) – Shortcut logic instructions
  • 10. 10 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing Control Path Balancing Longest control path Shortest control path • Example: Control-Flow Graph of Header Parser • Improve control path balancing by – C source code re-factorization – User-control on code hoisting – Predicated execution in tail of long control paths
  • 11. 11 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing If-Else, No Predication Tmicro (gen.-purp. CPU) nML Conditional jump instruction, 2-cycle branch penalty C Condition at tail of long control path Machine code Conditional jump with branch penalty: One of two delay slots filled, one ‘nop’ left
  • 12. 12 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing Predication Tnano (optimized ASIP) nML Select instruction C Condition at tail of long control path Machine code • Conditional code executes always • Result is used selectively  No branch penalty nML Predication Threshold
  • 13. 13 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing If-Else with Multiple Tests Tmicro (gen.-purp. CPU) nML Stand-alone compare instruction C “If-else” with multiple tests Machine code Multiple compare and c-jump instructions Slow in worst-case
  • 14. 14 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing If-Else with Multiple Tests Tnano (optimized ASIP) nML “Compare + shortcut-logic” instruction CND &= Rj==Ri CND |= Rj!=Ri C “If-else” with multiple tests Machine code • Multiple “compare + shortcut-logic” • Single c-jump Worst case is always faster!
  • 15. 15 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Control Processing Tmicro CPU Tnano ASIP Rohc_parse program code size 347 x 16-bit 227 x 16-bit (-35%) Rohc_parse cycle count per packet 191 87 (-55%) Clock frequency (28nm HPM) 800 MHz 1 GHz (+25%) Gate count (core only, 28nm HPM) 14K gates 5.4K gates (-61%) Results – Header Parser
  • 16. 16 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Agenda 1. Robust Header Compression (ROHC) in network processing 2. Application-Specific Processor (ASIP) methodology 3. Accelerating control processing in ROHC 4. Accelerating data processing in ROHC 5. Conclusions
  • 17. 17 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Data Processing • Implementation styles – Software on processor: too slow? – Hardware co-processors: (manual) design effort, synchronization challenge? – Hardware-accelerated instructions in ASIP instruction set: well supported by tools, potential for resource sharing! Header Parser Header Field Encoder Packet Modification Buffer Feedback Buffer Context Processor CRC Con- Text Mem CRC WLSB encoder Scaled / Timer-Based RTP Timestamp Compression ….
  • 18. 18 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Data Processing WLSB Encoder: SW Implementation Tmicro (gen.-purp. CPU) nML General-purpose ALU: add, sub, shift, mask… C Software implementation of WLSB encoder: for- loop with called function Machine code • 30 instructions for called function • 6-packet test program: 2110 cycles
  • 19. 19 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Data Processing WLSB Encoder: HW-Accelerated Instruction Tnano (optimized ASIP) nML (ISA view) WLSB encoder instruction, calling hardware primitive C Intrinsic function call to WLSB encoder instruction Machine code • Called function replaced by single instruction • 6-packet test program: 267 cycles (7.9x speedup) nML (behavioral view) • WLSB hardware primitive in bit-accurate C code • Auto-translated to RTL
  • 20. 20 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Accelerated Data Processing Results: Adding HW-Accelerated Instructions Tmicro CPU Tnano ASIP Tnano ASIP w/ WLSB instr WLSB 6-packet test program code size 134 x 16-bit 126 x 16-bit 84 x 16-bit (-33%) WLSB 6-packet test program cycle count 2122 2110 267 (-87%) Clock frequency (28nm HPM) 800 MHz 1 GHz 1 GHz (0%) Gate count (core only, 28nm HPM) 14K gates 5.4K gates 6.3K gates (+16%)
  • 21. 21 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Agenda 1. Robust Header Compression (ROHC) in network processing 2. Application-Specific Processor (ASIP) methodology 3. Accelerating control processing in ROHC 4. Accelerating data processing in ROHC 5. Conclusions
  • 22. 22 © 2016 Synopsys, Inc. All rights reserved. May 9, 2016 Conclusions • Application-Specific Processors (ASIP) – Enable acceleration of control and data processing, similar to fixed-function hardware – Flexibility of a software-programmable processor • ASIP Designer allows to design ASIPs quickly – Architectural exploration: Compiler-in-the-Loop – SDK generation – RTL generation • Benefits illustrated with Robust Header Compression (ROHC) case study