SlideShare a Scribd company logo
© 2014, Esencia Technology Confidential
EScala Design Platform
Esencia Technologies Inc.
2014
© 2014, Esencia Technology Confidential
What is EScala Technology
■ Design Platform that generates field reprogrammable
HDL IP cores and programs them efficiently
1. Core Generator: You can easily generate the high
performance IP core that best fits your data
processing algorithm needs and design goals
2. SDK: Highly optimizing C/C++ compiler to program
the generated core
© 2014, Esencia Technology Confidential
Data Memory Controller
Program Memory
Decoder
Load/Store
Register Bypass
& Flow Control
PC
General
Purpose
ALU
32 General
Purpose
Registers Bank
RISC CPU Architecture
■ RISC CPUs like ARM,
MIPS:
− Program Control
− Instruction Decoder
− ALU
− Register Bank
− Load Store Unit
Exception /
interrupt
controller
© 2014, Esencia Technology Confidential
Data Memory Controller
Program Memory
Decoder
Load/Store
Register Bypass
& Flow Control
PC
General
Purpose
ALU
32 General
Purpose
Registers Bank
Typical RISC CPU Pipeline
■ RISC CPUs like ARM,
MIPS:
− Program Control
− Instruction Decoder
− ALU
− Register Bank
− Load Store Unit
Fetch
Decode
Execute
Write BackException /
interrupt
controller
© 2014, Esencia Technology Confidential
EScala-Core
Data Memory Controller
AND/OR MMIO
Program Memory (RAM or ROM)
Decoder
Load/Store
Register Bypass
& Flow Control
PC
General
Purpose
ALU
Configurable Size General Purpose Register Bank
Slot 0
EScala Generated Core
Exception /
interrupt
controller
© 2014, Esencia Technology Confidential
Escala-Core
Data Memory Controller
AND/OR MMIO
Program Memory (RAM or ROM)
Decoder
Load/Store
Decode
Load/Store
Custom
ALU
Data Memory
Controller AND/OR
MMIO
PC
General
Purpose
ALU
Configurable Size General Purpose Register Bank
Slot 0
Slot 1
EScala Generated Core
Register Bypass
& Flow Control
Exception /
interrupt
controller
© 2014, Esencia Technology Confidential
Escala-Core
Data Memory Controller
AND/OR MMIO
Program Memory (RAM or ROM)
Decoder
Load/Store
Decode
Load/Store
Custom
ALU
Decode
Load/Store
Custom
ALU
Decode
Load/Store
Custom
ALU
Data Memory
Controller AND/OR
MMIO
Data Memory
Controller AND/OR
MMIO
PC
General
Purpose
ALU
Data Memory
Controller AND/OR
MMIO
Configurable Size General Purpose Register Bank
Slot 0
Slot 1 Slot N-2 Slot N-1
EScala Generated Core
Register Bypass
& Flow Control
Exception /
interrupt
controller
© 2014, Esencia Technology Confidential
EScala-core
Program Memory (RAM or
ROM)
EScala Generated Embedded
Subsystem
DataMem
Multi-Port DataMem
DataMem
High BW
IO channels (MMIO)
IO bridge
Timer Intc
BDM
Peripheral bus
JTAG
UART
DMA
To Bulk/External Mem
To other Escala cores
Or dedicated accelerators
© 2014, Esencia Technology Confidential
EScala for HP Data-Plane processing
PE
PE
PE
PE
PE
■ Can create arbitrarily complex topologies to exploit task level parallelism
■ All PE’s can access Bulk Memory (not shown in diagram)
Main/Bulk Memory
PE
PE
PE
PE
© 2014, Esencia Technology Confidential
EScala-core
EScala on Data-Plane processing
Data Mem Bank(s)
Program Memory
MMO
Data Mem Bank(s)
=
Optional
DMA/AGU
■ PE’s are individually tailored to the code they run
■ Each PE automatically configured for maximum ILP
■ High BW Point-to-Point connections possible through MMIO
channels
■ No bus bottleneck
■ Built-in flow control
MMO
MMI
PE
MMO
© 2014, Esencia Technology Confidential
EScala Applications
■ EScala offers a unique solution for the space where
traditional RISC CPUs (ARM, MIPS…) run into their
limitations and hand optimized full custom RTL is the
traditional alternative
■ A broad range of compute intensive algorithms shows
great benefit from EScala Design Platform’s tailored
solution. A subset of these include:
− HD and 3D Video & Audio
− Video analytics (motion
tracking, facial recognition, etc)
− 2D and 3D graphics
− Motor / low latency embedded
control
− Security
− Image Processing
− Speech recognition
− Baseband Algorithms
− High-Frequency Trading
− Deep Packet Inspection (QoS /
Firewalls / Signature detection..)
− Packet compression
© 2014, Esencia Technology Confidential
EScala Design Platform Provides
■ Rapid Design Space Exploration
− Performance and cost of design points are rapidly generated
(typically in a few minutes instead of hours or days)
■ Rapid Design Entry
− Algorithms can be expressed in C/C++
− Uses simple high-level design constraints
■ Flexible Hardware and Software Platform
− Architecturally advantaged processing core for your problem that
can run multiple algorithms or target multiple functions
− Compiler and SDK to enable algorithm changes and updates after
silicon design is frozen or even post customer delivery
■ Design Deliverables
− Synthesizable HDL with guaranteed Design, Timing, and Physical
Closure
− Binary Object Code
− Simulation Model, Regression Testbench, Synthesis Scripts
© 2014, Esencia Technology Confidential
Rapid Design Space Exploration
■ Quickly pick best design
based on an array of
automatically generated
options
■ Web interface. Runs
jobs on cloud
■ Customer dedicated VM’
s
■ Local installations
available
© 2014, Esencia Technology Confidential
Technology Overview
■ EScala Performance Evaluation Models
− Provides exact measurement of performance across different
design points as controlled by design constraints
■ EScala Core Generator
− Generates hardware design points for the designated algorithm(s)
− Generated designs are timing / design closure guaranteed
■ EScala Software Development Kit
− Maps C/C++ algorithms (programs) to executable binaries
− Used for design exploration while hardware design is flexible
− Locked for the specific chosen design allowing for multiple
algorithms, enhancement, modifications, and corrections to be
made after hardware design is frozen
© 2014, Esencia Technology Confidential
EScala Algorithm Flow
C/C++
Source
●
Object
file
GCC/G++
PE
libs
Escala
Generator
Escala SDK
ISS Trace
Escala
Design
Platform
RTL Core
& Models
© 2014, Esencia Technology Confidential
EScala Key Benefits
■ Significant reduction in design time
− C/C++ high-level design entry
− Core is built out of pre-verified components
− Generates regression test environment
− Generates simulation model for system simulation
■ Area and power efficient
− Able to remove unused HW resources
− Ability to reuse hardware resources for multiple algorithms
■ Reduces risk
− Implement bug fixes post-silicon
− Offers to maintain or upgrade code/features in the field
© 2014, Esencia Technology Confidential
Conclusion
■ EScala combines the advantages of a programmable core with the
computational power of parallel processing typical for dedicated RTL
hardware implementations.
■ Designers using the EScala Design platform experience major
reduction in cycle times compared to a traditional hand written RTL
approach.
■ EScala field re-programmability reduces the risk of silicon re-spins.
■ Ideal building block to generate efficient many-core configurations for
highly demanding data-plane processing applications
Please contact Esencia Technology Inc. for more
information or Visit our web site!
© 2014, Esencia Technology Confidential
Thank You!
© 2014, Esencia Technology Confidential
Back-up
© 2014, Esencia Technology Confidential
EScala: Core Generator
■ Processor Elements (PEs) are hand-written templated processors
■ Highly configurable and parametric
■ High level programmable (C/C++/…)
■ Multitude of Performance knobs
− Number of slots
− Configurable Register resources
− Configurable number of Memory banks
− Custom/user defined instructions
− Auto-Generated instructions (operator fusion)
− Unified / clustered register file
− Number of SIMD units / vector sizes
− Number of FP units
− 32 vs. 16 bit data-path
− Endianess etc.
■ Resource Optimizations
− PE is deprived of all unused resources at fine-grain level
− Performance/Area trade-offs are controlled by the user
© 2014, Esencia Technology Confidential
Escala: SDK toolset
■ Targets High level languages (C/C++)
■ Toolset uses a combination of configuration unaware
compiler (gcc) and a proprietary configuration aware one
■ Back-end compiler steps are optimized by our compiler to
match processor configuration
− Instruction Scheduling
− Register renaming / allocation
− Memory un-aliasing
− Combo instruction generation, etc.
■ Source transformations are used only opportunistically
− Full / partial loop unrolling
© 2014, Esencia Technology Confidential
EScala Algorithm Flow
■ Three levels of abstraction:
■ Host (linux x86) simulation
■ EScala ISS
■ RTL simulation
■ Same API across
© 2014, Esencia Technology Confidential
Digital Design Flow Summary
1. Describe functionality in HDL
2. Synthesize HDL to gates of the target process
3. Layout Netlist
4. Generate Masks
5. Mask are used to etch/deposit… wafer
© 2014, Esencia Technology Confidential
Performance comparison
EScala vs. MicroBlaze soft-core case study
© 2014, Esencia Technology Confidential
Benchmark conditions
■ Synthesized on smallest of Virtex-6 device
− xc6vlx75t-2ff484
■ Reported area as number of LUT+FF pairs used
− Excluded memories and DSP specialized primitives
− For MBlaze area is a constant for all benchmarks
● 2729 LUT+FF-pairs
■ Performance reported at maximum frequency
− For MBlaze Max frequency is a constant for all benchmarks
● Max freq: 197.6 MHz
■ EScala freq/area is customized to application
benchmarked
© 2014, Esencia Technology Confidential
Benchmark: IDCT
■ MPEG2/4 style 8x8 block
■ Chen-Wang algorithm (IEEE ASSP-32, Aug 1984)
− Intended to save operations, but has low ILP
■ Reference implementation by MPEG software simulation
group
■ 32-bit integer arithmetic. (12 bit coefficients), 11 mults, 29
adds per DCT (16)
■ Perf. results (fps) reported assuming 6 IDCT’s per MB
performed (4:2:0 format)
© 2014, Esencia Technology Confidential
Benchmark: IDCT
ways
period
(ns) slices
SA/MB
area clks
SA/MB
perf Freq (Mhz) Ops/sec
fps @
VGA fps @ 720p
SA/MB perf
@ max freq
SA1 1 8.174 9209 3.4 1987 1.17 122.3 61570 8.6 2.9 0.7
SA1 4 8.301 12497 4.6 983 2.37 120.5 122551 17.0 5.7 1.4
SA2 2 6.635 18656 6.8 1334 1.74 150.7 112980 15.7 5.2 1.3
SA2 3 6.693 21160 7.8 925 2.51 149.4 161524 22.4 7.5 1.9
SA2 4 6.808 27051 9.9 731 3.18 146.9 200938 27.9 9.3 2.4
SA2 5 7.388 36458 13.4 621 3.74 135.4 217962 30.3 10.1 2.6
MB 1 5.062 2729 n/a 2325 n/a 197.6 84968 11.8 3.9 n/a
© 2014, Esencia Technology Confidential
Benchmark: SHA-1 Hashing
■ As per FIPS-PUB 180-1 (Apr 17th
, 1995)
■ Processes a series of 512-bit message blocks
■ Last block is padded to 512-bit as per standard
■ Result is a 160-bit hash
■ Sequential operation in nature
− Latency limits overall performance
© 2014, Esencia Technology Confidential
Benchmark: SHA-1 Hashing
ways period (ns) slices SA/MB area clks SA/MB perf Freq (Mhz) Blks/sec
SA/MB perf
@ max freq Mbps
SA 1 8.078 4519 1.7 1430 2.4 123.8 86569 1.5 44.3
SA 2 7.359 7434 2.7 803 4.2 135.9 169225 2.9 86.6
SA 3 7.281 10067 3.7 618 5.5 137.3 222239 3.8 113.8
SA 4 7.900 13996 5.1 537 6.3 126.6 235721 4.0 120.7
MB 1 5.062 2729 n/a 3391 n/a 197.6 58257 n/a 29.8
© 2014, Esencia Technology Confidential
Benchmark: AES encryption
■ Rijndael algorithm as per FIPS PUB 197
■ Block size of 128-bits
■ Key size of 128-bits
■ Sequential application in nature (due to chaining modes)
− Overall latency limits performance
© 2014, Esencia Technology Confidential
Benchmark: AES encryption
ways period (ns) slices
SA/MB
area clks
SA/MB
perf Freq (Mhz) Mblks/sec Mbps
SA/MB perf
@ max freq
SA 1 7.582 6711 2.5 331 3.6 131.9 0.40 51.0 2.4
SA 2 7.440 10518 3.9 170 7.0 134.4 0.79 101.2 4.8
SA 3 6.343 12782 4.7 123 9.7 157.7 1.28 164.1 7.8
SA 4 6.694 15886 5.8 102 11.7 149.4 1.46 187.5 8.9
SA 5 6.774 20020 7.3 91 13.2 147.6 1.62 207.6 9.8
SA 6 6.797 23596 8.6 82 14.6 147.1 1.79 229.7 10.9
SA 7 6.955 26131 9.6 76 15.8 143.8 1.89 242.2 11.5
MB 1 5.062 2729 n/a 1198 n/a 197.6 0.16 21.1 n/a
© 2014, Esencia Technology Confidential
Benchmark: Color Space Conv.
■ Performs y = M * x + b
■ M a 3x3 coefficient matrix
■ y,x,b are 3x1 vectors
■ x input color triple (in 4:4:4 sampling format)
■ y output color triple
© 2014, Esencia Technology Confidential
Benchmark: Color Space Conv.
Ways
period
(ns) slices
SA/MB
Area clks
SA/MB
Perf Freq (Mhz) clks/pix Mpix/s Fps @ VGA
Fps @
720p
SA/MB
perf @
max freq
SA 1 7.300 6889 2.5 10132 6.9 137.0 33.8 4.1 13.2 4.4 4.8
SA 2 6.680 8121 3.0 5197 13.5 149.7 17.3 8.6 28.1 9.38 10.2
SA 3 6.668 13013 4.8 3601 19.4 150.0 12.0 12.5 40.7 13.56 14.7
SA 4 6.565 16129 5.9 2789 25.1 152.3 9.3 16.4 53.3 17.78 19.3
SA 5 6.719 21280 7.8 2368 29.5 148.8 7.9 18.9 61.4 20.5 22.2
SA 6 7.395 25011 9.2 2098 33.3 135.2 7.0 19.3 62.9 20.98 22.8
SA 7 6.760 29054 10.6 1917 36.5 147.9 6.4 23.2 75.4 25.1 27.3
SA 8 6.532 34586 12.7 1737 40.2 153.1 5.8 26.4 86.1 28.69 31.2
SA 9 7.143 36866 13.5 1678 41.7 140.0 5.6 25.0 81.5 27.16 29.5
MB 1 5.062 2729 n/a 69900 n/a 197.6 233.00 0.85 2.76 .92 n/a
© 2014, Esencia Technology Confidential
Benchmark: FIR filtering
■ Implemented BT-601-7 high quality interpolation filter
■ Coefficients quantized to 12-bits, 8-bit input samples
■ 26-tap symmetric filter
■ Used over 2 chroma components on a 4:2:2 to 4:4:4 up-
sampling application (two filtering operations per pixel)
© 2014, Esencia Technology Confidential
Benchmark: FIR filtering
ways
period
(ns) slices
SA/MB
area clks SA/MB perf
Freq
(Mhz) clks/pix Mpix/s
Fps @
VGA
Fps @
720p
SA/MB
perf @
max freq
SA 1 6.602 3641 1.3 8634 42.6 151.5 86.34 1.75 5.71 1.90 32.6
SA 2 6.624 6902 2.5 4527 81.2 151.0 45.27 3.33 10.86 3.62 62.1
SA 3 6.458 7597 2.8 3128 117.5 154.8 31.28 4.95 16.11 5.37 92.1
SA 4 6.884 10467 3.8 2828 130.0 145.3 28.28 5.14 16.72 5.57 95.6
SA 5 7.526 12527 4.6 2678 137.3 132.9 26.78 4.96 16.15 5.38 92.3
SA 6 7.853 15940 5.8 2628 139.9 127.3 26.28 4.85 15.77 5.26 90.2
MB 1 5.062 2729 n/a 367600 n/a 197.6 3676.00 0.05 .17 .06 n/a
© 2014, Esencia Technology Confidential
Comparison Conclusion
■ Highly scalable core generator
■ Application driven configuration
■ Programmability optionally removed for efficient single purpose
implementations (FPGA already reprogrammable!)
■ Slots can be configured with high level of granularity
■ Considerable performance gains when compared to a traditional
RISC like Micro-Blaze
© 2014, Esencia Technology Confidential
Thank You
Esencia Technologies Inc.
2014

More Related Content

What's hot

regmap: The power of subsystems and abstractions
regmap: The power of subsystems and abstractionsregmap: The power of subsystems and abstractions
regmap: The power of subsystems and abstractions
Mark Brown
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
A B Shinde
 
SOC Interconnects: AMBA & CoreConnect
SOC Interconnects: AMBA  & CoreConnectSOC Interconnects: AMBA  & CoreConnect
SOC Interconnects: AMBA & CoreConnect
A B Shinde
 
VVDN Presentation
VVDN PresentationVVDN Presentation
VVDN Presentation
vibansal
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
Shivam Gupta
 
System-on-Chip
System-on-ChipSystem-on-Chip
System-on-Chip
Lars Jacobs
 
SOC Design Challenges and Practices
SOC Design Challenges and PracticesSOC Design Challenges and Practices
SOC Design Challenges and Practices
Dr. Shivananda Koteshwar
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
A B Shinde
 
Node architecture
Node architectureNode architecture
Node architecture
GhufranEssam
 
SOC Chip Basics
SOC Chip BasicsSOC Chip Basics
SOC Chip Basics
A B Shinde
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and design
Satya Harish
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
Linaro
 
Processors used in System on chip
Processors used in System on chip Processors used in System on chip
Processors used in System on chip
A B Shinde
 
System On Chip
System On ChipSystem On Chip
System On Chip
anishgoel
 
Simple Virtualization Overview
Simple Virtualization OverviewSimple Virtualization Overview
Simple Virtualization Overview
bassemir
 
What is system on chip (1)
What is system on chip (1)What is system on chip (1)
What is system on chip (1)
Jagadeshgoud
 
System on chip architectures
System on chip architecturesSystem on chip architectures
System on chip architectures
A B Shinde
 
Phoenix family overview 080414
Phoenix family overview 080414Phoenix family overview 080414
Phoenix family overview 080414
A&D Technology
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
Bruno Cornec
 
Update Management and Compliance Monitoring with the Subscription Management...
Update Management and Compliance Monitoring with the Subscription  Management...Update Management and Compliance Monitoring with the Subscription  Management...
Update Management and Compliance Monitoring with the Subscription Management...
Novell
 

What's hot (20)

regmap: The power of subsystems and abstractions
regmap: The power of subsystems and abstractionsregmap: The power of subsystems and abstractions
regmap: The power of subsystems and abstractions
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
 
SOC Interconnects: AMBA & CoreConnect
SOC Interconnects: AMBA  & CoreConnectSOC Interconnects: AMBA  & CoreConnect
SOC Interconnects: AMBA & CoreConnect
 
VVDN Presentation
VVDN PresentationVVDN Presentation
VVDN Presentation
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
 
System-on-Chip
System-on-ChipSystem-on-Chip
System-on-Chip
 
SOC Design Challenges and Practices
SOC Design Challenges and PracticesSOC Design Challenges and Practices
SOC Design Challenges and Practices
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
 
Node architecture
Node architectureNode architecture
Node architecture
 
SOC Chip Basics
SOC Chip BasicsSOC Chip Basics
SOC Chip Basics
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and design
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
 
Processors used in System on chip
Processors used in System on chip Processors used in System on chip
Processors used in System on chip
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
Simple Virtualization Overview
Simple Virtualization OverviewSimple Virtualization Overview
Simple Virtualization Overview
 
What is system on chip (1)
What is system on chip (1)What is system on chip (1)
What is system on chip (1)
 
System on chip architectures
System on chip architecturesSystem on chip architectures
System on chip architectures
 
Phoenix family overview 080414
Phoenix family overview 080414Phoenix family overview 080414
Phoenix family overview 080414
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 
Update Management and Compliance Monitoring with the Subscription Management...
Update Management and Compliance Monitoring with the Subscription  Management...Update Management and Compliance Monitoring with the Subscription  Management...
Update Management and Compliance Monitoring with the Subscription Management...
 

Similar to E scala design platform

LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture Presentation
Linaro
 
Cockatrice: A Hardware Design Environment with Elixir
Cockatrice: A Hardware Design Environment with ElixirCockatrice: A Hardware Design Environment with Elixir
Cockatrice: A Hardware Design Environment with Elixir
Hideki Takase
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
Deepak Shankar
 
Avr microcontrollers training (sahil gupta - 9068557926)
Avr microcontrollers training  (sahil gupta - 9068557926)Avr microcontrollers training  (sahil gupta - 9068557926)
Avr microcontrollers training (sahil gupta - 9068557926)
Sahil Gupta
 
Embedded systems-unit-1
Embedded systems-unit-1Embedded systems-unit-1
Embedded systems-unit-1
Prabhu Mali
 
esunit1.pptx
esunit1.pptxesunit1.pptx
esunit1.pptx
AmitKumar7572
 
Basic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate ArraysBasic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate Arrays
Usha Mehta
 
A block of logic or data that can be used in making application-specific inte...
A block of logic or data that can be used in making application-specific inte...A block of logic or data that can be used in making application-specific inte...
A block of logic or data that can be used in making application-specific inte...
r_sadoun
 
ODSA Use Case - SmartNIC
ODSA Use Case - SmartNICODSA Use Case - SmartNIC
ODSA Use Case - SmartNIC
ODSA Workgroup
 
18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf
VanshikaRajvanshi1
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Cesar Maciel
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
Jaffer Haadi
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
Ganesan Narayanasamy
 
Design of Software for Embedded Systems
Design of Software for Embedded SystemsDesign of Software for Embedded Systems
Design of Software for Embedded Systems
Peter Tröger
 
Choosing the right processor for embedded system design
Choosing the right processor for embedded system designChoosing the right processor for embedded system design
Choosing the right processor for embedded system design
Pantech ProLabs India Pvt Ltd
 
Introduction to intel galileo board gen2
Introduction to intel galileo board gen2Introduction to intel galileo board gen2
Introduction to intel galileo board gen2
Harshit Srivastava
 
Leverage your business with ebs extensions with endeca ppt
Leverage your business with ebs extensions with endeca pptLeverage your business with ebs extensions with endeca ppt
Leverage your business with ebs extensions with endeca ppt
Venkat Muthadi
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and Visualization
Slide_N
 
Krupesh_Resume
Krupesh_ResumeKrupesh_Resume
Krupesh_Resume
zoomkrupesh
 
Risc
RiscRisc

Similar to E scala design platform (20)

LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture Presentation
 
Cockatrice: A Hardware Design Environment with Elixir
Cockatrice: A Hardware Design Environment with ElixirCockatrice: A Hardware Design Environment with Elixir
Cockatrice: A Hardware Design Environment with Elixir
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
Avr microcontrollers training (sahil gupta - 9068557926)
Avr microcontrollers training  (sahil gupta - 9068557926)Avr microcontrollers training  (sahil gupta - 9068557926)
Avr microcontrollers training (sahil gupta - 9068557926)
 
Embedded systems-unit-1
Embedded systems-unit-1Embedded systems-unit-1
Embedded systems-unit-1
 
esunit1.pptx
esunit1.pptxesunit1.pptx
esunit1.pptx
 
Basic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate ArraysBasic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate Arrays
 
A block of logic or data that can be used in making application-specific inte...
A block of logic or data that can be used in making application-specific inte...A block of logic or data that can be used in making application-specific inte...
A block of logic or data that can be used in making application-specific inte...
 
ODSA Use Case - SmartNIC
ODSA Use Case - SmartNICODSA Use Case - SmartNIC
ODSA Use Case - SmartNIC
 
18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
Design of Software for Embedded Systems
Design of Software for Embedded SystemsDesign of Software for Embedded Systems
Design of Software for Embedded Systems
 
Choosing the right processor for embedded system design
Choosing the right processor for embedded system designChoosing the right processor for embedded system design
Choosing the right processor for embedded system design
 
Introduction to intel galileo board gen2
Introduction to intel galileo board gen2Introduction to intel galileo board gen2
Introduction to intel galileo board gen2
 
Leverage your business with ebs extensions with endeca ppt
Leverage your business with ebs extensions with endeca pptLeverage your business with ebs extensions with endeca ppt
Leverage your business with ebs extensions with endeca ppt
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and Visualization
 
Krupesh_Resume
Krupesh_ResumeKrupesh_Resume
Krupesh_Resume
 
Risc
RiscRisc
Risc
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 

E scala design platform

  • 1. © 2014, Esencia Technology Confidential EScala Design Platform Esencia Technologies Inc. 2014
  • 2. © 2014, Esencia Technology Confidential What is EScala Technology ■ Design Platform that generates field reprogrammable HDL IP cores and programs them efficiently 1. Core Generator: You can easily generate the high performance IP core that best fits your data processing algorithm needs and design goals 2. SDK: Highly optimizing C/C++ compiler to program the generated core
  • 3. © 2014, Esencia Technology Confidential Data Memory Controller Program Memory Decoder Load/Store Register Bypass & Flow Control PC General Purpose ALU 32 General Purpose Registers Bank RISC CPU Architecture ■ RISC CPUs like ARM, MIPS: − Program Control − Instruction Decoder − ALU − Register Bank − Load Store Unit Exception / interrupt controller
  • 4. © 2014, Esencia Technology Confidential Data Memory Controller Program Memory Decoder Load/Store Register Bypass & Flow Control PC General Purpose ALU 32 General Purpose Registers Bank Typical RISC CPU Pipeline ■ RISC CPUs like ARM, MIPS: − Program Control − Instruction Decoder − ALU − Register Bank − Load Store Unit Fetch Decode Execute Write BackException / interrupt controller
  • 5. © 2014, Esencia Technology Confidential EScala-Core Data Memory Controller AND/OR MMIO Program Memory (RAM or ROM) Decoder Load/Store Register Bypass & Flow Control PC General Purpose ALU Configurable Size General Purpose Register Bank Slot 0 EScala Generated Core Exception / interrupt controller
  • 6. © 2014, Esencia Technology Confidential Escala-Core Data Memory Controller AND/OR MMIO Program Memory (RAM or ROM) Decoder Load/Store Decode Load/Store Custom ALU Data Memory Controller AND/OR MMIO PC General Purpose ALU Configurable Size General Purpose Register Bank Slot 0 Slot 1 EScala Generated Core Register Bypass & Flow Control Exception / interrupt controller
  • 7. © 2014, Esencia Technology Confidential Escala-Core Data Memory Controller AND/OR MMIO Program Memory (RAM or ROM) Decoder Load/Store Decode Load/Store Custom ALU Decode Load/Store Custom ALU Decode Load/Store Custom ALU Data Memory Controller AND/OR MMIO Data Memory Controller AND/OR MMIO PC General Purpose ALU Data Memory Controller AND/OR MMIO Configurable Size General Purpose Register Bank Slot 0 Slot 1 Slot N-2 Slot N-1 EScala Generated Core Register Bypass & Flow Control Exception / interrupt controller
  • 8. © 2014, Esencia Technology Confidential EScala-core Program Memory (RAM or ROM) EScala Generated Embedded Subsystem DataMem Multi-Port DataMem DataMem High BW IO channels (MMIO) IO bridge Timer Intc BDM Peripheral bus JTAG UART DMA To Bulk/External Mem To other Escala cores Or dedicated accelerators
  • 9. © 2014, Esencia Technology Confidential EScala for HP Data-Plane processing PE PE PE PE PE ■ Can create arbitrarily complex topologies to exploit task level parallelism ■ All PE’s can access Bulk Memory (not shown in diagram) Main/Bulk Memory PE PE PE PE
  • 10. © 2014, Esencia Technology Confidential EScala-core EScala on Data-Plane processing Data Mem Bank(s) Program Memory MMO Data Mem Bank(s) = Optional DMA/AGU ■ PE’s are individually tailored to the code they run ■ Each PE automatically configured for maximum ILP ■ High BW Point-to-Point connections possible through MMIO channels ■ No bus bottleneck ■ Built-in flow control MMO MMI PE MMO
  • 11. © 2014, Esencia Technology Confidential EScala Applications ■ EScala offers a unique solution for the space where traditional RISC CPUs (ARM, MIPS…) run into their limitations and hand optimized full custom RTL is the traditional alternative ■ A broad range of compute intensive algorithms shows great benefit from EScala Design Platform’s tailored solution. A subset of these include: − HD and 3D Video & Audio − Video analytics (motion tracking, facial recognition, etc) − 2D and 3D graphics − Motor / low latency embedded control − Security − Image Processing − Speech recognition − Baseband Algorithms − High-Frequency Trading − Deep Packet Inspection (QoS / Firewalls / Signature detection..) − Packet compression
  • 12. © 2014, Esencia Technology Confidential EScala Design Platform Provides ■ Rapid Design Space Exploration − Performance and cost of design points are rapidly generated (typically in a few minutes instead of hours or days) ■ Rapid Design Entry − Algorithms can be expressed in C/C++ − Uses simple high-level design constraints ■ Flexible Hardware and Software Platform − Architecturally advantaged processing core for your problem that can run multiple algorithms or target multiple functions − Compiler and SDK to enable algorithm changes and updates after silicon design is frozen or even post customer delivery ■ Design Deliverables − Synthesizable HDL with guaranteed Design, Timing, and Physical Closure − Binary Object Code − Simulation Model, Regression Testbench, Synthesis Scripts
  • 13. © 2014, Esencia Technology Confidential Rapid Design Space Exploration ■ Quickly pick best design based on an array of automatically generated options ■ Web interface. Runs jobs on cloud ■ Customer dedicated VM’ s ■ Local installations available
  • 14. © 2014, Esencia Technology Confidential Technology Overview ■ EScala Performance Evaluation Models − Provides exact measurement of performance across different design points as controlled by design constraints ■ EScala Core Generator − Generates hardware design points for the designated algorithm(s) − Generated designs are timing / design closure guaranteed ■ EScala Software Development Kit − Maps C/C++ algorithms (programs) to executable binaries − Used for design exploration while hardware design is flexible − Locked for the specific chosen design allowing for multiple algorithms, enhancement, modifications, and corrections to be made after hardware design is frozen
  • 15. © 2014, Esencia Technology Confidential EScala Algorithm Flow C/C++ Source ● Object file GCC/G++ PE libs Escala Generator Escala SDK ISS Trace Escala Design Platform RTL Core & Models
  • 16. © 2014, Esencia Technology Confidential EScala Key Benefits ■ Significant reduction in design time − C/C++ high-level design entry − Core is built out of pre-verified components − Generates regression test environment − Generates simulation model for system simulation ■ Area and power efficient − Able to remove unused HW resources − Ability to reuse hardware resources for multiple algorithms ■ Reduces risk − Implement bug fixes post-silicon − Offers to maintain or upgrade code/features in the field
  • 17. © 2014, Esencia Technology Confidential Conclusion ■ EScala combines the advantages of a programmable core with the computational power of parallel processing typical for dedicated RTL hardware implementations. ■ Designers using the EScala Design platform experience major reduction in cycle times compared to a traditional hand written RTL approach. ■ EScala field re-programmability reduces the risk of silicon re-spins. ■ Ideal building block to generate efficient many-core configurations for highly demanding data-plane processing applications Please contact Esencia Technology Inc. for more information or Visit our web site!
  • 18. © 2014, Esencia Technology Confidential Thank You!
  • 19. © 2014, Esencia Technology Confidential Back-up
  • 20. © 2014, Esencia Technology Confidential EScala: Core Generator ■ Processor Elements (PEs) are hand-written templated processors ■ Highly configurable and parametric ■ High level programmable (C/C++/…) ■ Multitude of Performance knobs − Number of slots − Configurable Register resources − Configurable number of Memory banks − Custom/user defined instructions − Auto-Generated instructions (operator fusion) − Unified / clustered register file − Number of SIMD units / vector sizes − Number of FP units − 32 vs. 16 bit data-path − Endianess etc. ■ Resource Optimizations − PE is deprived of all unused resources at fine-grain level − Performance/Area trade-offs are controlled by the user
  • 21. © 2014, Esencia Technology Confidential Escala: SDK toolset ■ Targets High level languages (C/C++) ■ Toolset uses a combination of configuration unaware compiler (gcc) and a proprietary configuration aware one ■ Back-end compiler steps are optimized by our compiler to match processor configuration − Instruction Scheduling − Register renaming / allocation − Memory un-aliasing − Combo instruction generation, etc. ■ Source transformations are used only opportunistically − Full / partial loop unrolling
  • 22. © 2014, Esencia Technology Confidential EScala Algorithm Flow ■ Three levels of abstraction: ■ Host (linux x86) simulation ■ EScala ISS ■ RTL simulation ■ Same API across
  • 23. © 2014, Esencia Technology Confidential Digital Design Flow Summary 1. Describe functionality in HDL 2. Synthesize HDL to gates of the target process 3. Layout Netlist 4. Generate Masks 5. Mask are used to etch/deposit… wafer
  • 24. © 2014, Esencia Technology Confidential Performance comparison EScala vs. MicroBlaze soft-core case study
  • 25. © 2014, Esencia Technology Confidential Benchmark conditions ■ Synthesized on smallest of Virtex-6 device − xc6vlx75t-2ff484 ■ Reported area as number of LUT+FF pairs used − Excluded memories and DSP specialized primitives − For MBlaze area is a constant for all benchmarks ● 2729 LUT+FF-pairs ■ Performance reported at maximum frequency − For MBlaze Max frequency is a constant for all benchmarks ● Max freq: 197.6 MHz ■ EScala freq/area is customized to application benchmarked
  • 26. © 2014, Esencia Technology Confidential Benchmark: IDCT ■ MPEG2/4 style 8x8 block ■ Chen-Wang algorithm (IEEE ASSP-32, Aug 1984) − Intended to save operations, but has low ILP ■ Reference implementation by MPEG software simulation group ■ 32-bit integer arithmetic. (12 bit coefficients), 11 mults, 29 adds per DCT (16) ■ Perf. results (fps) reported assuming 6 IDCT’s per MB performed (4:2:0 format)
  • 27. © 2014, Esencia Technology Confidential Benchmark: IDCT ways period (ns) slices SA/MB area clks SA/MB perf Freq (Mhz) Ops/sec fps @ VGA fps @ 720p SA/MB perf @ max freq SA1 1 8.174 9209 3.4 1987 1.17 122.3 61570 8.6 2.9 0.7 SA1 4 8.301 12497 4.6 983 2.37 120.5 122551 17.0 5.7 1.4 SA2 2 6.635 18656 6.8 1334 1.74 150.7 112980 15.7 5.2 1.3 SA2 3 6.693 21160 7.8 925 2.51 149.4 161524 22.4 7.5 1.9 SA2 4 6.808 27051 9.9 731 3.18 146.9 200938 27.9 9.3 2.4 SA2 5 7.388 36458 13.4 621 3.74 135.4 217962 30.3 10.1 2.6 MB 1 5.062 2729 n/a 2325 n/a 197.6 84968 11.8 3.9 n/a
  • 28. © 2014, Esencia Technology Confidential Benchmark: SHA-1 Hashing ■ As per FIPS-PUB 180-1 (Apr 17th , 1995) ■ Processes a series of 512-bit message blocks ■ Last block is padded to 512-bit as per standard ■ Result is a 160-bit hash ■ Sequential operation in nature − Latency limits overall performance
  • 29. © 2014, Esencia Technology Confidential Benchmark: SHA-1 Hashing ways period (ns) slices SA/MB area clks SA/MB perf Freq (Mhz) Blks/sec SA/MB perf @ max freq Mbps SA 1 8.078 4519 1.7 1430 2.4 123.8 86569 1.5 44.3 SA 2 7.359 7434 2.7 803 4.2 135.9 169225 2.9 86.6 SA 3 7.281 10067 3.7 618 5.5 137.3 222239 3.8 113.8 SA 4 7.900 13996 5.1 537 6.3 126.6 235721 4.0 120.7 MB 1 5.062 2729 n/a 3391 n/a 197.6 58257 n/a 29.8
  • 30. © 2014, Esencia Technology Confidential Benchmark: AES encryption ■ Rijndael algorithm as per FIPS PUB 197 ■ Block size of 128-bits ■ Key size of 128-bits ■ Sequential application in nature (due to chaining modes) − Overall latency limits performance
  • 31. © 2014, Esencia Technology Confidential Benchmark: AES encryption ways period (ns) slices SA/MB area clks SA/MB perf Freq (Mhz) Mblks/sec Mbps SA/MB perf @ max freq SA 1 7.582 6711 2.5 331 3.6 131.9 0.40 51.0 2.4 SA 2 7.440 10518 3.9 170 7.0 134.4 0.79 101.2 4.8 SA 3 6.343 12782 4.7 123 9.7 157.7 1.28 164.1 7.8 SA 4 6.694 15886 5.8 102 11.7 149.4 1.46 187.5 8.9 SA 5 6.774 20020 7.3 91 13.2 147.6 1.62 207.6 9.8 SA 6 6.797 23596 8.6 82 14.6 147.1 1.79 229.7 10.9 SA 7 6.955 26131 9.6 76 15.8 143.8 1.89 242.2 11.5 MB 1 5.062 2729 n/a 1198 n/a 197.6 0.16 21.1 n/a
  • 32. © 2014, Esencia Technology Confidential Benchmark: Color Space Conv. ■ Performs y = M * x + b ■ M a 3x3 coefficient matrix ■ y,x,b are 3x1 vectors ■ x input color triple (in 4:4:4 sampling format) ■ y output color triple
  • 33. © 2014, Esencia Technology Confidential Benchmark: Color Space Conv. Ways period (ns) slices SA/MB Area clks SA/MB Perf Freq (Mhz) clks/pix Mpix/s Fps @ VGA Fps @ 720p SA/MB perf @ max freq SA 1 7.300 6889 2.5 10132 6.9 137.0 33.8 4.1 13.2 4.4 4.8 SA 2 6.680 8121 3.0 5197 13.5 149.7 17.3 8.6 28.1 9.38 10.2 SA 3 6.668 13013 4.8 3601 19.4 150.0 12.0 12.5 40.7 13.56 14.7 SA 4 6.565 16129 5.9 2789 25.1 152.3 9.3 16.4 53.3 17.78 19.3 SA 5 6.719 21280 7.8 2368 29.5 148.8 7.9 18.9 61.4 20.5 22.2 SA 6 7.395 25011 9.2 2098 33.3 135.2 7.0 19.3 62.9 20.98 22.8 SA 7 6.760 29054 10.6 1917 36.5 147.9 6.4 23.2 75.4 25.1 27.3 SA 8 6.532 34586 12.7 1737 40.2 153.1 5.8 26.4 86.1 28.69 31.2 SA 9 7.143 36866 13.5 1678 41.7 140.0 5.6 25.0 81.5 27.16 29.5 MB 1 5.062 2729 n/a 69900 n/a 197.6 233.00 0.85 2.76 .92 n/a
  • 34. © 2014, Esencia Technology Confidential Benchmark: FIR filtering ■ Implemented BT-601-7 high quality interpolation filter ■ Coefficients quantized to 12-bits, 8-bit input samples ■ 26-tap symmetric filter ■ Used over 2 chroma components on a 4:2:2 to 4:4:4 up- sampling application (two filtering operations per pixel)
  • 35. © 2014, Esencia Technology Confidential Benchmark: FIR filtering ways period (ns) slices SA/MB area clks SA/MB perf Freq (Mhz) clks/pix Mpix/s Fps @ VGA Fps @ 720p SA/MB perf @ max freq SA 1 6.602 3641 1.3 8634 42.6 151.5 86.34 1.75 5.71 1.90 32.6 SA 2 6.624 6902 2.5 4527 81.2 151.0 45.27 3.33 10.86 3.62 62.1 SA 3 6.458 7597 2.8 3128 117.5 154.8 31.28 4.95 16.11 5.37 92.1 SA 4 6.884 10467 3.8 2828 130.0 145.3 28.28 5.14 16.72 5.57 95.6 SA 5 7.526 12527 4.6 2678 137.3 132.9 26.78 4.96 16.15 5.38 92.3 SA 6 7.853 15940 5.8 2628 139.9 127.3 26.28 4.85 15.77 5.26 90.2 MB 1 5.062 2729 n/a 367600 n/a 197.6 3676.00 0.05 .17 .06 n/a
  • 36. © 2014, Esencia Technology Confidential Comparison Conclusion ■ Highly scalable core generator ■ Application driven configuration ■ Programmability optionally removed for efficient single purpose implementations (FPGA already reprogrammable!) ■ Slots can be configured with high level of granularity ■ Considerable performance gains when compared to a traditional RISC like Micro-Blaze
  • 37. © 2014, Esencia Technology Confidential Thank You Esencia Technologies Inc. 2014