SlideShare a Scribd company logo
Copyright © 2016 Intel Corporation 1
Accelerating Deep Learning Using
Altera FPGAs
Bill Jenkins
May 3, 2016
Copyright © 2016 Intel Corporation 2
Legal Notices and Disclaimers
• Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service
activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure.
• Tests document performance of components on a particular test, in specific systems. Results have been estimated or simulated
using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Differences in
hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance
as you consider your purchase. For more complete information about performance and benchmark results, visit
http://www.intel.com/performance.
• Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances
and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs
or cost reduction.
• All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product
specifications and roadmaps.
• Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-
looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s
results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.
• The products described may contain design defects or errors known as errata which may cause the product to deviate from
published specifications. Current characterized errata are available on request.
• No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
• Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the
referenced web site and confirm whether referenced data are accurate.
• Intel, the Intel logo, and Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and
brands may be claimed as the property of others.
Copyright © 2016 Intel Corporation 3
• Accelerated FPGA innovation from
combined R&D scale
• Improved FPGA performance/power
via early access and greater
optimization of process node
advancements
• New, breakthrough Data Center and
IoT products harnessing combined
FPGA + CPU expertise
Altera and Intel Enhance the FPGA Value Proposition
Accelerated FPGA investment
Operational excellence
STRATEGIC RATIONALE
• Superior product design capabilities
• Continued excellence in customer
service and support
• Increased resources bolster long-term
innovation
• Focused, additive investments today
Copyright © 2016 Intel Corporation 4
• Extracting features from data in order to solve predictive problems
• Image classification & detection
• Image recognition/tagging
• Network intrusion detection
• Fraud / face detection
• Aim is programs that automatically learn to recognize complex patterns and make
intelligent decisions based on insight generated from learning
• For accuracy, models must be trained, tested and calibrated to detect patterns
using previous experience
What is Machine Learning?
Copyright © 2016 Intel Corporation 5
• Human expertise is absent
• Navigating to Pluto
• Humans cannot explain their expertise
• Speech recognition
• Solution changes over time
• Tracking traffic
• Solution needs to be adapted to particular cases
• Medical diagnosis
• Problem is vast in relation to human reasoning capabilities
• Ranking web pages on Google or Bing
When to Apply Machine Learning
Copyright © 2016 Intel Corporation 6
Value Proposition of Machine Learning
X 35ZB/s =
Increasing
Variety of
Things
Volume x
Velocity =
Throughput
Separating Signal
from Noise
Provides Value
Data is the problem
Revenue
Growth
Cost
Savings
Increased
Margin
Copyright © 2016 Intel Corporation 7
• A network of interconnected
neurons, modeled after biological
processes, for computing
approximate functions
• Layers extract successively higher
level of features
• Often want a custom topology to
meet specific application
accuracy/throughput requirements
Convolutional Neural Networks (CNN)
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based Learning Applied to
Document Recognition. IEEE98
Copyright © 2016 Intel Corporation 8
CNN Computation in One Slide
Inew 𝑥 𝑦 = Iold
1
𝑦′=−1
1
𝑥′=−1
𝑥 + 𝑥′ 𝑦 + 𝑦′ × F 𝑥′ 𝑦′
Input Feature Map
(Set of 2D Images)
Filter
(3D Space)
Output Feature Map
Repeat for Multiple Filters to Create Multiple
“Layers” of Output Feature Map
Copyright © 2016 Intel Corporation 9
What’s in my FPGA?
• DSPs
• Dedicated single-precision floating
point multiply and accumulators
• Block RAMs
• Small embedded memories that can
be stitched to form an arbitrary
memory system
• Programmable Interconnect
• Programmable logic and routing that
can build arbitrary topologies
• Compute architecture with high degree
of customization
X
+
Copyright © 2016 Intel Corporation 10
• 1 TFLOP floating point performance in mid-
range part
• 35W total device power
• Use every DSP, every clock cycle compute
spatially
• 8 TB/s memory bandwidth to keep the state on
chip!
• Exceeds available external bandwidth by
factor of 50
• Random access, low latency (2 clks)
• Place all data in on-chip memory compute
temporally
Why an FPGA for CNN? (Arria 10)
X
+
X
+
X
+
X
+ M20K
M20K
M20K
M20K
Fine-grained & low latency
between compute and memory
Copyright © 2016 Intel Corporation 11
CNNs on FPGAs — Scalable Architecture
Copyright © 2016 Intel Corporation 12
Market Demands Scalability for Machine Learning
• 1000s of Classes
• Large Workloads
• Highly Efficient
(Performance / W)
• Varying accuracy
• Server Form Factor
Cloud Analytics Transportation Safety
• < 10 Classes
• Frame Rate: 15–30fps
• Power: 1W-5W
• Cost: Low
• Varying accuracy
• Custom Form Factor
Copyright © 2016 Intel Corporation 13
Old Approach
• Parallelism across the “face” of the
kernel window, and across multiple
convolution stages
• Low hardware re-use
Different Parallelism in CNN
New Approach
• Parallelism in the depth of the kernel
window and across output features
Defer complex spatial math to
random access memory
• Re-use hardware to compute
multiple layers
Copyright © 2016 Intel Corporation 14
Scalable CNN Computations — In One Slide
accum
accum
accum
Output
Feature Map
“Slide”  No data movement.
Addressing an on-chip RAM!
Filters
Copyright © 2016 Intel Corporation 15
Scalable CNN Architecture on FPGA (1)
FPGA
Double-Buffer
On-Chip RAM
DDR
Filters
(on-chip RAM)
#ofParallel
Convolutions
Copyright © 2016 Intel Corporation 16
Scalable CNN Architecture on FPGA (2)
• Array size
(x, y)
• Clock rate
• External memory
bandwidth
Calculated throughput &
resource utilization
• Layer
descriptions
• Given resource constraints,
find optimal architecture
• Ex. AlexNet on A10-115 is 52x26
for 800 img/s @ 350 MHz
Copyright © 2016 Intel Corporation 17
• Choice of parallelism has large impact on end compute architecture and properties of solution
• Defined a scalable approach to CNNs on the FPGA
• Not tied to specific FPGA device
• Not tied to specific CNN topology
• Design Methodology:
1. Fit largest possible accelerator network on FPGA (52x26 on Arria 10)
• Limited by DSP Blocks & M20K (RAM) Resources
2. Tile network onto available accelerator
• Decompose filter window into 1x1xW vectors for dot product
Scalable CNN Architecture on FPGA (3)
Copyright © 2016 Intel Corporation 18
AlexNet Competitive Analysis — Classification
System (Precision, Image, Speed)1 Throughput
Est. Board
Power
Throughput /
Watt
Arria 10-115 (Current: FP32, Full Size, @275Mhz) 575 img/s ~31W 18.5 img/s/W
Arria 10-115 (Optimized: FP32, Full Size, @350Mhz) 750 img/s ~36W 20.8 img/s/W
Arria 10-115 (Estimate: FP16, Full Size, @350Mhz) 900 img/s ~39W 23.1 img/s/W
Arria 10-115 (Estimate: 21b, Full Size, @350Mhz) 1200 img/s ~40W 30 img/s/W
2 x Arria 10-115
Nallatech 510T Board
2400 img/s ~75W 32 img/s/W
cuDNN4 on NVIDIA Titan X
Source: NVIDIA Corporation, GPU-Based Deep Learning Inference: A Performance and
Power Analysis, November 2015
3216 img/s 227W 14.2 img/s/W
• Further algorithmic optimization of FPGA possible
• Expect similar ratios for Stratix10 vs. NVIDIA 14nm Pascal
Copyright © 2016 Intel Corporation 19
Getting Started with CNNs on FPGAs
High-Performance
Machine Learning
Desired
Accelerate
Computation
Scale & Speed of Devices
Better Compute Architecture
Math Optimization (Winograd, FFT)
Optimized RTL / HLD
(Current Intel PSG focus,
original MSFT focus)
Tune Problem
to Platform
Simplify network topology
Reduce precision / use fixed point
Create more local neuron structures
Integrated training and classification
(Current i-Abra and partner focus)
Not Mutually Exclusive
Combine for Optimal Solution
Copyright © 2016 Intel Corporation 20
Overview: Design Flow Using CNN IP
Data
Collection
Data
Store
Choose
Network
Train
Network
Execution
Engine
Improvement Strategies
• Collect more data
• Improve network
Parameters
Selection
Architecture
Choose Network
• Use framework (e.g. Caffé,
Torch)
• Choose based on experience
or limits of execution engine
Train Network
• An HPC workload
• Requires data to be pre-
selected
• Weeks to Months process
Execution Engine
• Implementation of the
Neural Network
• Flexibility, performance &
power dominate choice
Altera
CNN IP
Copyright © 2016 Intel Corporation 21
Overview: Design Flow for CNN Using Partner
Data
Collection
Data
Store
Neural
Pathways
Neural
Synapse
Parameters
Selection
Architecture
Neural Pathways
• Integrated Network
selection and training
• Capable of acceleration in
FPGA
• Minutes to hours process
Neural Synapse
• Implementation of highly
efficient Neural Network
• Built in FPGA fabric with
OpenCL
Altera
CNN IP
Copyright © 2016 Intel Corporation 22
• New opportunities to increase the FPGA value proposition
• Accelerated FPGA investment driving product innovation to increase your
performance and productivity
• Increased operational excellence to accelerate time-to-market
• Expanded product portfolio to arm you with new solutions for your most
challenging applications
• Come join us at our booth to see a demo of machine learning on FPGAs
Join Us on Our Journey Together…
How can Intel + Altera help your business grow?
Copyright © 2016 Intel Corporation 23
• Altera Website
• Altera SDK for OpenCL Page (www.altera.com/opencl)
• Technical Article “Efficient Implementation of Neural Network Systems Built
on FPGAs, Programmed with OpenCL” (www.altera.com/deeplearning-
tech-article)
• GPU vs FPGA overview online training (available mid-May)
• CNN on FPGA whitepaper (available mid-May)
• “Machine Learning on FPGAs” web page (available mid-May)
• Embedded Vision Alliance Website
• Technical Article “OpenCL Streamlines FPGA Acceleration of Computer Vision”
Resources
Copyright © 2016 Intel Corporation 24
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies
depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark
and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause
the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the
performance of that product when combined with other products.
© Intel Corporation
Slide 18
Footnote 1. Configurations:
AlexNet configurations on Arria 10-115 FPGAs optimized via IP - tested by Intel PSG
For more information go to https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/pt/arria-10-product-table.pdf
Legal Notices and Disclaimers
Copyright © 2016 Intel Corporation 25
Thank You

More Related Content

What's hot

AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
CastLabKAIST
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
Edge AI and Vision Alliance
 
Linux on RISC-V with Open Hardware (ELC-E 2020)
Linux on RISC-V with Open Hardware (ELC-E 2020)Linux on RISC-V with Open Hardware (ELC-E 2020)
Linux on RISC-V with Open Hardware (ELC-E 2020)
Drew Fustini
 
Enabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device PersonalizationEnabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device Personalization
Michelle Holley
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
Allan Cantle
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
Grigory Sapunov
 
FPGA on the Cloud
FPGA on the Cloud FPGA on the Cloud
FPGA on the Cloud
jtsagata
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
LGCNSairesearch
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
Denys Haryachyy
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
AMD
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
inside-BigData.com
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpoints
Vipin Varghese
 
Embedded Hypervisor for ARM
Embedded Hypervisor for ARMEmbedded Hypervisor for ARM
Embedded Hypervisor for ARM
National Cheng Kung University
 
NVIDIA GeForce RTX Launch Event
NVIDIA GeForce RTX Launch EventNVIDIA GeForce RTX Launch Event
NVIDIA GeForce RTX Launch Event
NVIDIA
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem BattaIbrahem Batta
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPU
GlobalLogic Ukraine
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
Michelle Holley
 
Develop and optimize CV/DL applications with Intel OpenVINO toolkit
Develop and optimize CV/DL applications with Intel OpenVINO toolkitDevelop and optimize CV/DL applications with Intel OpenVINO toolkit
Develop and optimize CV/DL applications with Intel OpenVINO toolkit
Yury Gorbachev
 
NVIDIA A100 ampere GPU
NVIDIA A100 ampere GPUNVIDIA A100 ampere GPU
NVIDIA A100 ampere GPU
system_plus
 

What's hot (20)

AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
Linux on RISC-V with Open Hardware (ELC-E 2020)
Linux on RISC-V with Open Hardware (ELC-E 2020)Linux on RISC-V with Open Hardware (ELC-E 2020)
Linux on RISC-V with Open Hardware (ELC-E 2020)
 
Enabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device PersonalizationEnabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device Personalization
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 
FPGA on the Cloud
FPGA on the Cloud FPGA on the Cloud
FPGA on the Cloud
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpoints
 
Embedded Hypervisor for ARM
Embedded Hypervisor for ARMEmbedded Hypervisor for ARM
Embedded Hypervisor for ARM
 
NVIDIA GeForce RTX Launch Event
NVIDIA GeForce RTX Launch EventNVIDIA GeForce RTX Launch Event
NVIDIA GeForce RTX Launch Event
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem Batta
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPU
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
Develop and optimize CV/DL applications with Intel OpenVINO toolkit
Develop and optimize CV/DL applications with Intel OpenVINO toolkitDevelop and optimize CV/DL applications with Intel OpenVINO toolkit
Develop and optimize CV/DL applications with Intel OpenVINO toolkit
 
NVIDIA A100 ampere GPU
NVIDIA A100 ampere GPUNVIDIA A100 ampere GPU
NVIDIA A100 ampere GPU
 

Similar to "Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel

Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
inside-BigData.com
 
The Intel Xeon Scalable Processor and IoT
The Intel Xeon Scalable Processor and IoTThe Intel Xeon Scalable Processor and IoT
The Intel Xeon Scalable Processor and IoT
Advantech Industrial Automation Group
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
High Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationHigh Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel Station
Intel IT Center
 
Intel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overviewIntel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overview
DESMOND YUEN
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY
 
Omni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overviewOmni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overview
DESMOND YUEN
 
Overview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path ArchitectureOverview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path Architecture
Intel® Software
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
Ronen Mendezitsky
 
Accelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationAccelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationIntel IT Center
 
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Databricks
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Intel® Software
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
inside-BigData.com
 
Edge Computing and 5G - SDN/NFV London meetup
Edge Computing and 5G - SDN/NFV London meetupEdge Computing and 5G - SDN/NFV London meetup
Edge Computing and 5G - SDN/NFV London meetup
Haidee McMahon
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Spark Summit
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Jen Aman
 
Building Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery NetworksBuilding Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery Networks
Rebekah Rodriguez
 
Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Srinivasa Addepalli
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
Intel IT Center
 

Similar to "Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel (20)

Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
 
The Intel Xeon Scalable Processor and IoT
The Intel Xeon Scalable Processor and IoTThe Intel Xeon Scalable Processor and IoT
The Intel Xeon Scalable Processor and IoT
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
High Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationHigh Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel Station
 
Intel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overviewIntel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overview
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
Omni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overviewOmni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overview
 
Overview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path ArchitectureOverview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path Architecture
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
 
Accelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationAccelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing Transformation
 
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
 
Edge Computing and 5G - SDN/NFV London meetup
Edge Computing and 5G - SDN/NFV London meetupEdge Computing and 5G - SDN/NFV London meetup
Edge Computing and 5G - SDN/NFV London meetup
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon Innovation
 
Building Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery NetworksBuilding Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery Networks
 
Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 

More from Edge AI and Vision Alliance

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel

  • 1. Copyright © 2016 Intel Corporation 1 Accelerating Deep Learning Using Altera FPGAs Bill Jenkins May 3, 2016
  • 2. Copyright © 2016 Intel Corporation 2 Legal Notices and Disclaimers • Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. • Tests document performance of components on a particular test, in specific systems. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. • Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. • All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. • Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward- looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. • The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. • No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. • Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. • Intel, the Intel logo, and Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
  • 3. Copyright © 2016 Intel Corporation 3 • Accelerated FPGA innovation from combined R&D scale • Improved FPGA performance/power via early access and greater optimization of process node advancements • New, breakthrough Data Center and IoT products harnessing combined FPGA + CPU expertise Altera and Intel Enhance the FPGA Value Proposition Accelerated FPGA investment Operational excellence STRATEGIC RATIONALE • Superior product design capabilities • Continued excellence in customer service and support • Increased resources bolster long-term innovation • Focused, additive investments today
  • 4. Copyright © 2016 Intel Corporation 4 • Extracting features from data in order to solve predictive problems • Image classification & detection • Image recognition/tagging • Network intrusion detection • Fraud / face detection • Aim is programs that automatically learn to recognize complex patterns and make intelligent decisions based on insight generated from learning • For accuracy, models must be trained, tested and calibrated to detect patterns using previous experience What is Machine Learning?
  • 5. Copyright © 2016 Intel Corporation 5 • Human expertise is absent • Navigating to Pluto • Humans cannot explain their expertise • Speech recognition • Solution changes over time • Tracking traffic • Solution needs to be adapted to particular cases • Medical diagnosis • Problem is vast in relation to human reasoning capabilities • Ranking web pages on Google or Bing When to Apply Machine Learning
  • 6. Copyright © 2016 Intel Corporation 6 Value Proposition of Machine Learning X 35ZB/s = Increasing Variety of Things Volume x Velocity = Throughput Separating Signal from Noise Provides Value Data is the problem Revenue Growth Cost Savings Increased Margin
  • 7. Copyright © 2016 Intel Corporation 7 • A network of interconnected neurons, modeled after biological processes, for computing approximate functions • Layers extract successively higher level of features • Often want a custom topology to meet specific application accuracy/throughput requirements Convolutional Neural Networks (CNN) Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based Learning Applied to Document Recognition. IEEE98
  • 8. Copyright © 2016 Intel Corporation 8 CNN Computation in One Slide Inew 𝑥 𝑦 = Iold 1 𝑦′=−1 1 𝑥′=−1 𝑥 + 𝑥′ 𝑦 + 𝑦′ × F 𝑥′ 𝑦′ Input Feature Map (Set of 2D Images) Filter (3D Space) Output Feature Map Repeat for Multiple Filters to Create Multiple “Layers” of Output Feature Map
  • 9. Copyright © 2016 Intel Corporation 9 What’s in my FPGA? • DSPs • Dedicated single-precision floating point multiply and accumulators • Block RAMs • Small embedded memories that can be stitched to form an arbitrary memory system • Programmable Interconnect • Programmable logic and routing that can build arbitrary topologies • Compute architecture with high degree of customization X +
  • 10. Copyright © 2016 Intel Corporation 10 • 1 TFLOP floating point performance in mid- range part • 35W total device power • Use every DSP, every clock cycle compute spatially • 8 TB/s memory bandwidth to keep the state on chip! • Exceeds available external bandwidth by factor of 50 • Random access, low latency (2 clks) • Place all data in on-chip memory compute temporally Why an FPGA for CNN? (Arria 10) X + X + X + X + M20K M20K M20K M20K Fine-grained & low latency between compute and memory
  • 11. Copyright © 2016 Intel Corporation 11 CNNs on FPGAs — Scalable Architecture
  • 12. Copyright © 2016 Intel Corporation 12 Market Demands Scalability for Machine Learning • 1000s of Classes • Large Workloads • Highly Efficient (Performance / W) • Varying accuracy • Server Form Factor Cloud Analytics Transportation Safety • < 10 Classes • Frame Rate: 15–30fps • Power: 1W-5W • Cost: Low • Varying accuracy • Custom Form Factor
  • 13. Copyright © 2016 Intel Corporation 13 Old Approach • Parallelism across the “face” of the kernel window, and across multiple convolution stages • Low hardware re-use Different Parallelism in CNN New Approach • Parallelism in the depth of the kernel window and across output features Defer complex spatial math to random access memory • Re-use hardware to compute multiple layers
  • 14. Copyright © 2016 Intel Corporation 14 Scalable CNN Computations — In One Slide accum accum accum Output Feature Map “Slide”  No data movement. Addressing an on-chip RAM! Filters
  • 15. Copyright © 2016 Intel Corporation 15 Scalable CNN Architecture on FPGA (1) FPGA Double-Buffer On-Chip RAM DDR Filters (on-chip RAM) #ofParallel Convolutions
  • 16. Copyright © 2016 Intel Corporation 16 Scalable CNN Architecture on FPGA (2) • Array size (x, y) • Clock rate • External memory bandwidth Calculated throughput & resource utilization • Layer descriptions • Given resource constraints, find optimal architecture • Ex. AlexNet on A10-115 is 52x26 for 800 img/s @ 350 MHz
  • 17. Copyright © 2016 Intel Corporation 17 • Choice of parallelism has large impact on end compute architecture and properties of solution • Defined a scalable approach to CNNs on the FPGA • Not tied to specific FPGA device • Not tied to specific CNN topology • Design Methodology: 1. Fit largest possible accelerator network on FPGA (52x26 on Arria 10) • Limited by DSP Blocks & M20K (RAM) Resources 2. Tile network onto available accelerator • Decompose filter window into 1x1xW vectors for dot product Scalable CNN Architecture on FPGA (3)
  • 18. Copyright © 2016 Intel Corporation 18 AlexNet Competitive Analysis — Classification System (Precision, Image, Speed)1 Throughput Est. Board Power Throughput / Watt Arria 10-115 (Current: FP32, Full Size, @275Mhz) 575 img/s ~31W 18.5 img/s/W Arria 10-115 (Optimized: FP32, Full Size, @350Mhz) 750 img/s ~36W 20.8 img/s/W Arria 10-115 (Estimate: FP16, Full Size, @350Mhz) 900 img/s ~39W 23.1 img/s/W Arria 10-115 (Estimate: 21b, Full Size, @350Mhz) 1200 img/s ~40W 30 img/s/W 2 x Arria 10-115 Nallatech 510T Board 2400 img/s ~75W 32 img/s/W cuDNN4 on NVIDIA Titan X Source: NVIDIA Corporation, GPU-Based Deep Learning Inference: A Performance and Power Analysis, November 2015 3216 img/s 227W 14.2 img/s/W • Further algorithmic optimization of FPGA possible • Expect similar ratios for Stratix10 vs. NVIDIA 14nm Pascal
  • 19. Copyright © 2016 Intel Corporation 19 Getting Started with CNNs on FPGAs High-Performance Machine Learning Desired Accelerate Computation Scale & Speed of Devices Better Compute Architecture Math Optimization (Winograd, FFT) Optimized RTL / HLD (Current Intel PSG focus, original MSFT focus) Tune Problem to Platform Simplify network topology Reduce precision / use fixed point Create more local neuron structures Integrated training and classification (Current i-Abra and partner focus) Not Mutually Exclusive Combine for Optimal Solution
  • 20. Copyright © 2016 Intel Corporation 20 Overview: Design Flow Using CNN IP Data Collection Data Store Choose Network Train Network Execution Engine Improvement Strategies • Collect more data • Improve network Parameters Selection Architecture Choose Network • Use framework (e.g. Caffé, Torch) • Choose based on experience or limits of execution engine Train Network • An HPC workload • Requires data to be pre- selected • Weeks to Months process Execution Engine • Implementation of the Neural Network • Flexibility, performance & power dominate choice Altera CNN IP
  • 21. Copyright © 2016 Intel Corporation 21 Overview: Design Flow for CNN Using Partner Data Collection Data Store Neural Pathways Neural Synapse Parameters Selection Architecture Neural Pathways • Integrated Network selection and training • Capable of acceleration in FPGA • Minutes to hours process Neural Synapse • Implementation of highly efficient Neural Network • Built in FPGA fabric with OpenCL Altera CNN IP
  • 22. Copyright © 2016 Intel Corporation 22 • New opportunities to increase the FPGA value proposition • Accelerated FPGA investment driving product innovation to increase your performance and productivity • Increased operational excellence to accelerate time-to-market • Expanded product portfolio to arm you with new solutions for your most challenging applications • Come join us at our booth to see a demo of machine learning on FPGAs Join Us on Our Journey Together… How can Intel + Altera help your business grow?
  • 23. Copyright © 2016 Intel Corporation 23 • Altera Website • Altera SDK for OpenCL Page (www.altera.com/opencl) • Technical Article “Efficient Implementation of Neural Network Systems Built on FPGAs, Programmed with OpenCL” (www.altera.com/deeplearning- tech-article) • GPU vs FPGA overview online training (available mid-May) • CNN on FPGA whitepaper (available mid-May) • “Machine Learning on FPGAs” web page (available mid-May) • Embedded Vision Alliance Website • Technical Article “OpenCL Streamlines FPGA Acceleration of Computer Vision” Resources
  • 24. Copyright © 2016 Intel Corporation 24 Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. © Intel Corporation Slide 18 Footnote 1. Configurations: AlexNet configurations on Arria 10-115 FPGAs optimized via IP - tested by Intel PSG For more information go to https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/pt/arria-10-product-table.pdf Legal Notices and Disclaimers
  • 25. Copyright © 2016 Intel Corporation 25 Thank You