SlideShare a Scribd company logo
Xeon+FPGA: Better Together
An Overview of Architecture and Practices
Elijah Charles, Gaurav Kaul
Intel Corporation
Legal Notices and Disclaimers
2
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com,
or from the OEM or retailer.
No computer system can be absolutely secure.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual
performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and
benchmark results, visit http://www.intel.com/performance.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future
costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice.
Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a
number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual
report on Form 10-K.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current
characterized errata are available on request.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether
referenced data are accurate.
Intel, Xeon and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
*Other names and brands may be claimed as the property of others.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.
© 2015 Intel Corporation.
Risk Factors
3
The above statements and any others in this document that refer to plans and expectations for the second quarter, the year and the future are forward-
looking statements that involve a number of risks and uncertainties. Words such as "anticipates," "expects," "intends," "plans," "believes," "seeks,"
"estimates," "may," "will," "should" and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain
events or assumptions also identify forward-looking statements. Many factors could affect Intel's actual results, and variances from Intel's current
expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel
presently considers the following to be important factors that could cause actual results to differ materially from the company's expectations. Demand for
Intel's products is highly variable and could differ from expectations due to factors including changes in business and economic conditions; consumer
confidence or income levels; the introduction, availability and market acceptance of Intel's products, products used together with Intel products and
competitors' products; competitive and pricing pressures, including actions taken by competitors; supply constraints and other disruptions affecting
customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Intel's gross margin
percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the
timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated
costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; and product manufacturing
quality/yields. Variations in gross margin may also be caused by the timing of Intel product introductions and related expenses, including marketing
expenses, and Intel's ability to respond quickly to technological developments and to introduce new products or incorporate new features into existing
products, which may result in restructuring and asset impairment charges. Intel's results could be affected by adverse economic, social, political and
physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural
disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Results may also be affected by the formal or informal
imposition by countries of new or revised export and/or import and doing-business regulations, which could be changed without prior notice. Intel
operates in highly competitive industries and its operations have high costs that are either fixed or difficult to reduce in the short term. The amount, timing
and execution of Intel's stock repurchase program could be affected by changes in Intel's priorities for the use of cash, such as operational spending,
capital spending, acquisitions, and as a result of changes to Intel's cash flows or changes in tax laws. Product defects or errata (deviations from published
specifications) may adversely impact our expenses, revenues and reputation. Intel's results could be affected by litigation or regulatory matters involving
intellectual property, stockholder, consumer, antitrust, disclosure and other issues. An unfavorable ruling could include monetary damages or an
injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel's ability to design
its products, or requiring other remedies such as compulsory licensing of intellectual property. Intel's results may be affected by the timing of closing of
acquisitions, divestitures and other significant transactions. A detailed discussion of these and other factors that could affect Intel's results is included in
Intel's SEC filings, including the company's most recent reports on Form 10-Q, Form 10-K andearnings release.
Rev. 4/14/15
Agenda
4
• Accelerators: Motivation and Use Cases
• Using Field Programmable Gate Array (FPGA) as an Accelerator
• Intel® Xeon® Processor + FPGA Accelerator Platform
• Hardware and Software Programming Interfaces
• Example Applications
50¹ Billion
DEVICES
Build out of
the CLOUD
$120B³
New
SERVICES
$450B²
1: Sources: AMS Research, Gartner,IDC, McKinsey Global Institute, and various others industry analysts and commentators
2: Source IDC, 2013. 2016 calculated base don reported CAGR ‘13-’17
4 3: Source: iDATA /Digiworld,2013
Digital Services Economy…
…Fueling Cloud Computing Growth
6
Cloud Economics
Amazon’s TCO Analysis¹
VMs per System
Web Transactions /Sec
Storage Capacity
Hadoop Queries
Workload Performance Metrics
1: Source: James Hamilton, Amazon* http://perspectives.mvdirona.com/2010/09/overall-data-center-costs/
Performance / TCO is the key metric
7
Diverse Data Center Demands
Accelerators can increase Performance at lower TCO for targeted workloads
8 Intel estimates; bubble size is relative CPU intensity
Agenda
9
• Accelerators: Motivation and Use Cases
• Using Field Programmable Gate Array (FPGA) as an Accelerator
• Intel® Xeon® Processor + FPGA Accelerator Platform
• Hardware and Software Programming Interfaces
• Example Applications
Accelerator Architecture Landscape
Application Flexibility
Ease of Programming/
Development
Fixed Function
Accelerator
10
Reconfigurable
Accelerator
CPU
Benefits of Reconfigurable Accelerators:
Savings in Area /Power
• Can be configured to implement different functions efficiently
- Meeting performance goalsfor segment
- Saving area and power compared to multiple Fixed Functions
Fixed Functions
Cost
Programmable
Accelerator
Software
Performance
10
Benefits of Reconfigurable Accelerators:
Meeting Customer Needs for Differentiation
Workload
Optimized
Silicon
12
Pervasive
Analytics &
Insights
Intelligent
Resource
Orchestration
Dynamic
Resource
Pooling
Driving the Digital ServiceEconomy
What is a Field Programmable Gate Array (FPGA)?
FPGAs (Field Programmable Gate Arrays) are
semiconductor devices that can be programmed
13
• Desired functionality of the FPGA can be (re-) programmed
by downloading a configuration into the device
FPGAs offer several advantages over potential
alternatives:
• Lower one-time development cost, and faster time to market
compared to custom designed chips (ASICs)
• Ability to implement customer-specific functionality beyond
what is available from standard products (ASSPs)
• Customizable and reprogrammable after the device has
been deployed to the field compared to both ASIC and ASSP
Logic Blocks
Interconnect Resources
I/O Cells
A Complete Solutions Portfolio
CPLDs
Lowest Cost,
Lowest Power
PowerSoCs
High-efficiency
Power Management
FPGAs
Cost/PowerBalance
Design
Software
Development
Kits
Embedded Soft and
Hard Processors
FPGAs
Mid-range FPGAs
P O W E R I N G Y O U R I N N O V A T I O N
SoC & Transceivers SoC & Transceivers
R E S O U R C E S
FPGAs
Optimized for
High Bandwidth
Intellectual
Property (IP)
Industrial
Computing
Enterprise
1
Efficiency via Specialization
ASICsFPGAs
Source: Bob Broderson, Berkeley Wireless group
GPUs
OpenCL and FPGAs Address These Challenges
Power efficient acceleration
– Typically 1/5 power of GPU and orders of magnitude more performance per watt ofCPU
FPGA lifecycle over 15 years
– GPUs lifespan is short
Require re-optimization testing between generations
– FPGA OpenCL code retargeted to future devices without modification
Our OpenCL flow abstracts away FPGA hardware flow
– Puts FPGA into software engineers hands
Our OpenCL SDK allows for streaming IO channels and kernel
channels
– Data movement without host involvement
– Low latency data transmissions to accelerator
Shared virtual memory
– IBM CAPI and Intel QPI
16
More SW Engineering Resources than HW?
1000:1 software engineers to FPGA designers
Software engineers are not used to long compile
17
times
OpenCL Solves This!
 Our OpenCL flow abstracts away FPGA hardware flow
bringing the FPGA to low level software programmers
 Software developers write, optimize and debug in their software familiar
environment
 Quartus is run behind the scenes
 Emulator and profiler are software development tools
 Pushing long compile times to end
 OpenCL optimization doesn’t require a board
 Allowing SW to drive board requirements (.xml file)
Application Development Paradigm
ASIC
FPGA
Programmers
Parallel
Programmers
Standard CPU Programmers
OpenCL expands
The number of
application developers
18
Agenda
19
• Accelerators: Motivation and Use Cases
• Using Field Programmable Gate Array (FPGA) as an Accelerator
• Intel® Xeon® Processor + FPGA Accelerator Platform
• Hardware and Software Programming Interfaces
• Example Applications
Intel® Xeon® E5 + Field Programmable Gate Array Software
Development Platform (SDP) Shipping Today
Intel QPI
DDR3
DDR3
DDR3
DDR3
DDR3
PCIe3.0x8
DMI2
PCIe3.0x8
PCIe3.0x8
PCIe3.0x8
PCIe3.0x8
PCIe3.0x8
DDR3
Intel Xeon
Processor E5
Product Family
FPGA
Processor Intel Xeon Processor E5
FPGA Module Altera* Stratix* V
QPI Speed 6.4 GT/s fullwidth
(target 8.0 GT/s at full width)
Memory to
FPGA Module
2 channels of DDR3
(up to 64 GB)
Expansion
connector
to FPGA Module
PCI Express® (PCIe) 3.0 x8
lanes - maybe used for direct
I/O e.g. Ethernet
Features
Configuration Agent,Caching
Agent, (optional) Memory
Controller
Software
Accelerator Abstraction Layer
(AAL) runtime, drivers, sample
applications
Software Development for Accelerating Workloads using Intel® Xeon® processors and coherently attached FPGA in-socket
20
Intel® QuickPath Interconnect (Intel® QPI)
System Logical View
• AFUs can access coherent cache on FPGA
• AFUs can “not” implement a second level cache
• Intel® Quick Path Interconnect (Intel® QPI) IP participates in cache coherency
with Processors
A F U s
Q P I
D R A M
D R A M
D D R
D R A M
P r o c e ss o r
C o re s L L C
F P G A
C C I
M u lt i-processor C o h e r e n c e D o m a i n C a c h e a c c e s s D o m a i n
C
a
c
h
e
21
In te l
Q P I
I P
Intel® Xeon® + Field Programmable Gate Array SDP: Intel®
Quick Path Interconnect 1.1 RTL Microarchitecture
• PHY – Implements the Intel QPI PHY 1.1
(Analog/Digital)
• Intel QPI Linklayer- provides flow control
and reliable communication
• Intel QPI Protocol – implements Intel QPI
Cache Agent + ConfigurationAgent
• Cache Controller – Cache hit/miss
determination and generates Intel QPI
protocol requests.
• Cache Tag – Tracks state of cacheline (MESI +
internal states for tracking outstanding
requests)
• Coherency Table – Programmable table that
implements coherency protocol rules
• System Protocol Layer (SPL2) – Implements
Address translation functionality. Can
provide up to 2GB device virtual address
space to AFU. SPL2 cannot handle page
faults.
• AFU – User designed Accelerator Function
Unit
Q P I L i n k / P r o t o c o l C o n t r o l
Q P I P H YR x A l i g n T x A l i g n
R x C o n t r o l T x C o n t r o l
C a c h e c o n t r o l l er
C a c h e
D a t a
C a c h e T a g
C a c h e T a b l e
R x
T x
S P L 2
C C I- E
R x
T x
C C I- S
Intel Q P I F P G A IP
6 4 0 bits6 4 0 bits
A d d r e s s translation
U s er:
Accelerator Func t i on Unit (A FU )
Intel® QuickPath Interconnect (Intel® QPI) Q P I int erf ac e t o p i n s22
Agenda
23
• Accelerators: Motivation and Use Cases
• Using Field Programmable Gate Array (FPGA) as an Accelerator
• Intel® Xeon® Processor + FPGA Accelerator Platform
• Hardware and Software Programming Interfaces
• Example Applications
Intel® Xeon® Processor + Field Programmable Gate Array Tool Flow
C HDL
SW
Compiler
Syn.
PAR
exe
bit-
stream
Intel® Xeon®
AAL
FPGA
Shell
Host Kernels
SW
Compiler
OpenCL
Compiler
exe
bit-
stream
HDL Programming OpenCL™ Programming
Intel Xeon
AAL
FPGA
Shell
Accelerator Abstraction Layer Field ProgrammableGate Array (FPGA)24
Programming Interfaces
Host Application
Virtual Memory
API
Intel QPI/KTI Link,
Protocol, & PHY
CPU
Accelerator Function
Units (AFU)
CCI1
extended
Addr Translation
CCI1
standard
Service API
Physical Memory API
Interfaces
Accelerator
Abstraction
Layer
Field ProgrammableGate Array
25 Intel® QuickPath Interconnect (Intel® QPI) 2. Software Development Platform 4. Register Transfer Level
Intel QPI
Standard Programming Interfaces : AAL and CCI
Programming interfaces will be forward compatible from SDP2 to future MCP3 solutions
Simulation Environment available for development of SW and RTL4
1. Coherent Cache Interface 3. Multi-chip package
Programming Interfaces: OpenCL™
OpenCL
Application
Virtual Memory API VirtMem
CPU
OpenCL Kernels
CCI
Extended
CCI
Standard
Service API
Physical Memory API
Accelerator
Abstraction
Layer
C
F
G
Physical Memory API
OpenCL RunTime
OpenCL™
Host Code
OpenCL
Kernel Code
Field Programmable Gate Array
Intel QPI/PCI Express®
System Memory
Unified application code abstracted from the hardware environment
Portable across generations and families of CPUs and FPGAs
20 Intel® QuickPath Interconnect (Intel® QPI)
Agenda
21
• Accelerators: Motivation and Use Cases
• Using Field Programmable Gate Array (FPGA) as an Accelerator
• Intel® Xeon® Processor + FPGA Accelerator Platform
• Hardware and Software Programming Interfaces
• Example Applications
Example Usage:
Deep Learning Framework for Visual Understanding
clusternodedeviceprimitives
DMA
Weights
Inputs
O
utputs
Processing Tile ‘n’
Processing Tile 1
Processing Tile 0
PE PE PE
Read Write Reg
Access
Control
State
Machine
IP
Registers
CCI Interface
SRAM Controller
CNN (Convolutional Neural Network) function accelerated on FPGA:
Power-performance of CNN classification boosted up to 2.2X†
22 microbenchmark. In order to sustain ~2400 img/s we need a I/O bandwidth of ~500 MB/s, which can be supported by a 10GigE link and software stack
†Source: Intel Measured (Intel® Xeon® processor E5-2699v3 results; Altera Estimated (4x Arria-10 results)
2S Intel( Xeon E5-2699v3 + 4x GX1150 PCI Express® cards. Most computations executed on Arria-10 FPGA's, 2S Intel Xeon E5-2699v3 host assumed to be near idle, doing misc. networking/housekeeping functions.
Arria-10 results estimated by Altera with Altera custom classification network. 2x Intel Xeon E5-2699v3 power estimated @ 139W while doing "housekeeping" for GX1150 cards based on Intel measured
Example Usage:
HaplotypeCaller (PairHMM
Genomics Analysis Toolkit
BWA mem (Smith-Waterman
PairHMM function accelerated on FPGA:
Power-performance of pHMM boosted up to 3.8X†
23 essentially idle when work load is offloaded to the FPGA)
†pHMM Algorithm performance is measured in terms of Millions Cell Updates per seconds (CUPS).
Performance projections: CPU Performance: includes: 1 core Intel® Xeon® processor E5-2680v2 @ 2.8GHz delivers 2101.1 MCUP/s measured; estimated value assumes linear scaling to 10 Cores on Xeon ES2680v2 @
2.8 GHz & 115W TDP; FPGA Performance includes: 1 FPGA PE (Processing Engine) delivers 408.9 MCUP/s @ 200 MHz measured; estimated value assumes linear scaling to 32 PEs and 90% frequency scaling on Stratix-
V A7 400 MHz based on RTL Synthesis results (35W TDP). Intel estimated based on 1S Xeon E5-2680v2 + 1 Stratix-V A7 with QPI 1.1 @ 6.4 GT/s full width using Intel® QuickAssist FPGA System Release 3.3, ICC (CPU is
Intel® Xeon® + FPGA1 in the Cloud
Vision
Workload
Static/dynamic
FPGA programming
Place
workload
Intel® Xeon®
+FPGA
Orchestration Software
Intel
Developed IP
3rd party
Developed IP
Resource Pool
Storage Network Compute
Software
Defined
Infrastructure
FPGA Vendor
Developed IP
Cloud Users
IP Library
End User
Developed IP
Launch workload
1: Field Programmable GateArray (FPGA)30
Workload
accelerators
“Programmer Friendly” Acceleration
Software Programmers
• Need Logic andData Management– By writing lines of code
OpenCL™ Compiler Benefits
• Ease ofuse
• Scalable
• Heterogeneous
• Leverage existing libraries
• Vendor choice w/open standards
• Foundation for OpenMP (80% reuse)
Channels/Pipe Extension
• Kernel Kernel
• External IO Kernel
• Mix ‘n Match HDL & Kernels
I/O I/OKernel Kernel Kernel
DDRx Global MemoryBuffer
Context
Compile code Create data&
arguments
Execute
CPU FPGA
© 2015 AlteraCorporation–Public 31
Spectrum of Workload Acceleration
Software
Library
39
Processor
Instruction
Discrete
Accelerator
Integrated
Accelerator
Example:
Data Plane
Development Kit
Example: Field
Programmable
Gate Array
Quick Assist
Technology
Example: Intel®
Iris™ Pro Graphics
Example: Intel®
Advanced Vector
Extensions
Workload Acceleration Beyond CPU
Intel® Silicon
Photonics
Intel® Omni-
Path Fabric
3D XPoint™
Technology
49
Intel Architecture Vision for Software:
Code Once – Run Anywhere
Software
Library
34
Processor
Instruction
Discrete
Accelerator
Integrated
Accelerator
Consistent programing model
for all accelerators
Additional Sources of Information
35
• A PDF of this presentation is available from our Technical Session Catalog:
www.intel.com/idfsessionsSF.
• Intel® Xeon Phi™ coprocessor resources: software.intel.com/mic-developer
• Network Compression resources: intel.com/quickassist
• Media Transcoding resources: software.intel.com/intel-media-server-studio
• Storage Cryptography resources: software.intel.com/storage
• FPGA: Please see demo in Altera* booth in the demo showcase

More Related Content

What's hot

Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the Network
Liz Warner
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
boxu42
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case
Intel IT Center
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overview
videos
 
Improving Quality of Service via Intel RDT
Improving Quality of Service via Intel RDTImproving Quality of Service via Intel RDT
Improving Quality of Service via Intel RDT
Liz Warner
 
Platform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsPlatform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed Loops
Liz Warner
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and designYeonki Choi
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY
 
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma CollinsClosed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Liz Warner
 
Disrupting the Data Center: Unleashing the Digital Services Economy
Disrupting the Data Center: Unleashing the Digital Services EconomyDisrupting the Data Center: Unleashing the Digital Services Economy
Disrupting the Data Center: Unleashing the Digital Services EconomyIntel IT Center
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
MongoDB
 
Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetup
Haidee McMahon
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?Enkitec
 
Exadata x5 investor-relations-final
Exadata x5 investor-relations-finalExadata x5 investor-relations-final
Exadata x5 investor-relations-final
Chien Auron
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Intel® Software
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
inside-BigData.com
 
Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2
Peter Tröger
 
Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.
Keshav Murthy
 
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. AvailabilityHPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
HPC DAY
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
Intel® Software
 

What's hot (20)

Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the Network
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overview
 
Improving Quality of Service via Intel RDT
Improving Quality of Service via Intel RDTImproving Quality of Service via Intel RDT
Improving Quality of Service via Intel RDT
 
Platform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsPlatform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed Loops
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and design
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma CollinsClosed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma Collins
 
Disrupting the Data Center: Unleashing the Digital Services Economy
Disrupting the Data Center: Unleashing the Digital Services EconomyDisrupting the Data Center: Unleashing the Digital Services Economy
Disrupting the Data Center: Unleashing the Digital Services Economy
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
 
Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetup
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?
 
Exadata x5 investor-relations-final
Exadata x5 investor-relations-finalExadata x5 investor-relations-final
Exadata x5 investor-relations-final
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
 
Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2
 
Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.
 
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. AvailabilityHPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
HPC DAY 2017 | NVIDIA Volta Architecture. Performance. Efficiency. Availability
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
 

Viewers also liked

Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environments
inside-BigData.com
 
Fpga computing
Fpga computingFpga computing
Fpga computing
rinnocente
 
Containers and HPC
Containers and HPCContainers and HPC
Containers and HPC
Olli-Pekka Lehto
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
Naoto MATSUMOTO
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
Maho Nakata
 
Stratix FPGA Overview
Stratix FPGA OverviewStratix FPGA Overview
Stratix FPGA Overview
Premier Farnell
 
Today's FPGA Ecosystem - Neeraj Varma, Xilinx
Today's FPGA Ecosystem - Neeraj Varma, XilinxToday's FPGA Ecosystem - Neeraj Varma, Xilinx
Today's FPGA Ecosystem - Neeraj Varma, XilinxFPGA Central
 
Altera’s Role In Accelerating the Internet of Things
Altera’s Role In Accelerating the Internet of ThingsAltera’s Role In Accelerating the Internet of Things
Altera’s Role In Accelerating the Internet of ThingsAltera Corporation
 
Intel xeon phi coprocessor slideshare ppt
Intel xeon phi coprocessor slideshare pptIntel xeon phi coprocessor slideshare ppt
Intel xeon phi coprocessor slideshare pptIntel IT Center
 
Assic 28th Lecture
Assic 28th LectureAssic 28th Lecture
Assic 28th Lecture
babak danyal
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013
HSA Foundation
 
Docker for HPC in a Nutshell
Docker for HPC in a NutshellDocker for HPC in a Nutshell
Docker for HPC in a Nutshell
inside-BigData.com
 
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
HSA Foundation
 
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingTom Spyrou
 
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Indrajit Poddar
 
FPGAアクセラレータの作り方
FPGAアクセラレータの作り方FPGAアクセラレータの作り方
FPGAアクセラレータの作り方
Mr. Vengineer
 
SoC FPGA Technology
SoC FPGA TechnologySoC FPGA Technology
SoC FPGA Technology
Siraj Muhammad
 
Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016
Sean Everett
 
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAsAnnouncing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Amazon Web Services
 

Viewers also liked (20)

Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environments
 
Fpga computing
Fpga computingFpga computing
Fpga computing
 
Containers and HPC
Containers and HPCContainers and HPC
Containers and HPC
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
 
Stratix FPGA Overview
Stratix FPGA OverviewStratix FPGA Overview
Stratix FPGA Overview
 
Today's FPGA Ecosystem - Neeraj Varma, Xilinx
Today's FPGA Ecosystem - Neeraj Varma, XilinxToday's FPGA Ecosystem - Neeraj Varma, Xilinx
Today's FPGA Ecosystem - Neeraj Varma, Xilinx
 
Altera’s Role In Accelerating the Internet of Things
Altera’s Role In Accelerating the Internet of ThingsAltera’s Role In Accelerating the Internet of Things
Altera’s Role In Accelerating the Internet of Things
 
Intel xeon phi coprocessor slideshare ppt
Intel xeon phi coprocessor slideshare pptIntel xeon phi coprocessor slideshare ppt
Intel xeon phi coprocessor slideshare ppt
 
Assic 28th Lecture
Assic 28th LectureAssic 28th Lecture
Assic 28th Lecture
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013
 
Docker for HPC in a Nutshell
Docker for HPC in a NutshellDocker for HPC in a Nutshell
Docker for HPC in a Nutshell
 
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
 
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timing
 
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
 
FPGAアクセラレータの作り方
FPGAアクセラレータの作り方FPGAアクセラレータの作り方
FPGAアクセラレータの作り方
 
SoC FPGA Technology
SoC FPGA TechnologySoC FPGA Technology
SoC FPGA Technology
 
Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016
 
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAsAnnouncing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAs
 

Similar to Using Xeon + FPGA for Accelerating HPC Workloads

Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Community
 
Improving the performance of OpenSubdiv* on Intel Architecture
Improving the performance of OpenSubdiv* on Intel ArchitectureImproving the performance of OpenSubdiv* on Intel Architecture
Improving the performance of OpenSubdiv* on Intel Architecture
Intel® Software
 
DreamWork Animation DWA
DreamWork Animation DWADreamWork Animation DWA
DreamWork Animation DWA
Intel® Software
 
Transforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsTransforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsIntel IT Center
 
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
inside-BigData.com
 
DreamWorks Animation
DreamWorks AnimationDreamWorks Animation
DreamWorks Animation
Intel® Software
 
Transforming Products into Platforms
Transforming Products into PlatformsTransforming Products into Platforms
Transforming Products into Platforms
Delyn Simons
 
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Intel Software Brasil
 
Intel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего местаIntel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего места
Expolink
 
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンドVIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS Virtualized Infrastructure Operators group ARCHIVES
 
intel Third Quarter 2007 Business Update
intel  Third Quarter 2007 Business Updateintel  Third Quarter 2007 Business Update
intel Third Quarter 2007 Business Updatefinance6
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
Intel Software Brasil
 
intel Business Update
intel  	Business Updateintel  	Business Update
intel Business Updatefinance6
 
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryTwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
Delyn Simons
 
Yocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration InitiativeYocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration Initiative
Marcelo Sanz
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
Jomar Silva
 
50 Billion Connected Things are Coming
50 Billion Connected Things are Coming50 Billion Connected Things are Coming
50 Billion Connected Things are Coming
Intel® Software
 
Crooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumCrooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumAlan Frost
 
intel Quarter 2008 BusinessUpdate Q 4TH
intel Quarter 2008 BusinessUpdate Q 4THintel Quarter 2008 BusinessUpdate Q 4TH
intel Quarter 2008 BusinessUpdate Q 4THfinance6
 
intel Second Quarter 2008 Earnings Release
intel Second Quarter 2008 Earnings Releaseintel Second Quarter 2008 Earnings Release
intel Second Quarter 2008 Earnings Releasefinance6
 

Similar to Using Xeon + FPGA for Accelerating HPC Workloads (20)

Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
 
Improving the performance of OpenSubdiv* on Intel Architecture
Improving the performance of OpenSubdiv* on Intel ArchitectureImproving the performance of OpenSubdiv* on Intel Architecture
Improving the performance of OpenSubdiv* on Intel Architecture
 
DreamWork Animation DWA
DreamWork Animation DWADreamWork Animation DWA
DreamWork Animation DWA
 
Transforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsTransforming Business with Advanced Analytics
Transforming Business with Advanced Analytics
 
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
 
DreamWorks Animation
DreamWorks AnimationDreamWorks Animation
DreamWorks Animation
 
Transforming Products into Platforms
Transforming Products into PlatformsTransforming Products into Platforms
Transforming Products into Platforms
 
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
 
Intel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего местаIntel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего места
 
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンドVIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンド
 
intel Third Quarter 2007 Business Update
intel  Third Quarter 2007 Business Updateintel  Third Quarter 2007 Business Update
intel Third Quarter 2007 Business Update
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
 
intel Business Update
intel  	Business Updateintel  	Business Update
intel Business Update
 
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryTwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
 
Yocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration InitiativeYocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration Initiative
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
 
50 Billion Connected Things are Coming
50 Billion Connected Things are Coming50 Billion Connected Things are Coming
50 Billion Connected Things are Coming
 
Crooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumCrooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinum
 
intel Quarter 2008 BusinessUpdate Q 4TH
intel Quarter 2008 BusinessUpdate Q 4THintel Quarter 2008 BusinessUpdate Q 4TH
intel Quarter 2008 BusinessUpdate Q 4TH
 
intel Second Quarter 2008 Earnings Release
intel Second Quarter 2008 Earnings Releaseintel Second Quarter 2008 Earnings Release
intel Second Quarter 2008 Earnings Release
 

More from inside-BigData.com

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
inside-BigData.com
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
inside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
inside-BigData.com
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
inside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
inside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
inside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 

Using Xeon + FPGA for Accelerating HPC Workloads

  • 1. Xeon+FPGA: Better Together An Overview of Architecture and Practices Elijah Charles, Gaurav Kaul Intel Corporation
  • 2. Legal Notices and Disclaimers 2 Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. Intel, Xeon and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. © 2015 Intel Corporation.
  • 3. Risk Factors 3 The above statements and any others in this document that refer to plans and expectations for the second quarter, the year and the future are forward- looking statements that involve a number of risks and uncertainties. Words such as "anticipates," "expects," "intends," "plans," "believes," "seeks," "estimates," "may," "will," "should" and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel's actual results, and variances from Intel's current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be important factors that could cause actual results to differ materially from the company's expectations. Demand for Intel's products is highly variable and could differ from expectations due to factors including changes in business and economic conditions; consumer confidence or income levels; the introduction, availability and market acceptance of Intel's products, products used together with Intel products and competitors' products; competitive and pricing pressures, including actions taken by competitors; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Intel's gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; and product manufacturing quality/yields. Variations in gross margin may also be caused by the timing of Intel product introductions and related expenses, including marketing expenses, and Intel's ability to respond quickly to technological developments and to introduce new products or incorporate new features into existing products, which may result in restructuring and asset impairment charges. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Results may also be affected by the formal or informal imposition by countries of new or revised export and/or import and doing-business regulations, which could be changed without prior notice. Intel operates in highly competitive industries and its operations have high costs that are either fixed or difficult to reduce in the short term. The amount, timing and execution of Intel's stock repurchase program could be affected by changes in Intel's priorities for the use of cash, such as operational spending, capital spending, acquisitions, and as a result of changes to Intel's cash flows or changes in tax laws. Product defects or errata (deviations from published specifications) may adversely impact our expenses, revenues and reputation. Intel's results could be affected by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel's ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. Intel's results may be affected by the timing of closing of acquisitions, divestitures and other significant transactions. A detailed discussion of these and other factors that could affect Intel's results is included in Intel's SEC filings, including the company's most recent reports on Form 10-Q, Form 10-K andearnings release. Rev. 4/14/15
  • 4. Agenda 4 • Accelerators: Motivation and Use Cases • Using Field Programmable Gate Array (FPGA) as an Accelerator • Intel® Xeon® Processor + FPGA Accelerator Platform • Hardware and Software Programming Interfaces • Example Applications
  • 5. 50¹ Billion DEVICES Build out of the CLOUD $120B³ New SERVICES $450B² 1: Sources: AMS Research, Gartner,IDC, McKinsey Global Institute, and various others industry analysts and commentators 2: Source IDC, 2013. 2016 calculated base don reported CAGR ‘13-’17 4 3: Source: iDATA /Digiworld,2013 Digital Services Economy…
  • 7. Cloud Economics Amazon’s TCO Analysis¹ VMs per System Web Transactions /Sec Storage Capacity Hadoop Queries Workload Performance Metrics 1: Source: James Hamilton, Amazon* http://perspectives.mvdirona.com/2010/09/overall-data-center-costs/ Performance / TCO is the key metric 7
  • 8. Diverse Data Center Demands Accelerators can increase Performance at lower TCO for targeted workloads 8 Intel estimates; bubble size is relative CPU intensity
  • 9. Agenda 9 • Accelerators: Motivation and Use Cases • Using Field Programmable Gate Array (FPGA) as an Accelerator • Intel® Xeon® Processor + FPGA Accelerator Platform • Hardware and Software Programming Interfaces • Example Applications
  • 10. Accelerator Architecture Landscape Application Flexibility Ease of Programming/ Development Fixed Function Accelerator 10 Reconfigurable Accelerator CPU
  • 11. Benefits of Reconfigurable Accelerators: Savings in Area /Power • Can be configured to implement different functions efficiently - Meeting performance goalsfor segment - Saving area and power compared to multiple Fixed Functions Fixed Functions Cost Programmable Accelerator Software Performance 10
  • 12. Benefits of Reconfigurable Accelerators: Meeting Customer Needs for Differentiation Workload Optimized Silicon 12 Pervasive Analytics & Insights Intelligent Resource Orchestration Dynamic Resource Pooling Driving the Digital ServiceEconomy
  • 13. What is a Field Programmable Gate Array (FPGA)? FPGAs (Field Programmable Gate Arrays) are semiconductor devices that can be programmed 13 • Desired functionality of the FPGA can be (re-) programmed by downloading a configuration into the device FPGAs offer several advantages over potential alternatives: • Lower one-time development cost, and faster time to market compared to custom designed chips (ASICs) • Ability to implement customer-specific functionality beyond what is available from standard products (ASSPs) • Customizable and reprogrammable after the device has been deployed to the field compared to both ASIC and ASSP Logic Blocks Interconnect Resources I/O Cells
  • 14. A Complete Solutions Portfolio CPLDs Lowest Cost, Lowest Power PowerSoCs High-efficiency Power Management FPGAs Cost/PowerBalance Design Software Development Kits Embedded Soft and Hard Processors FPGAs Mid-range FPGAs P O W E R I N G Y O U R I N N O V A T I O N SoC & Transceivers SoC & Transceivers R E S O U R C E S FPGAs Optimized for High Bandwidth Intellectual Property (IP) Industrial Computing Enterprise 1
  • 15. Efficiency via Specialization ASICsFPGAs Source: Bob Broderson, Berkeley Wireless group GPUs
  • 16. OpenCL and FPGAs Address These Challenges Power efficient acceleration – Typically 1/5 power of GPU and orders of magnitude more performance per watt ofCPU FPGA lifecycle over 15 years – GPUs lifespan is short Require re-optimization testing between generations – FPGA OpenCL code retargeted to future devices without modification Our OpenCL flow abstracts away FPGA hardware flow – Puts FPGA into software engineers hands Our OpenCL SDK allows for streaming IO channels and kernel channels – Data movement without host involvement – Low latency data transmissions to accelerator Shared virtual memory – IBM CAPI and Intel QPI 16
  • 17. More SW Engineering Resources than HW? 1000:1 software engineers to FPGA designers Software engineers are not used to long compile 17 times OpenCL Solves This!  Our OpenCL flow abstracts away FPGA hardware flow bringing the FPGA to low level software programmers  Software developers write, optimize and debug in their software familiar environment  Quartus is run behind the scenes  Emulator and profiler are software development tools  Pushing long compile times to end  OpenCL optimization doesn’t require a board  Allowing SW to drive board requirements (.xml file)
  • 18. Application Development Paradigm ASIC FPGA Programmers Parallel Programmers Standard CPU Programmers OpenCL expands The number of application developers 18
  • 19. Agenda 19 • Accelerators: Motivation and Use Cases • Using Field Programmable Gate Array (FPGA) as an Accelerator • Intel® Xeon® Processor + FPGA Accelerator Platform • Hardware and Software Programming Interfaces • Example Applications
  • 20. Intel® Xeon® E5 + Field Programmable Gate Array Software Development Platform (SDP) Shipping Today Intel QPI DDR3 DDR3 DDR3 DDR3 DDR3 PCIe3.0x8 DMI2 PCIe3.0x8 PCIe3.0x8 PCIe3.0x8 PCIe3.0x8 PCIe3.0x8 DDR3 Intel Xeon Processor E5 Product Family FPGA Processor Intel Xeon Processor E5 FPGA Module Altera* Stratix* V QPI Speed 6.4 GT/s fullwidth (target 8.0 GT/s at full width) Memory to FPGA Module 2 channels of DDR3 (up to 64 GB) Expansion connector to FPGA Module PCI Express® (PCIe) 3.0 x8 lanes - maybe used for direct I/O e.g. Ethernet Features Configuration Agent,Caching Agent, (optional) Memory Controller Software Accelerator Abstraction Layer (AAL) runtime, drivers, sample applications Software Development for Accelerating Workloads using Intel® Xeon® processors and coherently attached FPGA in-socket 20 Intel® QuickPath Interconnect (Intel® QPI)
  • 21. System Logical View • AFUs can access coherent cache on FPGA • AFUs can “not” implement a second level cache • Intel® Quick Path Interconnect (Intel® QPI) IP participates in cache coherency with Processors A F U s Q P I D R A M D R A M D D R D R A M P r o c e ss o r C o re s L L C F P G A C C I M u lt i-processor C o h e r e n c e D o m a i n C a c h e a c c e s s D o m a i n C a c h e 21 In te l Q P I I P
  • 22. Intel® Xeon® + Field Programmable Gate Array SDP: Intel® Quick Path Interconnect 1.1 RTL Microarchitecture • PHY – Implements the Intel QPI PHY 1.1 (Analog/Digital) • Intel QPI Linklayer- provides flow control and reliable communication • Intel QPI Protocol – implements Intel QPI Cache Agent + ConfigurationAgent • Cache Controller – Cache hit/miss determination and generates Intel QPI protocol requests. • Cache Tag – Tracks state of cacheline (MESI + internal states for tracking outstanding requests) • Coherency Table – Programmable table that implements coherency protocol rules • System Protocol Layer (SPL2) – Implements Address translation functionality. Can provide up to 2GB device virtual address space to AFU. SPL2 cannot handle page faults. • AFU – User designed Accelerator Function Unit Q P I L i n k / P r o t o c o l C o n t r o l Q P I P H YR x A l i g n T x A l i g n R x C o n t r o l T x C o n t r o l C a c h e c o n t r o l l er C a c h e D a t a C a c h e T a g C a c h e T a b l e R x T x S P L 2 C C I- E R x T x C C I- S Intel Q P I F P G A IP 6 4 0 bits6 4 0 bits A d d r e s s translation U s er: Accelerator Func t i on Unit (A FU ) Intel® QuickPath Interconnect (Intel® QPI) Q P I int erf ac e t o p i n s22
  • 23. Agenda 23 • Accelerators: Motivation and Use Cases • Using Field Programmable Gate Array (FPGA) as an Accelerator • Intel® Xeon® Processor + FPGA Accelerator Platform • Hardware and Software Programming Interfaces • Example Applications
  • 24. Intel® Xeon® Processor + Field Programmable Gate Array Tool Flow C HDL SW Compiler Syn. PAR exe bit- stream Intel® Xeon® AAL FPGA Shell Host Kernels SW Compiler OpenCL Compiler exe bit- stream HDL Programming OpenCL™ Programming Intel Xeon AAL FPGA Shell Accelerator Abstraction Layer Field ProgrammableGate Array (FPGA)24
  • 25. Programming Interfaces Host Application Virtual Memory API Intel QPI/KTI Link, Protocol, & PHY CPU Accelerator Function Units (AFU) CCI1 extended Addr Translation CCI1 standard Service API Physical Memory API Interfaces Accelerator Abstraction Layer Field ProgrammableGate Array 25 Intel® QuickPath Interconnect (Intel® QPI) 2. Software Development Platform 4. Register Transfer Level Intel QPI Standard Programming Interfaces : AAL and CCI Programming interfaces will be forward compatible from SDP2 to future MCP3 solutions Simulation Environment available for development of SW and RTL4 1. Coherent Cache Interface 3. Multi-chip package
  • 26. Programming Interfaces: OpenCL™ OpenCL Application Virtual Memory API VirtMem CPU OpenCL Kernels CCI Extended CCI Standard Service API Physical Memory API Accelerator Abstraction Layer C F G Physical Memory API OpenCL RunTime OpenCL™ Host Code OpenCL Kernel Code Field Programmable Gate Array Intel QPI/PCI Express® System Memory Unified application code abstracted from the hardware environment Portable across generations and families of CPUs and FPGAs 20 Intel® QuickPath Interconnect (Intel® QPI)
  • 27. Agenda 21 • Accelerators: Motivation and Use Cases • Using Field Programmable Gate Array (FPGA) as an Accelerator • Intel® Xeon® Processor + FPGA Accelerator Platform • Hardware and Software Programming Interfaces • Example Applications
  • 28. Example Usage: Deep Learning Framework for Visual Understanding clusternodedeviceprimitives DMA Weights Inputs O utputs Processing Tile ‘n’ Processing Tile 1 Processing Tile 0 PE PE PE Read Write Reg Access Control State Machine IP Registers CCI Interface SRAM Controller CNN (Convolutional Neural Network) function accelerated on FPGA: Power-performance of CNN classification boosted up to 2.2X† 22 microbenchmark. In order to sustain ~2400 img/s we need a I/O bandwidth of ~500 MB/s, which can be supported by a 10GigE link and software stack †Source: Intel Measured (Intel® Xeon® processor E5-2699v3 results; Altera Estimated (4x Arria-10 results) 2S Intel( Xeon E5-2699v3 + 4x GX1150 PCI Express® cards. Most computations executed on Arria-10 FPGA's, 2S Intel Xeon E5-2699v3 host assumed to be near idle, doing misc. networking/housekeeping functions. Arria-10 results estimated by Altera with Altera custom classification network. 2x Intel Xeon E5-2699v3 power estimated @ 139W while doing "housekeeping" for GX1150 cards based on Intel measured
  • 29. Example Usage: HaplotypeCaller (PairHMM Genomics Analysis Toolkit BWA mem (Smith-Waterman PairHMM function accelerated on FPGA: Power-performance of pHMM boosted up to 3.8X† 23 essentially idle when work load is offloaded to the FPGA) †pHMM Algorithm performance is measured in terms of Millions Cell Updates per seconds (CUPS). Performance projections: CPU Performance: includes: 1 core Intel® Xeon® processor E5-2680v2 @ 2.8GHz delivers 2101.1 MCUP/s measured; estimated value assumes linear scaling to 10 Cores on Xeon ES2680v2 @ 2.8 GHz & 115W TDP; FPGA Performance includes: 1 FPGA PE (Processing Engine) delivers 408.9 MCUP/s @ 200 MHz measured; estimated value assumes linear scaling to 32 PEs and 90% frequency scaling on Stratix- V A7 400 MHz based on RTL Synthesis results (35W TDP). Intel estimated based on 1S Xeon E5-2680v2 + 1 Stratix-V A7 with QPI 1.1 @ 6.4 GT/s full width using Intel® QuickAssist FPGA System Release 3.3, ICC (CPU is
  • 30. Intel® Xeon® + FPGA1 in the Cloud Vision Workload Static/dynamic FPGA programming Place workload Intel® Xeon® +FPGA Orchestration Software Intel Developed IP 3rd party Developed IP Resource Pool Storage Network Compute Software Defined Infrastructure FPGA Vendor Developed IP Cloud Users IP Library End User Developed IP Launch workload 1: Field Programmable GateArray (FPGA)30 Workload accelerators
  • 31. “Programmer Friendly” Acceleration Software Programmers • Need Logic andData Management– By writing lines of code OpenCL™ Compiler Benefits • Ease ofuse • Scalable • Heterogeneous • Leverage existing libraries • Vendor choice w/open standards • Foundation for OpenMP (80% reuse) Channels/Pipe Extension • Kernel Kernel • External IO Kernel • Mix ‘n Match HDL & Kernels I/O I/OKernel Kernel Kernel DDRx Global MemoryBuffer Context Compile code Create data& arguments Execute CPU FPGA © 2015 AlteraCorporation–Public 31
  • 32. Spectrum of Workload Acceleration Software Library 39 Processor Instruction Discrete Accelerator Integrated Accelerator Example: Data Plane Development Kit Example: Field Programmable Gate Array Quick Assist Technology Example: Intel® Iris™ Pro Graphics Example: Intel® Advanced Vector Extensions
  • 33. Workload Acceleration Beyond CPU Intel® Silicon Photonics Intel® Omni- Path Fabric 3D XPoint™ Technology 49
  • 34. Intel Architecture Vision for Software: Code Once – Run Anywhere Software Library 34 Processor Instruction Discrete Accelerator Integrated Accelerator Consistent programing model for all accelerators
  • 35. Additional Sources of Information 35 • A PDF of this presentation is available from our Technical Session Catalog: www.intel.com/idfsessionsSF. • Intel® Xeon Phi™ coprocessor resources: software.intel.com/mic-developer • Network Compression resources: intel.com/quickassist • Media Transcoding resources: software.intel.com/intel-media-server-studio • Storage Cryptography resources: software.intel.com/storage • FPGA: Please see demo in Altera* booth in the demo showcase