SlideShare a Scribd company logo
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject
to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or
compatibility, are provided for informational purposes only and do not modify the warranty, if any,
applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron
trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their
respective owners.
Seamless Prediction at the Edge
Using TensorFlow on FPGAs
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject
to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or
compatibility, are provided for informational purposes only and do not modify the warranty, if any,
applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron
trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their
respective owners.
Brad Spiers, Principal Solutions Architect
Linley Spring Processor Conference: April 12, 2018
Prediction.. At the
Edge
 Limited Weight, Space and Power
 Very Limited External Bandwidth
 Cannot Move Data  Must Compute Locally
 FPGAs Have Speed, Efficiency & Memory Capability
 Now Program FPGAs – with No Code Change!
Micron Confidential2
What are Field Programmable
Gate Arrays (FPGAs)?
3
 Unlike a CPU, no Pre-Defined Instructions
 Can be Dynamically Reprogrammed
 Massive Inherent Parallelism
ALU
ALU
ALU
ALU
Control
Cache
CPU
GPU
FPGA
Current Customer Challenges
4
 Person and Face Recognition
 Body Pose Recognition
 Fingerprint Recognition
 Voice and Speaker Identification
 Object Categorization
 Time-Series Pattern Recognition (LSTM-based RNN’s)
FWDNXT Performance on FPGAs
5
From Just 24 Watts to Handle Power Constraints on “The Edge”
FWDNXT’s Approach
6
 Speed up Traces, not Layers
 Key Idea: Hide non-essential Work Behind
Long Traces
 Traces Stretch
Across
Network Layers
 With Long Traces, Bandwidth Becomes Key
FWDNXT Has a Hierarchical Architecture
7
 Hierarchical Memory
Design Achieves
Efficiency
 Hidden, Long
Memory Fetches Fill
Buffers
 Full Buffers Feed
Compute Units
Micron Hybrid Memory Cube
June 8, 20188
Low-Power Bandwidth to Feed Long Traces
8.5x
more
bandwidth
than DDR4
70% less
energy
per bit
How?
 Stacked DRAM
 Multiple “banks” per layer
 “Light up” smaller bank  less energy
Problem: How to Program FPGAs?
9
 Programming has Been a Barrier in the Past
− Verilog, HDL --> Months to Deploy
 FWDNXT’s Snowflake Compiler & Micron FPGA Modules: ML for IoT
Your Network
Your
Framework
Network
Description
Snowflake
Compiler
Micron FPGA
Module
Machine Learning
At the Edge
What Model Types Can FWDNXT Handle?
10
 Any Model
− CNN
− RNN
− LSTM
− …
 Any Framework
− PYTORCH
− Caffe
− TensorFlow
− …
FWDNXT Representations
11
 Now, 16 bit Fixed Point Used for
Inputs
 Fixed Point: 5 bit integer, 11-bit
fraction
 Moving to 16 bit Floating Point
 Now, 32-bit Fixed Point Used for
Multiplication Output and Add’s
Fixed Point Representation
Steps to Deploy Models on FPGAs
12
1. Define Model in PYTORCH, Caffe
or Tensorflow
2. Train Model with Data on GPUs
3. Input Framework-Trained Model
into SnowFlake Compiler
4. Deploy Snowflake Output Directly
onto Micron FPGA Module
NO CODE CHANGE
Hybrid Memory
Cube
Up to 512GB
DDR Footprints
Advanced
FPGAs
 Xilinx UltraScale +
 Intel Stratix 10
What New Problems Can We Solve?
Micron Confidential13
 Some Domains Have Problems that Require
Larger Memory Footprints
− Medical Imaging
− Oil Exploration
− Videos
− Government
 Need both High-Bandwidth and High-
Capacity Memory
 Micron FPGA Cards Plus FWDNXT
Snowflake Compiler Provide Missing Links
Summary
Micron Confidential14
 The Edge Poses Challenges in Power and Bandwidth
 FPGAs Can Help, but Programming Was a Challenge—Until Now
 Memory Bandwidth now Key to Machine Learning Performance
 Plus, Solve Larger Problems on Boards with up to 512GB of Memory
www.micron.com/tensorflow
Micron Confidential15

More Related Content

Similar to Micron: Seamless Prediction at the Edge Using TensorFlow on FPGAs

AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdfAI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
Object Automation
 
Innodisk Selection Guide (2019 Edition)
Innodisk Selection Guide (2019 Edition)Innodisk Selection Guide (2019 Edition)
Innodisk Selection Guide (2019 Edition)
Innodisk Corporation
 
#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session
Brocade
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
MarketingArrowECS_CZ
 
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
Intel® Software
 
AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- Supercomputing
Intel IT Center
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
Intel® Software
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
sampige
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
DESMOND YUEN
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
Intel® Software
 
5G Network Introduction
5G Network Introduction5G Network Introduction
5G Network Introduction
Michelle Holley
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilities
solarisyougood
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
geetachauhan
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERAchronix
 
Ferri Embedded Storage
Ferri Embedded Storage Ferri Embedded Storage
Ferri Embedded Storage
Silicon Motion
 
Future Cloud Infrastructure
Future Cloud InfrastructureFuture Cloud Infrastructure
Future Cloud Infrastructureexponential-inc
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
RCCSRENKEI
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Danielle Womboldt
 
Ceph Day Beijing - Storage Modernization with Intel & Ceph
Ceph Day Beijing - Storage Modernization with Intel & Ceph Ceph Day Beijing - Storage Modernization with Intel & Ceph
Ceph Day Beijing - Storage Modernization with Intel & Ceph
Ceph Community
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
Cisco Canada
 

Similar to Micron: Seamless Prediction at the Edge Using TensorFlow on FPGAs (20)

AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdfAI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
 
Innodisk Selection Guide (2019 Edition)
Innodisk Selection Guide (2019 Edition)Innodisk Selection Guide (2019 Edition)
Innodisk Selection Guide (2019 Edition)
 
#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
 
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
Unleashing Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Inside the ...
 
AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- Supercomputing
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
 
5G Network Introduction
5G Network Introduction5G Network Introduction
5G Network Introduction
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilities
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWER
 
Ferri Embedded Storage
Ferri Embedded Storage Ferri Embedded Storage
Ferri Embedded Storage
 
Future Cloud Infrastructure
Future Cloud InfrastructureFuture Cloud Infrastructure
Future Cloud Infrastructure
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
 
Ceph Day Beijing - Storage Modernization with Intel & Ceph
Ceph Day Beijing - Storage Modernization with Intel & Ceph Ceph Day Beijing - Storage Modernization with Intel & Ceph
Ceph Day Beijing - Storage Modernization with Intel & Ceph
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 

Micron: Seamless Prediction at the Edge Using TensorFlow on FPGAs

  • 1. ©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Statements regarding products, including regarding their features, availability, functionality, or compatibility, are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their respective owners. Seamless Prediction at the Edge Using TensorFlow on FPGAs ©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Statements regarding products, including regarding their features, availability, functionality, or compatibility, are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their respective owners. Brad Spiers, Principal Solutions Architect Linley Spring Processor Conference: April 12, 2018
  • 2. Prediction.. At the Edge  Limited Weight, Space and Power  Very Limited External Bandwidth  Cannot Move Data  Must Compute Locally  FPGAs Have Speed, Efficiency & Memory Capability  Now Program FPGAs – with No Code Change! Micron Confidential2
  • 3. What are Field Programmable Gate Arrays (FPGAs)? 3  Unlike a CPU, no Pre-Defined Instructions  Can be Dynamically Reprogrammed  Massive Inherent Parallelism ALU ALU ALU ALU Control Cache CPU GPU FPGA
  • 4. Current Customer Challenges 4  Person and Face Recognition  Body Pose Recognition  Fingerprint Recognition  Voice and Speaker Identification  Object Categorization  Time-Series Pattern Recognition (LSTM-based RNN’s)
  • 5. FWDNXT Performance on FPGAs 5 From Just 24 Watts to Handle Power Constraints on “The Edge”
  • 6. FWDNXT’s Approach 6  Speed up Traces, not Layers  Key Idea: Hide non-essential Work Behind Long Traces  Traces Stretch Across Network Layers  With Long Traces, Bandwidth Becomes Key
  • 7. FWDNXT Has a Hierarchical Architecture 7  Hierarchical Memory Design Achieves Efficiency  Hidden, Long Memory Fetches Fill Buffers  Full Buffers Feed Compute Units
  • 8. Micron Hybrid Memory Cube June 8, 20188 Low-Power Bandwidth to Feed Long Traces 8.5x more bandwidth than DDR4 70% less energy per bit How?  Stacked DRAM  Multiple “banks” per layer  “Light up” smaller bank  less energy
  • 9. Problem: How to Program FPGAs? 9  Programming has Been a Barrier in the Past − Verilog, HDL --> Months to Deploy  FWDNXT’s Snowflake Compiler & Micron FPGA Modules: ML for IoT Your Network Your Framework Network Description Snowflake Compiler Micron FPGA Module Machine Learning At the Edge
  • 10. What Model Types Can FWDNXT Handle? 10  Any Model − CNN − RNN − LSTM − …  Any Framework − PYTORCH − Caffe − TensorFlow − …
  • 11. FWDNXT Representations 11  Now, 16 bit Fixed Point Used for Inputs  Fixed Point: 5 bit integer, 11-bit fraction  Moving to 16 bit Floating Point  Now, 32-bit Fixed Point Used for Multiplication Output and Add’s Fixed Point Representation
  • 12. Steps to Deploy Models on FPGAs 12 1. Define Model in PYTORCH, Caffe or Tensorflow 2. Train Model with Data on GPUs 3. Input Framework-Trained Model into SnowFlake Compiler 4. Deploy Snowflake Output Directly onto Micron FPGA Module NO CODE CHANGE
  • 13. Hybrid Memory Cube Up to 512GB DDR Footprints Advanced FPGAs  Xilinx UltraScale +  Intel Stratix 10 What New Problems Can We Solve? Micron Confidential13  Some Domains Have Problems that Require Larger Memory Footprints − Medical Imaging − Oil Exploration − Videos − Government  Need both High-Bandwidth and High- Capacity Memory  Micron FPGA Cards Plus FWDNXT Snowflake Compiler Provide Missing Links
  • 14. Summary Micron Confidential14  The Edge Poses Challenges in Power and Bandwidth  FPGAs Can Help, but Programming Was a Challenge—Until Now  Memory Bandwidth now Key to Machine Learning Performance  Plus, Solve Larger Problems on Boards with up to 512GB of Memory www.micron.com/tensorflow