SlideShare a Scribd company logo
1 of 15
Download to read offline
Challenges in
Architecting Vision
Inference Systems for
Transformer Models
Cheng Wang
CTO & SVP, Software & Architecture
Flex Logix
Blind person and the elephant
– Receptive field is a big limitation
2
© 2023 Flex Logix Technologies, Inc.
CNNs are limited by receptive field
3
© 2023 Flex Logix Technologies, Inc.
Smaller receptive field
Smaller features &
objects
Larger receptive field
Larger features &
objects
Transformers use context from whole image
4
© 2023 Flex Logix Technologies, Inc.
CNN: Elephant? Transformers: Elephant!
Receptive Field
Receptive Field
MAYBE
VERY HARD
• Uses CNN backbone for feature extraction & transformer “head”
• Transformer Encoder extracts features from all patches for context
• Decoder makes predictions based on all extracted features
• Transformer Encoder/Decoder operations are very different from CNN
DETR 2020
– The de-facto vision transformer model
5
© 2023 Flex Logix Technologies, Inc.
• InferX provides flexibility essential
for transformers:
• Each TPU can stream data with: TPU,
L1 weight mem, L2 Data mem & DDR
• TPU natively supports mixed precision
• Flexible activations in EFLX eFPGA
• More data bandwidth vs Network-on-
Chip based AI
• Also more flexible data manipulation
InferX dynamic TPU
– Flexible, balanced & memory-efficient
6
© 2023 Flex Logix Technologies, Inc.
• First stage is positional encode:
• PE values are stored eFPGA ROMs
• EFLX lookup PE “on the fly” to add to
the K/Q matrix into the attention head
Diving into vision transformers
7
© 2023 Flex Logix Technologies, Inc.
CNN output
• Second stage multiplies input with 3
matrices for each head (Q/K/V):
• Each matrix maps to TPU weights
Diving into vision transformers (2)
8
© 2023 Flex Logix Technologies, Inc.
Attention Head #0 Attention Head #1
• Main part of multi-head attention layer is a
challenge on traditional edge accelerators:
• The (Q, K, V) for each matrix is activation data
• Q×KT multiplies 2 activation data:
• InferX can load activation into weight memory
Diving into vision transformers (3)
9
© 2023 Flex Logix Technologies, Inc.
Q0 W
0
Q
K0
K Q×K
T
• Softmax and normalization operators are also
challenging on int8 datapaths
• InferX mixed-precision includes BFloat16 format
• Enables softmax & normalization computation
without going to a separate floating-point unit
Diving into vision transformers (4)
10
© 2023 Flex Logix Technologies, Inc.
• Normalization is executed in BF16 due to
its large dynamic range.
• Add and Feed-forward Network (FFN)
operators are similar those in CNNs
Diving into vision transformers (5)
11
© 2023 Flex Logix Technologies, Inc.
L1 norm
Squared
L2 norm
InferX IP is linearly scalable (5nm, batch=1)
12
© 2023 Flex Logix Technologies, Inc.
DETR 2020 (1024×1024) 12 IPS 23 IPS 70 IPS 127 IPS
YOLOv5s (640×640) 75 IPS 175 IPS 396 IPS 850 IPS
YOLOv5L6 (1280×1280) 5 IPS 10 IPS 24 IPS 48 IPS
ResNet50 (1024x1024) 18 IPS 38 IPS 102 IPS 158 IPS
How to deploy InferX models in your system?
13
© 2023 Flex Logix Technologies, Inc.
Application data Transfer learning FP32 model
Standard training and
transfer learning workflow
Flex Logix model developer
workflow with InferX MDK
Model conversion Edge model
Int8 Model
Your Inference
Application
Your SoC with
InferX accelerator
InferX model
developer kit
‣ Optimization
‣ Quantization
‣ Compilation
(Optional)
‣ Refine model
for int8
InferX compiler is available for evaluation
14
© 2023 Flex Logix Technologies, Inc.
• InferX IP provides flexibility to future-proof your AI solution,
including state-of-the art CNN & transformers workloads
• Please come visit our booth for demos and brochures
• Please visit www.flex-logix.com for more information
Come see us for more information!
15
© 2023 Flex Logix Technologies, Inc.

More Related Content

Similar to “Challenges in Architecting Vision Inference Systems for Transformer Models,” a Presentation from Flex Logix

 Network Innovations Driving Business Transformation
 Network Innovations Driving Business Transformation Network Innovations Driving Business Transformation
 Network Innovations Driving Business TransformationCisco Service Provider
 
Cisco connect winnipeg 2018 gain insight and programmability with cisco dc ...
Cisco connect winnipeg 2018   gain insight and programmability with cisco dc ...Cisco connect winnipeg 2018   gain insight and programmability with cisco dc ...
Cisco connect winnipeg 2018 gain insight and programmability with cisco dc ...Cisco Canada
 
OpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula Project
 
Gain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingGain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingCisco Canada
 
Cisco DC Networking: Gain Insight and Programmability with
Cisco DC Networking: Gain Insight and Programmability with Cisco DC Networking: Gain Insight and Programmability with
Cisco DC Networking: Gain Insight and Programmability with Cisco Canada
 
Gain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingGain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingCisco Canada
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Designing and deploying converged storage area networks final
Designing and deploying converged storage area networks finalDesigning and deploying converged storage area networks final
Designing and deploying converged storage area networks finalBhavin Yadav
 
Radisys Virtualized RAN using the Mobile-CORD platform
Radisys Virtualized RAN using the Mobile-CORD platformRadisys Virtualized RAN using the Mobile-CORD platform
Radisys Virtualized RAN using the Mobile-CORD platformSmall Cell Forum
 
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...NetworkCollaborators
 
Logical_Routing_NSX_T_2.4.pptx.pptx
Logical_Routing_NSX_T_2.4.pptx.pptxLogical_Routing_NSX_T_2.4.pptx.pptx
Logical_Routing_NSX_T_2.4.pptx.pptxAnwarAnsari40
 
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...Edge AI and Vision Alliance
 
OptiXtrans E6600 main slide.pdf
OptiXtrans E6600 main slide.pdfOptiXtrans E6600 main slide.pdf
OptiXtrans E6600 main slide.pdfssuserc99286
 
Acronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFAcronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFEmulex Corporation
 
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...Cisco Canada
 
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentation
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentationTurbocharge the NFV Data Plane in the SDN Era - a Radisys presentation
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentationRadisys Corporation
 
LTE and Satellite: Solutions for Rural and Public Safety Networking
LTE and Satellite: Solutions for Rural and Public Safety NetworkingLTE and Satellite: Solutions for Rural and Public Safety Networking
LTE and Satellite: Solutions for Rural and Public Safety NetworkingSmall Cell Forum
 

Similar to “Challenges in Architecting Vision Inference Systems for Transformer Models,” a Presentation from Flex Logix (20)

 Network Innovations Driving Business Transformation
 Network Innovations Driving Business Transformation Network Innovations Driving Business Transformation
 Network Innovations Driving Business Transformation
 
IPLOOK Networks.pdf
IPLOOK Networks.pdfIPLOOK Networks.pdf
IPLOOK Networks.pdf
 
Cisco connect winnipeg 2018 gain insight and programmability with cisco dc ...
Cisco connect winnipeg 2018   gain insight and programmability with cisco dc ...Cisco connect winnipeg 2018   gain insight and programmability with cisco dc ...
Cisco connect winnipeg 2018 gain insight and programmability with cisco dc ...
 
OpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart Cloud
 
Gain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingGain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC Networking
 
Cisco DC Networking: Gain Insight and Programmability with
Cisco DC Networking: Gain Insight and Programmability with Cisco DC Networking: Gain Insight and Programmability with
Cisco DC Networking: Gain Insight and Programmability with
 
Gain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC NetworkingGain Insight and Programmability with Cisco DC Networking
Gain Insight and Programmability with Cisco DC Networking
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Designing and deploying converged storage area networks final
Designing and deploying converged storage area networks finalDesigning and deploying converged storage area networks final
Designing and deploying converged storage area networks final
 
Radisys Virtualized RAN using the Mobile-CORD platform
Radisys Virtualized RAN using the Mobile-CORD platformRadisys Virtualized RAN using the Mobile-CORD platform
Radisys Virtualized RAN using the Mobile-CORD platform
 
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...
Cisco Connect 2018 Thailand - Secured and visualized data center connectivity...
 
Logical_Routing_NSX_T_2.4.pptx.pptx
Logical_Routing_NSX_T_2.4.pptx.pptxLogical_Routing_NSX_T_2.4.pptx.pptx
Logical_Routing_NSX_T_2.4.pptx.pptx
 
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
 
Design Principles for 5G
Design Principles for 5GDesign Principles for 5G
Design Principles for 5G
 
OptiXtrans E6600 main slide.pdf
OptiXtrans E6600 main slide.pdfOptiXtrans E6600 main slide.pdf
OptiXtrans E6600 main slide.pdf
 
Acronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFAcronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNF
 
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...
Cisco Connect Vancouver 2017 - Gain insight and programmability with Cisco DC...
 
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentation
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentationTurbocharge the NFV Data Plane in the SDN Era - a Radisys presentation
Turbocharge the NFV Data Plane in the SDN Era - a Radisys presentation
 
A consolidated virtualization approach to deploying distributed cloud networks
A consolidated virtualization approach to deploying distributed cloud networksA consolidated virtualization approach to deploying distributed cloud networks
A consolidated virtualization approach to deploying distributed cloud networks
 
LTE and Satellite: Solutions for Rural and Public Safety Networking
LTE and Satellite: Solutions for Rural and Public Safety NetworkingLTE and Satellite: Solutions for Rural and Public Safety Networking
LTE and Satellite: Solutions for Rural and Public Safety Networking
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

“Challenges in Architecting Vision Inference Systems for Transformer Models,” a Presentation from Flex Logix

  • 1. Challenges in Architecting Vision Inference Systems for Transformer Models Cheng Wang CTO & SVP, Software & Architecture Flex Logix
  • 2. Blind person and the elephant – Receptive field is a big limitation 2 © 2023 Flex Logix Technologies, Inc.
  • 3. CNNs are limited by receptive field 3 © 2023 Flex Logix Technologies, Inc. Smaller receptive field Smaller features & objects Larger receptive field Larger features & objects
  • 4. Transformers use context from whole image 4 © 2023 Flex Logix Technologies, Inc. CNN: Elephant? Transformers: Elephant! Receptive Field Receptive Field MAYBE VERY HARD
  • 5. • Uses CNN backbone for feature extraction & transformer “head” • Transformer Encoder extracts features from all patches for context • Decoder makes predictions based on all extracted features • Transformer Encoder/Decoder operations are very different from CNN DETR 2020 – The de-facto vision transformer model 5 © 2023 Flex Logix Technologies, Inc.
  • 6. • InferX provides flexibility essential for transformers: • Each TPU can stream data with: TPU, L1 weight mem, L2 Data mem & DDR • TPU natively supports mixed precision • Flexible activations in EFLX eFPGA • More data bandwidth vs Network-on- Chip based AI • Also more flexible data manipulation InferX dynamic TPU – Flexible, balanced & memory-efficient 6 © 2023 Flex Logix Technologies, Inc.
  • 7. • First stage is positional encode: • PE values are stored eFPGA ROMs • EFLX lookup PE “on the fly” to add to the K/Q matrix into the attention head Diving into vision transformers 7 © 2023 Flex Logix Technologies, Inc. CNN output
  • 8. • Second stage multiplies input with 3 matrices for each head (Q/K/V): • Each matrix maps to TPU weights Diving into vision transformers (2) 8 © 2023 Flex Logix Technologies, Inc. Attention Head #0 Attention Head #1
  • 9. • Main part of multi-head attention layer is a challenge on traditional edge accelerators: • The (Q, K, V) for each matrix is activation data • Q×KT multiplies 2 activation data: • InferX can load activation into weight memory Diving into vision transformers (3) 9 © 2023 Flex Logix Technologies, Inc. Q0 W 0 Q K0 K Q×K T
  • 10. • Softmax and normalization operators are also challenging on int8 datapaths • InferX mixed-precision includes BFloat16 format • Enables softmax & normalization computation without going to a separate floating-point unit Diving into vision transformers (4) 10 © 2023 Flex Logix Technologies, Inc.
  • 11. • Normalization is executed in BF16 due to its large dynamic range. • Add and Feed-forward Network (FFN) operators are similar those in CNNs Diving into vision transformers (5) 11 © 2023 Flex Logix Technologies, Inc. L1 norm Squared L2 norm
  • 12. InferX IP is linearly scalable (5nm, batch=1) 12 © 2023 Flex Logix Technologies, Inc. DETR 2020 (1024×1024) 12 IPS 23 IPS 70 IPS 127 IPS YOLOv5s (640×640) 75 IPS 175 IPS 396 IPS 850 IPS YOLOv5L6 (1280×1280) 5 IPS 10 IPS 24 IPS 48 IPS ResNet50 (1024x1024) 18 IPS 38 IPS 102 IPS 158 IPS
  • 13. How to deploy InferX models in your system? 13 © 2023 Flex Logix Technologies, Inc. Application data Transfer learning FP32 model Standard training and transfer learning workflow Flex Logix model developer workflow with InferX MDK Model conversion Edge model Int8 Model Your Inference Application Your SoC with InferX accelerator InferX model developer kit ‣ Optimization ‣ Quantization ‣ Compilation (Optional) ‣ Refine model for int8
  • 14. InferX compiler is available for evaluation 14 © 2023 Flex Logix Technologies, Inc.
  • 15. • InferX IP provides flexibility to future-proof your AI solution, including state-of-the art CNN & transformers workloads • Please come visit our booth for demos and brochures • Please visit www.flex-logix.com for more information Come see us for more information! 15 © 2023 Flex Logix Technologies, Inc.